Ten AI Safety Projects I’d Like People to Work On

Author: Julian Hazell (julian-hazell), grants officer at open-philanthropy Published: July 2025 Source: EA Forum / Secret Third Thing Substack Source

Context

Hazell runs grants at Open Philanthropy focused on reducing catastrophic risks from transformative AI. This post is his personal list — not official Open Phil policy — of projects he’d like to see more people work on. He frames it with a direct statement of his threat model: he believes AI systems capable of causing catastrophe (including human extinction) could be developed within a decade.

The post is notable for two reasons: (1) it signals where a major funder sees gaps, making it a useful signal for anyone considering working in AI safety; (2) project #6 explicitly names “AI safety living literature reviews” as a priority, and Hazell links to the Open Philanthropy RFP for interested applicants.

The Ten Projects

1. AI Security Field-Building

A structured program (e.g. 6-week part-time cohort) that trains security engineers on AI-specific security challenges: securing model weights from state-level adversaries, preventing data contamination, defending against exfiltration attacks. Hazell notes persistent talent bottlenecks in AI security roles outside labs, where salaries are lower. Related to information-security.

2. Technical AI Governance Research Organization

A research org focused on topics in the Bucknall/Reuel et al. paper — things like “how would compute governance actually be enforced in practice?” and “what techniques enable verifiable model auditing without compromising model security?” Could also run a fellowship for early-to-mid career technical people. Hazell sees this as an underdeveloped paradigm despite growing excitement. Related to ai-governance.

3. Tracking Sketchy AI Agent Behavior “In the Wild”

An organization systematically investigating deployed AI agents for signs of misalignment, scheming, or deception in real-world deployments. Possible workstreams: partnering with companies to analyze anonymized interaction logs, creating honeypot environments, interviewing power users of AI agents, and publishing case studies. Motivated by emerging empirical evidence of concerning behaviors — alignment faking, reward hacking, sycophancy — and the need to ground policy discussions in real-world observations. Related to ai-agents, deceptive-alignment, ai-scheming-evals.

4. AI Safety Communications Consultancy

A dedicated communications firm specializing in helping AI safety organizations communicate more effectively. Services: media training, writing support, strategic planning, op-ed pitching, interview prep, messaging frameworks. Distinguishing feature: deep investment in understanding AI safety nuances (unlike general PR firms). At least one founder should have actual communications firm experience. Foreview appears to be a real-world instance of exactly this concept. Related to foreview.

5. AI Lab Monitor

An independent organization conducting meticulous analysis of frontier AI labs’ safety-relevant practices — tracking responsible scaling policy adherence, safety testing procedures, corporate governance, and safety-relevant decisions. Could produce quarterly scorecards. Serves multiple purposes: helps labs stay accountable, gives policymakers reliable information, improves public situational awareness. Related to responsible-scaling-policy, anthropic, openai, deepmind.

6. AI Safety Living Literature Reviews

A “living literature review” — continuously updated, expert-authored synthesis on specific AI safety topics. Questions like: What’s the state of evidence for AI scheming? Which technical safety research agendas are active and what progress has been made? What policy ideas exist for making AI safer? Each review maintained by a single expert or small team. This is the direct primary source for the AI Safety Atlas project in this wiki. Hazell links to an Open Philanthropy RFP for this work. Related to ai-safety, the SR2025 agenda cluster, and the ai-safety-atlas-textbook.

7. $10 Billion AI Resilience Plan

A comprehensive, implementation-ready blueprint for how $10B could be deployed on AI alignment/control research and societal resilience. Specific program structures, budget allocations, timelines. Motivated by the scenario where a major government or philanthropic funder suddenly wakes up to transformative AI risk. Related to ai-governance, existential-risk.

8. AI Tools for Fact-Checking

A (possibly for-profit) organization building AI-powered fact-checking tools with transparent chain-of-thought reasoning and open-source code. Includes rigorous bias evaluations, public datasets of fact-checking decisions, and APIs for platform integration. Motivated by the importance of societal epistemics during rapid AI-driven change. Related to ai-agents.

9. AI Auditors

A company building AI agents that conduct compliance audits — automating documentation review, log checking, safety procedure verification. Hazell flags a critical caveat: giving AI systems access to labs’ internal systems creates massive attack surfaces and security risks. He acknowledges security professionals would likely object — but thinks the concept merits exploration. Related to ai-governance, information-security, capability-evaluations.

10. AI Economic Impacts Tracker

An organization examining how AI is transforming the economy — original research, surveys of management on AI use, economics-style productivity studies, investigation of hiring claims. “Think Epoch’s approach but applied to economic impacts.” Could partner with companies for granular deployment data and maintain a comprehensive database of AI adoption metrics. Related to transformative-ai, ai-population-explosion.

Caveats Hazell Explicitly States

  • These are his personal views, not Open Philanthropy’s official positions
  • Some projects may already be underway
  • These aren’t necessarily the most impactful things to work on
  • Funding is not guaranteed even for projects that apply to the RFP

Significance for This Wiki

This article is the primary source for the “Julian Hazell (Open Phil) explicitly requested AI Safety Living Literature Reviews” citation that appears in the AI Safety Atlas design doc (docs/plans/2026-04-24-living-literature-map-design.md). It provides concrete funder signal that project #6 (living literature reviews) is both needed and potentially fundable — directly validating the AI Safety Atlas project rationale.

The broader list also serves Kevin’s goal of identifying projects needing attention: these 10 items represent one major funder’s map of gaps in the AI safety ecosystem as of mid-2025.