AI Safety Atlas Ch.2 — Misuse Risks

Source: Misuse Risks

How humans deliberately leverage AI capabilities for harm — across biological weapons, cyberattacks, autonomous weapons, and adversarial AI exploitation. “Technology is an amplifier of intentions” — each technological advancement expands potential harm radius.

Bio Risk

Offense-defense imbalance: developing a novel virus costs ~ $100, 000; cre a t in g a v a cc in ec an e x cee d$ 1 billion. AI tilts this further toward attackers.

Empirical demonstrations:

Researchers redirected drug-discovery AI toward toxicity, generating “40,000 potentially toxic molecules within six hours.”
Students without biology backgrounds used AI chatbots to identify pandemic pathogens, production methods, DNA synthesis firms likely to overlook screening, and detailed protocols — within one hour.

Moving capability frontier: experts predicted AI wouldn’t match top virology teams on troubleshooting until after 2030; testing showed the threshold had already been reached.

DNA synthesis vulnerabilities: 2023 MIT study — researchers ordered 1918 pandemic flu fragments and ricin using simple evasion tactics. 12 of 13 International Gene Synthesis Consortium members fulfilled the disguised orders.

Democratization trend: declining DNA synthesis costs (halving every 15 months) + cloud labs + benchtop synthesis machines + AI assistance → bioweapon creation increasingly accessible to non-institutional actors.

This connects to and substantially deepens the wiki’s existing biosecurity page.

Cyber Risk

Existing vulnerabilities scale: the CrowdStrike software update caused $5B in damage across airlines, hospitals, banks. “Cyberattack overhangs” exist — devastating attacks remain possible due to attacker restraint, not robust defenses.

AI-enabled capabilities:

Phishing at scale — AI-generated emails: 65% success vs. 60% human-written; 40% less time to create.
Voice/visual — minutes of audio for voice cloning; one image for face-swap deepfakes.
Autonomous exploitation — AI agents “successfully hacked 73% of test targets” autonomously. OpenAI’s o3 helped discover a zero-day Linux kernel vulnerability requiring expert kernel knowledge.
Malware acceleration — WormGPT generates malicious code without expertise; polymorphic variants automatically create variations security tools don’t recognize.

Cost transformation: autonomous AI agents can hack websites for ~$10/attempt — 8× cheaper than human expertise, enabling unprecedented scale.

Offense-defense balance: attackers need only one weakness; defenders must secure everything. AI enables “flash attacks” executable in minutes, outpacing human response.

Autonomous Weapons Risk

No longer theoretical:

Libya 2021 — autonomous drones made targeting decisions without human control
Ukraine — AI-enabled loitering munitions with autonomous target tracking (both sides)
Gaza — AI-guided drone swarm attacks
Turkey’s Kargu-2 — finds and attacks targets autonomously

Driving incentives: speed (DARPA’s AI beat F-16 pilots in simulated dogfights with maneuvers “too precise and rapid for humans to counter”); cost (US Replicator program: thousands of autonomous drones at fraction of traditional aircraft cost); resilience (GPS-denied environments preclude human-in-the-loop control).

Erosion of meaningful human control: operators face “only seconds to verify computer-suggested targets” under battlefield stress, defaulting to acceptance. The Lavender system assigns numerical scores to residents predicting armed-group membership; human officers only set thresholds — execution becomes automated downstream.

Arms race: China and Russia targeting 2028–2030 for major military automation; US deploying thousands of autonomous drones by 2025. “Only actors willing to compromise safety remain in the race.”

Escalation risks: AI military systems consistently recommend more aggressive actions than human strategists, including escalating to nuclear weapons in simulations. Multiple AI systems engaging create unexpected feedback loops “similar to financial flash crashes — except this time with missiles instead of stocks.”

This deepens the wiki’s existing autonomous-weapons, ai-military-applications pages and the work of ann-katrien-oimann / andrew-rebera.

Adversarial AI Risk

Beyond misuse-of-AI, misuse-against-AI is its own category. Four sub-types:

Runtime attacks:

Visual perturbations — “a panda with imperceptible changes classified as a gibbon with 99.3% confidence”
Physical attacks — stickers on stop signs trick autonomous vehicles
Dolphin attacks — ultrasonic frequencies undetectable to humans control voice assistants from up to 1.7m
Prompt injection — Slack’s AI assistant leaked confidential info via injections in public channels

Automated attack generation — AutoDAN reliably generates jailbreak prompts; attacks frequently transfer across models (GPT/Claude/Gemini/Llama).

Data poisoning — corrupts during training. “Attackers only need to contribute some training data once to permanently compromise the system.” Backdoor example: poisoning 0.1% of training data created reliable backdoors in facial recognition. Larger models can be more vulnerable to certain poisoning attacks — opposite of expected robustness scaling.

Privacy extraction — membership inference attacks, model inversion. LLMs can be prompted to reveal email addresses, phone numbers, social security numbers.

Compounding effects: privacy extraction enables more effective adversarial examples; attacks amplify each other.

Defense trade-offs: adversarial training improves robustness against known attacks but reduces normal-input performance. Hardening against one attack sometimes increases vulnerability to others.

Connection to Wiki

This subchapter substantially deepens existing pages:

biosecurity — DNA synthesis and democratization specifics
autonomous-weapons — concrete 2025 deployment data
ai-military-applications — escalation simulations
robustness — adversarial robustness trade-offs

It also connects to the various-redteams SR2025 agenda (which catalogs many of these attack vectors empirically) and to wmd-evals-weapons-of-mass-destruction.

AI Safety Compendium

Explorer

AI Safety Atlas Ch.2 — Misuse Risks

AI Safety Atlas Ch.2 — Misuse Risks

Bio Risk

Cyber Risk

Autonomous Weapons Risk

Adversarial AI Risk

Connection to Wiki

Graph View

Graph view

Table of Contents

Backlinks

AI Safety Compendium

Explorer

AI Safety Atlas Ch.2 — Misuse Risks

AI Safety Atlas Ch.2 — Misuse Risks

Bio Risk

Cyber Risk

Autonomous Weapons Risk

Adversarial AI Risk

Connection to Wiki

Related Pages

Graph View

Graph view

Table of Contents

Backlinks