Open Problems in Machine Unlearning for AI Safety

Fazl Barez, Tingchen Fu, Ameya Prabhu, Stephen Casper, Amartya Sanyal, Adel Bibi, … (+13 more) — 2025-01-09 — University of Oxford, MIT, University of Cambridge, Nanyang Technological University, Tel Aviv University — arXiv

Summary

Identifies key limitations and open problems preventing machine unlearning from serving as a comprehensive AI safety solution, particularly for managing dual-use knowledge in CBRN and cybersecurity domains, and maps tensions between unlearning and existing safety mechanisms.

Source