Neural Interactive Proofs
Lewis Hammond, Sam Adam-Day — 2024-12-12 — arXiv (ICLR 2025)
Summary
Introduces a unifying framework for prover-verifier games where a computationally bounded verifier learns to interact with powerful but untrusted neural network provers, proposing new protocols and demonstrating their application to graph isomorphism and code validation tasks.
Key Result
Demonstrated that neural interactive proofs can enable safe delegation to more capable but untrusted AI systems through adversarial prover-verifier interactions in code validation tasks.
Source
- Link: https://arxiv.org/abs/2412.08897
- Listed in the Shallow Review of Technical AI Safety 2025 under 1 agenda(s):
- supervising-ais-improving-ais — Make AI solve it