Understanding the Capabilities and Limitations of Weak-to-Strong Generalization
Wei Yao, Wenkai Yang, Ziqiao Wang, Yankai Lin, Yong Liu — 2025-03-08 — ICLR 2025 Workshop SSI-FM
Summary
Provides theoretical analysis of weak-to-strong generalization by establishing upper and lower bounds on generalization error and calibration error in classification settings, and extends work to regression with KL divergence loss, validated experimentally.
Key Result
Derives bounds showing that weak-to-strong generalization is primarily limited by the weak model’s generalization error and requires careful optimization balance to avoid over-reliance on weak supervision signals.
Source
- Link: https://openreview.net/forum?id=RwYdLgj1S6
- Listed in the Shallow Review of Technical AI Safety 2025 under 1 agenda(s):
- weak-to-strong-generalization — Make AI solve it