Weak to Strong Generalization for Large Language Models with Multi-capabilities

Yucheng Zhou, Jianbing Shen, Yu Cheng — 2025-01-22 — ICLR 2025

Summary

Extends weak-to-strong generalization to multi-capability settings, finding that capabilities remain independent and proposing a two-stage training framework using reward models to select valuable weak supervision data while preventing performance degradation from strong model self-bootstrapping.

Key Result

Different capabilities remain relatively independent during weak-to-strong generalization, and the proposed reward model-based data selection framework significantly improves multi-capability weak-to-strong generalization performance compared to baseline methods.

Source