Challenges and Future Directions of Data-Centric AI Alignment

Min-Hsuan Yeh, Jeffrey Wang, Xuefeng Du, Seongheon Park, Leitian Tao, Shawn Im, … (+1 more) — 2025-05-01 — arXiv

Summary

Position paper advocating for data-centric AI alignment, identifying six sources of unreliability in human feedback through qualitative analysis of 160 Anthropic-HH samples, and proposing seven research directions for improving feedback collection, data cleaning, and verification processes.

Key Result

Found 25% of re-annotated samples contradicted original labels and another 25% were marked ‘both are bad’, with Fleiss’s kappa of 0.46 indicating moderate inter-annotator agreement and six distinct sources of unreliability.

Source