You Are What You Eat — AI Alignment Requires Understanding How Data Shapes Structure and Generalisation

Simon Pepin Lehalleur, Jesse Hoogland, Matthew Farrugia-Roberts, Susan Wei, Alexander Gietelink Oldenziel, George Wang, … (+2 more) — 2025-02-08 — arXiv

Summary

Position paper arguing that understanding the relation between data distribution structure and trained model structure is central to AI alignment, and that developing statistical foundations for this understanding is necessary to progress beyond standard evaluation toward a robust mathematical science of alignment.

Source

Link: https://arxiv.org/abs/2502.05475
Listed in the Shallow Review of Technical AI Safety 2025 under 2 agenda(s):
- data-quality-for-alignment — Black-box safety (understand and control current model behaviour) / Better data
- data-attribution — White-box safety (i.e. Interpretability)
Editorial blurb (verbatim): [You Are What You Eat \-- AI Alignment Requires Understanding How Data Shapes Structure and Generalisation](https://arxiv.org/abs/2502.05475)

AI Safety Compendium

Explorer

You Are What You Eat -- AI Alignment Requires Understanding How Data Shapes Structure and Generalisation

You Are What You Eat — AI Alignment Requires Understanding How Data Shapes Structure and Generalisation

Summary

Source

Graph View

Graph view

Table of Contents

AI Safety Compendium

Explorer

You Are What You Eat -- AI Alignment Requires Understanding How Data Shapes Structure and Generalisation

You Are What You Eat — AI Alignment Requires Understanding How Data Shapes Structure and Generalisation

Summary

Source

Related Pages

Graph View

Graph view

Table of Contents