Circuits in Superposition 2: Now With Less Wrong Math
Source
- Link: https://www.google.com/url?q=https://www.lesswrong.com/posts/FWkZYQceEzL84tNej/circuits-in-superposition-2-now-with-less-wrong-math&sa=D&source=docs&ust=1765550772146255&usg=AOvVaw334Tyidx2keGCA9vGwQ9a-
- Listed in the Shallow Review of Technical AI Safety 2025 under 1 agenda(s):
- reverse-engineering — White-box safety (i.e. Interpretability)
- Editorial blurb (verbatim):
[2](https://www.google.com/url?q=https://www.lesswrong.com/posts/FWkZYQceEzL84tNej/circuits-in-superposition-2-now-with-less-wrong-math&sa=D&source=docs&ust=1765550772146255&usg=AOvVaw334Tyidx2keGCA9vGwQ9a-)