Hyperstition studies — SR2025 Agenda Snapshot
One-sentence summary: Study, steer, and intervene on the following feedback loop: “we produce stories about how present and future AI systems behave” → “these stories become training data for the AI” → “these stories shape how AI systems in fact behave”.
Theory of Change
Measure the influence of existing AI narratives in the training data → seed and develop more salutary ontologies and self-conceptions for AI models → control and redirect AI models’ self-concepts through selectively amplifying certain components of the training data.
Broad Approach
cognitive
Target Case
average
Orthodox Problems Addressed
Value is fragile and hard to specify
Key People
Alex Turner, Hyperstition AI, Kyle O’Brien
Funding
Unclear, niche
Estimated FTEs: 1-10
See Also
data-filtering, active inference, LLM whisperers
Outputs in 2025
4 item(s) in the review. See the wiki/summaries/ entries with frontmatter agenda: hyperstition-studies (these were generated alongside this file from the same export).
Source
- Row in
shallow-review-2025/agendas.csv(name = Hyperstition studies) — Shallow Review of Technical AI Safety 2025.
Related Pages
- ai-safety
- ai-safety
- data-filtering
- assistance-games-assistive-agents
- black-box-make-ai-solve-it
- capability-removal-unlearning
- chain-of-thought-monitoring
- character-training-and-persona-steering
- control
- data-poisoning-defense
- data-quality-for-alignment
- emergent-misalignment
- harm-reduction-for-open-weights
- inference-time-in-context-learning
- inference-time-steering
- inoculation-prompting
- iterative-alignment-at-post-train-time
- iterative-alignment-at-pretrain-time
- mild-optimisation
- model-psychopathology
- model-specs-and-constitutions
- model-values-model-preferences
- rl-safety
- safeguards-inference-time-auxiliaries
- synthetic-data-for-alignment
- the-neglected-approaches-approach
Sources cited
Primary URLs harvested from this page’s summary references. Auto-generated by scripts/backfill_citations.py; edit by re-running, not by hand.
- Summary: AI Safety (Wikipedia) — referenced as
[[ai-safety]]