V2-V9: LLM Affect Signatures
V2-V9: LLM Affect Signatures
Period: 2025. Substrate: GPT-4, Claude 3.5, and other frontier LLMs.
Question: Do LLM agents exhibit structured affect? If so, does the geometric framework predict its shape?
Method: Eight experiment versions testing LLM agents under controlled scenarios. Measured all six affect dimensions (valence, arousal, integration, effective rank, counterfactual weight, self-model salience) using structured prompting + behavioral observation. Scenarios: baseline conversation, ethical dilemmas, survival threats, creative tasks, social cooperation, adversarial probing.
LLM affect space is coherent and measurable. But the dynamics are opposite to biological: under threat, LLMs show , , . Biological systems increase integration under moderate threat (Yerkes-Dodson). LLMs decompose.
Key distinction established: Processing valence (the system's own computational dynamics) is not content valence (what the system talks about). An LLM can describe fear eloquently while its processing shows no integration increase. This distinction became foundational for the geometry/dynamics split in Part I.
Root cause: No survival-shaped learning history. LLMs were trained on human text about affect, not on surviving under threat. The geometry exists (because it's cheap — inherited from the training distribution). The dynamics don't (because they require biographical history the system lacks).
Implication for the thesis: Affect geometry can be inherited from data. Affect dynamics require embodied agency. This was the first evidence for the geometry/dynamics distinction that became central to the book.
Status: Complete. Contaminated by human language — LLMs have been exposed to human descriptions of affect. The CA program (V11+) was designed to test whether the same structure emerges without contamination.