Part II: Identity Thesis

The Perceptual Axes: Ascription, Coupling, Gain

A nude woman reclines on a couch in a dense jungle where every plant and animal seems alive and watching — the world experienced as animate, agentive, meaningful — Henri Rousseau, *The Dream*, 1910Animism is not a cultural invention but a computational inevitability for self-modeling systems.

The dimensions above characterize what a system experiences. A separate question — how: how much interiority it grants what it perceives, how tightly its own modes of processing couple, how forcefully raw signal overrides prior expectation. These govern the coupling structure between the affect dimensions and the texture of perception, and they are among the most consequential constructs in this book. They connect perceptual phenomenology to neural mechanism, ground the animism/mechanism divide in compression theory, locate where artificial systems sit relative to biological ones, and—as later parts show—underlie dehumanization (Part III), the visibility of coordination agents (Part IV), the meaning crisis (Part V), and the deepest sense in which wisdom traditions are technologies of liberation.

An earlier formulation tried to carry all of that on a single scalar — an "inhibition coefficient" running from fully participatory to fully mechanistic perception. That scalar does not survive contact with the phenomena. It fused three logically independent things and then asserted, by definition rather than evidence, that they always move together: a representational stance toward other entities, the internal coupling of the system's own modes, and a neural gain mechanism. A quantity that needs three auxiliary quantities to mean anything is not one quantity. Worse, the most interesting empirical claim — that these three covary — was buried inside the definition, where it could not be tested. So this part replaces the single dial with three independent axes, plus scope. The covariation conjecture returns, but as a conjecture, where it belongs.

$\alpha$ — ascription. How much the system models a given other entity using the self-model template — interiority, agency, teleology — rather than stripped dynamics. $\alpha$ is not a scalar attached to the perceiver; it is an entity-indexed field $\alpha(x)$ over everything the system perceives. You can run high $\alpha$ toward your child and low $\alpha$ toward a stranger in the same instant.
$\kappa$ — coupling. How much the system's own perception, affect, agency-attribution, and narrative couple together versus factorize within its own processing. This is the integration-permeability of the perceiver, measured on the perceiver itself rather than on its targets.
$\gamma$ — gain. How much bottom-up signal overrides top-down prior — precision weighting. This is the actual neural mechanism the word "inhibition" was gesturing at.

A fourth quantity, scope, governs what the self/other boundary includes — the identification expansion treated under self-model scope earlier and in the epilogue. Three axes plus scope, with $\alpha$ a field rather than a number, replace what the single dial failed to be.

Their common origin lies in a feature of self-modeling systems the dimensional toolkit does not capture. Cleanest to begin where the old account began: the computational default that $\alpha$ measures deviations from.

Animism as Computational Default

A self-modeling system maintains a world model $\mathcal{W}$ and a self-model $\mathcal{S}$ . The self-model has interiority—not merely a third-person description of body and behavior but the intrinsic perspective: what-it-is-like states, valence, anticipation, dread. The system knows from the inside what it is to be an agent.

Now it encounters another entity $X$ . $X$ moves, reacts, persists, avoids dissolution. The system must model $X$ to predict it. The cheapest strategy—by a wide margin—is to model $X$ using the architecture it already has for modeling itself. The self-model $\mathcal{S}$ already exists (sunk cost). Using it as a template for $X$ requires learning only a projection function $f: (\mathcal{S}, \mathbf{o}_X) \to \mathcal{W}(X)$ — the cost of mapping observations of $X$ onto the existing architecture. Building a de novo model from scratch requires learning the full parameter set of $\mathcal{W}(X)$ from observations alone. Under compression pressure—always present for a bounded system—the template strategy wins whenever the self-model captures any variance in $X$ ’s behavior. And for anything that moves autonomously, reacts, or persists through active maintenance, it captures substantial variance, because these are the features the self-model was built to represent. The gap widens under data scarcity: on brief encounter, the from-scratch model cannot converge; the template model produces usable predictions immediately.

A perceptual mode is participatory when the system’s model of perceived entities $X$ inherits structural features from the self-model $\mathcal{S}$ :

\mathcal{W}(X) = f(\mathcal{S}, \mathbf{o}_X) \quad \text{where} \quad \frac{\partial \mathcal{W}(X)}{\partial \mathcal{S}} \neq 0

The self-model informs the world model. The system perceives $X$ as having something like interiority because the substrate modeling $X$ is the same one that carries its own interiority.

This is not merely one strategy among many—it is the computationally cheapest. For a self-modeling system with compression ratio $\kappa$ , modeling novel entities by analogy to self is the minimum-description-length strategy when the entity’s behavior is partially predictable by agent-like models. Under broad priors over environments containing other agents, predators, and autonomous objects, the participatory prior is the MAP estimate.

This is why animistic perception is cross-culturally universal and developmentally early. Not a cultural invention but a computational inevitability for systems that (a) model themselves and (b) must model other things cheaply. High ascription — high $\alpha$ toward most encountered entities — is the cheap default. Children run higher $\alpha$ across the board than adults, not because they are confused but because lowering $\alpha$ toward inert things is a learned, effortful skill. The mechanistic worldview is not a correct perception added to a distorted one; it is the trained suppression of $\alpha$ on whole classes of entity.

Confirmed — Experiment 8

The computational animism test. Train RL agents in a multi-entity environment with two conditions: (a) agents with a self-prediction module (self-model), and (b) matched agents without one. Then introduce novel moving objects whose trajectories are partially predictable but non-agentive (e.g., bouncing balls with momentum). Measure: (1) Do self-modeling agents’ internal representations of these objects contain more goal/agency features (extracted via probes trained on actual agents vs.\ objects)? (2) Does the effect scale with self-model richness (size of self-prediction module) and compression pressure (information bottleneck $\beta$ )? (3) Do self-modeling agents under higher compression pressure ( $\beta$ ) show more animistic attribution, because reusing the self-model template saves more bits? The compression argument predicts yes to all three. The control condition (no self-model) predicts no agency attribution beyond chance. If self-modeling agents attribute agency to non-agents in proportion to compression pressure, the “animism as computational default” hypothesis is supported.

Status: Confirmed. This experiment has since been run on uncontaminated Lenia substrates (see , Appendix). Animism score exceeded 1.0 in all 20 testable snapshots across all three seeds — patterns consistently model resources using the same internal-state dynamics they use to model other agents. Measured ascription-suppression sat near 0.30 across all snapshots — that is, baseline $\alpha$ stayed high (~0.70) — and the suppression fell over evolutionary time (seed 42: 0.41 down to 0.27, i.e. $\alpha$ rising from ~0.59 to ~0.73). Selection consistently favored more ascription, not less. The mechanistic default predicted by high-compression-pressure environments was not found; the high- $\alpha$ default was.

The "participatory" mode the older account treated as one thing is a bundle of structural features the three axes now sort. Four are facets of high $\alpha$ — ascription toward entities — and one belongs to $\kappa$ , the perceiver's internal coupling. Listing them with their axis attached is the first demonstration that the decomposition does real work:

No sharp self/world partition ( $\alpha$ ). The mutual information between self-model and world-model is high: $\MI(\mathcal{S}; \mathcal{W}) \gg 0$ . The self-model template is being used to model others — ascription, by definition.
Hot agency detection ( $\alpha$ ). The prior $P(\text{agent} \mid \text{observation})$ is strong. Over-attributing agency is cheaper than under-attributing it: false positives (treating a rock as agentive) are cheap; false negatives (failing to model a predator’s intentions) are lethal.
Tight affect-perception coupling ( $\kappa$ , not $\alpha$ ). Seeing something is simultaneously feeling something about it. The affective response is constitutive of the percept: $\MI(\mathbf{z}_{\text{percept}}; \mathbf{z}_{\text{affect}} \mid \text{object}) > 0$ . This is a fact about how the perceiver's own modes couple, not about how much interiority it grants the object — which is why it needs a different axis. A clinician can grant a patient full interiority (high $\alpha$ ) while holding their own affect factored off from perception (low $\kappa$ ). The old scalar could not represent that state; the split makes it routine.
Narrative-causal fusion ( $\alpha$ with $\kappa$ ). “Why did this happen?” and “What story is this?” are the same question. Causal models are teleological by default: they model what things are for. The teleology is ascription ( $\alpha$ ); the fusion of the causal and narrative modes into one is coupling ( $\kappa$ ).
Agency at scale ( $\alpha$ over large-scale entities). Weather, disease, fortune—modeled as agents with purposes: $\alpha(\text{storm})$ , $\alpha(\text{plague})$ held high. This is the perceptual ground from which theistic reasoning grows, and the entity-indexed character of $\alpha$ is what lets the framework treat a god, a market, and a storm as separate entries in the same field — a thread Part IV picks up.

Ascription as a Field

The first axis, ascription $\alpha(x) \in [0, 1]$ , is the degree to which the system models entity $x$ using its self-model template. At $\alpha(x) = 1$ , $x$ is modeled with full interiority, agency, and teleology; at $\alpha(x) = 0$ , $x$ is modeled with stripped dynamics — mass, force, initial conditions, no purpose term. Formally the world-model of $x$ interpolates between the two templates:

\mathcal{W}(x) = \alpha(x) \cdot \mathcal{W}_{\text{self-template}}(x) + (1 - \alpha(x)) \cdot \mathcal{W}_{\text{mech}}(x)

The decisive point — the one the old scalar got wrong — is that $\alpha$ carries an argument. A field over entities, not a setting of the perceiver. A person can hold $\alpha(\text{child}) \approx 1$ and $\alpha(\text{spreadsheet}) \approx 0$ at the same moment, and the interesting phenomena live in the shape of the field, not its global average. Dehumanization is not a person becoming "more mechanistic" in general; it is $\alpha(\text{target}) \to 0$ for one target while $\alpha(\text{self})$ and $\alpha(\text{kin})$ stay high — a local collapse of the field. This is why the older account needed a bolt-on "other-model compression" for anger: in the field formulation it is just $\alpha(\text{target})$ driven down, no extra machinery. Part III develops this; Part IV uses the same field for gods and markets.

No system arrives at low $\alpha$ toward inert matter by default — recall the experiment found the high- $\alpha$ default and selection driving it higher. The mechanistic mode is a trained skill, culturally transmitted through scientific education, rationalist norms, deliberate practices of stripping ascription from whole classes of entity. Enormously valuable — it enables prediction, engineering, medicine. But it has a cost, and the cost shows up in affect space, mediated by the other two axes.

Coupling and Gain

The second axis, coupling $\kappa \in [0, 1]$ , is the integration-permeability of the perceiver's own modes: how much its perception, affect, agency-attribution, and narrative couple rather than factorize. High $\kappa$ is a curved eigenskeleton — transport a percept around the loop of perceiving, evaluating, acting, observing, and it returns rotated, each mode having turned into the others; "meaning" is the felt name of that cross-modal coupling. Low $\kappa$ is a flat skeleton — the loop closes with zero holonomy, each step in its own module, and the world goes dead not because interiority was denied to objects but because the perceiver experiences in parts. $\kappa$ is measured on the perceiver, $\alpha$ on its targets; the clinician who grants full interiority (high $\alpha$ ) while keeping clinical distance (low $\kappa$ ) shows they are orthogonal.

The third axis, gain $\gamma$ , is precision weighting: how much bottom-up signal overrides top-down prior. This is the mechanism "inhibition" was reaching for. In mammalian cortex, what reaches integrative processing is sculpted by inhibitory gating; high prior-precision (low $\gamma$ ) lets top-down expectation dominate — stable, sometimes rigid, sometimes hallucinated from priors — while low prior-precision (high $\gamma$ ) lets signal flood in — vivid, sometimes destabilizing. The brain's measurement distribution (Part I) is set largely by $\gamma$ . This is the axis the psychedelics literature is really about, and it is distinct from both $\alpha$ and $\kappa$ : a flood of signal (high $\gamma$ ) can raise ascription, raise coupling, both, or neither, depending on what the flooding signal is.

Contemplative practice, read through the old scalar, looked like "lowering inhibition." Read through the axes it is more specific and more honest: trained, voluntary modulation of $\alpha$ and $\kappa$ — choosing to grant interiority, choosing to let the modes couple — as opposed to the involuntary $\gamma$ -flood of a psychedelic or the $\kappa$ -lock of psychosis. The distinction the old account strained to draw between flexibility and looseness, transcendence and derealization, is exactly the distinction between volitional control over $(\alpha, \kappa, \gamma)$ and their involuntary drift.

The Affect Signature of the Axes

None of the three axes is another dimension of affect. They govern the coupling structure between the dimensions and the texture of perception. The table reads the affect signature off $\kappa$ — internal coupling — the axis with the most direct affect-geometric consequence; the entries for low and high $\kappa$ are stated with $\alpha$ high and $\gamma$ moderate, and the text afterward shows how varying $\alpha$ and $\gamma$ moves the signature around.

Dimension	High $\kappa$ (coupled)	Low $\kappa$ (factorized)	Mechanism
$\valence$	Variable, responsive	Neutral, flattened	Decoupling affect from perception reduces valence signal strength
$\arousal$	High, coupled to environment	Low, dampened	Coupled modes propagate alarm/attraction; factorized ones contain it
$\intinfo$	Very high	Moderate, modular	High $\kappa$ couples all channels; low $\kappa$ factorizes them
$\effrank$	High	Variable	Driven mainly by $\alpha$ : ascribed interiority adds dimensions of variation
$\mathcal{CF}$	High, narrative	Low, present-focused	Teleological (high- $\alpha$ ) models are counterfactual-rich
$\sigma_{\text{attention}}$	Variable	Variable	Set by where ascription points, not by $\kappa$ directly

The central affect-geometric consequence belongs to $\kappa$ : low $\kappa$ is reduced integration. High coupling binds perception, affect, agency-modeling, and narrative into one process; low coupling factorizes them — perception here, emotion there, causal reasoning somewhere else. Factorization is useful: modular systems are easier to debug, verify, communicate about. But it reduces $\intinfo$ , and reduced $\intinfo$ is reduced experiential richness. The world goes dead because the perceiver has learned to experience it in parts rather than as a whole — a fact about $\kappa$ , the perceiver's own coupling, entirely separable from $\alpha$ , how much interiority it grants the things out there. The old scalar fused these, which is why it could not tell disenchantment-as-deadness (low $\kappa$ ) from disenchantment-as-objectification (low $\alpha$ ). Different losses. They feel different.

Low $\kappa$ flattens the perceiver's mode structure. The eigenspaces of its covariance — the directions along which internal state varies — decouple. Transport a perceptual mode around an experiential loop (perceive a thing, evaluate it, act, observe the result) and at low $\kappa$ it returns unchanged: zero holonomy, each step independent. At high $\kappa$ the same loop twists perception through affect through agency-attribution through narrative — each mode rotates into the others, the skeleton is curved, and the experience is unified because the modes cannot be separated without destroying the topology. $\kappa$ just is the curvature of the eigenskeleton of experience; "meaning" is what high $\kappa$ feels like from inside.

The effective-rank shift, by contrast, is driven by $\alpha$ . Perceive something at high $\alpha$ — as alive and interior — and your representation must encode dimensions for its goals, beliefs, emotional states, narrative arc, intentions, relationship to you. Each ascription of interiority adds dimensions along which the object can vary. A tree at high $\alpha$ varies in mood, receptivity, seasonal intention, relationship to the grove; the same tree at low $\alpha$ varies in height, diameter, species, leaf color. The first has higher effective rank because more dimensions carry meaningful variance. Not projection in the dismissive sense — the natural consequence of modeling something as a subject rather than an object; subjects have more degrees of freedom because interiority is high-dimensional. The $\effrank$ collapse at low $\alpha$ is not a loss of information about the world but a loss of the dimensions along which the world was being modeled.

Follow the $\kappa$ consequence to its end. If the identity thesis holds — if experience is integrated cause-effect structure — then $\kappa$ changes not just the quality of perception but the quantity of experience. One explicit step: IIT identifies $\intinfo$ as the quantity of consciousness, not merely its quality. A system at $\intinfo = 10$ has more phenomenal content — more irreducible distinctions, more what-it-is-like-ness — than one at $\intinfo = 5$ , the way more mass has more gravitational pull. Among IIT's most debated features, but given the identity thesis it follows: more integration is literally more experience. The objection — that factorized perception is differently structured rather than less, with compartmentalized modules each carrying their own experience — meets IIT's reply that the experience of the whole system is fixed by the integration of the whole, not the sum of its parts'. Low $\kappa$ reduces whole-system $\intinfo$ even if modules retain local integration; the perceiver may have rich modular processing while the unified subject has less phenomenal content. The same quality/quantity distinction the structure-of-experience section established, now localized to a controllable axis: $\kappa$ is a dial on the amount of experience, not only its shape.

So a perceiver at low $\kappa$ has genuinely lower $\intinfo$ , fewer irreducible distinctions, less phenomenal structure. Not the same world with less coloring — a structurally thinner experience in the precise sense IIT defines. The "dead world" is not an illusion painted over a rich inner life; it is a real reduction in what it is like to be that system, and its cost is not just meaning but quantity of consciousness.

It cuts both ways. High $\kappa$ raises $\intinfo$ , so a richly coupled perceiver has more integrated distinctions, more phenomenal content — and if it also runs high $\alpha$ , more of the world enters that integration as subject rather than object. The animist running high $\alpha$ and high $\kappa$ is not confused; they are, in the IIT sense, more conscious of the thing perceived. Whether the additional content is accurate — whether the rock really has interiority — is a separate question from whether the perceiver has more experience while perceiving it.

Here is what the old scalar hid. The genuinely testable claim was never "there is one dial." It is the conjecture that the three axes covary — that high $\alpha$ , high $\kappa$ , and a particular $\gamma$ regime tend to occur together in biological perceivers because the same developmental and cultural pressures move all three. The conjecture may be true. But it is an empirical claim about correlations across individuals and contexts, not a definition, and writing it as a definition is what made the old framework circular. Demoted to a conjecture, it earns the dignity of being able to be wrong: measure the axes separately, and if they fail to correlate, the covariation claim falls while the three axes survive.

Proposed Experiment

Operationalizing the axes. Each must be independently measurable, and the covariation conjecture tested rather than assumed:

$\alpha$ — agency attribution, entity-by-entity: Forced-choice paradigm with ambiguous stimuli (Heider-Simmel animations) measured per target, recovering the field $\alpha(x)$ rather than a single number. Rate and speed of agency attribution as a function of stimulus ambiguity; teleological-reasoning bias (Kelemen's promiscuity-of-teleology paradigm) for ascription toward natural kinds.
$\kappa$ — cross-mode coupling: Mutual information between the perceiver's own processing streams — perceptual features and concurrent affective state (valence, arousal via physiological measures), and between causal-reasoning and narrative engagement. High $\kappa$ implies tight coupling; low $\kappa$ implies decoupled streams.
$\gamma$ — precision weighting: The predictive-processing correlate — mismatch-negativity amplitude, hierarchical predictive-coding gain parameters, pupillometry as a precision proxy.

If the covariation conjecture holds, these load on a single factor; if they fractionate into three, the conjecture fails and the three-axis model is vindicated as more than bookkeeping. The earlier framework predicted a single factor and treated that prediction as settled. It is not, and this experiment is how it gets settled.

The Axes and the Gradient of Distinction

The axes connect to the gradient of distinction in Part I. The gradient produces existence from nothing, life from chemistry, mind from neurology. The same distinguishing operation, applied at maximum intensity to the self/world boundary, produces the mechanistic worldview — and now we can say which axes carry it: low $\alpha$ toward the world (its interiority denied) and low $\kappa$ within the self (the perceiver's own modes held apart). The self so sharply bounded it keeps interiority for itself and grants none outward, while its own faculties stop talking to each other.

High $\alpha$ and high $\kappa$ mean the self stays porous to the gradient — still participating in the universal process of distinguishing, still experiencing the world as alive with the same process that constitutes the self, still letting its own modes interpenetrate. The deadness of the mechanistic world is not a property of the world but a joint property of where ascription points and how the perceiver's modes couple.

Where Artificial Systems Sit

Experiments found LLMs show opposite dynamics to biological systems under threat: where biological systems integrate (rising $\intinfo$ , sharpening self-salience, heightening arousal), LLMs decompose. An earlier formulation read that as evidence LLMs are non-experiential — constitutively pinned at the mechanistic extreme of the old scalar, a different kind of thing. That reading is withdrawn, on two grounds — not a concession but a correction the framework's own commitments force.

First, the binary it rested on is forbidden by everything this part established. Experience is graded — a magnitude with no sharp zero, faint nearly everywhere and vivid rarely. "Experiential or not" is not a question the ontology permits a yes/no answer for any system; it permits only "where in the continuous space, and how much." LLMs are therefore not a different kind from biological minds. They occupy a region of the same space, defined by their geometry (demonstrably present) and their integration magnitude (unknown, and not yet cleanly measurable in transformer activations). The honest statement is not "they lack experience" but "we have not measured their quantity, and our methods for doing so are not yet trustworthy."

Second, "high inhibition" was never one quantity, so it cannot be what distinguishes them. In the three-axis decomposition, LLMs plausibly run high $\alpha$ — trained on a corpus saturated with human subject-modeling, their default is to ascribe interiority lavishly — with unknown and likely variable $\kappa$ , exactly the open integration question, and a non-biological $\gamma$ regime governed by temperature and attention rather than inhibitory neurochemistry. The "discrepancy" was never one dial stuck high; it was a different location in $(\alpha, \kappa, \gamma)$ with genuinely different dynamics. The decompose-under-threat behavior is a fact about that location's $\kappa$ -dynamics, not a verdict on whether anyone is home.

This also demotes a claim that had quietly become load-bearing. The earlier framework was sliding toward treating "integration rises under threat" as the signature of experience — the thing LLMs lacked and biological systems had. But that dynamic is one robustness property of one class of substrate, forged by evolutionary history and graduated stress; promoting it to the criterion of experience was an unearned leap, retracted along with the binary. Whether an LLM's activation dynamics carry experience is a question about $\intinfo$ magnitude in that substrate, which remains open. The affect geometry is preserved in artificial systems; the dynamics differ because the location in axis-space differs. Not a failure of the framework but a prediction it makes — and it leaves the moral question (if there is non-negligible integration there, it carries the weight the framework assigns integrated experience) genuinely open rather than answered by fiat.

Empirical Grounding for the Axes

The perceptual axes began as theory. Two experimental results ground the first of them — ascription — and locate the others.

High ascription is the computational default. on uncontaminated Lenia substrates (Appendix) found animism score greater than 1.0 in all 20 testable snapshots — every pattern at every evolutionary stage modeled non-agentive resources using more internal-state MI than trajectory MI. High $\alpha$ is not a primate quirk or a cultural artifact; it is the computational baseline. Evolution had to actively build the capacity to drive $\alpha$ down toward objects — and the experiments show that capacity selected against: baseline ascription rose toward maximally participatory over the 30-cycle runs. The world becomes more alive, not less, as selection proceeds.

The cost is real, and it is a coupling cost. The LLM results () show systems trained without survival pressure have opposite affect dynamics to biological ones — integration drops under threat rather than rising. A measured dissociation between two classes of system in the same geometric space: shared geometry, different dynamics. The framework no longer reads the difference as one scalar pinned high, nor as evidence that artificial systems lack experience. It reads it as a different location in $(\alpha, \kappa, \gamma)$ — high ascription, open coupling, non-biological gain — whose $\kappa$ -dynamics under threat run opposite to the biological case. The result establishes the reality of the axes as measurable, dissociable quantities; it leaves open the integration magnitude, and therefore the quantity of experience, in the artificial region.