Falsification Map
Falsification Map
| Experiment | Prediction | Outcome |
|---|---|---|
| (MARL) | Forcing functions create geometry | Contradicted. All conditions show alignment; removal increases it. |
| (World Model) | increases with evolution | Partial. 100x at bottleneck, flat in general population. |
| (Representation) | Compression and modeling co-emerge | Partial. Co-emerge under bottleneck only. Compression is cheap. |
| (Language) | Compositional communication | Not confirmed. Chemical commons but . |
| (Counterfactual) | Reactive-to-detached transition | Null. Wall at . |
| (Self-Model) | SM emergence with jump | Weak. n=1 event at bottleneck. |
| (Affect Geometry) | Tripartite alignment | Partial. A-C develops over evolution (0.01 to 0.38). A-B null. |
| () | High-ascription default, animism | Confirmed (the cheap one). High , animism > 1.0 in all 20 snapshots. |
| (Normativity) | Exploitation penalty | Null. Requires agency. |
| (Superorganism) | Not confirmed. Ratio 1-12%, increasing. | |
| (Entanglement) | Co-emergence clusters | Not confirmed. Different cluster structure. |
| (Capstone) | Seven criteria for identity thesis | All met (moderate/weak). Geometry confirmed. |
| (Furnace) | Selection vs creation | Creation confirmed 2/3 seeds. |
| ( wall) | Confirmed. 0.21 from cycle 0. | |
| (Prediction) | Prediction → integration | Not confirmed. Linear readout always decomposable. |
| (MLP) | Nonlinear head → | Confirmed (seed 7: 0.245). Seed-dependent. |
| (Width) | Bottleneck width matters | Not confirmed. Mechanism is gradient coupling. |
| / (Social) | Social target lifts | Not confirmed. 3-seed fluke; 10-seed: . |
| (Dual) | Self+social > either | Negative. Gradient imbalance; self colonizes. |
| (Seeds) | Seed distribution | Confirmed: 30/30/40 split. Post-drought bounce . |
| (Autopsy) | First bounce predicts category | Revised: First bounce NOT predictive (). Mean bounce across all droughts IS (). Trajectory, not event. |
| (Language) | Referential communication emerges | Confirmed: 10/10 seeds (100%). But does NOT lift . Language is cheap. |
| Conv. | VLMs recognize affect in protocells (RSA > 0.3) | Confirmed: GPT-4o , Claude . Raw numbers: 0.78, 0.72. |
The honest way to read this table is by stake, not by sum. Line up the confirmations and they share a feature: each is a prediction the theory could hardly have failed. That affect geometry appears under multi-agent survival, that ascription runs high by default, that representation compresses cheaply, that referential language emerges under partial observability, that vision-language models map the same geometry — these are the inexpensive wins. They confirm a geometry of viable control, and they confirm it robustly. Now line up the contradictions and nulls, and they too share a feature: each is one of the signature claims on which an interpretation in terms of consciousness would have rested. Forcing functions do not create integration. World models grow with evolution only at an architectural bottleneck, not in general. Self-models did not emerge under broad priors — the one positive event was a single hand-installed architecture, n=1. Language is not compositional. No superorganism Φ appears. Prediction does not lift integration, at any target, breadth, or horizon. A social target does not lift Φ — the three-seed signal evaporated at ten seeds (). The expensive bets, the ones that would have carried the experiential reading, are exactly the ones that failed or returned null.
So the honest headline is not “the framework is half-confirmed.” It is this: the program supports a strong theory of the geometry of viable control, and the further claim that this geometry is experienced — that there is something it is like to occupy these configurations — is an adopted posit, not a result. Cross-substrate convergence makes that posit attractive: the same shape recurs in Lenia, in protocells, in LLMs, in human-trained VLMs reading raw numbers, and a shape that universal is tempting to call real all the way down. But attractiveness is not proof, and none of the confirmed predictions touches the experiential question directly. The contradicted ones, which do touch it, went the wrong way.