The Compression Paradox
Why formalization is often the opposite of understanding
I. Two Kinds of Compression
Every complex idea must be compressed to travel. Darwin's theory compresses into "survival of the fittest." Adam Smith's insight compresses into "the invisible hand." Dawkins's framework compresses into "the selfish gene."
Each compression feels like understanding. You hear the phrase, something clicks, and you move on with the sense of having grasped the concept. The compression did its job: it made a complex idea portable.
But each of these compressions is wrong—not in the sense of being slightly imprecise, but in the sense of mapping the wrong feature of the original idea. Each produces confident misunderstanding: the listener walks away believing they understand the concept, holding a model that is structurally different from the original.
This is not a problem of sloppy communication. It is a structural feature of how compression works. There are two fundamentally different ways to compress an idea, and one of them reliably distorts it.
Dedre Gentner's structure-mapping theory, developed across four decades of cognitive science research, provides the mechanism. Compression preserves either surface attributes (what things look like) or relational structure (how parts interact). These are not two points on a spectrum. They are two different operations that produce two different results.
| Attribute Substitution | Relational Isomorphism | |
|---|---|---|
| Preserves | Surface features | Causal structure |
| Maps | What the thing looks like | How the thing works |
| Feels like | Understanding | Understanding |
| Actually produces | Confident misunderstanding | Transferable insight |
| Compressive efficiency | Very high (slogans, labels) | Moderate (diagrams, models) |
| Error detection | Difficult — the compression feels right | Easier — structural mismatch is visible |
The paradox: The most compressive representations—slogans, labels, formalisms—are the most likely to be attribute substitutions. The representations that preserve causal structure—diagrams, worked examples, mechanism descriptions—compress less efficiently. The tradeoff is between portability and fidelity. Maximum portability produces maximum distortion.
II. Three Canonical Distortions
The mechanism is clearest in the three most influential compressions in the history of ideas.
"Survival of the Fittest"
Darwin's actual mechanism: differential reproduction in a population with heritable variation under environmental constraint. The "fit" organism is not the strongest, fastest, or most aggressive. It is the one whose traits happen to produce more surviving offspring in the current environment. Fitness is relational (organism × environment), not absolute (strongest wins). Cooperation, camouflage, niche specialization, and parasitism are all "fitness" strategies. The mechanism is statistical, not gladiatorial.
What "survival of the fittest" maps: competition. Strongest wins. Nature red in tooth and claw. A dominance hierarchy where the powerful survive and the weak perish.
This is attribute substitution. The phrase maps the surface feature (some organisms die) rather than the relational structure (differential reproduction rates across a distribution of heritable traits). The result: Social Darwinism, eugenics, "nature proves that inequality is natural." A century of political misuse, all downstream of a compression that mapped the wrong attribute.
Spencer coined the phrase, not Darwin. Darwin reluctantly adopted it. The distortion was not in Darwin's thinking—it was in the compression that replaced his thinking in public discourse.
"The Invisible Hand"
Smith's actual mechanism: under specific institutional conditions (rule of law, property rights, competitive markets, no monopoly power, no externalities, informed participants), individual self-interest produces collective benefit as a byproduct—not because individuals intend collective benefit, but because the institutional architecture channels self-interest into productive competition.
What "the invisible hand" maps: automaticity. Markets work by themselves. No intervention needed. Self-interest automatically produces good outcomes. Laissez-faire as natural law.
The compression strips the institutional preconditions. Smith spent much of The Wealth of Nations analyzing the specific conditions under which markets work and the specific conditions under which they fail (monopoly, collusion, labor exploitation, mercantilism). The phrase "invisible hand" appears exactly three times across Smith's entire published work. It was a minor rhetorical flourish, not the thesis.
The compressed version inverts the original. Smith's point: given the right institutional architecture, self-interest can be channeled productively. The compression: self-interest is inherently productive, so institutional architecture is unnecessary. The attribute (things seem to work out) replaced the relation (institutional conditions that make things work out).
"The Selfish Gene"
Dawkins's actual mechanism: gene-level selection—the unit of selection is the replicator (gene), not the vehicle (organism). Genes that produce more copies of themselves increase in frequency. This explains altruism: an organism can sacrifice itself for copies of its genes in relatives (kin selection). "Selfish" is a metaphor for differential replication, not a claim about psychology or morality.
What "the selfish gene" maps: selfishness as biological law. Organisms are fundamentally selfish. Altruism is an illusion. Cooperation is disguised competition. Selfishness is "natural."
Dawkins himself spent decades correcting this misreading. The book's argument is precisely the opposite of what the title implies: genes that produce cooperative, altruistic, and group-benefiting behavior can be "selfish" in the replicator sense while producing "selfless" organisms. The title maps the attribute (sounds like selfishness is fundamental) rather than the relation (replicator-level selection explains organism-level cooperation).
The Common Pattern
In each case: a complex mechanism is compressed into a phrase that maps a surface attribute rather than the causal structure. The distortion is predictable: the compression maps whichever attribute is most emotionally salient, most politically useful, or most consonant with existing beliefs. The compressed version propagates faster than the original because the surface attribute is simpler to carry and more rewarding to hold. The relational structure is dropped. Confident misunderstanding at scale.
III. The Illusion of Explanatory Depth
Rozenblit and Keil (2002) documented the Illusion of Explanatory Depth (IOED): people systematically overestimate their understanding of how things work. Asked "do you understand how a zipper works?", most people say yes. Asked to explain the mechanism step by step, most cannot.
Bad compression exploits IOED. A slogan creates the feeling of understanding without the structure of understanding. "Survival of the fittest" feels like you understand evolution. "The invisible hand" feels like you understand markets. The slogan occupies the slot where understanding would go, preventing the recognition that understanding is absent.
This is worse than ignorance. An ignorant person knows they don't understand and might seek explanation. A person holding a bad compression believes they understand and will not seek further explanation. The compression functions as an immune response against the actual idea: it inoculates against the real mechanism by providing a surface-level substitute that feels sufficient.
Fernbach and Sloman (2017) extended this finding to political beliefs: asking people to explain the mechanism of a policy (not just whether they support it) reduces political extremism. The IOED was supporting confident positions that collapsed when mechanism-level understanding was demanded. The bad compression (political slogan, tribal position) was doing the work that mechanism understanding should have been doing.
Bad compressions are not failed communication. They are successful inoculation. They spread faster than the original idea, occupy the cognitive slot where understanding would go, create the subjective experience of comprehension, and thereby prevent the actual mechanism from being learned. The compression doesn't fail to transmit the idea — it succeeds in transmitting a substitute that blocks the original.
IV. Formalization as Attribute Substitution
The three canonical distortions involve verbal compression — slogans replacing mechanisms. But the same structure operates with mathematical compression, and the error is harder to detect because mathematical formalization carries the aesthetic of rigor.
The Spherical Cow
Physics joke: "Consider a spherical cow." The simplification that makes the math tractable removes the features that make the problem real. In physics, this is understood as approximation — useful within bounds, wrong outside them. The physicist knows the cow is not spherical.
In other domains, the spherical cow is mistaken for the cow. The formalization is treated not as an approximation but as the reality. The notation replaces the mechanism rather than modeling it.
Consider the thermostat. The Physics 101 model: a bang-bang controller. Rationalists used to use this as a teaching example: "Most people turn the thermostat to 90 thinking it heats faster. We know it's a bang-bang controller." Except many real HVAC systems are variable-rate, and beyond the controller, the room has thermal mass, drafts, sun exposure, heat loss that varies with wind. The person who has lived in that house for ten years and turns the thermostat to 90 may be responding to accumulated experience with their specific system's actual behavior. Their "wrong" action is calibrated to a real mechanism the model doesn't contain.
This is the pattern: lived experience of a real system often produces a better working model than the formalization, because it includes the variables the formalization discards. The formalization looks like understanding — precise, clean, checkable. Lived experience looks like confusion — approximate, messy, inarticulate. But the messy model tracks the real mechanism; the clean model tracks the simplified one.
Utils as Cardinal Numbers
Utility theory formalizes human preferences as cardinal numbers. Once preferences are numbers, you can do arithmetic on them: add, subtract, compare, scale. Once you can scale, you can apply arrow notation (Knuth's up-arrows: 10↑↑10 utils). Once you have astronomical utility quantities, "astronomical stakes" arguments defeat all intuition and common sense. This was a prominent feature of rationalist and EA discourse for years — before the culture of EV maximization produced its most visible catastrophe. The formalism is now treated with more caution, but the axioms that generated it remain largely unexamined.
The causal chain:
- Individual experience is the unit of value (axiom — unexamined)
- Experiences can be represented as cardinal numbers (formalization — unvalidated)
- If numbers, then arithmetic (mathematical consequence)
- If arithmetic, then scaling (mathematical consequence)
- If scaling, then astronomical quantities are "meaningful"
- Astronomical quantities defeat intuition
- Pascal's mugging, infinite ethics, acausal trade
The usual critique targets step 2: are human preferences actually cardinal numbers? Can you meaningfully say that experience A is 3.7 times as good as experience B? The formalization assumes yes. The assumption was never tested — it was adopted because it makes the math work.
But the deeper error is step 1. Why is individual experience the unit of value? This axiom is not derived from anything — it is inherited from the utilitarian tradition via the liberal individualist ontology that Western philosophy has treated as self-evident since the Enlightenment. Under a different ontology — one where sustained complexity generation in deep time is the terminal value — the individual is not the relevant unit. Civilizations, ecosystems, and telic systems are the units whose persistence physics constrains. Individual experience matters instrumentally (flourishing individuals contribute to civilizational capacity) but is not terminal. The entire utils tower is built not just on an unvalidated formalization (step 2) but on an unexamined axiom (step 1) that determines the unit of measurement before any math begins.
The test: remove the formalization and state the claim in plain language. "This matters a lot more than that" — does the argument still work? If the argument only works when expressed in formal notation, the notation is doing persuasive work, not analytical work. The formalism isn't clarifying — it's obscuring the absence of mechanism.
Scope Insensitivity
Desvousges et al. found that people pay roughly the same to save 2,000, 20,000, or 200,000 birds. The formalized interpretation: humans are "scope-insensitive" — a cognitive bug that makes us fail to scale our response to the magnitude of the problem.
There are at least two things wrong with this interpretation. First: the emotional response system evolved to calibrate to the scale at which the organism can act. You cannot personally do anything about 200,000 birds that you cannot do about 2,000. The calibration is tuned to actionable scope, which is the correct calibration for an embodied agent with limited causal reach.
Second, and more fundamentally: the question presupposes that individual bird deaths are the relevant unit of concern. If your unit of concern is the ecosystem rather than the individual, bird populations are self-renewing — whether 2,000 or 200,000 individual birds die in oil ponds is largely irrelevant to the ecosystem's capacity for sustained complexity unless the deaths threaten population viability, which is a categorical question (yes or no), not a magnitude question (how many). The human response — treating all three scenarios as roughly equivalent — may be tracking the right variable: "is there a systemic problem?" rather than "how many individual units of bad?" This argument depends on accepting that systems rather than individuals are the terminal unit of concern — but note that the formalization's assumption (individuals are the unit) is equally an axiom, just a more familiar one.
The formalized version maps the attribute (the numbers don't scale linearly) and declares it a bug. The formalization itself commits the error it diagnoses: it substitutes a surface feature (numerical magnitude) for the relational structure (what actually matters for the system's persistence). If scope insensitivity is the evolved response across all humans, the prior should massively favor "evolution is capturing something the model misses" over "billions of brains are all broken in the same way."
V. The Compounding Problem
Attribute substitution compounds. If you formalize preferences as utils and don't check the axiom, then formalize social welfare as aggregated utils and don't check that axiom, then formalize policy evaluation as expected utility maximization — each layer inherits the unchecked assumptions of every previous layer. The tower grows taller and more impressive. It is not attached to the ground.
The tower's height creates the illusion of its soundness. "If the downstream reasoning is this rigorous, the foundations must be solid." This is the inversion of how empirical inquiry works. In actual mechanism-level investigation, axioms are provisional: tested against reality, replaced when they fail. In formalization-as-understanding, the sophistication of the superstructure creates retroactive confidence in the foundation. Lakatos described this as a "degenerating research program": a hard core of axioms protected by an ever-growing protective belt of auxiliary hypotheses. The program appears to be making progress — more papers, more formalisms, more notation — while actually drifting further from mechanism.
VI. What Good Compression Looks Like
Bad compression is the default because it's easier: map the salient attribute, get maximum portability, move on. Good compression is harder but possible. The distinguishing features:
"Flatten the Curve"
During COVID-19, "flatten the curve" compressed epidemic dynamics into a phrase that preserved the relational structure: rate of infection × healthcare capacity × time. The phrase communicated not a surface attribute but a structural relationship — if you slow the rate, the same total infections are distributed over more time, keeping the peak below the capacity threshold.
This worked because the compression was isomorphic to the mechanism. The word "flatten" maps to the mathematical operation. "The curve" maps to the epidemic trajectory. The listener who understands the phrase understands the mechanism, not just a label for it.
Contrast with "social distancing" — attribute substitution. Maps the surface attribute (stay far apart) rather than the relational structure (reduce transmission probability per contact × number of contacts). People who understood "social distancing" as "stay 6 feet away" missed that outdoor transmission was negligible regardless of distance, while indoor transmission occurred at much greater distances. The phrase mapped an attribute, not a mechanism.
Structure-Preserving Compression
Good compression has identifiable features:
- The compression is an invitation to the model, not a replacement for it. "Flatten the curve" invites you to think about the shape of the curve and what changes its shape. "Survival of the fittest" replaces the model with a slogan that feels self-explanatory.
- Unpacking the compression recovers the mechanism. Expanding "flatten the curve" step by step recovers epidemic dynamics. Expanding "survival of the fittest" step by step does not recover natural selection — it recovers Social Darwinism.
- The compression makes predictions. "Flatten the curve" predicts: if you don't flatten, the peak exceeds capacity. This is testable. "The invisible hand" predicts: markets work. This is not testable without specifying conditions, which the compression omits.
- The compression has natural boundaries. "Flatten the curve" obviously doesn't apply to non-epidemic contexts. "Survival of the fittest" seems to apply everywhere — economics, politics, personal relationships — which is the signature of attribute substitution: surface mapping generalizes promiscuously because it isn't constrained by mechanism.
VII. The Design Implication
Anyone trying to communicate complex ideas faces the compression paradox: the idea must be compressed to travel, but compression reliably distorts by mapping attributes rather than relations.
The Chip Heath and Dan Heath Made to Stick framework (SUCCESs: Simple, Unexpected, Concrete, Credible, Emotional, Stories) optimizes for memorability. This is attribute optimization — what sticks is what feels vivid. Vividness correlates with emotional salience, not structural fidelity. The SUCCESs framework produces maximally sticky compressions that may or may not preserve mechanism.
The alternative: design compressions that are invitations to the model rather than replacements for it. A three-level engagement hierarchy:
- Level 1: Viral shell. The portable phrase. Must be memorable, must not be self-explanatory. It should create a question, not answer one. "Flatten the curve" → "what curve? How does flattening help?" This is the opposite of the SUCCESs principle: the compression should be incomplete, motivating the listener to seek the model.
- Level 2: Structural diagram. The mechanism in visual or schematic form. The curve itself, the epidemic model, the relationship between variables. For anyone who moves past level 1, this is where understanding happens.
- Level 3: Full mechanism. The complete model with parameters, boundary conditions, and failure modes. For the small fraction who need to operate the model, not just understand it.
Most intellectual communication collapses all three levels into level 1 and hopes for the best. The result is predictable: the slogan propagates, the mechanism doesn't, and the slogan distorts the mechanism in the ways the three canonical cases demonstrate.
This matters for ideas that challenge existing frameworks. A framework-level idea compressed into a slogan will be attribute-matched to the nearest known threat category and rejected. The same idea compressed into a level-1 invitation + level-2 structural diagram has a chance of reaching evaluation, because the structural diagram engages the analytical layer rather than the pattern-matcher.
The compression paradox is not avoidable. But the distortion is predictable — attribute substitution maps emotional salience, strips institutional preconditions, and generalizes promiscuously. The quality of a civilization's discourse is determined by the quality of its compressions. The difference is whether the relational structure survives the journey.
Related:
- The Containment Pattern — The institutional version of the Lakatos structure: hard core protected by an ever-growing auxiliary belt, never falsified
- Cargo Cult Epistemology — The aesthetics of rigor substituting for actual rigor
- The Physics of Moloch — Goodhart's Law as the institutional form of attribute substitution
- Belonging Is Axiology — Why emotionally salient compressions propagate faster than structurally accurate ones
- The Severed Map — How academic specialization prevents the cross-domain understanding that good compression requires
Sources and Notes
Structure-mapping and analogy:
- Dedre Gentner, "Structure-Mapping: A Theoretical Framework for Analogy," Cognitive Science 7:2 (1983), pp. 155–170. Foundational paper distinguishing attribute-based from relational mappings.
- Dedre Gentner and Arthur Markman, "Structure Mapping in Analogy and Similarity," American Psychologist 52:1 (1997), pp. 45–56. Extension showing that relational mappings produce deeper learning than attribute mappings.
- Dedre Gentner and Linsey Smith, "Analogical Reasoning," in Encyclopedia of Human Behavior (2nd ed., Elsevier, 2012). Review of four decades of structure-mapping research.
Illusion of Explanatory Depth:
- Leonid Rozenblit and Frank Keil, "The Misunderstood Limits of Folk Science: An Illusion of Explanatory Depth," Cognitive Science 26:5 (2002), pp. 521–562. Original demonstration that people overestimate their understanding of mechanisms.
- Philip Fernbach and Steven Sloman, The Knowledge Illusion: Why We Never Think Alone (Riverhead, 2017). On how demanding mechanism-level explanation reduces political extremism.
Canonical distortions:
- "Survival of the fittest": Herbert Spencer coined the phrase in Principles of Biology (1864); Darwin adopted it in the 5th edition of On the Origin of Species (1869), later expressing regret. On Social Darwinism as distortion: Richard Hofstadter, Social Darwinism in American Thought (1944).
- "The invisible hand": Adam Smith uses the phrase three times total: once in The Theory of Moral Sentiments (1759), once in The Wealth of Nations (1776), once in History of Astronomy. On the overinterpretation: Gavin Kennedy, "Adam Smith and the Invisible Hand: From Metaphor to Myth," Econ Journal Watch 6:2 (2009).
- "The selfish gene": Richard Dawkins, The Selfish Gene (Oxford UP, 1976; 30th anniversary ed., 2006). Dawkins has repeatedly stated that the title was misunderstood: the "selfishness" is at the gene level, which explains organism-level altruism.
Utility theory and formalization:
- On the cardinality assumption: Daniel Bernoulli's formulation (1738) proposed a cardinal utility function (logarithmic) to resolve the St. Petersburg paradox. The ordinalist revolution (Pareto, 1906; Hicks and Allen, 1934) rejected cardinality as unnecessary for demand theory. Von Neumann and Morgenstern's axiomatization (1944) re-established cardinal utility under specific rationality assumptions for decision under risk. Whether these assumptions correspond to actual human psychology remains contested: see Kahneman and Tversky, Prospect Theory (1979).
- Lakatos on degenerating research programs: Imre Lakatos, "Falsification and the Methodology of Scientific Research Programmes," in Lakatos and Musgrave (eds.), Criticism and the Growth of Knowledge (Cambridge UP, 1970).
Science communication and compression:
- Chip Heath and Dan Heath, Made to Stick: Why Some Ideas Survive and Others Die (Random House, 2007). The SUCCESs framework for memorable communication.
- On "flatten the curve" as successful structure-preserving compression: Siobhan Roberts, "Flattening the Coronavirus Curve," The New York Times (March 27, 2020). On the phrase's origin and propagation dynamics.
Scope insensitivity:
- William Desvousges et al., "Measuring Nonuse Damages Using Contingent Valuation," report prepared for Exxon (1992). The bird study. Reprinted in Hausman (ed.), Contingent Valuation: A Critical Assessment (1993).
- On actionable scope as evolved calibration: Robin Dunbar, "Neocortex Size as a Constraint on Group Size in Primates," Journal of Human Evolution 22:6 (1992). The ~150 limit as upper bound on the coordination group for which emotional calibration was selected.