V2.704 - Information-Theoretic Model Selection — The Occam Razor

V2.704: Information-Theoretic Model Selection — The Occam Razor

Status: COMPLETED — 20/20 tests passed

The Central Question

V2.701 showed the framework (0 parameters) beats Planck ΛCDM (6 parameters) on 4/5 datasets. But that comparison used Planck’s published best-fit, not the TRUE best-fit across all data. Here we apply the standard statistical tools for model comparison — AIC, BIC, cross-validation, Bayesian evidence — to get the honest answer: does Occam’s razor favor zero parameters or one?

Key Result: The Razor Cuts Both Ways

Criterion	Winner	Magnitude
Raw χ²	ΛCDM	Δχ² = 9.7
AIC	ΛCDM	ΔAIC = −7.7
BIC	ΛCDM	ΔBIC = −6.5
Leave-one-out CV	Framework	5.7× more stable
Occam factor (prior)	Framework	68:1 from parsimony
Combined Bayes factor	ΛCDM	1.9:1 (marginal)

The standard criteria (AIC/BIC) favor ΛCDM. But cross-validation reveals the framework’s hidden strength: predictive stability.

The Three Models

Model	Parameters	Best-fit χ²	AIC	BIC
Framework	0	51.6	51.6	51.6
ΛCDM	1 (Ω_m)	41.9	43.9	45.1
w₀wₐCDM	3 (Ω_m, w₀, wₐ)	36.3	42.3	45.9

Best-fit ΛCDM: Ω_m = 0.3098 (vs framework’s 0.3123). Best-fit w₀wₐ: Ω_m = 0.310, w₀ = −0.89, wₐ = −0.33.

Why AIC/BIC Favor ΛCDM

The data-preferred Ω_m = 0.309 is 2.9σ from the framework’s 0.312. This 0.0025 shift in Ω_m improves χ² by 9.7 across all datasets:

Dataset	χ²(framework)	χ²(best ΛCDM)	Δχ²
CMB	5.3	1.7	+3.7
BAO	25.2	22.0	+3.2
Pantheon+	1.5	1.8	−0.4
S₈	19.6	16.4	+3.2
Total	51.6	41.9	+9.7

The AIC penalty for one parameter is only 2.0; the BIC penalty is 3.2. Neither is enough to overcome the 9.7 unit χ² gain from fitting Ω_m.

Why Cross-Validation Favors the Framework

Leave-one-out cross-validation reveals a critical vulnerability of ΛCDM:

Left out	Framework χ²	ΛCDM Ω_m(fit)	ΛCDM χ²(pred)	Winner
CMB	5.3	0.293	494.3	Framework
BAO	25.2	0.310	21.7	ΛCDM
SNe	1.5	0.309	1.9	Framework
S₈	19.6	0.310	16.2	ΛCDM
H₀	49.2	0.310	44.4	ΛCDM
Total	100.8		578.5	Framework

When CMB is removed, ΛCDM’s Ω_m drops to 0.293 — far from the CMB-preferred value. This gives a catastrophic CMB prediction (χ² = 494).

The framework’s prediction never changes. Its Ω_m = 0.3123 is always the same, regardless of which data you train on. This stability IS the scientific content of having zero free parameters.

ΛCDM’s instability range: Ω_m ∈ [0.293, 0.310] depending on which dataset is removed. The framework’s is exactly zero.

The Bayesian Balance Sheet

The combined Bayes factor breaks into two competing effects:

Framework vs ΛCDM:

Occam factor: +4.23 (68:1 from parsimony — framework wastes zero prior volume)
Likelihood: −4.86 (ΛCDM fits the data better by Δχ²/2 = 4.86)
Total: −0.64 → 1.9:1 for ΛCDM (essentially a tie)

Framework vs w₀wₐCDM:

Occam factor: +7.92 (2740:1 from parsimony)
Likelihood: −7.66 (w₀wₐ fits better)
Total: +0.25 → 1.3:1 for Framework (essentially a tie)

Key insight: The Occam advantage from zero parameters ALMOST EXACTLY cancels the likelihood advantage from fitting. The models are in approximate Bayesian equilibrium.

Where Do the Datasets Point?

Each dataset independently prefers a different Ω_m:

Dataset	Preferred Ω_m
CMB	0.311
BAO	0.301
Pantheon+ SNe	0.334
Weak lensing S₈	0.283

The spread (0.283 to 0.334) reflects genuine tensions in the data. The framework’s 0.312 is closest to CMB. The best-fit ΛCDM at 0.310 is a weighted compromise pulled below the framework by BAO and S₈.

Honest Assessment

What the framework does well

Perfect predictive stability. No optimism bias, no overfitting, no sensitivity to which datasets are included. The prediction is the same before and after seeing any data.
Cross-validation dominance. The CV ratio of 0.17 (framework 5.7× better) reveals that ΛCDM’s in-sample superiority is partly due to over-adapting to the specific dataset combination.
Near-exact Bayesian equilibrium. A zero-parameter theory that comes within ln(B) = 0.64 of a one-parameter theory on standard Bayesian evidence is extraordinary. Most fixed predictions are decisively ruled out.
The framework vs w₀wₐ comparison actually FAVORS the framework. Despite w₀wₐ’s 15-unit χ² advantage, the Occam penalty for 3 parameters nearly kills it.

What the framework does poorly

2.9σ from the data-preferred Ω_m. This is the honest tension. The data collectively want Ω_m ≈ 0.309, not 0.312. This is not catastrophic (2.9σ, not 5σ) but it’s not comfortable either.
AIC and BIC both favor ΛCDM. The standard model selection criteria prefer having one free parameter. The χ² improvement from fitting Ω_m is real and substantial.
The CMB χ² improvement for best-fit ΛCDM is large. Going from 5.3 to 1.7 by shifting Ω_m by 0.0025 shows the CMB compressed likelihood is sensitive to small Ω_m changes.

What this means for the science

The framework is in a remarkable position: a zero-parameter prediction that standard model selection criteria can only weakly distinguish from a fitted one-parameter model. This is far better than most fixed predictions in physics perform.

The 2.9σ tension is the framework’s most concrete vulnerability. It will be resolved by:

Euclid (2027): σ(Ω_m) ≈ 0.002, will measure Ω_m to the precision needed to distinguish 0.309 from 0.312
DESI Y5 (2028): Tighter BAO constraints will clarify whether the BAO-preferred Ω_m = 0.301 persists or moves toward 0.311
CMB-S4 (2030): Sub-percent σ₈ will sharpen the S₈ dataset

The V2.701 comparison revisited

V2.701 compared the framework against Planck’s published Ω_m = 0.3153, which is NOT the best-fit to all data in our framework (it’s the best-fit of a 6-parameter model using full CMB, not just compressed likelihood). The framework beat Planck on 4/5 datasets because 0.3123 is closer to the global minimum (0.310) than 0.3153 is.

When we compare against the TRUE best-fit (Ω_m = 0.310), the framework loses on 3/4 datasets (all except Pantheon+). The honest conclusion: the framework is good, not perfect. It’s within 3σ of optimal, which is extraordinary for zero parameters but short of decisive.