-
Notifications
You must be signed in to change notification settings - Fork 78
Topological overfitting detection: H0 gap catches overfitting before accuracy diverges (r=0.998) #917
Copy link
Copy link
Open
Description
Summary
We found that Persistent Homology (H0 total persistence) on class-mean direction vectors provides a real-time overfitting signal with r=0.998 correlation to the generalization gap — often detecting overfitting before the train/test accuracy gap becomes visible.
Method
- Extract direction vectors from model:
d = normalize(engine_A(x) - engine_G(x)) - Compute per-class mean directions
- Build cosine distance matrix between class centroids
- Run H0 persistent homology (via ripser)
- Compare H0_train vs H0_test — the gap predicts overfitting
Also includes
- Automatic LR search: The LR that minimizes H0 CV (coefficient of variation) over 5 epochs = optimal LR
- 1-epoch difficulty prediction: H0 after 1 epoch predicts final accuracy (H0=4.38 → 98.3%, H0=2.02 → 52.0%)
- Confusion prediction: H0 merge order = confusion pairs (Spearman r=-0.97)
Verified results
| Dataset | Accuracy | Best LR | Early Stop | Time |
|---|---|---|---|---|
| MNIST | 98.3% | 1e-03 | no | 2.2 min |
| Fashion | 87.4% | 3e-04 | no | 2.2 min |
| CIFAR-10 | 52.0% | 1e-03 | yes (ep 6) | 1.4 min |
CIFAR early-stopped at epoch 6 when H0_gap exceeded threshold — preventing wasted compute on a model that was already overfitting.
Repo: https://github.com/need-singularity/ph-training
Install: pip install -e . then ph-train --dataset cifar
Related projects
- logout — Consciousness Continuity Engine. The main research project with the dual-engine (PureFieldEngine) architecture that produces direction vectors analyzed by PH.
- Anima — Conversational consciousness agent with real-time PH overfitting detection integrated into the live inference loop.
- ph-training — Standalone training pipeline.
pip install -e .thenph-train --dataset cifar.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels