DAVE
Model version review

V1, V2, and V3 comparison

Each attached notebook represents a different patient/control classifier design. V3 is highlighted because it is the most defensible model for presentation and testing.

V3 recommended
V1 Underfitted

Individual-Level Baseline

Uses one row per person with an 80/10/10 split. It is simple and leakage-aware, but only has 10 individual examples, so it does not have enough training signal.

Notebook
patient_classifier_colab.ipynb
Strength
Clean baseline
Weakness
Too few rows to learn stable patterns
V2 Overfitted

Session-Level Exploratory Model

Uses session-level rows and leave-one-subject-out evaluation. It reports stronger performance, but is more aggressive and likely too tuned to the small workbook.

Notebook
patient_classifier_colabV2.ipynb
Strength
Uses more rows
Weakness
Exploratory and overfit-prone

Recommendation

Use V3 for the final workflow. Keep V1 as the baseline and V2 as an exploratory comparison, but avoid presenting V2 as the primary model because its higher performance may not generalize.