Underfitting vs Overfitting
- Underfitting: model too simple, poor performance on train & test.
- Overfitting: model too complex, excellent on train, poor on new data.
Bias–Variance tradeoff
- Bias: error from wrong assumptions (model too rigid).
- Variance: error from sensitivity to training data (model too flexible).
- Goal: balance both to minimize total error.
Visualizing with learning curves (pseudo):
- If training & validation errors both high → high bias → increase model complexity or features.
- If training error low but validation high → high variance → regularize or get more data.
Regularization techniques
- L2 (Ridge): penalize sum of squared weights. Sklearn: Ridge(alpha=...).
- L1 (Lasso): penalize sum of absolute weights — encourages sparsity. Lasso(alpha=...).
- Elastic Net: combination of L1 & L2. ElasticNet(alpha=..., l1_ratio=...).
- Early stopping: stop training when validation loss stops improving (common in boosting/NNs).
- Dropout (DL), pruning, data augmentation, ensembling — other variance-reducing methods.
Ridge & Lasso in sklearn
from sklearn.linear_model import Ridge, Lasso ridge = Ridge(alpha=1.0).fit(X_train, y_train) lasso = Lasso(alpha=0.1).fit(X_train, y_train)
Early stopping example with scikit-learn's GradientBoosting or XGBoost
from xgboost import XGBRegressor model = XGBRegressor(n_estimators=1000) model.fit(X_train, y_train, eval_set=[(X_val, y_val)], early_stopping_rounds=20, verbose=50)
Practical tips
- Use cross-validation to find regularization strength (GridSearchCV / RandomizedSearchCV).
- When in doubt, simpler models + good features beat complex models with poor features.
- Monitor learning curves to diagnose bias vs variance.