Underfitting vs Overfitting

  • Underfitting: model too simple, poor performance on train & test.
  • Overfitting: model too complex, excellent on train, poor on new data.

Bias–Variance tradeoff

  • Bias: error from wrong assumptions (model too rigid).
  • Variance: error from sensitivity to training data (model too flexible).
  • Goal: balance both to minimize total error.

Visualizing with learning curves (pseudo):

  • If training & validation errors both high → high bias → increase model complexity or features.
  • If training error low but validation high → high variance → regularize or get more data.

Regularization techniques

  • L2 (Ridge): penalize sum of squared weights. Sklearn: Ridge(alpha=...).
  • L1 (Lasso): penalize sum of absolute weights — encourages sparsity. Lasso(alpha=...).
  • Elastic Net: combination of L1 & L2. ElasticNet(alpha=..., l1_ratio=...).
  • Early stopping: stop training when validation loss stops improving (common in boosting/NNs).
  • Dropout (DL), pruning, data augmentation, ensembling — other variance-reducing methods.

Ridge & Lasso in sklearn

from sklearn.linear_model import Ridge, Lasso
ridge = Ridge(alpha=1.0).fit(X_train, y_train)
lasso = Lasso(alpha=0.1).fit(X_train, y_train)

Early stopping example with scikit-learn's GradientBoosting or XGBoost

from xgboost import XGBRegressor
model = XGBRegressor(n_estimators=1000)
model.fit(X_train, y_train, eval_set=[(X_val, y_val)], early_stopping_rounds=20, verbose=50)

Practical tips

  • Use cross-validation to find regularization strength (GridSearchCV / RandomizedSearchCV).
  • When in doubt, simpler models + good features beat complex models with poor features.
  • Monitor learning curves to diagnose bias vs variance.