While Linear Regression predicts continuous values, Logistic Regression is used for classification tasks, where the goal is to predict a discrete category.

Don't be fooled by the name! Logistic Regression is a classification algorithm, not a regression one. It's used to answer questions like:

  • Will a customer churn? (Yes/No)
  • Is this email spam? (Spam/Not Spam)
  • Does this patient have a certain disease? (Positive/Negative)

From Linear to Logistic

Logistic Regression starts with the same linear equation as Linear Regression:

z=β0​+β1​x1​+⋯+βn​xn​

The problem is that the output of this equation, z, can be any real number (e.g., from -∞ to +∞). This isn't helpful for classification, where we need a probability between 0 and 1.

The Sigmoid Function

To solve this, Logistic Regression passes the output z through a special function called the Sigmoid (or Logistic) function.

The Sigmoid function squashes any input value into a range between 0 and 1.

σ(z)=1+e−z1​

The output of this function can be interpreted as the probability of the positive class. For example, if the output is 0.85, we can say there is an 85% probability that the data point belongs to class '1'. We then use a threshold (typically 0.5) to make a final decision:

  • If probability ≥ 0.5, predict class '1'.
  • If probability < 0.5, predict class '0'.

Python


# Python code with scikit-learn
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
import numpy as np

# Sample Data: Predicting pass/fail based on hours studied
X = np.array([[1], [2], [3], [4.5], [5.5], [6], [7], [8], [9], [10]]) # Hours Studied
y = np.array([0, 0, 0, 0, 1, 0, 1, 1, 1, 1]) # 0 = Fail, 1 = Pass

# Create and train the model
model = LogisticRegression()
model.fit(X, y)

# Predict for a new student
hours_studied = [[6.5]]
prediction = model.predict(hours_studied)
probability = model.predict_proba(hours_studied)

print(f"Prediction for {hours_studied[0][0]} hours: {'Pass' if prediction[0] == 1 else 'Fail'}")
print(f"Probabilities (Fail, Pass): {probability}")

Binary vs. Multinomial

  • Binary Logistic Regression: This is the standard form we've discussed, used for problems with only two possible outcome classes (e.g., True/False, Yes/No).
  • Multinomial Logistic Regression: This is an extension used for problems with three or more classes (e.g., classifying a news article as 'Sports', 'Politics', or 'Technology'). It uses a different function called Softmax instead of Sigmoid to calculate the probabilities for each class.