Supervised Learning in Python: Classification and Regression
Supervised learning is where machine learning gets practical. You have labeled data and want to train a model that predicts outcomes for new inputs. The Python ecosystem, centered on scikit-learn, provides implementations of every major supervised algorithm -- from simple linear regression to ensemble methods like random forests and gradient boosting.
This learning path covers both classification (predicting categories) and regression (predicting values), organized by algorithm family so you can understand the relationships between models and make informed choices.
Overview and Model Selection
4 articlesPython Supervised Learning Models
Overview of supervised learning categories, the bias-variance trade-off, and choosing the right model.
Python Classification Models
Survey of classification algorithms with strengths, weaknesses, and use cases for each.
Python Regression Models
Survey of regression algorithms from linear through polynomial, tree-based, and ensemble methods.
Python Model Selection
Cross-validation, hyperparameter tuning, grid search, and strategies for selecting the best model.
Tree-Based and Ensemble Methods
5 articlesPython Decision Trees
How decision trees split data, entropy, information gain, pruning, and scikit-learn implementation.
Python Decision Tree Regression
Using decision trees for regression tasks, handling overfitting, and tree depth optimization.
Random Forest in Python
Ensemble of decision trees -- bagging, feature importance, out-of-bag scoring, and hyperparameter tuning.
Python Random Forest Regression
Random forests for continuous prediction tasks with feature selection and performance optimization.
Python Ensemble Methods
Bagging, boosting, stacking, and voting classifiers for improved prediction accuracy.
Linear Models and SVMs
6 articlesPython Logistic Regression for Binary Classification
Logistic regression fundamentals, sigmoid function, decision boundaries, and multi-class extension.
Python Linear Correlation
Pearson, Spearman, and Kendall correlation analysis with statistical significance testing.
Python Lasso Regression
L1 regularization, feature selection through lasso, and tuning the regularization parameter.
Python Ridge Regression
L2 regularization, preventing overfitting, and comparing ridge with lasso and elastic net.
Python Support Vector Machines (SVM)
SVM classification with kernels, margin maximization, and hyperparameter tuning.
Python Support Vector Regression (SVR)
Using SVMs for regression with epsilon-insensitive loss and kernel tricks.
Instance-Based and Probabilistic Models
3 articlesPython K-Nearest Neighbors (KNN)
KNN for classification and regression, distance metrics, k selection, and scaling requirements.
Python Naive Bayes Classification
Gaussian, multinomial, and Bernoulli naive Bayes with text classification examples.
Python Linear Discriminant Analysis
LDA for dimensionality reduction and classification with class separability optimization.