Complete Machine Learning Tutorial | Beginner to Advanced

Beginner Level

Start your journey into the world of Machine Learning with fundamental concepts and algorithms

1. Introduction to Machine Learning

What is Machine Learning?

Machine Learning is a subset of artificial intelligence that enables systems to learn and improve from experience without being explicitly programmed. It focuses on developing computer programs that can access data and use it to learn for themselves.

Why Machine Learning?

Automates decision-making processes
Handles large and complex datasets
Improves over time with more data
Solves problems where traditional programming fails

Example: Email Spam Detection

A machine learning model can learn to classify emails as spam or not spam by analyzing thousands of labeled email examples.

# Simple example using scikit-learn
from sklearn.naive_bayes import MultinomialNB
from sklearn.feature_extraction.text import CountVectorizer
import numpy as np

# Sample data
emails = [
    "Win money now! Click here!",
    "Meeting at 3pm tomorrow",
    "Free gift! Claim now!",
    "Project deadline next week"
]
labels = [1, 0, 1, 0]  # 1 = spam, 0 = not spam

# Vectorize text
vectorizer = CountVectorizer()
X = vectorizer.fit_transform(emails)

# Train model
model = MultinomialNB()
model.fit(X, labels)

# Predict
new_email = ["Congratulations! You won a prize!"]
X_new = vectorizer.transform(new_email)
prediction = model.predict(X_new)
print("Spam" if prediction[0] == 1 else "Not Spam")

2. Types of Machine Learning

Supervised Learning

Learning with labeled data. The model learns from input-output pairs.

Classification: Predicting categories (e.g., spam/not spam)
Regression: Predicting continuous values (e.g., house prices)

Example: House Price Prediction

from sklearn.linear_model import LinearRegression
import numpy as np

# Features: [size_in_sqft, bedrooms, age]
X = np.array([
    [1500, 3, 5],
    [2000, 4, 3],
    [1200, 2, 8],
    [2500, 5, 2]
])
# Target: price in thousands
y = np.array([300, 450, 250, 550])

model = LinearRegression()
model.fit(X, y)

# Predict price for new house
new_house = np.array([[1800, 3, 4]])
predicted_price = model.predict(new_house)
print(f"Predicted price: ${predicted_price[0]:.0f},000")

Unsupervised Learning

Learning patterns from unlabeled data without predefined outputs.

Clustering: Grouping similar data points
Dimensionality Reduction: Reducing features while preserving information

Example: Customer Segmentation

from sklearn.cluster import KMeans
import numpy as np

# Customer data: [annual_spending, visits_per_year]
customers = np.array([
    [1200, 8], [3500, 15], [800, 5],
    [2800, 12], [1500, 10], [4200, 20]
])

kmeans = KMeans(n_clusters=3, random_state=42)
clusters = kmeans.fit_predict(customers)

print("Customer segments:", clusters)
print("Cluster centers:", kmeans.cluster_centers_)

Reinforcement Learning

Learning through trial and error using rewards and penalties.

Agent interacts with environment
Receives rewards for good actions
Learns optimal strategy over time

Example: Game Playing

# Simplified Q-Learning concept
# Agent learns which actions maximize reward
# Used in: Chess, Go, Video Games, Robotics

# Q-Table stores state-action values
# Q(state, action) = expected future reward

# Key components:
# - State: Current game situation
# - Action: Move to make
# - Reward: Feedback (+1 for win, -1 for loss)
# - Policy: Strategy to choose actions

3. Data Preprocessing

Data preprocessing is crucial for building effective ML models. Raw data is often incomplete, inconsistent, or in a format that's difficult to use.

Key Steps:

Handling Missing Values: Remove or impute missing data
Encoding Categorical Variables: Convert text categories to numbers
Feature Scaling: Normalize or standardize numerical features
Train-Test Split: Separate data for training and testing

Complete Preprocessing Pipeline

import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.model_selection import train_test_split
from sklearn.impute import SimpleImputer

# Load data
data = pd.DataFrame({
    'age': [25, 30, np.nan, 35, 40],
    'salary': [50000, 60000, 70000, np.nan, 90000],
    'city': ['NYC', 'LA', 'NYC', 'Chicago', 'LA'],
    'purchased': ['Yes', 'No', 'Yes', 'No', 'Yes']
})

# 1. Handle missing values
imputer = SimpleImputer(strategy='mean')
data[['age', 'salary']] = imputer.fit_transform(data[['age', 'salary']])

# 2. Encode categorical variables
label_encoder = LabelEncoder()
data['city'] = label_encoder.fit_transform(data['city'])
data['purchased'] = label_encoder.fit_transform(data['purchased'])

# 3. Feature scaling
scaler = StandardScaler()
features = ['age', 'salary', 'city']
X = scaler.fit_transform(data[features])
y = data['purchased']

# 4. Train-test split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

print("Preprocessed data shape:", X_train.shape)

4. Linear Regression

Linear Regression is the simplest regression algorithm that models the relationship between dependent and independent variables using a straight line.

Formula: y = mx + b (for simple) or y = b₀ + b₁x₁ + b₂x₂ + ... (for multiple)

Key Concepts:

Slope (m): Rate of change
Intercept (b): Value when x = 0
Cost Function: Mean Squared Error (MSE)
Optimization: Gradient Descent

Implementation Example

import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score

# Generate sample data
np.random.seed(42)
X = np.random.rand(100, 1) * 10
y = 2.5 * X.flatten() + 1.5 + np.random.randn(100) * 2

# Create and train model
model = LinearRegression()
model.fit(X, y)

# Make predictions
y_pred = model.predict(X)

# Evaluate
mse = mean_squared_error(y, y_pred)
r2 = r2_score(y, y_pred)

print(f"Coefficient: {model.coef_[0]:.2f}")
print(f"Intercept: {model.intercept_:.2f}")
print(f"MSE: {mse:.2f}")
print(f"R² Score: {r2:.2f}")

# Visualize
plt.scatter(X, y, alpha=0.5, label='Data')
plt.plot(X, y_pred, 'r-', label='Prediction')
plt.xlabel('X')
plt.ylabel('y')
plt.legend()
plt.title('Linear Regression')
plt.show()

5. Logistic Regression

Logistic Regression is used for binary classification problems. It uses the sigmoid function to output probabilities between 0 and 1.

Sigmoid Function: σ(z) = 1 / (1 + e^(-z))

Key Concepts:

Outputs probability (0 to 1)
Uses logistic function (S-curve)
Decision boundary at 0.5
Cost function: Log Loss

Classification Example

from sklearn.linear_model import LogisticRegression
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, classification_report

# Generate classification data
X, y = make_classification(
    n_samples=1000,
    n_features=4,
    n_classes=2,
    random_state=42
)

# Split data
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# Train model
model = LogisticRegression()
model.fit(X_train, y_train)

# Predictions
y_pred = model.predict(X_test)
y_proba = model.predict_proba(X_test)

# Evaluate
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.2%}")
print("\nClassification Report:")
print(classification_report(y_test, y_pred))

# Show probability for first prediction
print(f"\nFirst prediction probability: {y_proba[0]}")
print(f"Predicted class: {y_pred[0]}")

Intermediate Level

Explore more sophisticated algorithms and techniques for complex problems

6. Decision Trees

Decision Trees create a model that predicts the value of a target variable by learning simple decision rules inferred from data features.

Key Concepts:

Nodes: Decision points (features)
Branches: Outcomes (values)
Leaves: Final predictions
Splitting Criteria: Gini Impurity, Entropy, Information Gain

Decision Tree Implementation

from sklearn.tree import DecisionTreeClassifier, plot_tree
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt

# Load iris dataset
iris = load_iris()
X, y = iris.data, iris.target

# Split data
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# Train model
tree = DecisionTreeClassifier(max_depth=3, random_state=42)
tree.fit(X_train, y_train)

# Evaluate
accuracy = tree.score(X_test, y_test)
print(f"Accuracy: {accuracy:.2%}")

# Visualize tree
plt.figure(figsize=(15, 10))
plot_tree(tree, feature_names=iris.feature_names,
          class_names=iris.target_names, filled=True)
plt.title("Decision Tree Visualization")
plt.show()

# Feature importance
importances = tree.feature_importances_
for i, (feature, importance) in enumerate(zip(iris.feature_names, importances)):
    print(f"{feature}: {importance:.3f}")

7. Random Forest

Random Forest is an ensemble method that combines multiple decision trees to create a more robust and accurate model.

Key Concepts:

Ensemble Learning: Combining multiple models
Bootstrap Aggregating (Bagging): Training on random subsets
Random Feature Selection: Using random features at each split
Voting: Final prediction from majority vote

Random Forest Example

from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, confusion_matrix

# Generate data
X, y = make_classification(
    n_samples=1000,
    n_features=20,
    n_classes=3,
    random_state=42
)

# Split data
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# Train Random Forest
rf = RandomForestClassifier(
    n_estimators=100,  # Number of trees
    max_depth=10,
    random_state=42
)
rf.fit(X_train, y_train)

# Predictions
y_pred = rf.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)

print(f"Accuracy: {accuracy:.2%}")
print("\nConfusion Matrix:")
print(confusion_matrix(y_test, y_pred))

# Feature importance
importances = rf.feature_importances_
print(f"\nTop 5 Most Important Features:")
indices = importances.argsort()[-5:][::-1]
for i in indices:
    print(f"Feature {i}: {importances[i]:.3f}")

8. Support Vector Machines (SVM)

SVM finds the optimal hyperplane that separates classes with maximum margin.

Key Concepts:

Support Vectors: Data points closest to hyperplane
Margin: Distance between hyperplane and nearest points
Kernel Trick: Transforming data to higher dimensions
Kernels: Linear, Polynomial, RBF (Radial Basis Function)

SVM with Different Kernels

from sklearn.svm import SVC
from sklearn.datasets import make_circles
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
import numpy as np

# Generate non-linearly separable data
X, y = make_circles(n_samples=500, noise=0.1, factor=0.2, random_state=42)

# Split data
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# SVM with RBF kernel (handles non-linear data)
svm_rbf = SVC(kernel='rbf', C=1.0, gamma='scale')
svm_rbf.fit(X_train, y_train)

# SVM with polynomial kernel
svm_poly = SVC(kernel='poly', degree=3, C=1.0)
svm_poly.fit(X_train, y_train)

# Evaluate
rbf_acc = accuracy_score(y_test, svm_rbf.predict(X_test))
poly_acc = accuracy_score(y_test, svm_poly.predict(X_test))

print(f"RBF Kernel Accuracy: {rbf_acc:.2%}")
print(f"Polynomial Kernel Accuracy: {poly_acc:.2%}")

# Show support vectors
print(f"\nNumber of support vectors (RBF): {len(svm_rbf.support_vectors_)}")

9. K-Means Clustering

K-Means is an unsupervised clustering algorithm that partitions data into k clusters based on similarity.

Algorithm Steps:

Initialize k centroids randomly
Assign each data point to nearest centroid
Update centroids to mean of assigned points
Repeat until convergence

K-Means Clustering Example

from sklearn.cluster import KMeans
from sklearn.datasets import make_blobs
import matplotlib.pyplot as plt
import numpy as np

# Generate sample data
X, y_true = make_blobs(
    n_samples=300,
    centers=4,
    n_features=2,
    random_state=42
)

# Apply K-Means
kmeans = KMeans(n_clusters=4, random_state=42, n_init=10)
y_pred = kmeans.fit_predict(X)

# Visualize
plt.figure(figsize=(12, 5))

plt.subplot(1, 2, 1)
plt.scatter(X[:, 0], X[:, 1], c=y_true, cmap='viridis')
plt.title('True Clusters')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')

plt.subplot(1, 2, 2)
plt.scatter(X[:, 0], X[:, 1], c=y_pred, cmap='viridis')
plt.scatter(kmeans.cluster_centers_[:, 0], 
            kmeans.cluster_centers_[:, 1],
            c='red', marker='x', s=200, linewidths=3)
plt.title('K-Means Clusters')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')

plt.tight_layout()
plt.show()

# Find optimal k using Elbow Method
inertias = []
K_range = range(1, 10)
for k in K_range:
    kmeans_temp = KMeans(n_clusters=k, random_state=42)
    kmeans_temp.fit(X)
    inertias.append(kmeans_temp.inertia_)

plt.figure(figsize=(8, 5))
plt.plot(K_range, inertias, 'bo-')
plt.xlabel('Number of Clusters (k)')
plt.ylabel('Inertia')
plt.title('Elbow Method for Optimal k')
plt.show()

10. Neural Networks Basics

Neural Networks are computing systems inspired by biological neural networks. They consist of interconnected nodes (neurons) organized in layers.

Architecture:

Input Layer: Receives features
Hidden Layers: Process information
Output Layer: Produces predictions
Weights & Biases: Learnable parameters
Activation Functions: ReLU, Sigmoid, Tanh

Simple Neural Network

from sklearn.neural_network import MLPClassifier
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score

# Generate data
X, y = make_classification(
    n_samples=1000,
    n_features=20,
    n_classes=3,
    random_state=42
)

# Preprocess
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Split data
X_train, X_test, y_train, y_test = train_test_split(
    X_scaled, y, test_size=0.2, random_state=42
)

# Create Neural Network
mlp = MLPClassifier(
    hidden_layer_sizes=(100, 50),  # Two hidden layers
    activation='relu',
    solver='adam',
    alpha=0.01,  # L2 penalty
    learning_rate='adaptive',
    max_iter=500,
    random_state=42
)

# Train
mlp.fit(X_train, y_train)

# Predict
y_pred = mlp.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)

print(f"Accuracy: {accuracy:.2%}")
print(f"Number of layers: {mlp.n_layers_}")
print(f"Number of iterations: {mlp.n_iter_}")
print(f"Loss: {mlp.loss_:.4f}")

Advanced Level

Master cutting-edge techniques and deep learning architectures

11. Deep Learning Fundamentals

Deep Learning uses neural networks with multiple hidden layers to learn complex patterns in data.

Key Concepts:

Backpropagation: Learning algorithm for training
Gradient Descent: Optimization method
Batch Normalization: Stabilizing training
Dropout: Preventing overfitting

Deep Neural Network with TensorFlow/Keras

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import numpy as np

# Load and preprocess data
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

# Normalize pixel values
x_train = x_train.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0

# Flatten images
x_train = x_train.reshape((60000, 28 * 28))
x_test = x_test.reshape((10000, 28 * 28))

# Build model
model = keras.Sequential([
    layers.Dense(512, activation='relu', input_shape=(28 * 28,)),
    layers.Dropout(0.2),
    layers.Dense(256, activation='relu'),
    layers.Dropout(0.2),
    layers.Dense(10, activation='softmax')
])

# Compile
model.compile(
    optimizer='adam',
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)

# Train
history = model.fit(
    x_train, y_train,
    epochs=5,
    batch_size=128,
    validation_split=0.2
)

# Evaluate
test_loss, test_acc = model.evaluate(x_test, y_test)
print(f"Test accuracy: {test_acc:.2%}")

12. Convolutional Neural Networks (CNN)

CNNs are specialized neural networks for processing grid-like data such as images. They use convolutional layers to detect patterns.

Architecture Components:

Convolutional Layers: Feature detection
Pooling Layers: Dimensionality reduction
Fully Connected Layers: Classification
Filters/Kernels: Learnable feature detectors

CNN for Image Classification

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

# Load CIFAR-10 dataset
(x_train, y_train), (x_test, y_test) = keras.datasets.cifar10.load_data()

# Normalize
x_train = x_train.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0

# Build CNN
model = keras.Sequential([
    # Convolutional base
    layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, (3, 3), activation='relu'),
    
    # Dense classifier
    layers.Flatten(),
    layers.Dense(64, activation='relu'),
    layers.Dense(10, activation='softmax')
])

model.compile(
    optimizer='adam',
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)

# Train
model.fit(
    x_train, y_train,
    epochs=10,
    batch_size=64,
    validation_data=(x_test, y_test)
)

# Evaluate
test_loss, test_acc = model.evaluate(x_test, y_test)
print(f"Test accuracy: {test_acc:.2%}")

13. Recurrent Neural Networks (RNN & LSTM)

RNNs are designed for sequence data. LSTMs (Long Short-Term Memory) are a special type that can learn long-term dependencies.

Applications:

Natural Language Processing
Time Series Prediction
Speech Recognition
Machine Translation

LSTM for Time Series Prediction

import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

# Generate time series data
def generate_sequence(length):
    time = np.arange(0, length, 0.1)
    data = np.sin(time) + np.random.normal(0, 0.1, length * 10)
    return data

# Prepare sequences
def create_sequences(data, seq_length):
    X, y = [], []
    for i in range(len(data) - seq_length):
        X.append(data[i:i+seq_length])
        y.append(data[i+seq_length])
    return np.array(X), np.array(y)

# Generate data
data = generate_sequence(10)
seq_length = 20
X, y = create_sequences(data, seq_length)

# Reshape for LSTM: [samples, time_steps, features]
X = X.reshape((X.shape[0], X.shape[1], 1))

# Split
split = int(len(X) * 0.8)
X_train, X_test = X[:split], X[split:]
y_train, y_test = y[:split], y[split:]

# Build LSTM model
model = keras.Sequential([
    layers.LSTM(50, activation='relu', return_sequences=True, input_shape=(seq_length, 1)),
    layers.LSTM(50, activation='relu'),
    layers.Dense(1)
])

model.compile(optimizer='adam', loss='mse')

# Train
model.fit(X_train, y_train, epochs=20, batch_size=32, verbose=0)

# Evaluate
train_loss = model.evaluate(X_train, y_train, verbose=0)
test_loss = model.evaluate(X_test, y_test, verbose=0)
print(f"Train Loss: {train_loss:.4f}")
print(f"Test Loss: {test_loss:.4f}")

14. Natural Language Processing (NLP)

NLP enables machines to understand, interpret, and generate human language.

Key Techniques:

Tokenization: Breaking text into words/tokens
Word Embeddings: Word2Vec, GloVe, BERT
TF-IDF: Term frequency-inverse document frequency
Sentiment Analysis: Classifying emotions
Text Classification: Categorizing documents

Text Classification with NLP

from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, classification_report

# Sample text data
texts = [
    "I love this product! It's amazing.",
    "Terrible service, very disappointed.",
    "Great quality, highly recommend!",
    "Poor quality, waste of money.",
    "Excellent customer service and fast delivery.",
    "Not worth the price, cheaply made."
]
labels = ['positive', 'negative', 'positive', 'negative', 'positive', 'negative']

# Convert labels to numbers
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
y = le.fit_transform(labels)

# Vectorize text
vectorizer = TfidfVectorizer(max_features=100, stop_words='english')
X = vectorizer.fit_transform(texts)

# Split data
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.3, random_state=42
)

# Train model
model = MultinomialNB()
model.fit(X_train, y_train)

# Predict
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)

print(f"Accuracy: {accuracy:.2%}")
print("\nClassification Report:")
print(classification_report(y_test, y_pred, target_names=le.classes_))

# Predict new text
new_text = ["This is a fantastic product!"]
new_vector = vectorizer.transform(new_text)
prediction = model.predict(new_vector)
print(f"\nPrediction: {le.inverse_transform(prediction)[0]}")

15. Model Evaluation & Metrics

Proper evaluation is crucial for understanding model performance and making improvements.

Classification Metrics:

Accuracy: Overall correctness
Precision: True positives / (True positives + False positives)
Recall: True positives / (True positives + False negatives)
F1-Score: Harmonic mean of precision and recall
ROC-AUC: Area under ROC curve

Regression Metrics:

MAE: Mean Absolute Error
MSE: Mean Squared Error
RMSE: Root Mean Squared Error
R² Score: Coefficient of determination

Comprehensive Model Evaluation

from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.metrics import (
    accuracy_score, precision_score, recall_score, f1_score,
    confusion_matrix, classification_report, roc_auc_score, roc_curve
)
import matplotlib.pyplot as plt

# Generate data
X, y = make_classification(
    n_samples=1000,
    n_features=20,
    n_classes=2,
    random_state=42
)

# Split data
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# Train model
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)

# Predictions
y_pred = model.predict(X_test)
y_proba = model.predict_proba(X_test)[:, 1]

# Calculate metrics
accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred)
recall = recall_score(y_test, y_pred)
f1 = f1_score(y_test, y_pred)
roc_auc = roc_auc_score(y_test, y_proba)

print("Model Evaluation Metrics:")
print(f"Accuracy:  {accuracy:.2%}")
print(f"Precision: {precision:.2%}")
print(f"Recall:    {recall:.2%}")
print(f"F1-Score:  {f1:.2%}")
print(f"ROC-AUC:   {roc_auc:.2%}")

print("\nConfusion Matrix:")
print(confusion_matrix(y_test, y_pred))

print("\nClassification Report:")
print(classification_report(y_test, y_pred))

# Cross-validation
cv_scores = cross_val_score(model, X, y, cv=5, scoring='accuracy')
print(f"\nCross-Validation Accuracy: {cv_scores.mean():.2%} (+/- {cv_scores.std() * 2:.2%})")

16. Hyperparameter Tuning

Hyperparameter tuning optimizes model performance by finding the best configuration.

Methods:

Grid Search: Exhaustive search over parameter grid
Random Search: Random sampling of parameters
Bayesian Optimization: Efficient search using probability
Early Stopping: Stopping training when validation loss stops improving

Hyperparameter Tuning Example

from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import GridSearchCV, RandomizedSearchCV
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split

# Generate data
X, y = make_classification(
    n_samples=1000,
    n_features=20,
    n_classes=2,
    random_state=42
)

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# Define parameter grid
param_grid = {
    'n_estimators': [50, 100, 200],
    'max_depth': [3, 5, 7, None],
    'min_samples_split': [2, 5, 10],
    'min_samples_leaf': [1, 2, 4]
}

# Base model
rf = RandomForestClassifier(random_state=42)

# Grid Search
print("Performing Grid Search...")
grid_search = GridSearchCV(
    rf, param_grid, cv=5,
    scoring='accuracy', n_jobs=-1, verbose=1
)
grid_search.fit(X_train, y_train)

print(f"\nBest parameters: {grid_search.best_params_}")
print(f"Best cross-validation score: {grid_search.best_score_:.2%}")

# Test on test set
best_model = grid_search.best_estimator_
test_accuracy = best_model.score(X_test, y_test)
print(f"Test accuracy: {test_accuracy:.2%}")

# Random Search (faster for large parameter spaces)
print("\nPerforming Random Search...")
random_search = RandomizedSearchCV(
    rf, param_grid, n_iter=20, cv=5,
    scoring='accuracy', n_jobs=-1, random_state=42
)
random_search.fit(X_train, y_train)

print(f"Best parameters: {random_search.best_params_}")
print(f"Best cross-validation score: {random_search.best_score_:.2%}")