{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Ensemble Methods: Random Forest - A detailed overview\n", "\n", "Rafiq Islam \n", "2024-10-07\n", "\n", "## Introduction" ], "id": "64e3c09a-7a97-4e77-8136-3ef45e206674" }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/html" }, "source": [ "

" ], "id": "0c43cd84-05c3-4ef7-999e-a3c9c3536f77" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Random Forest is one of the most popular machine learning algorithms,\n", "known for its simplicity, versatility, and ability to perform both\n", "classification and regression tasks. It operates by constructing a\n", "multitude of decision trees during training and outputs the mode of the\n", "classes (for classification) or the mean prediction (for regression) of\n", "the individual trees." ], "id": "2ce26d62-f7ca-4176-ae37-28dc73420423" }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/html" }, "source": [ "

" ], "id": "b4fe3632-c8e4-420f-a66c-ee17e88ffb04" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## What is Random Forest?" ], "id": "a9088f67-932c-4f11-bad1-3c5b1fe43360" }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/html" }, "source": [ "

" ], "id": "370328ab-f50c-49b2-b9fd-03979bb1ac0f" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Random Forest is an ensemble learning method that builds multiple\n", "decision trees and combines their predictions to obtain a more accurate\n", "and stable result. Each tree is built using a different random subset of\n", "the data, and at each node, a random subset of features is considered\n", "when splitting the data." ], "id": "2c19668f-3ca7-467f-a650-0011a379f4da" }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/html" }, "source": [ "

" ], "id": "0a853211-a279-44ff-85d6-f278368d85b1" }, { "cell_type": "markdown", "metadata": {}, "source": [ "- **Classification:** The final output is determined by majority\n", " voting from all the decision trees\n", "- **Regression:** The output is the average of all tree predictions.\n", "\n", "## Mathematics Behind Random Forest\n", "\n", "To understand Random Forest, we first need to recap how a decision tree\n", "works and then explore how Random Forest extends this idea.\n", "\n", "### Decision Tree Recap\n", "\n", "A decision tree is a tree-structured model where each internal node\n", "represents a “test” on an attribute (e.g., whether the feature value is\n", "above or below a threshold), each branch represents the outcome of the\n", "test, and each leaf node represents a class label (classification) or a\n", "value (regression).\n", "\n", "- For **classification**, the goal is to partition the data such that\n", " the class labels in each partition are as homogeneous as possible. \n", "- For **regression**, the goal is to minimize the variance of the\n", " predicted values.\n", "\n", "Mathematically, the decision tree makes decisions by minimizing the\n", "**Gini Index** or **Entropy** for classification tasks and minimizing\n", "the **Mean Squared Error (MSE)** for regression tasks.\n", "\n", "### Random Forest Algorithm\n", "\n", "Random Forest enhances decision trees by employing two key concepts:\n", "\n", "- **Random Sampling (Bootstrap Sampling):** From the training set of\n", " size $N$, randomly draw $N$ samples with replacement. \n", "- **Feature Subsampling:** At each node of the decision tree, a random\n", " subset of the features is selected, and the best split is chosen\n", " only from these features.\n", "\n", "The process for building a Random Forest can be summarized as follows:\n", "\n", "1. Draw $B$ bootstrap samples from the original dataset.\n", "2. For each bootstrap sample, grow an unpruned decision tree using a\n", " random subset of features at each node.\n", "3. For **classification**, combine the predictions of all the trees by\n", " majority voting.\n", "4. For **regression**, combine the predictions by averaging the outputs\n", " of all trees.\n", "\n", "### Random Forest for Classification\n", "\n", "For classification tasks, Random Forest works by constructing multiple\n", "decision trees, each built on a different subset of the data and a\n", "random subset of the features.\n", "\n", "Given a dataset $D = \\{(x_1, y_1), (x_2, y_2), ..., (x_N, y_N)\\}$, where\n", "$x_i$ is a feature vector and $y_i$ is the class label, Random Forest\n", "generates $B$ decision trees $T_1, T_2, ..., T_B$.\n", "\n", "For each test point $x$, each tree $T_b$ gives a class prediction: $$\n", "\\hat{y}_b(x) = T_b(x)\n", "$$ The final prediction is determined by majority voting: $$\n", "\\hat{y}(x) = \\text{argmax}_k \\sum_{b=1}^{B} I(\\hat{y}_b(x) = k)\n", "$$ where $I(\\cdot)$ is an indicator function that equals 1 if the\n", "condition is true and 0 otherwise.\n", "\n", "------------------------------------------------------------------------\n", "\n", "### Random Forest for Regression\n", "\n", "In regression tasks, Random Forest builds trees that predict continuous\n", "values and averages the results.\n", "\n", "Given a dataset $D = \\{(x_1, y_1), (x_2, y_2), ..., (x_N, y_N)\\}$, where\n", "$x_i$ is a feature vector and $y_i$ is the continuous target variable,\n", "Random Forest generates $B$ decision trees $T_1, T_2, ..., T_B$.\n", "\n", "For each test point $x$, each tree $T_b$ gives a predicted value: $$\n", "\\hat{y}_b(x) = T_b(x)\n", "$$ The final prediction is the average of all the tree predictions: $$\n", "\\hat{y}(x) = \\frac{1}{B} \\sum_{b=1}^{B} \\hat{y}_b(x)\n", "$$\n", "\n", "------------------------------------------------------------------------\n", "\n", "## Assumptions of Random Forest\n", "\n", "Random Forest makes few assumptions about the data, making it highly\n", "flexible. Some assumptions include:\n", "\n", "- **Independent Features:** While Random Forest does not explicitly\n", " assume that features are independent, correlated features can reduce\n", " its performance slightly. \n", "- **Noisy Data:** Random Forest is robust to noise due to its ensemble\n", " nature. \n", "- **Non-linearity:** Random Forest can handle non-linear relationships\n", " between features and the target.\n", "\n", "## Advantages of Random Forest\n", "\n", "- **Reduction of Overfitting:** Random Forest reduces overfitting by\n", " averaging the predictions of multiple trees.\n", "- **Handles Missing Data:** It can handle missing values by assigning\n", " them to the most frequent class (classification) or mean value\n", " (regression).\n", "- **Robust to Noise:** It is relatively resistant to outliers and\n", " noise due to its ensemble nature.\n", "- **Works with Categorical & Continuous Variables:** Random Forest can\n", " handle both categorical and continuous data types.\n", "- **Feature Importance:** It provides an estimate of feature\n", " importance, allowing for better interpretability of models.\n", "\n", "## Disadvantages of Random Forest\n", "\n", "- **Complexity:** The algorithm is computationally intensive,\n", " especially with a large number of trees.\n", "- **Interpretability:** While decision trees are interpretable, Random\n", " Forest is a “black-box” model where it’s hard to understand\n", " individual predictions.\n", "- **Memory Usage:** Random Forest can require more memory to store\n", " multiple decision trees.\n", "- **Bias in Imbalanced Data:** For classification tasks with\n", " imbalanced data, Random Forest may be biased toward the majority\n", " class.\n", "\n", "------------------------------------------------------------------------\n", "\n", "## Python Implementation\n", "\n", "Here is a Python code example of how to implement Random Forest for both\n", "classification and regression using `scikit-learn`." ], "id": "e3b657c9-22e9-4b6e-af3d-0fbc8512cb05" }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Classification Accuracy: 1.0\n", "Regression Mean Squared Error: 9.619662013157892" ] } ], "source": [ "import pandas as pd\n", "import numpy as np\n", "from sklearn.ensemble import RandomForestClassifier, RandomForestRegressor\n", "from sklearn.model_selection import train_test_split\n", "from sklearn.metrics import accuracy_score, mean_squared_error\n", "from sklearn.datasets import load_iris\n", "\n", "# Classification Example: Iris dataset\n", "iris = load_iris()\n", "X, y = iris.data, iris.target\n", "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)\n", "\n", "# Initialize RandomForest Classifier\n", "clf = RandomForestClassifier(n_estimators=100, random_state=42)\n", "clf.fit(X_train, y_train)\n", "\n", "# Predict and evaluate\n", "y_pred = clf.predict(X_test)\n", "accuracy = accuracy_score(y_test, y_pred)\n", "print(f\"Classification Accuracy: {accuracy}\")\n", "\n", "# Regression Example: Boston Housing dataset\n", "data_url = \"http://lib.stat.cmu.edu/datasets/boston\"\n", "raw_df = pd.read_csv(data_url, sep=\"\\s+\", skiprows=22, header=None)\n", "data = np.hstack([raw_df.values[::2, :], raw_df.values[1::2, :2]])\n", "target = raw_df.values[1::2, 2]\n", "\n", "X = data\n", "y = target\n", "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)\n", "\n", "# Initialize RandomForest Regressor\n", "reg = RandomForestRegressor(n_estimators=100, random_state=42)\n", "reg.fit(X_train, y_train)\n", "\n", "# Predict and evaluate\n", "y_pred = reg.predict(X_test)\n", "mse = mean_squared_error(y_test, y_pred)\n", "print(f\"Regression Mean Squared Error: {mse}\")" ], "id": "ca403883" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Hyperparameter Tuning for Random Forest\n", "\n", "Tuning the hyperparameters of a Random Forest can significantly improve\n", "its performance. Here are some important hyperparameters to consider:\n", "\n", "### Important Hyperparameters\n", "\n", "- **`n_estimators`:** This is the number of trees in the forest.\n", " Increasing this number usually improves performance but also\n", " increases computational cost.\n", " - **Tip:** Start with a default value of 100 and increase as\n", " needed. \n", "- **`max_depth`:** The maximum depth of each tree. Deeper trees can\n", " model more complex relationships, but they also increase the risk of\n", " overfitting.\n", " - **Tip:** Use cross-validation to find the optimal depth that\n", " balances bias and variance \n", "- **`min_samples_split`:** The minimum number of samples required to\n", " split an internal node. Higher values prevent the tree from becoming\n", " too specific (overfitting).\n", " - **Tip:** Use higher values (e.g., 5 or 10) to reduce overfitting\n", " in noisy datasets.\n", "- **`min_samples_leaf`:** The minimum number of samples required to be\n", " at a leaf node. Larger leaf sizes reduce model complexity and can\n", " help generalization.\n", "- **`max_features`:** The number of features to consider when looking\n", " for the best split. Randomly selecting fewer features can reduce\n", " correlation between trees and improve generalization.\n", " - **Tip:** For classification, a common choice is\n", " `sqrt(number_of_features)`. For regression,\n", " `max_features = number_of_features / 3` is often effective.\n", "- **`bootstrap`:** Whether to use bootstrap samples when building\n", " trees. Set this to `True` for Random Forest (default) or `False` for\n", " extremely randomized trees (also known as ExtraTrees).\n", "\n", "### Grid Search for Hyperparameter Tuning\n", "\n", "To fine-tune the hyperparameters of a Random Forest, we can use\n", "**GridSearchCV** or **RandomizedSearchCV** in `scikit-learn`. Here’s an\n", "example of how to use `GridSearchCV` for tuning a Random Forest\n", "Classifier:" ], "id": "8fcdf5b5-0762-4f79-aba4-5406f0f6624d" }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Best Hyperparameters: {'max_depth': None, 'max_features': 'sqrt', 'min_samples_leaf': 1, 'min_samples_split': 2, 'n_estimators': 100}\n", "Accuracy with Best Parameters: 1.0" ] } ], "source": [ "from sklearn.model_selection import GridSearchCV\n", "\n", "param_grid = {\n", " 'n_estimators': [100, 200, 300],\n", " 'max_depth': [None, 10, 20, 30],\n", " 'min_samples_split': [2, 5, 10],\n", " 'min_samples_leaf': [1, 2, 4],\n", " 'max_features': ['sqrt', 'log2', None]\n", "}\n", "\n", "iris = load_iris()\n", "X, y = iris.data, iris.target\n", "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)\n", "\n", "# Initialize Random Forest Classifier\n", "clf = RandomForestClassifier(random_state=42)\n", "\n", "# Perform grid search\n", "grid_search = GridSearchCV(estimator=clf, param_grid=param_grid, cv=5, n_jobs=-1, verbose=0)\n", "grid_search.fit(X_train, y_train)\n", "\n", "# Best parameters from grid search\n", "print(\"Best Hyperparameters:\", grid_search.best_params_)\n", "\n", "# Evaluate with best parameters\n", "best_model = grid_search.best_estimator_\n", "y_pred = best_model.predict(X_test)\n", "accuracy = accuracy_score(y_test, y_pred)\n", "print(f\"Accuracy with Best Parameters: {accuracy}\")" ], "id": "0047730b" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Using this technique, we can find the combination of hyperparameters\n", "that yields the best model performance.\n", "\n", "## Feature Importance in Random Forest\n", "\n", "One of the appealing aspects of Random Forest is that it provides a\n", "measure of **feature importance**, which indicates how much each feature\n", "contributes to the model’s predictions.\n", "\n", "### Computing Feature Importance\n", "\n", "In Random Forest, feature importance is computed by measuring the\n", "**average reduction in impurity** (e.g., Gini impurity or MSE) brought\n", "by each feature across all trees. Features that lead to larger\n", "reductions are considered more important." ], "id": "c7c31e4d-4f4d-4586-b44e-a5f42e7c18aa" }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "output_type": "display_data", "metadata": {}, "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAA7YAAAI2CAYAAABkPRT0AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90\nbGliIHZlcnNpb24zLjkuMCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy80BEi2AAAACXBIWXMAAA7E\nAAAOxAGVKw4bAABEEklEQVR4nO3deZxVdcE/8M8giMO+qoDL4J6BgWupmCmCuaBUtqhpmlhi+oiR\nZdqDgT7q7zHNFjQXFC1tUUErBTXNpcw0xcTcUBB3Zd8GmIH7+8OX8zQBygh478H3+/U6r9fc7zn3\n3s8MHud+5pzzPVWlUqkUAAAAKKhm5Q4AAAAAa0KxBQAAoNAUWwAAAApNsQUAAKDQFFsAAAAKTbEF\nAACg0BRbAAAACk2xBQAAoNAUWwAAAApNsQXgQ3Httdemqqpqpcvdd9+91t/vz3/+c84555wsX758\nrb/2mnj35zBlypRyR2mySv2ZAoBiC8CH6ne/+10eeuihRsvuu+++1t/nz3/+c374wx8qYWuRnykA\nlap5uQMA8NHSp0+fbLPNNuWO8YGUSqXU1dVlww03LHeUD1VdXV2aN/eRAYDK5YgtABVj0aJF+e53\nv5uePXtmww03TM+ePXPeeec1OkK4ePHiDBs2LL169UqbNm2y6aab5tBDD80zzzzTsM0555yTH/7w\nh0mSFi1aNJzynLxz1LGqqip//vOfG733u6cIT5s2rWGspqYmRx99dMaMGZMddtghG264Yf74xz8m\nSZ544okMGjQoHTt2THV1dfbaa6888MADH+j73nfffbP33ntnwoQJ6dOnT6qrq9O3b988/PDDqa+v\nz/e///1069YtnTp1yte+9rUsXLiw4bnTpk1LVVVVRo8endNPPz0bb7xxWrVqlUMOOaTR95K8U1DP\nPvvs1NTUZMMNN0xNTU3OPvvs1NXVrfT1zjjjjHTv3j0tW7bMaaedtsqfaZKMGDEiO++8c9q1a5cu\nXbpkv/32y9/+9rdG7//uz/62227Lt771rXTp0iVdunTJ0UcfnTlz5jTatr6+PhdeeGF23HHHbLTR\nRunatWsOPPDARv/Ob7/9dr75zW+mR48eadmyZXbYYYdcccUVH+jfAIBi8+dXAD5Uy5YtS319fcPj\nqqqqbLDBBqmvr8/AgQPzr3/9Kz/4wQ/Su3fv/O1vf8uoUaMya9as/OhHP0qSLFmyJPPnz8/ZZ5+d\nbt26ZdasWRk9enQ+9alP5emnn86mm26aE044Ia+88kquvvrqPPjgg9lggw0+cN577703kyZNyogR\nI7LxxhunpqYmjz32WPr165e+ffvmyiuvTKtWrXL55Zenf//++etf/5pddtmlye8zZcqUfOc738lZ\nZ52VNm3a5IwzzsigQYMyaNCg1NfX59prr83TTz+d73znO9l4443z//7f/2v0/PPPPz99+vTJNddc\nk7feeivf//73M2DAgDz11FNp0aJFkuTYY4/Nb3/723z/+9/P3nvvnb/+9a8577zz8uKLL+aGG25o\n9HrnnXdedtttt1xxxRVZtmxZdt555yxcuHCVP9NXX301w4YNy2abbZaFCxfml7/8ZfbZZ5/84x//\nSO/evRtt+1//9V855JBDcsMNN+TZZ5/NGWeckQ022CBjx45t2ObLX/5yxo8fn9NOOy39+/fP4sWL\nc//99+f111/PDjvskHnz5mXvvfdObW1tzjnnnPTs2TMTJ07MSSedlCVLluSUU05p8r8BAAVWAoAP\nwTXXXFNKssKy1157lUqlUum6664rJSndd999jZ537rnnllq0aFF68803V/q69fX1pYULF5batGlT\nuvjiixvGR4wYUUpSqqura7T9vffeW0pSuvfee1eab+rUqQ1jW265Zam6urr0+uuvN9p2v/32K+2w\nww6lJUuWNMqxww47lA477LDV+jk8//zzDWOf/vSnS82bNy+98MILDWO33nprKUlp//33b/T8wYMH\nl2pqahoeT506tZSk9LGPfay0bNmyhvEHH3ywlKR01VVXlUqlUunJJ58sJSmNGDGi0euNGjWqlKT0\nxBNPNHq9vn37lpYvX95o21X9TP9TfX19qa6urrTddtuVTj311Ibxd3/2xxxzTKPtTz755FLLli0b\n3u9Pf/pTKUnp0ksvXeV7jBw5stSyZcvSc88912j8hBNOKHXu3Pl9MwKwfnEqMgAfqnHjxuWRRx5p\nWK6++uokyYQJE7Lllltmzz33TH19fcMyYMCA1NXVNTqt9be//W322GOPdOjQIc2bN0/r1q2zYMGC\nPPvss2s97yc/+clsuummDY9ra2tz33335YgjjkizZs0acpZKpfTv3z/333//B3qf7bbbLltttVXD\n4x122CFJMnDgwEbb7bDDDnnllVdSKpUajX/hC19Is2b/92t9r732ymabbZaHHnooSRpyHX300Y2e\n9+7j++67r9H44Ycf3uhU4/dz99135zOf+Uw6d+6c5s2bp0WLFnnuuedW+m9y8MEHN3rcu3fvLFmy\nJG+++WaS5M4770xVVVWGDBmyyvebMGFC9thjj/Ts2bPRfy8DBw7MzJkz869//Wu1swNQfE5FBuBD\n1atXr5VOHvXWW2/lpZdeajht9j/NnDkzSfL73/8+X/rSl3LsscdmxIgR6dKlS5o1a5aDDjooixcv\nXut5u3Xr1ujxrFmzsmzZsowaNSqjRo1a6XOWL1/eqGSujo4dOzZ6/O4EVSsbr6+vz7JlyxpN6LTJ\nJpus8JqbbLJJXn311YbcK/t+3i3t765/139u914ee+yxHHTQQRk4cGCuvvrqdOvWLRtssEFOOOGE\nlf6bdOrUqdHjli1bJknDtjNnzkynTp1SXV29yvd86623MmXKlPf97wWAjwbFFoCK0Llz5/Ts2TO/\n/e1vV7q+pqYmSfLrX/8622yzTa699tqGdXV1dSsUs1XZaKONkiRLly5tNL6qIvSfRy07dOiQZs2a\n5eSTT84xxxyz0uc0tdSuDe8e7fzPsT59+iT5vzL5xhtvZOutt27Y5o033mi0/l1NOVp78803p3nz\n5rnlllsaFc3Zs2enQ4cOq/067+rSpUtmzZqV2traVZbbzp07Z+ONN86ll1660vXbb799k98XgOJS\nbAGoCAceeGBuvvnmtGnTpuE03JVZtGjRCreeuf7667Ns2bJGY+8eBaytrU3btm0bxrfccsskyeTJ\nkzNgwICG8XdnO34/rVu3Tr9+/fLEE09k5513LkuJXZmbbrop55xzTkOev/zlL3nllVfyqU99Kkmy\nzz77JHnnDwNnnXVWw/N+9atfJXlnZub3s6qf6aJFi7LBBhs0KsP33HNPpk+fnp49ezb5exkwYEAu\nuOCCXHXVVaucBOrAAw/MT3/602yxxRbZeOONm/weAKxfFFsAKsJRRx2Va665Jvvvv3++/e1v5xOf\n+ESWLl2aF154IbfddlvGjx+fVq1a5cADD8z48eMzbNiwHHLIIXn00Ufz05/+dIUjgzvuuGOS5Ec/\n+lE++9nPZoMNNsiuu+6abt265dOf/nTOP//8dOnSJRtvvHF++ctf5sUXX1ztrBdffHH22WefDBw4\nMF//+tfTrVu3zJgxI4899liWLVuWCy64YG3+aFbL/Pnzc/jhh+cb3/hG3n777Zx55pnZdtttG44q\n9+rVK1/5yldyzjnnpL6+PnvuuWceeuihjBo1Kl/5yldWmLl4ZVb1Mz3wwAPz4x//OF/72tdy3HHH\n5bnnnsuoUaPSo0ePD/S9fOYzn8nnP//5nH766Xn55Zez3377pa6uLvfff38OPvjg7Lvvvhk2bFh+\n85vfpF+/fhk2bFi23377LFy4MM8880weeOCB3HrrrR/ovQEoJsUWgIrQokWLTJw4MRdccEGuuOKK\nTJ06Na1bt87WW2+dgw8+uOGa0yFDhuTll1/OmDFj8otf/CK77bZbfv/732fw4MGNXu+QQw7J0KFD\nM3r06IwcOTKlUqlhwqVf/vKXOemkk3Lqqadmo402yvHHH5+zzz77PScr+nc777xzHnnkkfzwhz/M\nqaeemrlz56Zr167Zeeed881vfnPt/mBW05lnnpkpU6Y03Of2M5/5TH72s581OjX42muvzVZbbZUx\nY8bk3HPPTffu3fPd7343I0aMWK33WNXPdODAgfnJT36Siy++ODfffHN69eqV6667Lueee+4H/n5+\n/etf58ILL8zYsWPz4x//OO3bt89uu+2WE044IUnSvn37/PWvf83IkSNz4YUX5tVXX02HDh2y/fbb\n5/Of//wHfl8Aiqmq9J/TKgIAhTFt2rT07NkzV155ZUPpA4CPmsq4MAgAAAA+IMUWAACAQnMqMgAA\nAIXmiC0AAACFptgCAABQaBV9u5/ly5dnzpw52WijjRrd9B0AAID1X6lUyuLFi9OhQ4c0a7bq47IV\nXWznzJmTzp07lzsGAAAAZTRz5sx06tRplesruthutNFGSd75Jqqrq8ucBgAAgA9TbW1tOnfu3NAN\nV6Wii+27px9XV1crtgAAAB9R73dpqsmjAAAAKDTFFgAAgEJTbAEAACg0xRYAAIBCU2wBAAAoNMUW\nAACAQlNsAQAAKDTFFgAAgEJTbAEAACg0xRYAAIBCU2wBAAAoNMUWAACAQlNsAQAAKDTFFgAAgEJT\nbAEAACg0xRYAAIBCa17uAOuDmu/9sdwRoEmmXXBwuSMAAMBa44gtAAAAhabYAgAAUGiKLQAAAIWm\n2AIAAFBoii0AAACFptgCAABQaIotAAAAhabYAgAAUGiKLQAAAIWm2AIAAFBoii0AAACFptgCAABQ\naIotAAAAhabYAgAAUGjNyx0A4L3UfO+P5Y4Aq23aBQeXOwIAfCQ5YgsAAEChKbYAAAAUmmILAABA\noSm2AAAAFJpiCwAAQKEptgAAABSaYgsAAEChKbYAAAAUmmILAABAoSm2AAAAFFqTi22pVMqIESPS\nvXv3tG7dOvvss08mT578vs+bN29eampqUlVVlfr6+g8UFgAAAP5Tk4vtRRddlDFjxmTixImZMWNG\n9tprrwwcODALFix4z+eddtpp2X777T9wUAAAAFiZJhfb0aNHZ/jw4endu3eqq6szatSoLF26NOPG\njVvlc37/+9/nySefzHe+8501CgsAAAD/qUnFdu7cuZk2bVp23333hrHmzZunb9++efzxx1f6nJkz\nZ+Zb3/pWrrnmmjRv3vw9X7+uri61tbWNFgAAAHgvTSq28+bNS5J06NCh0XjHjh0b1v2nk046KUOG\nDEmvXr3e9/XPO++8tGrVqmHp3LlzU+IBAADwEdSkYtuuXbskyZw5cxqNz549u2Hdv/v1r3+dF154\nId/73vdW6/XPOuusLFq0qGGZOXNmU+IBAADwEdSkYtu+ffvU1NTkkUceaRirr6/PpEmT0rdv3xW2\nnzBhQp555plsuumm6dKlSw477LAkyaabbpqxY8eusH2LFi1SXV3daAEAAID30uTJo4YOHZqLLroo\nkydPTm1tbUaMGJEWLVpk8ODBK2x7ySWX5Nlnn82kSZMyadKkXHXVVUmSf/zjH/nCF76w5ukBAAD4\nyHvv2ZxWYvjw4Zk/f3769++fefPmZdddd82ECRPSpk2bTJ8+PTvuuGPuuOOO9OvXLx07dkzHjh0b\nntu1a9ckSY8ePd53IikAAABYHU1ul1VVVRk5cmRGjhy5wrotttjiPe9nu++++6ZUKjX1LQEAAGCV\nmnwqMgAAAFQSxRYAAIBCU2wBAAAoNMUWAACAQlNsAQAAKDTFFgAAgEJTbAEAACg0xRYAAIBCU2wB\nAAAoNMUWAACAQlNsAQAAKDTFFgAAgEJTbAEAACg0xRYAAIBCU2wBAAAoNMUWAACAQlNsAQAAKDTF\nFgAAgEJTbAEAACg0xRYAAIBCU2wBAAAoNMUWAACAQlNsAQAAKDTFFgAAgEJTbAEAACg0xRYAAIBC\nU2wBAAAoNMUWAACAQlNsAQAAKDTFFgAAgEJTbAEAACg0xRYAAIBCU2wBAAAoNMUWAACAQlNsAQAA\nKDTFFgAAgEJTbAEAACg0xRYAAIBCU2wBAAAoNMUWAACAQlNsAQAAKDTFFgAAgEJTbAEAACg0xRYA\nAIBCU2wBAAAoNMUWAACAQlNsAQAAKDTFFgAAgEJTbAEAACg0xRYAAIBCU2wBAAAoNMUWAACAQlNs\nAQAAKDTFFgAAgEJTbAEAACg0xRYAAIBCU2wBAAAoNMUWAACAQlNsAQAAKDTFFgAAgEJTbAEAACg0\nxRYAAIBCU2wBAAAoNMUWAACAQlNsAQAAKDTFFgAAgEJTbAEAACg0xRYAAIBCU2wBAAAoNMUWAACA\nQlNsAQAAKDTFFgAAgEJTbAEAACg0xRYAAIBCU2wBAAAoNMUWAACAQlNsAQAAKDTFFgAAgEJrcrEt\nlUoZMWJEunfvntatW2efffbJ5MmTV7n9oEGD0qNHj7Rr1y7dunXLcccdl5kzZ65RaAAAAHhXk4vt\nRRddlDFjxmTixImZMWNG9tprrwwcODALFixY6fajRo3KlClTMm/evPzrX/9KbW1tTjzxxDUODgAA\nAMkHKLajR4/O8OHD07t371RXV2fUqFFZunRpxo0bt9LtP/GJT6S6uvr/3rBZszz77LMfPDEAAAD8\nmyYV27lz52batGnZfffdG8aaN2+evn375vHHH1/l884888y0bds2nTp1yvjx4zNixIiVbldXV5fa\n2tpGCwAAALyXJhXbefPmJUk6dOjQaLxjx44N61bm/PPPz/z58/P888/n9NNPz3bbbbfS7c4777y0\natWqYencuXNT4gEAAPAR1KRi265duyTJnDlzGo3Pnj27Yd172WabbTJo0KAMHDgwdXV1K6w/66yz\nsmjRoobFJFMAAAC8nyYV2/bt26empiaPPPJIw1h9fX0mTZqUvn37rtZr1NXV5c0338zcuXNXWNei\nRYtUV1c3WgAAAOC9NHnyqKFDh+aiiy7K5MmTU1tbmxEjRqRFixYZPHjwCts+99xzueWWWzJv3ryU\nSqU8++yz+c53vpPddtstXbp0WSvfAAAAAB9tTS62w4cPz9e+9rX0798/nTt3zgMPPJAJEyakTZs2\nmT59etq0aZMHHnggyTv3vL344ouzxRZbpG3bthk4cGB69+6d2267ba1/IwAAAHw0VZVKpVK5Q6xK\nbW1tWrVqlUWLFlX0ack13/tjuSNAk0y74OByR1ht9i+KpEj7FgAUwep2wiYfsQUAAIBKotgCAABQ\naIotAAAAhabYAgAAUGiKLQAAAIWm2AIAAFBoii0AAACFptgCAABQaIotAAAAhabYAgAAUGiKLQAA\nAIWm2AIAAFBoii0AAACFptgCAABQaIotAAAAhabYAgAAUGiKLQAAAIWm2AIAAFBoii0AAACFptgC\nAABQaIotAAAAhabYAgAAUGiKLQAAAIWm2AIAAFBoii0AAACFptgCAABQaIotAAAAhabYAgAAUGiK\nLQAAAIWm2AIAAFBoii0AAACFptgCAABQaIotAAAAhabYAgAAUGiKLQAAAIWm2AIAAFBoii0AAACF\nptgCAABQaIotAAAAhabYAgAAUGiKLQAAAIWm2AIAAFBoii0AAACFptgCAABQaIotAAAAhabYAgAA\nUGiKLQAAAIWm2AIAAFBoii0AAACFptgCAABQaIotAAAAhabYAgAAUGiKLQAAAIWm2AIAAFBoii0A\nAACFptgCAABQaIotAAAAhabYAgAAUGiKLQAAAIWm2AIAAFBoii0AAACFptgCAABQaIotAAAAhabY\nAgAAUGiKLQAAAIWm2AIAAFBoii0AAACFptgCAABQaIotAAAAhabYAgAAUGiKLQAAAIWm2AIAAFBo\nii0AAACFptgCAABQaIotAAAAhabYAgAAUGiKLQAAAIWm2AIAAFBoTSq2pVIpI0aMSPfu3dO6devs\ns88+mTx58kq3feutt3LsscemZ8+eadOmTWpqanLmmWdmyZIlayU4AAAAJE0sthdddFHGjBmTiRMn\nZsaMGdlrr70ycODALFiwYIVtFyxYkO233z5333135s2bl7vvvjt//OMf893vfnethQcAAIAmFdvR\no0dn+PDh6d27d6qrqzNq1KgsXbo048aNW2HbrbbaKt///vez9dZbp1mzZtlmm21y/PHH5957711r\n4QEAAGC1i+3cuXMzbdq07L777g1jzZs3T9++ffP444+v1mvceeed6du37yrX19XVpba2ttECAAAA\n72W1i+28efOSJB06dGg03rFjx4Z172XUqFF5/PHHc+65565ym/POOy+tWrVqWDp37ry68QAAAPiI\nWu1i265duyTJnDlzGo3Pnj27Yd2q/OAHP8gVV1yRP//5z9lss81Wud1ZZ52VRYsWNSwzZ85c3XgA\nAAB8RK12sW3fvn1qamryyCOPNIzV19dn0qRJqzy9uFQq5eSTT86NN96YBx54INtvv/17vkeLFi1S\nXV3daAEAAID30qTJo4YOHZqLLrookydPTm1tbUaMGJEWLVpk8ODBK2xbX1+fo48+On/+85/zwAMP\npKamZm1lBgAAgAbNm7Lx8OHDM3/+/PTv3z/z5s3LrrvumgkTJqRNmzaZPn16dtxxx9xxxx3p169f\n/vKXv+SGG25Iy5Yts+222zZ6nZXdHggAAAA+iCYV26qqqowcOTIjR45cYd0WW2zRqLB++tOfTqlU\nWvOEAAAA8B6adCoyAAAAVBrFFgAAgEJTbAEAACg0xRYAAIBCU2wBAAAoNMUWAACAQlNsAQAAKDTF\nFgAAgEJTbAEAACg0xRYAAIBCU2wBAAAoNMUWAACAQlNsAQAAKDTFFgAAgEJTbAEAACg0xRYAAIBC\nU2wBAAAoNMUWAACAQlNsAQAAKDTFFgAAgEJTbAEAACg0xRYAAIBCU2wBAAAoNMUWAACAQlNsAQAA\nKDTFFgAAgEJTbAEAACg0xRYAAIBCU2wBAAAoNMUWAACAQlNsAQAAKDTFFgAAgEJTbAEAACg0xRYA\nAIBCU2wBAAAoNMUWAACAQlNsAQAAKDTFFgAAgEJTbAEAACg0xRYAAIBCU2wBAAAoNMUWAACAQlNs\nAQAAKDTFFgAAgEJTbAEAACg0xRYAAIBCU2wBAAAoNMUWAACAQlNsAQAAKDTFFgAAgEJTbAEAACg0\nxRYAAIBCU2wBAAAoNMUWAACAQlNsAQAAKDTFFgAAgEJTbAEAACg0xRYAAIBCU2wBAAAoNMUWAACA\nQlNsAQAAKDTFFgAAgEJTbAEAACg0xRYAAIBCU2wBAAAoNMUWAACAQlNsAQAAKDTFFgAAgEJTbAEA\nACg0xRYAAIBCU2wBAAAoNMUWAACAQlNsAQAAKDTFFgAAgEJTbAEAACg0xRYAAIBCU2wBAAAoNMUW\nAACAQmve1CeUSqWcc845ufLKKzN37tzssssuGT16dHr16rXS7c8+++z88Y9/zFNPPZXdd989Dz74\n4BqHBgDWTM33/ljuCLDapl1wcLkjABWuyUdsL7rooowZMyYTJ07MjBkzstdee2XgwIFZsGDBSrff\neuutM3LkyJx44olrHBYAAAD+U5OL7ejRozN8+PD07t071dXVGTVqVJYuXZpx48atdPvjjjsuhx56\naLp06fK+r11XV5fa2tpGCwAAALyXJhXbuXPnZtq0adl9990bxpo3b56+ffvm8ccfX+Mw5513Xlq1\natWwdO7ceY1fEwAAgPVbk4rtvHnzkiQdOnRoNN6xY8eGdWvirLPOyqJFixqWmTNnrvFrAgAAsH5r\n0uRR7dq1S5LMmTOn0fjs2bPTo0ePNQ7TokWLtGjRYo1fBwAAgI+OJh2xbd++fWpqavLII480jNXX\n12fSpEnp27fvWg8HAAAA76fJk0cNHTo0F110USZPnpza2tqMGDEiLVq0yODBg1e6fV1dXRYvXpz6\n+vqUSqUsXrw4ixcvXuPgAAAAkHyA+9gOHz488+fPT//+/TNv3rzsuuuumTBhQtq0aZPp06dnxx13\nzB133JF+/folSYYMGZKxY8c2PL+6ujrJO/fDBQAAgDXV5CO2VVVVGTlyZN54440sWrQo999/f3r3\n7p0k2WKLLbJgwYKGUpsk1157bUql0goLAAAArA1NLrYAAABQSRRbAAAACk2xBQAAoNAUWwAAAApN\nsQUAAKDQFFsAAAAKTbEFAACg0BRbAAAACk2xBQAAoNAUWwAAAApNsQUAAKDQFFsAAAAKTbEFAACg\n0BRbAAAACk2xBQAAoNAUWwAAAApNsQUAAKDQFFsAAAAKTbEFAACg0BRbAAAACk2xBQAAoNAUWwAA\nAApNsQUAAKDQFFsAAAAKTbEFAACg0BRbAAAACk2xBQAAoNAUWwAAAApNsQUAAKDQFFsAAAAKTbEF\nAACg0BRbAAAACk2xBQAAoNAUWwAAAApNsQUAAKDQFFsAAAAKTbEFAACg0BRbAAAACk2xBQAAoNAU\nWwAAAApNsQUAAKDQFFsAAAAKrXm5AwAAwPqk5nt/LHcEWG3TLji43BHWCkdsAQAAKDTFFgAAgEJT\nbAEAACg0xRYAAIBCU2wBAAAoNMUWAACAQlNsAQAAKDTFFgAAgEJTbAEAACg0xRYAAIBCU2wBAAAo\nNMUWAACAQlNsAQAAKDTFFgAAgEJTbAEAACg0xRYAAIBCU2wBAAAoNMUWAACAQlNsAQAAKDTFFgAA\ngEJTbAEAACg0xRYAAIBCU2wBAAAoNMUWAACAQlNsAQAAKDTFFgAAgEJTbAEAACg0xRYAAIBCU2wB\nAAAoNMUWAACAQlNsAQAAKDTFFgAAgEJTbAEAACg0xRYAAIBCU2wBAAAotCYX21KplBEjRqR79+5p\n3bp19tlnn0yePHmV28+ePTtHHXVU2rdvnw4dOuSoo47KnDlz1iQzAAAANGhysb3ooosyZsyYTJw4\nMTNmzMhee+2VgQMHZsGCBSvd/uijj86bb76ZF154IVOmTMmbb76ZY489do2DAwAAQJI0b+oTRo8e\nneHDh6d3795JklGjRuWqq67KuHHj8tWvfrXRti+99FJuv/32TJo0KV26dEmS/OhHP0qfPn0yffr0\nbLHFFo22r6urS319fcPjRYsWJUlqa2ubGvNDtbxuSbkjQJNU+j717+xfFIl9C9aNIu1bif2LYqn0\n/evdfKVS6b03LDXBnDlzSklKf/3rXxuNH3DAAaVhw4atsP348eNLLVu2XGF8ww03LN16660rjI8Y\nMaKUxGKxWCwWi8VisVgsloZl5syZ79lVm3TEdt68eUmSDh06NBrv2LFjw7r/3L59+/YrjHfo0GGl\n25911ln57ne/2/B4+fLlWbBgQdq2bZuqqqqmRKXgamtr07lz58ycOTPV1dXljgPrFfsXrBv2LVh3\n7F8fXaVSKYsXL16hg/6nJhXbdu3aJckKkz/Nnj07PXr0WOn2c+fOXWF8zpw5Da/171q0aJEWLVo0\nGmvdunVTIrKeqa6u9j8vWEfsX7Bu2Ldg3bF/fTS1atXqfbdp0uRR7du3T01NTR555JGGsfr6+kya\nNCl9+/ZdYfs+ffpkyZIl+ec//9kw9s9//jNLly5Nnz59mvLWAAAAsFJNnhV56NChueiiizJ58uTU\n1tZmxIgRadGiRQYPHrzCtltuuWUOOuigDB8+PDNmzMiMGTMyfPjwHHrooStMHAUAAAAfRJOL7fDh\nw/O1r30t/fv3T+fOnfPAAw9kwoQJadOmTaZPn542bdrkgQceaNj++uuvT5cuXbL11ltn6623Tteu\nXXPdddet1W+C9U/z5s0zYsSING/e5Im7gfdh/4J1w74F6479i/dTVXrfeZMBAACgcjX5iC0AAABU\nEsUWAACAQlNsAQAAKDTFFgAAgEIzrRgAwAe0fPnyPPPMM5k1a1Y6deqUHXbYIc2aOW4A8GFTbKkI\nU6ZMybhx4/LII480fDjYddddc/jhh2e77bYrdzwopCVLluSXv/xlbrnlljzyyCOZPXt2OnbsmF13\n3TWDBw/OV7/61Wy00UbljgmF9Nhjj+Xiiy/OH/7wh8ybN69hvG3btjnkkEPy7W9/OzvvvHMZE0Jx\nlUqlPProoyt8Ltxtt91SVVVV7nhUKH9SpKymTJmSQYMGZaeddsof/vCHbLLJJvnkJz+ZTTbZJLff\nfnv69OmTQYMGZcqUKeWOCoUyduzYbLnllrniiiuy66675rLLLssdd9yRyy67LLvvvnuuvvrq1NTU\nuK84fADHHXdcDj300PTo0SPjxo3L22+/naVLl+btt9/Orbfems033zyHHnpojjvuuHJHhUKpr6/P\nJZdckpqamvTr1y+XXXZZJkyYkMsuuyz77LNPampqcskll6Surq7cUalA7mNLWdXU1OTb3/52vvrV\nr6ZDhw4rrJ87d27Gjh2bH//4x3nxxRc//IBQUIMGDcq5556bnXbaaZXb/POf/8wPfvCD3HrrrR9i\nMii+iy++ON/61rey4YYbrnKbpUuX5uc//3mGDRv2ISaDYvvYxz6WHXfcMSeccEL222+/tGzZsmHd\nkiVLcs899+Sqq67Kv/71rzz99NNlTEolUmwpq0WLFqVVq1ZrbTsAAIrpH//4R3bZZZf33e6xxx5z\nqj8rUGwBANbQ/PnzM3/+/EZj3bt3L1MagI8ek0dRMUqlUn7729/m73//+wofDq644ooypYLie+ut\ntzJixIiV7lvPPfdcmVLB+uFvf/tbvva1r+X5559vGCuVSqmqqsqyZcvKmAyK79VXX81jjz22wu+u\nI488skyJqGSKLRXjpJNOym9+85v069cvrVu3LnccWG8cffTRWbhwYY466ij7FqxlJ5xwQg488MDc\ncMMN9i9Yi6644op861vfSnV1daPL0aqqqhRbVsqpyFSMTp065ZFHHsnWW29d7iiwXmnfvn1ee+01\nH7phHWjXrl3mzp3rFiSwlvXo0SM/+clP8vnPf77cUSgIt/uhYrRr1y5bbLFFuWPAeqempiZLliwp\ndwxYL+25555mZ4V1YMmSJfnc5z5X7hgUiCO2VIyf//zneeWVV/I///M//vINa9F9992Xn/70pznj\njDOy6aabNlrnj0mwZl555ZUce+yxGThw4Ar71zHHHFOmVFB8Q4cOzcCBA3PYYYeVOwoFodhSMV59\n9dXsv//+efXVV9O1a9dG69zDFj64e++9N0cddVTefPPNhjGT28Dacemll+b0009P+/btG53uX1VV\nlenTp5cxGRTbggUL8slPfjKbb755unXr1mjdmDFjypSKSmbyKCrGV77ylXTt2jVDhw51LSCsRd/8\n5jdz5JFH5phjjrFvwVp23nnnZdy4cRk0aFC5o8B65dRTT83rr7+e7bbbLnV1deWOQwE4YkvFaNOm\nTd56661GM98Ba65t27aZN2+eU/xhHejSpUvefvtt+xesZW3bts2TTz6ZmpqackehIEweRcXYeuut\nU1tbW+4YsN7Za6+98tRTT5U7BqyXvvjFL2b8+PHljgHrnc6dO6d79+7ljkGBOBWZinHaaaflqKOO\nyn//93+vMAHHVlttVaZUUHyf+tSncuihh2bIkCEr7FvHH398mVLB+mHWrFk58sgjs/fee69wHeB1\n111XplRQfN/97ndz9tln5/zzz88GG2xQ7jgUgFORqRjNmv3fCQTvntJlghtYcz179lzpeFVVlYnZ\nYA0dd9xxq1x3zTXXfIhJYP2y+eab54033kiLFi3SpUuXRutMzMbKKLZUjJdeemmV67bccssPMQkA\nAOU0duzYVa479thjP8QkFIViC7CeW7JkSZo1a5YWLVo0jNXV1WX58uVp2bJlGZNB8U2aNCmdO3fO\n5ptv3jD28ssvZ9asWfnEJz5RxmQAHy0mj6JiDBs2LA888ECjsfvvvz/f/va3y5QI1g8HHXRQHnro\noUZjDz30UA455JAyJYL1x/HHH5+FCxc2GluwYIHr12ENXX/99Zk0aVKjsccffzy/+tWvyhOIiueI\nLRWjW7dumTJlSqP7bC5YsCDbbbddXnvttTImg2Lr0qVL3njjjTRv/n/zBdbX12fTTTfNjBkzypgM\niq9Dhw6ZM2fOao8Dq2ebbbbJgw8+2GjSwzfeeCN77713pkyZUsZkVCpHbKkYtbW1qa6ubjTWqlWr\nFf4SDjRNVVVV6uvrG43V19fH3zVhzbVt2zazZ89uNDZz5kz3ZIc19NZbb60wk/+mm26aN998s0yJ\nqHSKLRWjZ8+eue+++xqN3XfffW7MDWtop512yrXXXtto7Lrrrkvv3r3LEwjWI5/+9Kfzne98p+GP\nR/X19TnzzDOz7777ljcYFFy3bt3y3HPPNRp77rnnsvHGG5cpEZXOfWypGKeddlq+/OUv53vf+162\n2267PPfcc7nwwgtz/vnnlzsaFNq5556b/fffP7fffnu23377PPfcc7nrrrty9913lzsaFN6FF16Y\n/fbbL926dUtNTU2mTZuWjh075p577il3NCi0I444Isccc0wuu+yyhs+FJ598cr74xS+WOxoVyjW2\nVJSrrroql156aaZOnZqampr813/9V4YMGVLuWFB4Tz31VC6//PKGfeukk07Kxz/+8XLHgvXC4sWL\n84c//CHTpk1LTU1NDjnkkGy00UbljgWFtnjx4pxwwgm54YYbUlVVlST5yle+kiuvvHKFS9cgUWwB\nAIAKNXPmzIY/ynbp0qXccahgrrEFWA+9+uqrq7XdK6+8so6TwPrn5ptvXqvbAavWuXPn7Lrrrkot\n70uxpaz69OmT3//+96ucnbVUKuXWW29N3759P+RkUGx77713Tj/99Dz77LMrXf/ss89m2LBh6dev\n34ecDIrviiuuyC677NJwev+/mzZtWn7xi19k5513zpVXXlmmhFBMp5xySmbOnPme27z99ts55ZRT\nPqREFInJoyirn//85zn11FNz0kknZf/990+vXr3Svn37zJ07N0899VT+9Kc/ZeONN87PfvazckeF\nQpk0aVJGjhyZ3XbbLR07dlxh35o1a1ZOOOGEPP744+WOCoUzceLE3Hbbbbn44otz8sknZ6ONNmrY\nv5YsWZK99torI0aMyGGHHVbuqFAo3bp1y7bbbpsBAwZk4MCBK/zumjhxYu68884MHz683FGpQK6x\npSLce++9ueWWW/Loo49m1qxZ6dSpU3bZZZd87nOfy3777VfueFBY8+fPz5133rnCvjVgwIC0a9eu\n3PGg8GbMmJF//OMfDfvXzjvvnK5du5Y7FhTWG2+8kcsvvzw333xznnrqqYbxHXfcMZ/73OcydOjQ\nFe5vC4liCwAAVKDFixdn9uzZ6dixo5nGeV+KLQAAAIVm8igAAAAKTbEFAACg0BRbAAAACs3tfgAA\n1tDbb7+d+fPnNxrbaqutypQG4KNHsaViLF++PNdff33+/ve/r/Dh4LrrritTKii+119/Pd///vdX\num9Nnz69TKlg/fCXv/wlRx99dKN9qVQqpaqqKsuWLStjMii+F198Mf/4xz9W+N11/PHHlykRlUyx\npWIMHTo0v/vd77L//vundevW5Y4D641jjz02ixYtysknn2zfgrVs6NChOfzwwzNkyBD7F6xFl19+\neb71rW+lU6dOjfatqqoqxZaVcrsfKkaXLl3y0EMPZdttty13FFivtG/fPq+88kratm1b7iiw3mnb\ntm3mzp2bZs1MWwJr05ZbbplLLrkkn/vc58odhYLwf2EqxoYbbpiePXuWOwasdzbddNNUVVWVOwas\nlz7xiU/kpZdeKncMWO/MnTtXqaVJHLGlYowcOTJt27bNsGHDyh0FCm/58uUNX48bNy7jx4/PBRdc\nkG7dujXazlEmaLp77rmn4et//etfufrqqzN8+PAV9q/99tvvw44G642jjjoqJ554Yj796U+XOwoF\nodhSVv369Ws4klQqlfLwww9niy22SPfu3Rttd//995cjHhRWs2bNGh2lfXcym/9kchtoutX5g5DJ\no6Dp/vu//7vh63nz5mXs2LH5/Oc/v8LnwpEjR37Y0SgAk0dRVv3793/Px8AHc++995Y7Aqy3/v2M\nCGDteeCBBxo97tOnT1544YW88MILDWMurWFVHLEFWM+98sor2WyzzVZ7HFh9N954Y77yla+sMP7r\nX/86X/7yl8uQCOCjycVVVIzevXuvdLxPnz4fbhBYz+y4444rHd9pp50+5CSw/vnGN76x0vGhQ4d+\nyElg/bKqfehb3/rWh5yEolBsqRjTpk1b6bjZJmHNrOzEHKdSwtqxsv1r1qxZJmaDNfTLX/5ypeM3\n3HDDh5yEonCNLWU3ZsyYJO9MYnPNNdc0+pDw7LPPZpNNNilXNCi0Y445JkmydOnShq/f9cILL+Rj\nH/tYOWLBemHzzTdPVVVVamtrs8UWWzRaN2PGjBx22GFlSgbF9uKLLyZ5549GU6dOXeFz4UYbbVSu\naFQ4xZayGzVqVJJkyZIljWa5a9asWTbddNNceuml5YoGhbbBBhskeefDwbtfJ+/sW/vuu29OPPHE\nckWDwjv33HNTKpVy0kknNfweS/7vd5db/cAHs8022zRMELXNNts0jL/7u+x//ud/yhWNCmfyKCrG\nQQcdlNtvv73cMWC9c/755+fMM88sdwxYLz344IPZe++9yx0D1hsvvfRSSqVSevXqlaeeeqphvFmz\nZunatasjtqySYgsA8AFNnz59peMbbbRRNt544w85DcBHl2JLxTj++ONXOr7RRhtlyy23zBFHHJGt\nttrqQ04FxdSzZ8/Vutffu9cyAR9Ms2bNVrmvtWzZMkcddVQuvvjitG3b9kNOBsVz3XXXrdZ2/zlv\nBCSKLRXkS1/6UsaNG5dPfOITqampyUsvvZRJkybl0EMPzYsvvpinn346t912WwYMGFDuqFDxrr76\n6oavX3755YwePTrHHHNMevbsmalTp+b666/P0KFDM2LEiDKmhOK76qqrMmbMmJx11lmpqanJtGnT\ncv755+eoo45Kt27dMmLEiOy555657LLLyh0VKt7mm2/e6PGbb76ZZcuWpVOnTpk9e3bDNeyrOlOC\njzbFlooxZMiQfOpTn2p05Paaa67JX//611x55ZW55JJLcsMNN+SRRx4pY0oonv79+2fUqFH51Kc+\n1TD2t7/9LWeffXbuvvvuMiaD4vv4xz+eu+++O926dWsYe+2113LAAQfkqaeeyrPPPpsDDjjAB3Fo\nop/85CeZNGlSfvzjH6ddu3aZO3duhg8fnp122imnnHJKueNRgRRbKkanTp0yY8aMRvf+W7ZsWbp2\n7ZpZs2Zl8eLF2WSTTTJ37twypoTiadeuXWbPnt1oZuRly5alY8eOmTdvXhmTQfG1b98+b731Vlq2\nbNkwVltbm0022aRh/2rTpk0WLFhQrohQSJtvvnmee+65VFdXN4wtXLgwO+ywQ15++eUyJqNSuXs4\nFaNt27Z57LHHGo09/vjjadOmTcNjN7yHpqupqcm1117baGzs2LHZcsstyxMI1iO77LJLTjvttCxa\ntCjJOx+8hw8fnl122SVJ8vzzz6dr167ljAiFVFtbmzlz5jQamzt3bsO+Bv/JfWypGCeddFI++9nP\n5utf/3q23HLLvPTSSxkzZkyGDRuWJLntttuy6667ljklFM///u//5rDDDsvll1+enj17Ztq0aXny\nySczbty4ckeDwrviiityyCGHpH379unUqVNmzZqVrbfeOr///e+TJDNnzszFF19c5pRQPIcddlgO\nPfTQjBw5suH69XPOOSeHH354uaNRoZyKTEW57rrrcv311+fVV19Njx498tWvftXMd7AWTJ06NTfe\neGNeeeWVbLbZZvnKV76Snj17ljsWrBeWLVuWhx56KK+99lp69OiRT37yk41O/QeabuHChRk2bFiu\nv/76LFmyJC1btszRRx+dSy65pNHZfPAuxRYAAKhIpVIpb7/9drp27bpat7Hjo0uxpaIsWLAgTz/9\ndObPn99ofL/99itTIiim66+/Pl/96leTJGPGjFnldqu6fzSwehYsWJAf/ehH+fvf/77C767777+/\nTKkAPnoUWyrG+PHjc+yxx67wwaCqqirLli0rUyoopl69emXy5MlJsspTjquqqvLiiy9+mLFgvfPF\nL34xkyZNyuGHH57WrVs3Wuc+0dA0H/vYx/L0008neWdW5FUdoXX7LFZGsaVibLvtthk6dGi+8Y1v\npFWrVuWOAwDvq2PHjnnmmWeyySablDsKFN4NN9yQI488Mkly7bXXrrLYHnvssR9mLApCsaVitGvX\nzj01YR14dzI2YO3r2bNnnn322Wy44YbljgLwkeamoFSMfv365Yknnih3DFjvbLHFFvn4xz+eYcOG\n5Y477nAPQFiLzjzzzJx55plZvnx5uaPAemXIkCG56aabMnv27HJHoSAcsaVijBo1KldffXWGDBmS\nbt26NVpnghv44N58883cddddueuuu3L33Xdn1qxZ+dSnPpWBAwfmu9/9brnjQaFtvvnmeeONN9Ki\nRYt06dKl0TrXAcIHN3To0Nx1112ZNm1adt555xxwwAEZMGBA9txzzzRv3rzc8ahAii0VwwQ3sO4t\nWrQoP/vZz3L++edn3rx5JmaDNTR27NhVrnMdIKy5adOm5c4778ydd96Zu+++O6VSKXPnzi13LCqQ\nP3dQMaZOnVruCLBeevLJJ3PnnXdmwoQJefjhh9OnT5+cfvrpGThwYLmjQeEpr7DuLF++PK+99lpe\neeWVvPzyy1m+fHn23nvvcseiQjliS8UplUp54403VjgdGfhgmjVrlu222y7nnHNODjrooLRr167c\nkWC9MnXq1Nx444157bXX8rOf/SxTpkxJXV1dPvaxj5U7GhTW5z//+dx7773ZcsstM2DAgAwYMCD9\n+vUzURurZPIoKsaiRYty4oknprq6Ottss02S5NZbb815551X5mRQbCNGjEjnzp1z4okn5otf/GIu\nueSSPPXUU+WOBeuFe+65J717986f//znhtOSX3/99QwfPrzMyaDY7rjjjnTp0iWDBw/O4MGDs99+\n+ym1vCdHbKkYJ598cp5//vmMGDEiBx98cObMmZOXX345n/3sZzN58uRyx4PCmzdvXv70pz/lzjvv\nzG9+85u0bt06L7/8crljQaHttttuOfvss3PYYYelY8eOmT17dmpra7PVVlvl9ddfL3c8KKwlS5bk\n/vvvb7i+9tVXX81nPvOZDBw4MCeccEK541GBFFsqxuabb54nnnginTp1SqdOnTJr1qwkafigAHxw\n/15qJ06cmOnTp2fnnXfO3//+93JHg0Lr0KFD5syZkyR+d8E68sYbb2Ts2LG54IILTHzIKpk8iopR\nV1e3wrV/tbW1qa6uLlMiWD/sueeeefTRR9O9e/cccMABueCCC9K/f/906tSp3NGg8Lp3754pU6Y0\nXEKTJM8880w222yzMqaC4rvrrrsajtZOnjw5O+ywQ4499tgMGDCg3NGoUIotFWO33XbL6NGjc+qp\npzaMXXvttfnkJz9ZxlRQfEceeWSuueaabL/99uWOAuudr3/96/niF7+YCy+8MMuXL8+DDz6YM844\nIyeeeGK5o0GhHXXUUdl///1z2mmnZcCAAenRo0e5I1HhnIpMxXjmmWeyzz77ZNttt82jjz6afv36\n5fHHH89DDz2U7bbbrtzxAGAFy5cvz8iRI/PjH/848+bNS3V1db75zW/moosuSlVVVbnjAXxkKLZU\nlJkzZ+a6667L888/n0033TTHHXdcNt9883LHAoD39dZbb6VDhw5mbgUoA8UWAACAQnONLWU1ZsyY\n1dru+OOPX8dJAGD1bL755qt1mvH06dM/hDQAJI7YUmY9e/Z8322qqqry4osvfghpAOD9jR07drW2\nO/bYY9dxEgDepdgCrIecDQEAfJQotgDrIWdDAFA0TvNnTbjGFmA9NHXq1HJHAIAmOffcc8sdgQJz\nxBYAAIBCc8QW4CPgrrvuyp133pm33nor//73zOuuu66MqQDgvS1cuHCF311bbbVVGRNRqZqVOwAA\n69bo0aNz6KGH5vnnn89vfvObzJs3LzfddFOWLVtW7mgAsFLTpk3LnnvumXbt2mWbbbbJtttu27DA\nyjhiS1mZuRXWvZ/+9KcZN25cPvvZz6Zjx44ZP358fve73+Xee+8tdzQoJBPcwLr3X//1X+natWse\nffTR7Lvvvrnvvvty9tln54tf/GK5o1GhXGNLWZm5Fda9du3aZd68eUmSDh06ZM6cOVm2bFl69OiR\nN954o8zpoHjcxxbWvY033jjPPPNMOnXq1PC767XXXsvBBx+cxx9/vNzxqECO2FJWZm6Fda9du3aZ\nP39+2rZtm0022SRTpkxJ586ds2jRonJHg0JSWGHdq6+vT6dOnZIkrVq1yqJFi9K9e/e88MILZU5G\npVJsAdZze+65Z2655ZYce+yxOfTQQ3PooYemZcuW2WeffcodDdYbJriBtWurrbbKk08+md69e2fH\nHXfM5Zdfng4dOqRz587ljkaFcioyFcXMrbD2LVmyJKVSKRtttFGWLl2aH/3oR5k3b16GDx/uAwKs\noWnTpuXII4/Mww8/vMI6E7TBB3fTTTelXbt2GTBgQO67774ceuihWbx4ca688kpnTbBSii0VY/To\n0Tn99NNz4IEHZsKECTnwwANz5513ZvDgwfnVr35V7ngAsILDDjssSXLOOeesMMHNMcccU+Z0sP6o\nq6vL0qVL07p163JHoUK53Q8V492ZW8ePH5/q6uqMHz8+Y8eOTfv27csdDQrvxhtvzAEHHJAddtgh\n/fv3zw033FDuSLBeeOihh3LNNdekb9++qaqqSp8+fXLFFVfkkksuKXc0KLSDDz640eMWLVqkdevW\nGTRoUJkSUekcsaVimLkV1o3//d//zQUXXJAhQ4akZ8+emTp1aq666qqcccYZOeOMM8odDwqtU6dO\nmTVrVpKke/fumTJlSlq1atXodxrQdKvah/59n4N/Z/IoKoaZW2Hd+PnPf57bb789e+yxR8PY5z73\nuRxxxBGKLawhE9zA2nXPPfckeeca9XvvvbfRnCvPPvts2rZtW65oVDjFloph5lZYN+bOnZtdd921\n0dguu+ziaBKsBd/73vfy+uuvp3fv3vnBD37QaIIboOn69++fJKmqqsr+++/fMF5VVZVu3brl/PPP\nL1c0KpxTkakYZm6FdWPIkCHZY489csIJJzSMjRkzJn/7299yxRVXlDEZrH9McANrR69evTJ58uRy\nx6BAFFuA9dyXvvSljB8/PjvttFN69uyZadOm5YknnsjgwYOz4YYbNmzntlrQdAcffHD++Mc/rjA+\naNCg3HbbbWVIBPDRZFZkKoqZW2Hta9WqVY488sj06tUrrVu3zsc//vEceeSRqa6uzgYbbNCwAE33\nwAMPrHT8wQcf/JCTwPpl+fLlOf/887Pttts23CFj4sSJTvNnlVxjS8X495lbv/CFL2Tq1Kk59dRT\n88orr5jgBtbANddcU+4IsN4xwQ2sW+ecc05+//vf54c//GGGDh2aJNlmm21y5plnZsiQIWVORyVy\nKjIVo6amJr/5zW8azdz697//PUcccUReeumlMiaD4lu2bFkefvjhvPzyy/nSl76UxYsXp6qqKi1b\ntix3NCikZs3eOemtqqqqUan99wluvvrVr5YrHhRez549c//992fzzTdvuMXP8uXL06VLF7f7YaUU\nWypGx44dM2PGjEanRC5btixdunTJ7Nmzy5gMim3q1Kk55JBDMnXq1FRVVWXhwoW55ZZbMn78eNfV\nwhoywQ2sG126dMnbb7+dqqqqhmJbV1eXHj165K233ip3PCqQa2ypGF/4whdWOGVy7NixOeKII8qU\nCNYPp5xySgYNGpT58+c3TBb1mc98Jvfff3+Zk0HxKbWwbvTu3Ts33XRTo7Fbb701ffv2LVMiKp0j\ntlQMM7fCutG1a9e89tpradGiRcNfvZOkffv2mTt3bpnTQbEtX748F154YcaMGZO33norc+fOzcSJ\nEzN9+nTXAcIaePjhh9O/f/8MGjQo48aNy5FHHpmbbropd911V3bbbbdyx6MCOWJLxTBzK6wbrVu3\nzqJFixqNvf322+4PDWvBOeeck9/+9rf54Q9/mKqqqiTvTHBz2WWXlTkZFNsee+yRRx99NF26dMm+\n++6b5cuX5+6771ZqWSVHbAHWcyeddFIWLlyYyy+/PJtttlnefvvtfPOb30yrVq1y6aWXljseFJoJ\nbgAqg9v9UFHM3Apr3wUXXJDDDz88nTp1Sl1dXdq2bZtevXrlrrvuKnc0KLz58+dns802azS2bNmy\nNG/uIxasqYceeihjxozJyy+/nM022yzHH3989txzz3LHokI5FZmKMXXq1Oy0007p379/jj/++CTJ\n7bff7holWEPt27fPvffem4ceeig33nhjJkyYkL/97W8NN7wHPjgT3MC6cf3112fffffN/Pnz07dv\n3yxcuDD777+/uVZYJaciUzEOOeSQ9O7dO+eee27DLX5mz56dvn37Ztq0aeWOB+uNKVOmZIMNNkjP\nnj3LHQUKzwQ3sG5st912ufTSS/PZz362YWzChAk55ZRT8vzzz5cxGZXKEVsqxsMPP5yRI0dmgw02\naJiAo2PHju5hC2vo+OOPz4MPPpgk+fWvf53tt98+2267bW688cYyJ4PiM8ENrBtvvPFGBg4c2Ghs\nwIABefPNN8uUiErniC0Vo6amJk888UTat2/fMAHH22+/nT322CMvvvhiueNBYXXr1i0vvPBCWrVq\nlT322CPDhw9Pu3bt8p3vfCf//Oc/yx0PAFbw+c9/Pl//+tdz0EEHNYzdcccdueqqq3LzzTeXMRmV\nSrGlYpi5FdaNd+9XO3/+/GyxxRaZOXNmmjVrlg4dOmTOnDnljgeFZ4IbWPtOOeWUjBkzJgcddFB6\n9uyZadOm5fbbb8/xxx+fDh06NGw3cuTI8oWkopiyj4ph5lZYN7p27Zqnn346kydPzic/+ck0a9Ys\nCxcubDjlH/jgrr/++pxwwgkZPHhww5wQ+++/f37xi1/kmGOOKXc8KKzJkydn9913z4wZMzJjxowk\nyW677ZYnn3yyYRu/x/h3ii0V492ZWx9//PE8//zz2XTTTbP33nunWTOXgsOaOO2007Lrrrsmeeca\n2yS5//778/GPf7ycsWC9MGrUqIwfP36lE9wotvDB3XvvveWOQME4FZmKZeZWWHumTJmS5s2bp6am\nJkny3HPPZenSpenVq1d5g0HBtWvXLnPmzGn0R9jly5enQ4cOmTdvXhmTAXy0OBRGxTBzK6w722yz\nTUOpTd65jYJSC2vugAMOyIQJExqNTZw4MQcccECZEgF8NDliS8UwcysARWOCG4DKoNhSMczcCkDR\nfOYzn3nfbaqqqnLPPfd8CGkAPrpMHkXFMHMrAEVjghuAyqDYUjHM3AoAAHwQTkWmopi5FQAAaCrF\nFgAAgEJzux8AAAAKTbEFAACg0BRbAAAACk2xBQAAoNAUWwAAAApNsQUAAKDQFFsAAAAK7f8DW34+\nl/2DVX0AAAAASUVORK5CYII=\n" } } ], "source": [ "import matplotlib.pyplot as plt\n", "import numpy as np\n", "\n", "clf.fit(X_train,y_train)\n", "# Get feature importance from the RandomForest model\n", "importances = clf.feature_importances_\n", "indices = np.argsort(importances)[::-1]\n", "\n", "# Plot the feature importance\n", "plt.figure(figsize=(10, 6))\n", "plt.title(\"Feature Importance\")\n", "plt.bar(range(X.shape[1]), importances[indices], align=\"center\")\n", "plt.xticks(range(X.shape[1]), iris.feature_names, rotation=90)\n", "plt.tight_layout()\n", "plt.show()" ], "id": "cff173f0" }, { "cell_type": "markdown", "metadata": {}, "source": [ "The bar chart is showing the relative importance of each feature, making\n", "it easier to understand which features have the most predictive power.\n", "\n", "## Out-of-Bag (OOB) Error Estimate\n", "\n", "Random Forest uses **Out-of-Bag (OOB)** samples as an alternative to\n", "cross-validation. Since each tree is trained on a bootstrap sample,\n", "about one-third of the data is left out in each iteration. These\n", "“out-of-bag” samples can be used to estimate the model’s performance\n", "without the need for a separate validation set.\n", "\n", "### Enabling OOB in Python\n", "\n", "You can enable the out-of-bag error estimate by setting `oob_score=True`\n", "in the `RandomForestClassifier` or `RandomForestRegressor`." ], "id": "a89b30bc-0003-48d9-a6bb-a8c412b95d64" }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "OOB Score: 0.9428571428571428" ] } ], "source": [ "clf = RandomForestClassifier(n_estimators=100, oob_score=True, random_state=42)\n", "clf.fit(X_train, y_train)\n", "\n", "# Access the OOB score\n", "print(f\"OOB Score: {clf.oob_score_}\")" ], "id": "f4cc577a" }, { "cell_type": "markdown", "metadata": {}, "source": [ "The OOB score is an unbiased estimate of the model’s performance, which\n", "is particularly useful when the dataset is small and splitting it\n", "further into training/validation sets might reduce training\n", "effectiveness.\n", "\n", "## Dealing with Imbalanced Data\n", "\n", "For imbalanced classification tasks (where one class is much more\n", "frequent than the others), Random Forest may be biased toward predicting\n", "the majority class. Several techniques can help mitigate this issue:\n", "\n", "- **Class Weights:** You can assign higher weights to the minority\n", " class to force the model to pay more attention to it." ], "id": "69a0d4d3-bb52-4277-b994-56bf6ccf0e8b" }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "output_type": "display_data", "metadata": {}, "data": { "text/html": [ "
RandomForestClassifier(class_weight='balanced', random_state=42)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ] } } ], "source": [ "clf = RandomForestClassifier(class_weight='balanced', random_state=42)\n", "clf.fit(X_train, y_train)" ], "id": "2e483153" }, { "cell_type": "markdown", "metadata": {}, "source": [ "- **Resampling:** You can either oversample the minority class or\n", " undersample the majority class." ], "id": "9eedf18e-6554-40c3-a2da-08523179aff7" }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "from imblearn.over_sampling import SMOTE\n", "sm = SMOTE(random_state=42)\n", "X_resampled, y_resampled = sm.fit_resample(X_train, y_train)" ], "id": "c769fd9c" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Random Forest in Practice: Best Practices\n", "\n", "- **Cross-Validation:** Always perform cross-validation to ensure the\n", " model generalizes well\n", "- **Parallelization:** Random Forest naturally supports\n", " parallelization. If using `scikit-learn`, set `n_jobs=-1` to utilize\n", " all CPU cores for training. \n", "- **Ensemble Methods:** For better results, you can combine Random\n", " Forest with other ensemble methods, such as boosting (e.g., XGBoost\n", " or Gradient Boosting) to further improve performance.\n", "\n", "Random Forest is a highly flexible, non-parametric machine learning\n", "algorithm that can be used for both classification and regression tasks.\n", "Its ensemble-based approach reduces overfitting, improves predictive\n", "performance, and provides valuable insights like feature importance.\n", "Despite its many advantages, Random Forest is computationally intensive\n", "and may not be the best choice for real-time applications or datasets\n", "with extremely high dimensionality.\n", "\n", "------------------------------------------------------------------------\n", "\n", "## References\n", "\n", "1. Breiman, L. (2001). “Random Forests”. Machine Learning, 45(1), 5-32.\n", "2. Pedregosa, F., et al. (2011). “Scikit-learn: Machine Learning in\n", " Python”. Journal of Machine Learning Research, 12, 2825-2830.\n", "3. Hastie, T., Tibshirani, R., & Friedman, J. (2009). “The Elements of\n", " Statistical Learning”. Springer Series in Statistics.\n", "\n", "**Share on**\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "" ], "id": "085ef987-dab6-45e4-9ef8-d4cb51df564a" }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/html" }, "source": [ "" ], "id": "2ffb5617-b5e2-4e37-8ea1-750112d49acf" }, { "cell_type": "markdown", "metadata": {}, "source": [], "id": "a3c13ead-3100-47d7-bc78-3caacec543b7" }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/html" }, "source": [ "" ], "id": "81f43a00-b59f-4f6e-809c-4dfcb5aa1e74" }, { "cell_type": "markdown", "metadata": {}, "source": [ "**You may also like**" ], "id": "2a61f8b2-59e7-4c40-a9be-e571bbf53527" } ], "nbformat": 4, "nbformat_minor": 5, "metadata": { "kernelspec": { "name": "python3", "display_name": "Python 3 (ipykernel)", "language": "python", "path": "/opt/hostedtoolcache/Python/3.10.16/x64/share/jupyter/kernels/python3" }, "language_info": { "name": "python", "codemirror_mode": { "name": "ipython", "version": "3" }, "file_extension": ".py", "mimetype": "text/x-python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.16" } } }