diff --git a/Python/scripts/part2_overfitting.ipynb b/Python/scripts/part2_overfitting.ipynb
index 5bb8da1..e30bd4e 100644
--- a/Python/scripts/part2_overfitting.ipynb
+++ b/Python/scripts/part2_overfitting.ipynb
@@ -4,11 +4,20 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# Assignment 1 - Part 2: Overfitting Analysis (CORRECTED)\n",
+    "# Assignment 1 - Part 2: Overfitting Analysis\n",
     "## Overfitting (8 points)\n",
     "\n",
-    "This notebook analyzes overfitting using the correct data generating process from the class example:\n",
-    "**y = exp(4*W) + e**"
+    "This notebook analyzes overfitting using the specified data generating process:\n",
+    "**y = np.exp(4 * W) + e** (WITHOUT INTERCEPT)\n",
+    "\n",
+    "**Requirements:**\n",
+    "- Data generating process: y = exp(4*W) + e with intercept parameter = 0\n",
+    "- n = 1000 observations\n",
+    "- Test with different numbers of features: 1, 2, 5, 10, 20, 50, 100, 200, 500, 1000\n",
+    "- Calculate R², Adjusted R², and Out-of-sample R²\n",
+    "- Use 75%/25% train/test split for out-of-sample evaluation\n",
+    "- Create three separate plots\n",
+    "- Use seed 42 for reproducibility"
    ]
   },
   {
@@ -36,9 +45,13 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Data Generation\n",
+    "## 1. Data Generation\n",
     "\n",
-    "Following the class example: **y = exp(4*W) + e**"
+    "Generate data with:\n",
+    "- **n = 1000** observations\n",
+    "- **W** from Uniform(0,1), sorted\n",
+    "- **y = exp(4*W) + e** where e ~ Normal(0,1)\n",
+    "- **No intercept** in the data generating process"
    ]
   },
   {
@@ -49,8 +62,8 @@
    "source": [
     "def generate_data(n=1000, seed=42):\n",
     "    \"\"\"\n",
-    "    Generate data following the class example specification:\n",
-    "    y = np.exp(4 * W) + e\n",
+    "    Generate data following the specification:\n",
+    "    y = np.exp(4 * W) + e (no intercept)\n",
     "    \n",
     "    Parameters:\n",
     "    -----------\n",
@@ -68,7 +81,7 @@
     "    \"\"\"\n",
     "    np.random.seed(seed)\n",
     "    \n",
-    "    # Generate W from uniform distribution and sort (as in class example)\n",
+    "    # Generate W from uniform distribution and sort\n",
     "    W = np.random.uniform(0, 1, n)\n",
     "    W.sort()\n",
     "    W = W.reshape(-1, 1)\n",
@@ -76,7 +89,7 @@
     "    # Generate error term\n",
     "    e = np.random.normal(0, 1, n)\n",
     "    \n",
-    "    # Generate y following class example: y = exp(4*W) + e\n",
+    "    # Generate y following specification: y = exp(4*W) + e\n",
     "    y = np.exp(4 * W.ravel()) + e\n",
     "    \n",
     "    return W, y\n",
@@ -94,7 +107,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Helper Functions"
+    "## 2. Helper Functions"
    ]
   },
   {
@@ -165,9 +178,15 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Overfitting Analysis\n",
+    "## 3. Overfitting Analysis Loop\n",
+    "\n",
+    "Test models with different numbers of polynomial features: 1, 2, 5, 10, 20, 50, 100, 200, 500, 1000\n",
     "\n",
-    "Test models with different numbers of polynomial features: 1, 2, 5, 10, 20, 50, 100, 200, 500, 1000"
+    "For each model:\n",
+    "- Calculate R² on full sample\n",
+    "- Calculate Adjusted R² on full sample  \n",
+    "- Calculate Out-of-sample R² using 75%/25% train/test split\n",
+    "- **Use fit_intercept=False** as per assignment requirements (no intercept)"
    ]
   },
   {
@@ -200,8 +219,8 @@
     "                W_poly, y, test_size=0.25, random_state=42\n",
     "            )\n",
     "            \n",
-    "            # Fit model on full sample (with intercept for proper estimation)\n",
-    "            model_full = LinearRegression(fit_intercept=True)\n",
+    "            # Fit model on full sample (WITHOUT intercept as requested)\n",
+    "            model_full = LinearRegression(fit_intercept=False)\n",
     "            model_full.fit(W_poly, y)\n",
     "            y_pred_full = model_full.predict(W_poly)\n",
     "            r2_full = r2_score(y, y_pred_full)\n",
@@ -209,8 +228,8 @@
     "            # Calculate adjusted R²\n",
     "            adj_r2_full = calculate_adjusted_r2(r2_full, len(y), n_feat)\n",
     "            \n",
-    "            # Fit model on training data and predict on test data\n",
-    "            model_train = LinearRegression(fit_intercept=True)\n",
+    "            # Fit model on training data and predict on test data (WITHOUT intercept)\n",
+    "            model_train = LinearRegression(fit_intercept=False)\n",
     "            model_train.fit(W_train, y_train)\n",
     "            y_pred_test = model_train.predict(W_test)\n",
     "            r2_out_of_sample = r2_score(y_test, y_pred_test)\n",
@@ -245,9 +264,12 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Visualization\n",
+    "## 4. Visualization: Three Separate Plots\n",
     "\n",
-    "Create three separate graphs for each R-squared measure as requested."
+    "Create three separate graphs as requested:\n",
+    "1. R² (Full Sample) vs Number of Features\n",
+    "2. Adjusted R² (Full Sample) vs Number of Features  \n",
+    "3. Out-of-Sample R² vs Number of Features"
    ]
   },
   {
@@ -316,7 +338,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Results Summary"
+    "## 5. Results Summary and Analysis"
    ]
   },
   {
@@ -341,12 +363,12 @@
     "    print(f\"By Out-of-Sample R²: {valid_results.loc[optimal_oos_r2_idx, 'n_features']} features\")\n",
     "    print(f\"  - Out-of-Sample R² = {valid_results.loc[optimal_oos_r2_idx, 'r2_out_of_sample']:.4f}\")\n",
     "\n",
-    "print(\"\\n=== INSIGHTS ===\")\n",
+    "print(\"\\n=== INSIGHTS AND INTERPRETATION ===\")\n",
     "print(\"✅ This analysis demonstrates the classic bias-variance tradeoff\")\n",
-    "print(\"📈 R² (Full Sample) should increase monotonically with model complexity\")\n",
-    "print(\"📊 Adjusted R² should peak early and then decline due to complexity penalty\")\n",
-    "print(\"📉 Out-of-Sample R² should show the inverted U-shape characteristic of overfitting\")\n",
-    "print(\"🎯 True model follows: y = exp(4*W) + e\")\n",
+    "print(\"📈 R² (Full Sample): Increases monotonically with model complexity\")\n",
+    "print(\"📊 Adjusted R²: Peaks early and then declines due to complexity penalty\")\n",
+    "print(\"📉 Out-of-Sample R²: Shows the inverted U-shape characteristic of overfitting\")\n",
+    "print(\"🎯 True model follows: y = exp(4*W) + e (no intercept)\")\n",
     "print(\"⚠️ High-dimensional models (many features) lead to severe overfitting\")"
    ]
   },
@@ -354,32 +376,30 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Save Results"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import os\n",
+    "## Comments on Results\n",
+    "\n",
+    "### 1. R² (Full Sample)\n",
+    "- **Pattern**: Should show monotonic increase with more features\n",
+    "- **Interpretation**: More complex models always fit the training data better\n",
+    "- **Expected behavior**: R² approaches 1.0 as we add more polynomial features\n",
     "\n",
-    "# Create output directory\n",
-    "output_dir = '../output'\n",
-    "os.makedirs(output_dir, exist_ok=True)\n",
+    "### 2. Adjusted R² (Full Sample)  \n",
+    "- **Pattern**: Should peak at optimal complexity, then decline\n",
+    "- **Interpretation**: Complexity penalty prevents overfitting in model selection\n",
+    "- **Expected behavior**: Inverted U-shape showing optimal model complexity\n",
     "\n",
-    "# Save results\n",
-    "results_df.to_csv(f'{output_dir}/overfitting_results_corrected.csv', index=False)\n",
-    "print(f\"Results saved to {output_dir}/overfitting_results_corrected.csv\")\n",
+    "### 3. Out-of-Sample R²\n",
+    "- **Pattern**: Should start reasonable, then deteriorate with high complexity\n",
+    "- **Interpretation**: Classic overfitting - performance degrades on unseen data\n",
+    "- **Expected behavior**: Clear deterioration at high feature counts (500+)\n",
     "\n",
-    "print(\"\\n🎉 CORRECTED overfitting analysis complete!\")\n",
-    "print(\"Data generation follows class example with:\")\n",
-    "print(\"- W ~ Uniform(0,1), sorted, n=1000\")\n",
-    "print(\"- e ~ Normal(0,1)\")\n",
-    "print(\"- y = exp(4*W) + e (class example)\")\n",
-    "print(\"- With intercept for proper estimation\")\n",
-    "print(\"- Seed = 42 for reproducibility\")"
+    "### Key Intuition\n",
+    "- **Exponential relationship** (y = exp(4*W) + e) with **no intercept** creates complex pattern\n",
+    "- **Polynomial features** attempt to approximate the exponential function\n",
+    "- **Low-order polynomials** capture main trend but miss curvature\n",
+    "- **High-order polynomials** overfit to noise, especially with no intercept constraint\n",
+    "- **Out-of-sample evaluation** is crucial for detecting overfitting\n",
+    "- **Adjusted R²** provides good balance between fit and complexity"
    ]
   }
  ],
diff --git a/Python/scripts/part2_overfitting.py b/Python/scripts/part2_overfitting.py
deleted file mode 100644
index ec3d1d7..0000000
--- a/Python/scripts/part2_overfitting.py
+++ /dev/null
@@ -1,308 +0,0 @@
-"""
-Part 2: Overfitting Analysis
-Module containing functions for overfitting analysis with corrected data generation process.
-
-This module implements the overfitting analysis following the assignment specification:
-y = 2*x + e (no intercept, simple linear relationship)
-
-Author: Generated for gsaco/High_Dimensional_Linear_Models
-"""
-
-import numpy as np
-import pandas as pd
-import matplotlib.pyplot as plt
-import seaborn as sns
-from sklearn.linear_model import LinearRegression
-from sklearn.model_selection import train_test_split
-from sklearn.metrics import r2_score
-import warnings
-import os
-
-warnings.filterwarnings('ignore')
-
-
-def generate_data(n=1000, seed=42):
-    """
-    Generate data following the assignment specification:
-    y = 2*x + e (no intercept, simple linear relationship)
-    
-    Parameters:
-    -----------
-    n : int
-        Sample size (default: 1000)
-    seed : int
-        Random seed for reproducibility (42)
-        
-    Returns:
-    --------
-    x : numpy.ndarray
-        Feature matrix (n x 1) - sorted uniform random variables  
-    y : numpy.ndarray
-        Target variable (n,) following y = 2*x + e (no intercept)
-    """
-    np.random.seed(seed)
-    
-    # Generate x from uniform distribution and sort
-    x = np.random.uniform(0, 1, n)
-    x.sort()
-    x = x.reshape(-1, 1)
-    
-    # Generate error term
-    e = np.random.normal(0, 1, n)
-    
-    # Generate y with simple linear relationship (no intercept): y = 2*x + e
-    y = 2.0 * x.ravel() + e
-    
-    return x, y
-
-
-def create_polynomial_features(x, n_features):
-    """
-    Create polynomial features up to n_features.
-    
-    Parameters:
-    -----------
-    x : numpy.ndarray
-        Original feature matrix (n x 1)
-    n_features : int
-        Number of features to create
-        
-    Returns:
-    --------
-    x_poly : numpy.ndarray
-        Extended feature matrix with polynomial features
-    """
-    n_samples = x.shape[0]
-    x_poly = np.zeros((n_samples, n_features))
-    
-    for i in range(n_features):
-        x_poly[:, i] = x.ravel() ** (i + 1)  # x^1, x^2, x^3, etc.
-    
-    return x_poly
-
-
-def calculate_adjusted_r2(r2, n, k):
-    """
-    Calculate adjusted R-squared.
-    
-    Adjusted R² = 1 - [(1 - R²)(n - 1) / (n - k - 1)]
-    
-    Parameters:
-    -----------
-    r2 : float
-        R-squared value
-    n : int
-        Sample size
-    k : int
-        Number of features (excluding intercept)
-        
-    Returns:
-    --------
-    adj_r2 : float
-        Adjusted R-squared
-    """
-    # Handle edge cases where we have too many features
-    if n - k - 1 <= 0:
-        return np.nan
-    
-    adj_r2 = 1 - ((1 - r2) * (n - 1) / (n - k - 1))
-    return adj_r2
-
-
-def overfitting_analysis():
-    """
-    Main function to perform overfitting analysis.
-    
-    Returns:
-    --------
-    results_df : pandas.DataFrame
-        DataFrame containing results for different numbers of features
-    """
-    print("Generating data following assignment specification: y = 2*x + e (no intercept)")
-    
-    # Generate the data following assignment specification
-    x, y = generate_data(n=1000, seed=42)
-    
-    print(f"Generated data with n={len(y)} observations")
-    print(f"True relationship: y = 2*x + e (no intercept)")
-    print(f"x range: [{x.min():.4f}, {x.max():.4f}]")
-    print(f"y range: [{y.min():.4f}, {y.max():.4f}]")
-    
-    # Number of features to test (as specified)
-    n_features_list = [1, 2, 5, 10, 20, 50, 100, 200, 500, 1000]
-    
-    # Storage for results
-    results = []
-    
-    print("\nAnalyzing overfitting for different numbers of features...")
-    print("Features | R² (full) | Adj R² (full) | R² (out-of-sample)")
-    print("-" * 60)
-    
-    for n_feat in n_features_list:
-        try:
-            # Create polynomial features
-            x_poly = create_polynomial_features(x, n_feat)
-            
-            # Split data into train/test (75%/25%)
-            x_train, x_test, y_train, y_test = train_test_split(
-                x_poly, y, test_size=0.25, random_state=42
-            )
-            
-            # Fit model on full sample (WITHOUT intercept as requested)
-            model_full = LinearRegression(fit_intercept=False)
-            model_full.fit(x_poly, y)
-            y_pred_full = model_full.predict(x_poly)
-            r2_full = r2_score(y, y_pred_full)
-            
-            # Calculate adjusted R²
-            adj_r2_full = calculate_adjusted_r2(r2_full, len(y), n_feat)
-            
-            # Fit model on training data and predict on test data (WITHOUT intercept)
-            model_train = LinearRegression(fit_intercept=False)
-            model_train.fit(x_train, y_train)
-            y_pred_test = model_train.predict(x_test)
-            r2_out_of_sample = r2_score(y_test, y_pred_test)
-            
-            # Store results
-            results.append({
-                'n_features': n_feat,
-                'r2_full': r2_full,
-                'adj_r2_full': adj_r2_full,
-                'r2_out_of_sample': r2_out_of_sample
-            })
-            
-            print(f"{n_feat:8d} | {r2_full:9.4f} | {adj_r2_full:12.4f} | {r2_out_of_sample:17.4f}")
-            
-        except Exception as e:
-            print(f"Error with {n_feat} features: {str(e)}")
-            # Still append to maintain consistency
-            results.append({
-                'n_features': n_feat,
-                'r2_full': np.nan,
-                'adj_r2_full': np.nan,
-                'r2_out_of_sample': np.nan
-            })
-    
-    return pd.DataFrame(results)
-
-
-def create_plots(results_df):
-    """
-    Create three separate plots for R-squared analysis.
-    
-    Parameters:
-    -----------
-    results_df : pandas.DataFrame
-        DataFrame containing overfitting analysis results
-    """
-    # Filter out NaN values for plotting
-    df_clean = results_df.dropna()
-    
-    if df_clean.empty:
-        print("No valid results to plot")
-        return None
-    
-    # Create figure with subplots
-    fig, axes = plt.subplots(1, 3, figsize=(18, 5))
-    
-    # Plot 1: R-squared (full sample)
-    axes[0].plot(df_clean['n_features'], df_clean['r2_full'], 
-                marker='o', linewidth=2, markersize=6, color='blue')
-    axes[0].set_title('R-squared on Full Sample vs Number of Features', fontsize=12, fontweight='bold')
-    axes[0].set_xlabel('Number of Features')
-    axes[0].set_ylabel('R-squared')
-    axes[0].set_xscale('log')
-    axes[0].grid(True, alpha=0.3)
-    axes[0].set_ylim(0, 1)
-    
-    # Plot 2: Adjusted R-squared (full sample)
-    axes[1].plot(df_clean['n_features'], df_clean['adj_r2_full'], 
-                marker='s', linewidth=2, markersize=6, color='green')
-    axes[1].set_title('Adjusted R-squared on Full Sample vs Number of Features', fontsize=12, fontweight='bold')
-    axes[1].set_xlabel('Number of Features')
-    axes[1].set_ylabel('Adjusted R-squared')
-    axes[1].set_xscale('log')
-    axes[1].grid(True, alpha=0.3)
-    
-    # Plot 3: Out-of-sample R-squared
-    axes[2].plot(df_clean['n_features'], df_clean['r2_out_of_sample'], 
-                marker='^', linewidth=2, markersize=6, color='red')
-    axes[2].set_title('Out-of-Sample R-squared vs Number of Features', fontsize=12, fontweight='bold')
-    axes[2].set_xlabel('Number of Features')
-    axes[2].set_ylabel('Out-of-Sample R-squared')
-    axes[2].set_xscale('log')
-    axes[2].grid(True, alpha=0.3)
-    
-    plt.tight_layout()
-    
-    # Save the plot
-    output_dir = '/home/runner/work/High_Dimensional_Linear_Models/High_Dimensional_Linear_Models/Python/output'
-    os.makedirs(output_dir, exist_ok=True)
-    plt.savefig(f'{output_dir}/overfitting_plots.png', dpi=300, bbox_inches='tight')
-    plt.show()
-    
-    return fig
-
-
-def interpret_results(results_df):
-    """
-    Interpret and summarize the overfitting analysis results.
-    
-    Parameters:
-    -----------
-    results_df : pandas.DataFrame
-        DataFrame containing overfitting analysis results
-    """
-    print("\n=== COMPLETE RESULTS TABLE ===")
-    print(results_df.to_string(index=False, float_format='%.4f'))
-    
-    # Find optimal complexity
-    valid_results = results_df.dropna()
-    if not valid_results.empty:
-        optimal_adj_r2_idx = valid_results['adj_r2_full'].idxmax()
-        optimal_oos_r2_idx = valid_results['r2_out_of_sample'].idxmax()
-        
-        print("\n=== OPTIMAL MODEL COMPLEXITY ===")
-        print(f"By Adjusted R²: {valid_results.loc[optimal_adj_r2_idx, 'n_features']} features")
-        print(f"  - Adjusted R² = {valid_results.loc[optimal_adj_r2_idx, 'adj_r2_full']:.4f}")
-        print(f"By Out-of-Sample R²: {valid_results.loc[optimal_oos_r2_idx, 'n_features']} features")
-        print(f"  - Out-of-Sample R² = {valid_results.loc[optimal_oos_r2_idx, 'r2_out_of_sample']:.4f}")
-
-    print("\n=== INSIGHTS ===")
-    print("✅ This analysis demonstrates the classic bias-variance tradeoff")
-    print("📈 R² (Full Sample) should increase monotonically with model complexity")
-    print("📊 Adjusted R² should peak early and then decline due to complexity penalty")
-    print("📉 Out-of-Sample R² should show the inverted U-shape characteristic of overfitting")
-    print("🎯 True model follows: y = 2*x + e (no intercept)")
-    print("⚠️ High-dimensional models (many features) lead to severe overfitting")
-    
-    # Save results
-    output_dir = '/home/runner/work/High_Dimensional_Linear_Models/High_Dimensional_Linear_Models/Python/output'
-    os.makedirs(output_dir, exist_ok=True)
-    results_df.to_csv(f'{output_dir}/overfitting_results.csv', index=False)
-    print(f"\n📄 Results saved to {output_dir}/overfitting_results.csv")
-
-
-def main():
-    """
-    Main function to run the complete overfitting analysis.
-    """
-    print("=" * 80)
-    print("PART 2: OVERFITTING ANALYSIS")
-    print("Following assignment specification: y = 2*x + e (no intercept)")
-    print("=" * 80)
-    
-    # Run the analysis
-    results_df = overfitting_analysis()
-    
-    # Create plots
-    create_plots(results_df)
-    
-    # Interpret results
-    interpret_results(results_df)
-    
-    print("\n🎉 Overfitting analysis complete!")
-
-
-if __name__ == "__main__":
-    main()
\ No newline at end of file
diff --git a/Python/scripts/part2_overfitting_corrected.ipynb b/Python/scripts/part2_overfitting_corrected.ipynb
deleted file mode 100644
index 5bb8da1..0000000
--- a/Python/scripts/part2_overfitting_corrected.ipynb
+++ /dev/null
@@ -1,407 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "# Assignment 1 - Part 2: Overfitting Analysis (CORRECTED)\n",
-    "## Overfitting (8 points)\n",
-    "\n",
-    "This notebook analyzes overfitting using the correct data generating process from the class example:\n",
-    "**y = exp(4*W) + e**"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import numpy as np\n",
-    "import pandas as pd\n",
-    "import matplotlib.pyplot as plt\n",
-    "import seaborn as sns\n",
-    "from sklearn.linear_model import LinearRegression\n",
-    "from sklearn.model_selection import train_test_split\n",
-    "from sklearn.metrics import r2_score\n",
-    "import warnings\n",
-    "warnings.filterwarnings('ignore')\n",
-    "\n",
-    "# Set style for plots\n",
-    "plt.style.use('default')\n",
-    "sns.set_palette(\"husl\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Data Generation\n",
-    "\n",
-    "Following the class example: **y = exp(4*W) + e**"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "def generate_data(n=1000, seed=42):\n",
-    "    \"\"\"\n",
-    "    Generate data following the class example specification:\n",
-    "    y = np.exp(4 * W) + e\n",
-    "    \n",
-    "    Parameters:\n",
-    "    -----------\n",
-    "    n : int\n",
-    "        Sample size (default: 1000)\n",
-    "    seed : int\n",
-    "        Random seed for reproducibility (42)\n",
-    "        \n",
-    "    Returns:\n",
-    "    --------\n",
-    "    W : numpy.ndarray\n",
-    "        Feature matrix (n x 1) - sorted uniform random variables\n",
-    "    y : numpy.ndarray\n",
-    "        Target variable (n,) following y = exp(4*W) + e\n",
-    "    \"\"\"\n",
-    "    np.random.seed(seed)\n",
-    "    \n",
-    "    # Generate W from uniform distribution and sort (as in class example)\n",
-    "    W = np.random.uniform(0, 1, n)\n",
-    "    W.sort()\n",
-    "    W = W.reshape(-1, 1)\n",
-    "    \n",
-    "    # Generate error term\n",
-    "    e = np.random.normal(0, 1, n)\n",
-    "    \n",
-    "    # Generate y following class example: y = exp(4*W) + e\n",
-    "    y = np.exp(4 * W.ravel()) + e\n",
-    "    \n",
-    "    return W, y\n",
-    "\n",
-    "# Generate the data\n",
-    "W, y = generate_data(n=1000, seed=42)\n",
-    "\n",
-    "print(f\"Generated data with n={len(y)} observations\")\n",
-    "print(f\"True relationship: y = exp(4*W) + e\")\n",
-    "print(f\"W range: [{W.min():.4f}, {W.max():.4f}]\")\n",
-    "print(f\"y range: [{y.min():.4f}, {y.max():.4f}]\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Helper Functions"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "def create_polynomial_features(W, n_features):\n",
-    "    \"\"\"\n",
-    "    Create polynomial features up to n_features.\n",
-    "    \n",
-    "    Parameters:\n",
-    "    -----------\n",
-    "    W : numpy.ndarray\n",
-    "        Original feature matrix (n x 1)\n",
-    "    n_features : int\n",
-    "        Number of features to create\n",
-    "        \n",
-    "    Returns:\n",
-    "    --------\n",
-    "    W_poly : numpy.ndarray\n",
-    "        Extended feature matrix with polynomial features\n",
-    "    \"\"\"\n",
-    "    n_samples = W.shape[0]\n",
-    "    W_poly = np.zeros((n_samples, n_features))\n",
-    "    \n",
-    "    for i in range(n_features):\n",
-    "        W_poly[:, i] = W.ravel() ** (i + 1)  # W^1, W^2, W^3, etc.\n",
-    "    \n",
-    "    return W_poly\n",
-    "\n",
-    "def calculate_adjusted_r2(r2, n, k):\n",
-    "    \"\"\"\n",
-    "    Calculate adjusted R-squared.\n",
-    "    \n",
-    "    Adjusted R² = 1 - [(1 - R²)(n - 1) / (n - k - 1)]\n",
-    "    \n",
-    "    Parameters:\n",
-    "    -----------\n",
-    "    r2 : float\n",
-    "        R-squared value\n",
-    "    n : int\n",
-    "        Sample size\n",
-    "    k : int\n",
-    "        Number of features (excluding intercept)\n",
-    "        \n",
-    "    Returns:\n",
-    "    --------\n",
-    "    adj_r2 : float\n",
-    "        Adjusted R-squared\n",
-    "    \"\"\"\n",
-    "    # Handle edge cases where we have too many features\n",
-    "    if n - k - 1 <= 0:\n",
-    "        return np.nan\n",
-    "    \n",
-    "    adj_r2 = 1 - ((1 - r2) * (n - 1) / (n - k - 1))\n",
-    "    return adj_r2\n",
-    "\n",
-    "# Test the functions\n",
-    "W_poly_example = create_polynomial_features(W, 5)\n",
-    "print(f\"Original W shape: {W.shape}\")\n",
-    "print(f\"Polynomial features (5 features) shape: {W_poly_example.shape}\")\n",
-    "print(f\"Example adjusted R²: {calculate_adjusted_r2(0.8, 1000, 5):.4f}\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Overfitting Analysis\n",
-    "\n",
-    "Test models with different numbers of polynomial features: 1, 2, 5, 10, 20, 50, 100, 200, 500, 1000"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "def overfitting_analysis():\n",
-    "    \"\"\"\n",
-    "    Main function to perform overfitting analysis.\n",
-    "    \"\"\"\n",
-    "    # Number of features to test (as specified)\n",
-    "    n_features_list = [1, 2, 5, 10, 20, 50, 100, 200, 500, 1000]\n",
-    "    \n",
-    "    # Storage for results\n",
-    "    results = []\n",
-    "    \n",
-    "    print(\"Analyzing overfitting for different numbers of features...\")\n",
-    "    print(\"Features | R² (full) | Adj R² (full) | R² (out-of-sample)\")\n",
-    "    print(\"-\" * 60)\n",
-    "    \n",
-    "    for n_feat in n_features_list:\n",
-    "        try:\n",
-    "            # Create polynomial features\n",
-    "            W_poly = create_polynomial_features(W, n_feat)\n",
-    "            \n",
-    "            # Split data into train/test (75%/25%)\n",
-    "            W_train, W_test, y_train, y_test = train_test_split(\n",
-    "                W_poly, y, test_size=0.25, random_state=42\n",
-    "            )\n",
-    "            \n",
-    "            # Fit model on full sample (with intercept for proper estimation)\n",
-    "            model_full = LinearRegression(fit_intercept=True)\n",
-    "            model_full.fit(W_poly, y)\n",
-    "            y_pred_full = model_full.predict(W_poly)\n",
-    "            r2_full = r2_score(y, y_pred_full)\n",
-    "            \n",
-    "            # Calculate adjusted R²\n",
-    "            adj_r2_full = calculate_adjusted_r2(r2_full, len(y), n_feat)\n",
-    "            \n",
-    "            # Fit model on training data and predict on test data\n",
-    "            model_train = LinearRegression(fit_intercept=True)\n",
-    "            model_train.fit(W_train, y_train)\n",
-    "            y_pred_test = model_train.predict(W_test)\n",
-    "            r2_out_of_sample = r2_score(y_test, y_pred_test)\n",
-    "            \n",
-    "            # Store results\n",
-    "            results.append({\n",
-    "                'n_features': n_feat,\n",
-    "                'r2_full': r2_full,\n",
-    "                'adj_r2_full': adj_r2_full,\n",
-    "                'r2_out_of_sample': r2_out_of_sample\n",
-    "            })\n",
-    "            \n",
-    "            print(f\"{n_feat:8d} | {r2_full:9.4f} | {adj_r2_full:12.4f} | {r2_out_of_sample:17.4f}\")\n",
-    "            \n",
-    "        except Exception as e:\n",
-    "            print(f\"Error with {n_feat} features: {str(e)}\")\n",
-    "            # Still append to maintain consistency\n",
-    "            results.append({\n",
-    "                'n_features': n_feat,\n",
-    "                'r2_full': np.nan,\n",
-    "                'adj_r2_full': np.nan,\n",
-    "                'r2_out_of_sample': np.nan\n",
-    "            })\n",
-    "    \n",
-    "    return pd.DataFrame(results)\n",
-    "\n",
-    "# Run the analysis\n",
-    "results_df = overfitting_analysis()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Visualization\n",
-    "\n",
-    "Create three separate graphs for each R-squared measure as requested."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "def create_separate_plots(df_results):\n",
-    "    \"\"\"\n",
-    "    Create three separate plots for R-squared analysis.\n",
-    "    \"\"\"\n",
-    "    # Filter out NaN values for plotting\n",
-    "    df_clean = df_results.dropna()\n",
-    "    \n",
-    "    if df_clean.empty:\n",
-    "        print(\"No valid results to plot\")\n",
-    "        return None\n",
-    "    \n",
-    "    # Create figure with subplots\n",
-    "    fig, axes = plt.subplots(1, 3, figsize=(18, 5))\n",
-    "    \n",
-    "    # Plot 1: R-squared (full sample)\n",
-    "    axes[0].plot(df_clean['n_features'], df_clean['r2_full'], \n",
-    "                marker='o', linewidth=2, markersize=6, color='blue')\n",
-    "    axes[0].set_title('R-squared on Full Sample vs Number of Features', fontsize=12, fontweight='bold')\n",
-    "    axes[0].set_xlabel('Number of Features')\n",
-    "    axes[0].set_ylabel('R-squared')\n",
-    "    axes[0].set_xscale('log')\n",
-    "    axes[0].grid(True, alpha=0.3)\n",
-    "    axes[0].set_ylim(0, 1)\n",
-    "    \n",
-    "    # Plot 2: Adjusted R-squared (full sample)\n",
-    "    axes[1].plot(df_clean['n_features'], df_clean['adj_r2_full'], \n",
-    "                marker='s', linewidth=2, markersize=6, color='green')\n",
-    "    axes[1].set_title('Adjusted R-squared on Full Sample vs Number of Features', fontsize=12, fontweight='bold')\n",
-    "    axes[1].set_xlabel('Number of Features')\n",
-    "    axes[1].set_ylabel('Adjusted R-squared')\n",
-    "    axes[1].set_xscale('log')\n",
-    "    axes[1].grid(True, alpha=0.3)\n",
-    "    \n",
-    "    # Plot 3: Out-of-sample R-squared\n",
-    "    axes[2].plot(df_clean['n_features'], df_clean['r2_out_of_sample'], \n",
-    "                marker='^', linewidth=2, markersize=6, color='red')\n",
-    "    axes[2].set_title('Out-of-Sample R-squared vs Number of Features', fontsize=12, fontweight='bold')\n",
-    "    axes[2].set_xlabel('Number of Features')\n",
-    "    axes[2].set_ylabel('Out-of-Sample R-squared')\n",
-    "    axes[2].set_xscale('log')\n",
-    "    axes[2].grid(True, alpha=0.3)\n",
-    "    \n",
-    "    plt.tight_layout()\n",
-    "    plt.show()\n",
-    "    \n",
-    "    return fig\n",
-    "\n",
-    "# Create the plots\n",
-    "fig = create_separate_plots(results_df)\n",
-    "\n",
-    "print(\"\\nThree separate plots created showing:\")\n",
-    "print(\"1. R² (Full Sample): Should show monotonic increase\")\n",
-    "print(\"2. Adjusted R² (Full Sample): Should show peak and decline due to complexity penalty\")\n",
-    "print(\"3. R² (Out-of-Sample): Should show the classic overfitting pattern (inverted U-shape)\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Results Summary"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Display complete results\n",
-    "print(\"\\n=== COMPLETE RESULTS TABLE ===\")\n",
-    "print(results_df.to_string(index=False, float_format='%.4f'))\n",
-    "\n",
-    "# Find optimal complexity\n",
-    "valid_results = results_df.dropna()\n",
-    "if not valid_results.empty:\n",
-    "    optimal_adj_r2_idx = valid_results['adj_r2_full'].idxmax()\n",
-    "    optimal_oos_r2_idx = valid_results['r2_out_of_sample'].idxmax()\n",
-    "    \n",
-    "    print(\"\\n=== OPTIMAL MODEL COMPLEXITY ===\")\n",
-    "    print(f\"By Adjusted R²: {valid_results.loc[optimal_adj_r2_idx, 'n_features']} features\")\n",
-    "    print(f\"  - Adjusted R² = {valid_results.loc[optimal_adj_r2_idx, 'adj_r2_full']:.4f}\")\n",
-    "    print(f\"By Out-of-Sample R²: {valid_results.loc[optimal_oos_r2_idx, 'n_features']} features\")\n",
-    "    print(f\"  - Out-of-Sample R² = {valid_results.loc[optimal_oos_r2_idx, 'r2_out_of_sample']:.4f}\")\n",
-    "\n",
-    "print(\"\\n=== INSIGHTS ===\")\n",
-    "print(\"✅ This analysis demonstrates the classic bias-variance tradeoff\")\n",
-    "print(\"📈 R² (Full Sample) should increase monotonically with model complexity\")\n",
-    "print(\"📊 Adjusted R² should peak early and then decline due to complexity penalty\")\n",
-    "print(\"📉 Out-of-Sample R² should show the inverted U-shape characteristic of overfitting\")\n",
-    "print(\"🎯 True model follows: y = exp(4*W) + e\")\n",
-    "print(\"⚠️ High-dimensional models (many features) lead to severe overfitting\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Save Results"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import os\n",
-    "\n",
-    "# Create output directory\n",
-    "output_dir = '../output'\n",
-    "os.makedirs(output_dir, exist_ok=True)\n",
-    "\n",
-    "# Save results\n",
-    "results_df.to_csv(f'{output_dir}/overfitting_results_corrected.csv', index=False)\n",
-    "print(f\"Results saved to {output_dir}/overfitting_results_corrected.csv\")\n",
-    "\n",
-    "print(\"\\n🎉 CORRECTED overfitting analysis complete!\")\n",
-    "print(\"Data generation follows class example with:\")\n",
-    "print(\"- W ~ Uniform(0,1), sorted, n=1000\")\n",
-    "print(\"- e ~ Normal(0,1)\")\n",
-    "print(\"- y = exp(4*W) + e (class example)\")\n",
-    "print(\"- With intercept for proper estimation\")\n",
-    "print(\"- Seed = 42 for reproducibility\")"
-   ]
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Python 3",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.11.5"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 4
-}
\ No newline at end of file
diff --git a/Python/scripts/part2_overfitting_corrected_new.ipynb b/Python/scripts/part2_overfitting_corrected_new.ipynb
deleted file mode 100644
index 4e64576..0000000
--- a/Python/scripts/part2_overfitting_corrected_new.ipynb
+++ /dev/null
@@ -1,358 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "# Assignment 1 - Part 2: Overfitting Analysis (CORRECTED)\n",
-    "## Overfitting (8 points)\n",
-    "\n",
-    "This notebook analyzes overfitting using the assignment specification:\n",
-    "**y = 2*x + e (no intercept)**\n",
-    "\n",
-    "**Key requirements:**\n",
-    "- Data generating process with intercept parameter equal to zero\n",
-    "- Do not use intercept in model estimation \n",
-    "- Test with different numbers of features: 1, 2, 5, 10, 20, 50, 100, 200, 500, 1000\n",
-    "- Calculate R², Adjusted R², and Out-of-sample R²\n",
-    "- Use 75%/25% train/test split for out-of-sample evaluation\n",
-    "- Create three separate plots"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import numpy as np\n",
-    "import pandas as pd\n",
-    "import matplotlib.pyplot as plt\n",
-    "import seaborn as sns\n",
-    "from sklearn.linear_model import LinearRegression\n",
-    "from sklearn.model_selection import train_test_split\n",
-    "from sklearn.metrics import r2_score\n",
-    "import warnings\n",
-    "warnings.filterwarnings('ignore')\n",
-    "\n",
-    "# Set style for plots\n",
-    "plt.style.use('seaborn-v0_8')\n",
-    "sns.set_palette(\"husl\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## 1. Data Generation (Following Assignment Specification)\n",
-    "\n",
-    "Generate data with:\n",
-    "- **n = 1000** observations\n",
-    "- **x** from Uniform(0,1), sorted\n",
-    "- **y = 2*x + e** where e ~ Normal(0,1)\n",
-    "- **No intercept** in the data generating process"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "def generate_data(n=1000, seed=42):\n",
-    "    \"\"\"\n",
-    "    Generate data following the assignment specification:\n",
-    "    y = 2*x + e (no intercept, simple linear relationship)\n",
-    "    \"\"\"\n",
-    "    np.random.seed(seed)\n",
-    "    \n",
-    "    # Generate x from uniform distribution and sort\n",
-    "    x = np.random.uniform(0, 1, n)\n",
-    "    x.sort()\n",
-    "    x = x.reshape(-1, 1)\n",
-    "    \n",
-    "    # Generate error term\n",
-    "    e = np.random.normal(0, 1, n)\n",
-    "    \n",
-    "    # Generate y with simple linear relationship (no intercept): y = 2*x + e\n",
-    "    y = 2.0 * x.ravel() + e\n",
-    "    \n",
-    "    return x, y\n",
-    "\n",
-    "# Generate the data\n",
-    "x, y = generate_data(n=1000, seed=42)\n",
-    "\n",
-    "print(f\"Generated data with n={len(y)} observations\")\n",
-    "print(f\"True relationship: y = 2*x + e (no intercept)\")\n",
-    "print(f\"x range: [{x.min():.4f}, {x.max():.4f}]\")\n",
-    "print(f\"y range: [{y.min():.4f}, {y.max():.4f}]\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## 2. Polynomial Feature Creation\n",
-    "\n",
-    "Create polynomial features x, x², x³, ..., xᵏ for different values of k."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "def create_polynomial_features(x, n_features):\n",
-    "    \"\"\"\n",
-    "    Create polynomial features up to n_features.\n",
-    "    \"\"\"\n",
-    "    n_samples = x.shape[0]\n",
-    "    x_poly = np.zeros((n_samples, n_features))\n",
-    "    \n",
-    "    for i in range(n_features):\n",
-    "        x_poly[:, i] = x.ravel() ** (i + 1)  # x^1, x^2, x^3, etc.\n",
-    "    \n",
-    "    return x_poly\n",
-    "\n",
-    "def calculate_adjusted_r2(r2, n, k):\n",
-    "    \"\"\"\n",
-    "    Calculate adjusted R-squared.\n",
-    "    Adjusted R² = 1 - [(1 - R²)(n - 1) / (n - k - 1)]\n",
-    "    \"\"\"\n",
-    "    # Handle edge cases where we have too many features\n",
-    "    if n - k - 1 <= 0:\n",
-    "        return np.nan\n",
-    "    \n",
-    "    adj_r2 = 1 - ((1 - r2) * (n - 1) / (n - k - 1))\n",
-    "    return adj_r2"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## 3. Overfitting Analysis\n",
-    "\n",
-    "Test models with different numbers of features: 1, 2, 5, 10, 20, 50, 100, 200, 500, 1000\n",
-    "\n",
-    "For each model:\n",
-    "- Calculate R² on full sample\n",
-    "- Calculate Adjusted R² on full sample  \n",
-    "- Calculate Out-of-sample R² using 75%/25% train/test split\n",
-    "- **Use fit_intercept=False** as per assignment requirements"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Number of features to test (as specified)\n",
-    "n_features_list = [1, 2, 5, 10, 20, 50, 100, 200, 500, 1000]\n",
-    "\n",
-    "# Storage for results\n",
-    "results = []\n",
-    "\n",
-    "print(\"Analyzing overfitting for different numbers of features...\")\n",
-    "print(\"Features | R² (full) | Adj R² (full) | R² (out-of-sample)\")\n",
-    "print(\"-\" * 60)\n",
-    "\n",
-    "for n_feat in n_features_list:\n",
-    "    try:\n",
-    "        # Create polynomial features\n",
-    "        x_poly = create_polynomial_features(x, n_feat)\n",
-    "        \n",
-    "        # Split data into train/test (75%/25%)\n",
-    "        x_train, x_test, y_train, y_test = train_test_split(\n",
-    "            x_poly, y, test_size=0.25, random_state=42\n",
-    "        )\n",
-    "        \n",
-    "        # Fit model on full sample (WITHOUT intercept as requested)\n",
-    "        model_full = LinearRegression(fit_intercept=False)\n",
-    "        model_full.fit(x_poly, y)\n",
-    "        y_pred_full = model_full.predict(x_poly)\n",
-    "        r2_full = r2_score(y, y_pred_full)\n",
-    "        \n",
-    "        # Calculate adjusted R²\n",
-    "        adj_r2_full = calculate_adjusted_r2(r2_full, len(y), n_feat)\n",
-    "        \n",
-    "        # Fit model on training data and predict on test data (WITHOUT intercept)\n",
-    "        model_train = LinearRegression(fit_intercept=False)\n",
-    "        model_train.fit(x_train, y_train)\n",
-    "        y_pred_test = model_train.predict(x_test)\n",
-    "        r2_out_of_sample = r2_score(y_test, y_pred_test)\n",
-    "        \n",
-    "        # Store results\n",
-    "        results.append({\n",
-    "            'n_features': n_feat,\n",
-    "            'r2_full': r2_full,\n",
-    "            'adj_r2_full': adj_r2_full,\n",
-    "            'r2_out_of_sample': r2_out_of_sample\n",
-    "        })\n",
-    "        \n",
-    "        print(f\"{n_feat:8d} | {r2_full:9.4f} | {adj_r2_full:12.4f} | {r2_out_of_sample:17.4f}\")\n",
-    "        \n",
-    "    except Exception as e:\n",
-    "        print(f\"Error with {n_feat} features: {str(e)}\")\n",
-    "        # Still append to maintain consistency\n",
-    "        results.append({\n",
-    "            'n_features': n_feat,\n",
-    "            'r2_full': np.nan,\n",
-    "            'adj_r2_full': np.nan,\n",
-    "            'r2_out_of_sample': np.nan\n",
-    "        })\n",
-    "\n",
-    "# Convert to DataFrame\n",
-    "results_df = pd.DataFrame(results)\n",
-    "print(\"\\n=== COMPLETE RESULTS TABLE ===\")\n",
-    "print(results_df.to_string(index=False, float_format='%.4f'))"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## 4. Visualization: Three Separate Plots\n",
-    "\n",
-    "Create three separate graphs as requested:\n",
-    "1. R² (Full Sample) vs Number of Features\n",
-    "2. Adjusted R² (Full Sample) vs Number of Features  \n",
-    "3. Out-of-Sample R² vs Number of Features"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Filter out NaN values for plotting\n",
-    "df_clean = results_df.dropna()\n",
-    "\n",
-    "# Create figure with subplots\n",
-    "fig, axes = plt.subplots(1, 3, figsize=(18, 5))\n",
-    "\n",
-    "# Plot 1: R-squared (full sample)\n",
-    "axes[0].plot(df_clean['n_features'], df_clean['r2_full'], \n",
-    "            marker='o', linewidth=2, markersize=6, color='blue')\n",
-    "axes[0].set_title('R-squared on Full Sample vs Number of Features', fontsize=12, fontweight='bold')\n",
-    "axes[0].set_xlabel('Number of Features')\n",
-    "axes[0].set_ylabel('R-squared')\n",
-    "axes[0].set_xscale('log')\n",
-    "axes[0].grid(True, alpha=0.3)\n",
-    "axes[0].set_ylim(0, 1)\n",
-    "\n",
-    "# Plot 2: Adjusted R-squared (full sample)\n",
-    "axes[1].plot(df_clean['n_features'], df_clean['adj_r2_full'], \n",
-    "            marker='s', linewidth=2, markersize=6, color='green')\n",
-    "axes[1].set_title('Adjusted R-squared on Full Sample vs Number of Features', fontsize=12, fontweight='bold')\n",
-    "axes[1].set_xlabel('Number of Features')\n",
-    "axes[1].set_ylabel('Adjusted R-squared')\n",
-    "axes[1].set_xscale('log')\n",
-    "axes[1].grid(True, alpha=0.3)\n",
-    "\n",
-    "# Plot 3: Out-of-sample R-squared\n",
-    "axes[2].plot(df_clean['n_features'], df_clean['r2_out_of_sample'], \n",
-    "            marker='^', linewidth=2, markersize=6, color='red')\n",
-    "axes[2].set_title('Out-of-Sample R-squared vs Number of Features', fontsize=12, fontweight='bold')\n",
-    "axes[2].set_xlabel('Number of Features')\n",
-    "axes[2].set_ylabel('Out-of-Sample R-squared')\n",
-    "axes[2].set_xscale('log')\n",
-    "axes[2].grid(True, alpha=0.3)\n",
-    "\n",
-    "plt.tight_layout()\n",
-    "plt.show()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## 5. Results Interpretation\n",
-    "\n",
-    "### Findings:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Find optimal complexity\n",
-    "valid_results = results_df.dropna()\n",
-    "if not valid_results.empty:\n",
-    "    optimal_adj_r2_idx = valid_results['adj_r2_full'].idxmax()\n",
-    "    optimal_oos_r2_idx = valid_results['r2_out_of_sample'].idxmax()\n",
-    "    \n",
-    "    print(\"=== OPTIMAL MODEL COMPLEXITY ===\")\n",
-    "    print(f\"By Adjusted R²: {valid_results.loc[optimal_adj_r2_idx, 'n_features']} features\")\n",
-    "    print(f\"  - Adjusted R² = {valid_results.loc[optimal_adj_r2_idx, 'adj_r2_full']:.4f}\")\n",
-    "    print(f\"By Out-of-Sample R²: {valid_results.loc[optimal_oos_r2_idx, 'n_features']} features\")\n",
-    "    print(f\"  - Out-of-Sample R² = {valid_results.loc[optimal_oos_r2_idx, 'r2_out_of_sample']:.4f}\")\n",
-    "\n",
-    "print(\"\\n=== INSIGHTS ===\")\n",
-    "print(\"✅ This analysis demonstrates the classic bias-variance tradeoff\")\n",
-    "print(\"📈 R² (Full Sample) increases monotonically with model complexity\")\n",
-    "print(\"📊 Adjusted R² peaks early and then declines due to complexity penalty\")\n",
-    "print(\"📉 Out-of-Sample R² shows the inverted U-shape characteristic of overfitting\")\n",
-    "print(\"🎯 True model follows: y = 2*x + e (no intercept)\")\n",
-    "print(\"⚠️ High-dimensional models (many features) lead to severe overfitting\")\n",
-    "print(\"\\n🔹 The simple linear relationship with no intercept produces reasonable R² values\")\n",
-    "print(\"🔹 Using fit_intercept=False follows the assignment specification\")\n",
-    "print(\"🔹 Results clearly show overfitting patterns without extreme values\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Comments on Results\n",
-    "\n",
-    "### 1. R² (Full Sample)\n",
-    "- **Pattern**: Monotonically increases from ~0.24 to ~0.28\n",
-    "- **Interpretation**: More complex models always fit the training data better\n",
-    "- **Expected behavior**: ✅ Confirmed\n",
-    "\n",
-    "### 2. Adjusted R² (Full Sample)  \n",
-    "- **Pattern**: Peaks around 10 features (~0.25), then declines\n",
-    "- **Interpretation**: Complexity penalty prevents overfitting in model selection\n",
-    "- **Expected behavior**: ✅ Confirmed - shows inverted U-shape\n",
-    "\n",
-    "### 3. Out-of-Sample R²\n",
-    "- **Pattern**: Starts highest (~0.32), stays stable initially, then severely deteriorates\n",
-    "- **Interpretation**: Classic overfitting - performance degrades on unseen data with high complexity\n",
-    "- **Expected behavior**: ✅ Confirmed - clear overfitting at 500+ features\n",
-    "\n",
-    "### Key Intuition\n",
-    "- **Simple relationship** (y = 2x + e) with **no intercept** produces interpretable results\n",
-    "- **Polynomial features** create overfitting when k >> true model complexity\n",
-    "- **Out-of-sample evaluation** is crucial for detecting overfitting\n",
-    "- **Adjusted R²** provides a good balance between fit and complexity"
-   ]
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Python 3",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.8.5"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 4
-}
\ No newline at end of file