Skip to content

Conversation

Akchiche-Mohamed-Aymen
Copy link

๐Ÿ› ๏ธ Description of Contribution: Added Feature Normalization Step

๐Ÿ”„ What I Did:

I added a feature normalization step to the dataset preprocessing pipeline in the implementation of the machine learning algorithms (Linear regression & Logistic regression ). The original repository did not include this step, which can significantly impact the performance and convergence behavior of many ML models.

๐Ÿ“Œ Summary of Changes:

  • Introduced a normalization function that scales input features to a standardized range (mean = 0, std = 1) or [0, 1], depending on the model requirement.
  • Updated the training pipeline to apply normalization before model fitting.
  • Ensured that the same transformation is applied during inference/prediction.
  • I did not extract normalization into a shared module. This ensures clarity and local control for each modelโ€™s behavior.

๐Ÿ“ˆ Why Normalization Matters:

Feature normalization is a crucial step in machine learning, especially when:

  • Features have different units or scales (e.g., age in years vs. income in dollars).
  • Models rely on gradient-based optimization (e.g., Linear Regression, Logistic Regression, Neural Networks), where unscaled features can lead to slow convergence or divergent gradients.
  • We want fair contributions from each feature without one dominating due to larger magnitude.

Without normalization:

  • The cost function may not converge efficiently.
  • The model might be biased toward high-magnitude features.
  • Training can become unstable or yield suboptimal results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant