Skip to content
View akanksha3-3's full-sized avatar

Block or report akanksha3-3

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
akanksha3-3/README.md

Hi there, I'm Akanksha Waghamode! πŸ‘‹

🧬 Bioinformatics Graduate Student | Machine Learning Enthusiast | Drug Discovery Researcher | NGS

Welcome to my GitHub profile! I'm a passionate Bioinformatics graduate student with a strong foundation in computational biology, machine learning, and data analysis. Currently pursuing my Masters in Bioinformatics while actively working on cutting-edge research in drug discovery and predictive modeling.


πŸŽ“ Education

πŸŽ“ Master of Science in Bioinformatics (2023-2025)
Rajiv Gandhi Institute of IT and Biotechnology, Pune

πŸŽ“ Bachelor of Science in Biotechnology (2019-2022)
MES Abasaheb Garware College of Science and Arts, Pune


πŸ”¬ Current Research & Work

πŸ§ͺ Project Intern | MultiTargetAI (Jan 2025 - May 2025)

Currently developing Machine Learning-based QSAR models and molecular docking techniques to discover novel dual-inhibitors against Sodium-Glucose Co-Transporters (SGLT1 and SGLT2) for Type 2 Diabetes Mellitus. This work involves:

  • Advanced descriptor analysis and activity prediction
  • Virtual screening using Python and ML technologies
  • Integration of computational chemistry with machine learning

πŸ’» Technical Skills

Programming & Databases:

  • 🐍 Python | R | SQL

Specializations:

  • πŸ€– Machine Learning | Data Analysis | Linux

Domain Expertise:

  • πŸ’Š Computer-Aided Drug Design (CADD)
  • 🧬 Bioinformatics Pipeline Development

Bioinformatics Techniques:

  • 🧬 RNA-sequencing Analysis
  • πŸ” Genome-Wide Variation Analysis
  • πŸ“Š Genomic Data Processing

Data Visualization:

  • πŸ“Š PowerBI | Tableau | Matplotlib | Seaborn

πŸš€ Featured Projects

Built an end-to-end ML pipeline for breast cancer biomarker classification (569 samples, 30 features, 90:10 imbalance)

  • Models: Logistic Regression, Random Forest, XGBoost with augmentation (SMOTE, ADASYN, Random Oversampling)
  • Best Performance: Logistic Regression + Random Oversampling (98.89% ROC-AUC, 100% Recall)
  • Impact: Achieved perfect sensitivity for medical diagnosis with comprehensive cross-validation

Developed and benchmarked 5 oversampling techniques for highly imbalanced datasets (500 samples, 20 features, 8.4:1 ratio)

  • Methods: SMOTE, Borderline-SMOTE, ADASYN, Statistical Gaussian, Noise Injection with quality validation
  • Best Performance: All methods achieved perfect metrics (100% ROC-AUC, 100% Recall, 100% F1-Score)
  • Impact: Improved baseline recall from 27% to 100% with K-S test validation (p-value > 0.48) and excellent correlation preservation

Built an end-to-end ML pipeline using Ames Housing Dataset (2,932 records, 82 features)

  • Models: Linear Regression, Random Forest, XGBoost
  • Best Performance: XGBoost with cross-validation
  • Impact: Delivered actionable insights for real estate price prediction

Developed Python script using BeautifulSoup for web scraping

  • Sources: BBC and NDTV websites
  • Output: Daily HTML email digest
  • Automation: Scheduled news compilation and delivery

🧬 Genomic Data Analysis

Executed comprehensive bioinformatics pipelines on Linux

  • Analysis Types: Genome-Wide Variation and RNA-Seq
  • Data: SRA datasets
  • Environment: Linux command-line tools

πŸ₯ Healthcare ML Models

Built predictive models during LearnToUpgrade AI Internship

  • Applications: Cancer and BMI prediction
  • Algorithms: KNN and NaΓ―ve Bayes
  • Data: DNA K-mers count analysis
  • Interface: Interactive Streamlit dashboards

πŸ“œ Certifications

  • πŸ† Career Edge - Young Professional | TCS iON | Jun 2025
  • 🧬 Genomics and Bioinformatics | IISER Kolkata | Aug 2025
  • πŸ“Š Data Science & Analytics | hp LIFE | Aug 2025

🌟 Core Competencies

Technical Excellence:

  • Problem-Solving & Critical Thinking
  • Statistical Analysis & Data Modeling
  • Computational Biology Applications

Leadership & Communication:

  • Team Collaboration & Leadership
  • Scientific Communication
  • Active Learning & Adaptability

πŸ“ˆ GitHub Stats

Akanksha's GitHub Stats

Top Languages


πŸ” Research Interests

  • πŸ’Š Drug Discovery & Development
  • 🧬 Computational Biology
  • πŸ€– Machine Learning in Healthcare
  • πŸ“Š Genomics Data Analysis
  • πŸ”¬ QSAR Modeling
  • πŸ’» Bioinformatics Tool Development

πŸ“« Let's Connect!


🌱 Currently Learning

  • Advanced Deep Learning techniques for molecular modeling
  • Cloud computing platforms for bioinformatics (AWS, Google Cloud)
  • Advanced statistical methods for genomic data analysis
  • MLOps for deploying ML models in production

πŸ’‘ Fun Facts

  • 🧬 Passionate about bridging the gap between biology and technology
  • πŸ“Š Love turning complex biological data into actionable insights
  • 🎯 Always excited to collaborate on interdisciplinary projects
  • 🌟 Believe in the power of open-source science

"Combining the power of computation with the complexity of biology to solve real-world problems."

⭐ Feel free to explore my repositories and don't forget to star the ones you find interesting!

Popular repositories Loading

  1. akanksha3-3 akanksha3-3 Public

    Personal GitHub profile README showcasing my interests.

  2. ames-house-price-prediction ames-house-price-prediction Public

    Jupyter Notebook

  3. automated-email-based-news-aggregator automated-email-based-news-aggregator Public

    Python

  4. breast-cancer-classification-pipeline breast-cancer-classification-pipeline Public

    Python

  5. synthetic-data-augmentation-for-imbalanced-classification synthetic-data-augmentation-for-imbalanced-classification Public

    Python