akanksha3-3

Hi there, I'm Akanksha Waghamode! 👋

🧬 Bioinformatics Graduate Student | Machine Learning Enthusiast | Drug Discovery Researcher | NGS

Welcome to my GitHub profile! I'm a passionate Bioinformatics graduate student with a strong foundation in computational biology, machine learning, and data analysis. Currently pursuing my Masters in Bioinformatics while actively working on cutting-edge research in drug discovery and predictive modeling.

🎓 Education

🎓 Master of Science in Bioinformatics (2023-2025)
Rajiv Gandhi Institute of IT and Biotechnology, Pune

🎓 Bachelor of Science in Biotechnology (2019-2022)
MES Abasaheb Garware College of Science and Arts, Pune

🔬 Current Research & Work

🧪 Project Intern | MultiTargetAI (Jan 2025 - May 2025)

Currently developing Machine Learning-based QSAR models and molecular docking techniques to discover novel dual-inhibitors against Sodium-Glucose Co-Transporters (SGLT1 and SGLT2) for Type 2 Diabetes Mellitus. This work involves:

Advanced descriptor analysis and activity prediction
Virtual screening using Python and ML technologies
Integration of computational chemistry with machine learning

💻 Technical Skills

Programming & Databases:

🐍 Python | R | SQL

Specializations:

🤖 Machine Learning | Data Analysis | Linux

Domain Expertise:

💊 Computer-Aided Drug Design (CADD)
🧬 Bioinformatics Pipeline Development

Bioinformatics Techniques:

🧬 RNA-sequencing Analysis
🔍 Genome-Wide Variation Analysis
📊 Genomic Data Processing

Data Visualization:

📊 PowerBI | Tableau | Matplotlib | Seaborn

🚀 Featured Projects

🧬 Breast Cancer Classification Pipeline

Built an end-to-end ML pipeline for breast cancer biomarker classification (569 samples, 30 features, 90:10 imbalance)

Models: Logistic Regression, Random Forest, XGBoost with augmentation (SMOTE, ADASYN, Random Oversampling)
Best Performance: Logistic Regression + Random Oversampling (98.89% ROC-AUC, 100% Recall)
Impact: Achieved perfect sensitivity for medical diagnosis with comprehensive cross-validation

🔬 Synthetic Data Augmentation for Imbalanced Classification

Developed and benchmarked 5 oversampling techniques for highly imbalanced datasets (500 samples, 20 features, 8.4:1 ratio)

Methods: SMOTE, Borderline-SMOTE, ADASYN, Statistical Gaussian, Noise Injection with quality validation
Best Performance: All methods achieved perfect metrics (100% ROC-AUC, 100% Recall, 100% F1-Score)
Impact: Improved baseline recall from 27% to 100% with K-S test validation (p-value > 0.48) and excellent correlation preservation

🏠 House Price Prediction Model

Built an end-to-end ML pipeline using Ames Housing Dataset (2,932 records, 82 features)

Models: Linear Regression, Random Forest, XGBoost
Best Performance: XGBoost with cross-validation
Impact: Delivered actionable insights for real estate price prediction

📧 Automated News Aggregator

Developed Python script using BeautifulSoup for web scraping

Sources: BBC and NDTV websites
Output: Daily HTML email digest
Automation: Scheduled news compilation and delivery

🧬 Genomic Data Analysis

Executed comprehensive bioinformatics pipelines on Linux

Analysis Types: Genome-Wide Variation and RNA-Seq
Data: SRA datasets
Environment: Linux command-line tools

🏥 Healthcare ML Models

Built predictive models during LearnToUpgrade AI Internship

Applications: Cancer and BMI prediction
Algorithms: KNN and Naïve Bayes
Data: DNA K-mers count analysis
Interface: Interactive Streamlit dashboards

📜 Certifications

🏆 Career Edge - Young Professional | TCS iON | Jun 2025
🧬 Genomics and Bioinformatics | IISER Kolkata | Aug 2025
📊 Data Science & Analytics | hp LIFE | Aug 2025

🌟 Core Competencies

Technical Excellence:

Problem-Solving & Critical Thinking
Statistical Analysis & Data Modeling
Computational Biology Applications

Leadership & Communication:

Team Collaboration & Leadership
Scientific Communication
Active Learning & Adaptability

📈 GitHub Stats

🔍 Research Interests

💊 Drug Discovery & Development
🧬 Computational Biology
🤖 Machine Learning in Healthcare
📊 Genomics Data Analysis
🔬 QSAR Modeling
💻 Bioinformatics Tool Development

📫 Let's Connect!

📧 Email: akankshawaghamode2001@gmail.com
📱 Phone: +91-8080640427
💼 LinkedIn: Connect with me
🐙 GitHub: You're already here!

🌱 Currently Learning

Advanced Deep Learning techniques for molecular modeling
Cloud computing platforms for bioinformatics (AWS, Google Cloud)
Advanced statistical methods for genomic data analysis
MLOps for deploying ML models in production

💡 Fun Facts

🧬 Passionate about bridging the gap between biology and technology
📊 Love turning complex biological data into actionable insights
🎯 Always excited to collaborate on interdisciplinary projects
🌟 Believe in the power of open-source science

"Combining the power of computation with the complexity of biology to solve real-world problems."

⭐ Feel free to explore my repositories and don't forget to star the ones you find interesting!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly