-
Notifications
You must be signed in to change notification settings - Fork 47
Description
Labels: enhancement, good-first-issue, hacktoberfest, ml-integration, priority-high
Difficulty: ⭐ Intermediate
Estimated Effort: 2–4 hours
Skills Required: Python, Streamlit, scikit-learn, ML model integration
🐛 Problem Description
Current Behavior:
The fare estimator currently uses random selection to display crowd levels:
def crowd_level():
levels = ["Low 🟢", "Moderate 🟡", "High 🔴"]
return random.choice(levels)
This results in completely unpredictable outputs — the same station can show different crowds when refreshed. The randomness makes the app unreliable for passengers who need accurate crowd insights.
Expected Behavior:
Crowd levels should be predicted using the trained RandomForestRegressor model (passenger_flow_model.pkl), which factors in:
⏰ Hour of the day (peak vs off-peak)
📅 Day of the week (weekday/weekend)
🚉 Station-specific characteristics
📊 Historical passenger flow patterns
🎯 Why This Matters
Core Feature Missing: The trained ML model isn’t being used.
User Trust: Random results reduce app credibility.
Real Impact: Commuters rely on accurate crowd predictions to plan travel and avoid congestion.
Learning Value: Great issue for contributors to learn how to integrate ML predictions into a Streamlit app.
💡 Proposed Solution
Add Time & Day Selection in UI
Create Streamlit controls for hour (slider) and day (dropdown).
Load the ML Model Efficiently
Use @st.cache_resource to load the model once for faster predictions.
Implement predict_crowd_level()
Generate predictions using station name, hour, and day of week as inputs.
Convert numeric output into readable crowd levels (Low, Moderate, High).
Update the UI
Show predictions for both start and end stations using st.metric(), including passenger count deltas.
Error Handling
Display a Streamlit warning if the model file is missing.
✅ Acceptance Criteria
Model loaded with caching
Time/day selectors added to UI
Predictions use ML model instead of randomness
Results update dynamically with inputs
Deterministic outputs (same inputs → same prediction)
Graceful error message if model file missing
🧪 Testing Checklist
Peak hours show higher crowd predictions
Weekends show lighter crowds
Different stations yield distinct results
UI remains responsive and consistent
🏆 Impact
Before: Random, meaningless crowd levels.
After: Accurate, ML-driven predictions empowering commuters with real-time, data-backed insights.
This enhancement bridges the ML backend with the user-facing interface — transforming the app from a demo into a truly intelligent travel assistant.
@Manjushwarofficial please assign this to me