-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
🔴 Priority: HIGH | Type: Enhancement
1. SUMMARY
- Feedback collection exists (
FeedbackStore) but feedback is not used to improve query generation. Users can submit corrections, but they don't influence future queries. - Impact: Missed opportunity for continuous improvement. Users correct the same mistakes repeatedly. No learning from production usage patterns.
2. SYSTEM CONTEXT
db/
├── feedback.py # FeedbackStore - collects feedback (EXISTS)
└── examples.py # ExampleStore - few-shot examples (EXISTS)
app/
├── routes_feedback.py # POST /feedback endpoint (EXISTS)
├── agent/engine.py # Agent doesn't use feedback (GAP)
└── text2sql_engine.py # Engine doesn't use feedback (GAP)
models/
├── prompts.py # Prompt templates (no feedback integration)
└── fine_tuning.py # Fine-tuning module (not connected to feedback)
Current flow:
User → Query → SQL → (wrong) → User submits correction → Stored → END
Desired flow:
User → Query → SQL → (wrong) → User submits correction → Stored →
→ Verified → Added to few-shot examples OR fine-tuning dataset
→ Future queries benefit from correction
3. CURRENT STATE (with code)
📄 File: db/feedback.py:82-100
async def submit_feedback(
self,
database_id: str,
natural_query: str,
generated_sql: str | None = None,
corrected_sql: str | None = None,
rating: int | None = None,
comment: str | None = None,
) -> FeedbackEntry:
"""Submit user feedback for a query."""
# Stores feedback but doesn't use it for improvementFeedback is collected but never processed further.
📄 File: db/examples.py (ExampleStore)
class ExampleStore:
"""Store and manage few-shot examples."""
# Separate from feedback - no automatic promotionFew-shot examples are manually managed, not populated from feedback.
📄 File: app/agent/engine.py
# No reference to FeedbackStore
# No learning from correctionsAgent doesn't check feedback or learn from corrections.
4. PROPOSED SOLUTION
Create a feedback loop that:
- Allows users to mark queries as correct/incorrect with corrections
- Promotes verified corrections to few-shot examples
- Aggregates feedback for fine-tuning datasets
- Shows feedback status to users (pending, verified, applied)
📄 File: app/feedback_loop.py (NEW)
from db.feedback import FeedbackStore, FeedbackStatus
from db.examples import ExampleStore
class FeedbackLoopManager:
"""Manages the feedback → improvement pipeline."""
def __init__(self, feedback_store: FeedbackStore, example_store: ExampleStore):
self.feedback = feedback_store
self.examples = example_store
async def process_verified_feedback(self, feedback_id: str) -> dict:
"""Process verified feedback into improvements."""
entry = await self.feedback.get_feedback(feedback_id)
if entry.status != FeedbackStatus.VERIFIED:
raise ValueError("Feedback must be verified first")
# Option 1: Add to few-shot examples
if entry.corrected_sql and entry.rating and entry.rating >= 4:
await self.examples.add_example(
database_id=entry.database_id,
natural_query=entry.natural_query,
sql=entry.corrected_sql,
source="user_feedback",
verified=True,
)
# Option 2: Add to fine-tuning queue
await self._add_to_finetune_queue(entry)
# Mark as applied
await self.feedback.update_feedback_status(
feedback_id, FeedbackStatus.APPLIED
)
return {"status": "applied", "feedback_id": feedback_id}
async def get_similar_corrections(
self, natural_query: str, database_id: str
) -> list[dict]:
"""Find previous corrections for similar queries."""
# Use embedding similarity to find relevant corrections
return await self.feedback.find_similar(
natural_query=natural_query,
database_id=database_id,
status=FeedbackStatus.APPLIED,
limit=3,
)📄 File: app/agent/engine.py (ENHANCED)
async def generate_sql(self, natural_query: str, ...):
# Check for relevant prior corrections
corrections = await self.feedback_loop.get_similar_corrections(
natural_query, database_id
)
if corrections:
# Include in prompt as additional context
context = self._format_corrections_context(corrections)
prompt = f"{prompt}\n\nPrevious corrections for similar queries:\n{context}"
# Continue with generation
...5. IMPLEMENTATION CHECKLIST
Phase 1: Feedback Enhancement
- Add
FeedbackStatus.APPLIEDstate to track used feedback - Add embedding column to feedback table for similarity search
- Implement
find_similar()method in FeedbackStore - Add API endpoint to verify and approve feedback
Phase 2: Few-Shot Integration
- Create
FeedbackLoopManagerclass - Implement automatic promotion to few-shot examples
- Add source tracking to examples (manual vs feedback)
- Add quality thresholds (rating >= 4, verified status)
Phase 3: Agent Integration
- Inject feedback context into agent prompts
- Add similar-correction lookup before generation
- Track which corrections influenced which queries
- Add metrics for feedback utilization
Phase 4: Fine-Tuning Pipeline
- Create fine-tuning dataset export from verified feedback
- Add scheduled job to aggregate feedback
- Implement dataset versioning
- Add quality filtering and deduplication
Phase 5: User Experience
- Add feedback status visibility in API responses
- Create feedback review queue endpoint
- Add bulk verification endpoint for admins
- Show "learned from your correction" indicator
6. FILES TO MODIFY TABLE
| File | Lines | Action | Description |
|---|---|---|---|
app/feedback_loop.py |
NEW | Create | Feedback loop orchestration |
db/feedback.py |
30-50 | Modify | Add APPLIED status, embedding column |
db/feedback.py |
TBD | Add | find_similar() method |
db/examples.py |
TBD | Modify | Add source tracking field |
app/agent/engine.py |
TBD | Modify | Integrate feedback lookup |
app/routes_feedback.py |
TBD | Modify | Add verify/apply endpoints |
models/prompts.py |
TBD | Modify | Add corrections context template |
scripts/export_finetune_data.py |
NEW | Create | Export feedback for fine-tuning |
7. RISK ASSESSMENT
| Risk | Impact | Mitigation |
|---|---|---|
| Bad feedback pollutes examples | 🔴 | Require verification; quality thresholds; admin review |
| Embedding search is slow | 🟡 | Use approximate search; cache results; limit scope |
| Feedback bias (same users) | 🟡 | Track user diversity; weight by uniqueness |
| Privacy concerns | 🟡 | Allow opt-out; anonymize; document data usage |
8. RELATED CONTEXT
- Existing feedback store:
db/feedback.py - Existing example store:
db/examples.py - Feedback API:
app/routes_feedback.py - Fine-tuning module:
models/fine_tuning.py(Issue Phase 5.3: Few-Shot Learning & Fine-Tuning #16) - Related issue: [HIGH] Benchmarking Suite for Accuracy Tracking (Spider, WikiSQL) #44 (Benchmarking - can use feedback for accuracy tracking)
Metadata
Metadata
Assignees
Labels
No labels