Skip to content

Refactor backend: modularize query workflow and add local model support#169

Merged
FranardoHuang merged 139 commits intoaugcog:mainfrom
Catrunaround:main
Mar 3, 2026
Merged

Refactor backend: modularize query workflow and add local model support#169
FranardoHuang merged 139 commits intoaugcog:mainfrom
Catrunaround:main

Conversation

@Catrunaround
Copy link
Collaborator

Summary

  • Query workflow modularization: Refactored monolithic rag_generation.py, rag_retriever.py, rag_preprocess.py, and chat_service.py into modular packages under services/generation/, services/query/, services/memory/, and services/audio/
  • Local model support for tutor mode: Added OpenAI-compatible local model (vLLM) integration with new openai_model.py dependency and tutor pipeline support
  • New prompt system: Migrated prompts from app/prompts/ to services/generation/prompts/ with structured textchat/voice modules, including canvas and outline prompts
  • RAG pipeline improvements: Added citation display, course mapping, query reformulation, vector search, and prompt assembly modules

Test plan

  • Verify chat pipeline works end-to-end (text and voice modes)
  • Verify tutor pipeline works with both remote and local models
  • Test streaming responses and citation display
  • Confirm memory synopsis service functions correctly

🤖 Generated with Claude Code

Catrunaround and others added 27 commits January 30, 2026 05:04
…stem

This merge combines:
- vLLM OpenAI-compatible API configuration from main (settings-based URLs)
- 4-mode system (Chat Tutor, Chat Regular, Voice Tutor, Voice Regular) from final_develop
- Composable prompts system from final_develop
- Timer/latency tracking from final_develop
- OpenAI model integration from final_develop
- Dynamic path configuration from main
- TTS server configuration from main

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Major changes:
- Implement 4-mode system (Chat Tutor, Chat Regular, Voice Tutor, Voice Regular)
- Add OpenAI model integration alongside vLLM
- Add request timer for latency tracking
- Refactor prompts into separate modules (app/prompts/)
- Add sentence mapping and batch upload services for RAG
- Improve video converter and database utilities
- Add web service for RAG file conversion

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…main

Resolved conflicts by accepting incoming changes:
- Remove VLLM guided decoding constants (GUIDED_RESPONSE_BLOCKS, GUIDED_VOICE_TUTOR_BLOCKS)
- Remove json_output parameter (derived from tutor_mode)
- Add dynamic engine initialization
- Simplify markdown spacing logic

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@FranardoHuang FranardoHuang merged commit 9d5be74 into augcog:main Mar 3, 2026
1 check failed
FranardoHuang added a commit that referenced this pull request Mar 12, 2026
…rt (#169)

* rebase and bring all change before 0709 into main

* Test commit

* deleted Test commit

* add condition on video paragraph if less than 5

* fixed video page for now

* build a new paeg class

* local change push

* fix problem for ## empty content

* title helper fixed for roar academt

* update index helper

* finish roar academy

* command change

* fixing pdf converter

* fixing (2)

* fix video scraper

* fix cs61a added new chunk create

* add conversion ignore, change chunk from save pkl file to data base, change embedding function to data base, next step is add speaker name into json file.

* changed from pkl to data base and added extra information into db

* added Assessment question in video

* add assessment question and strict order for all kinds of file

* add speaker role into video

* create a fix function for problem table

* create a fix function for problem table

* create a fix function for problem table

* fixed url, uuid and relative path in db

* backup

* add guess speaker function

* temp change need to be clean

* finalize data base next step is create update db function

* add ssplit db for each course and colletive db for all course.

* add Cladue code support for file_conversion_router and refector the api.py since it is too large

* deleted old useless file, added data base mereger function

* change database merger

* code refactor fore title handle.py

* add validator function for db.

* add helper for title handle

* update scraper can try multi times

* add file_rearangement folder

* mvp version of file rearangement

* update readme

* add middle json

* command change

* add the pdf bbox and search structure

* implemented sentence citation service function

* remove test files

* add playlist information to metadata

* fix scraper and implement conversion accurate reference

* finalized new prompt and add file_discription and new key concept

* added new json formate and streaming

* streaming

* restore to origin

* back to depoly

* json response

* enable json

* unadd audio

* rewrite the prompt

* remove unuse file

* add new prompt

* revert the change

* add inline json

* new json prompt

* json streaming and prompt turning can be better

* fuxk

* fix

* Removed redundant code

* fix

* Removed redundant code

* Restore RAG prompt improvements after removing bad commit

This restores the final state of the RAG prompt engineering changes
while keeping the commit history clean.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* command change

* add web service for rag and modify rag_generation prompt

* add code block

* add code block and muti-level heading

* add thinking in the json

* remove level info

* remove limit

* add openai model for testing

* add a timer and four mode

* prompt fixed

* test for both local model and gpt5.2 model

* add many debug statments

* delete unuse tests

* remove some reduntent prompt

* feat: 4-mode system, RAG improvements, file conversion enhancements

Major changes:
- Implement 4-mode system (Chat Tutor, Chat Regular, Voice Tutor, Voice Regular)
- Add OpenAI model integration alongside vLLM
- Add request timer for latency tracking
- Refactor prompts into separate modules (app/prompts/)
- Add sentence mapping and batch upload services for RAG
- Improve video converter and database utilities
- Add web service for RAG file conversion

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* remove unuse functions

* fixed bug between openai format and vllm format.

* remove example in prompt

* remove the load local env function

* refactor prompt

* sync text tutor prompt source with approved export wording

* refactor regular text prompt to template-based addendums

* structure the prompt

* remove unuse code

* remove unuse code

* import error

* add citation show in frontend

* add citation show in frontend

* add citation show in frontend

* add prompt

* refactor and update openai prompt

* add file description

* temp

* add purpose to outline

* add local model to tutor mode

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* remove .claude/settings.json and add .claude/ to .gitignore

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Franco <francohuang945@gmail.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants