- Make a full up-to-date data crawling and preprocessing pipeline (because building entire service with full of features takes too much time and it makes me exhausted)
- search by keywords
- choose reputable resources
- get raw data
- preprocessing
- merging duplicated data
- format with fixed structure
- convert to latex scripts
- export pdfs
- update data engine setup (most important)
- build RAG about tourism in Vietnam (for two languages: en-vi) in this night until done
- be able to run locally (RTX 3050 4GB)
- fast, accurate, compact, reliable
- estimated time: 5hrs --> 4:15 - 05/06/2025
- normal + real-time (true RAG) mode
- Brainstorming
- Data preparation
- Building model
- Testing
- Launching