-
Notifications
You must be signed in to change notification settings - Fork 4
Separate search from LLM #212
Copy link
Copy link
Open
Labels
new computationUpdate that adds a new computation methodUpdate that adds a new computation methodp-mediumPriority: mediumPriority: mediumrefactorCode improvements that do not change functionalityCode improvements that do not change functionalitytopic-python-cliIssues/pull requests related to running the python processingIssues/pull requests related to running the python processing
Metadata
Metadata
Assignees
Labels
new computationUpdate that adds a new computation methodUpdate that adds a new computation methodp-mediumPriority: mediumPriority: mediumrefactorCode improvements that do not change functionalityCode improvements that do not change functionalitytopic-python-cliIssues/pull requests related to running the python processingIssues/pull requests related to running the python processing
It might be worth adding an optional "pre-processing" step that goes out and does all the web scraping, but without LLMs. This means the result would contain links from search engine searches, website crawls, and potentially even ordinance databases like amlegal. These links would be stored in some sort of output (JSON?) that the user could provide as an optional input to running COMPASS, which would help the main execution bypass the heavy and long-running search steps.
Separating out the search step to not include LLM's can also be beneficial since we could run locally on HPC. Alternatively, if we keep LLM's in the search process, we can scale the search portion using kubernetes.
Since this would be a new "step" to running compass, it would be strongly encouraged to set up a WMS like Airflow.