Skip to content

qtu-UBC/LCI-Data-AIPrep

Repository files navigation

Creating an automated, high-throughput method to extract relevant information from documents, such as peer-reviewed life cycle assessment (LCA) articles and technical reports, is crucial for advancing life cycle inventory (LCI) modeling. Large Language Models (LLMs) can efficiently curate large datasets from various sources, including text descriptions, tabulated data, knowledge graphs, and images. This project aims to create an end-to-end, LLM-based LCI data curation framework. Key steps of this framework include:

  • Detect and partition the key elements (e.g., tables, images) from a given pdf
  • Embed and persist the elements into a vector database
  • Apply hybrid search to retrieve the relevant information for (1) system boundary completion, (2) inventory data (flow name and quantity) synthesis, (3) assumption validation
  • Output the curated LCI data

About

Use AI to prepare life cycle inventory data

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages