Skip to content

aniket-mish/the-daily-bugle

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 

Repository files navigation

The Daily 🕸 Bugle

I'm building a comic nerd app that answers my nerdy questions about superheroes. This keeps track of all the superhero stuff going on in the world. You can ask trivia things including coolest easter eggs. I am also reading about LLMOps and AI Engineering.

image

My project map:

  • Setup and installations
  • Collect data
  • Fine-tune an open source llm
  • Vector db
  • Query the system with prompts
  • Create a simple ui

Index

  1. Data Collection
  2. Feature Engineering
  3. Training/Finetuning a LLM
  4. Inference Service
  5. Monitoring
  6. UI/UX

Setup

Environment

I am using conda for creating environments.

conda create -n comic python=3.11

conda activate comic

I am using poetry for package management. I will use uv in production because its very fast.

cd comic

poetry init

poetry add numpy pandas

Dependencies

Instead of using Makefile for simple automation, i'm using poe the poet plugin. I can just add the scripts in the pyproject.toml file and execute them. It works well with poetry.

I'm using ZenML as an orchestrator to manage my pipelines. It glues multiple @steps with a @pipeline. There are other orchestrators like Airflow, Prefect, Argo, Kubeflow that are popular.

Track experiments using CometML and prompts using Opik.

MongoDB for storing scraped data and Qdrant for storing vector representations.

Finally I'm using AWS Sagemaker for training as i'm more familiar with it. You can use AWS Bedrock as well and you don't have to manage infra.

Data Collection Pipeline

About

know about your favourite superhero

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published