GitHub - anyscale/batch-llm-inference-reproductions: Reproducing Batch LLM Inference performance numbers

Instructions

Model	Node Type
neuralmagic/Meta-Llama-3.1-70B-Instruct-FP8	g6e.12xlarge
neuralmagic/Meta-Llama-3.1-7B-Instruct-FP8	g6e.xlarge

bash run_70b.sh 
# bash run_8b.sh

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
configs		configs
README.md		README.md
main.py		main.py
run_70b.sh		run_70b.sh
run_8b.sh		run_8b.sh