Code for NeurIPS2024 paper: 'Towards Transparency: Exploring LLM Trainings Datasets through Visual Topic Modeling and Semantic Frames'
The codes for the figures is in the different colab and the figures are in results
Download model form HF Hub
git-lfs clone https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B