Code for the paper:
Trott, S. (2024). Large language models and the wisdom of small crowds. Open Mind, 8, 723-738.
This code compares the quality of data generated by GPT-4, an LLM, to the quality of data provided by human samples of various sizes.
All experimental data can be found in data.
data/official/human: contains the original human "ground truth" ratings for each dataset tested.data/official/llm: contains the original LLM ratings for each dataset tested.data/processed/llm: contains the aggregated human results from each new experiment, as well as the results of the primary analyses.
Additionally, the lists for each dataset can be found in experiment/stimuli/.
The src/analysis folder contains code for running the primary analyses.
- The Jupyter notebooks contain Python code to execute the primary analyses for each dataset.
- The
Visualization.RmdR file contains code to reproduce the primary visualizations for the manuscript.