Skip to content

Releases: RobotStudyCompanion/Benchmark_LM

v0.1

21 Apr 13:17

Choose a tag to compare

First tagged release accompanying the paper
"Benchmarking Local Language Models for Social Robots using Edge Devices"
(accepted IEEE ARSO 2026).

Release summary. Reproducible benchmark suite covering 25 open-source
language models on Raspberry Pi 4, Raspberry Pi 5, and laptop-GPU hosts.
Evaluates inference efficiency (TPS, TPJ), knowledge (six-category MMLU
subset), and teaching effectiveness (LLM-rated against eight criteria,
validated by five human raters).

Accompanying data record: https://doi.org/10.5281/zenodo.19643021

Highlights since dorian-original:

  • Consolidated per-platform runners and analysers from the development
    repository (orlandossss/Master_Benchmark, archiving).
  • Disk-I/O telemetry on the Raspberry Pi runners, matching the data
    published in the Zenodo record.
  • Linux-only packaging with pinned requirements.txt and setup.sh.
  • Syntax-check CI workflow on push and pull request.
  • Apache-2.0 licence, CITATION.cff, hardened .gitignore.

Known scope: the three benchmark runners and three analysers remain
separate per-platform scripts for v0.1. Consolidation into a single
platform-aware runner is scoped for v0.2 — see future_work/ for the
broader forward-looking roadmap.

Full Changelog: dorian-original...v0.1

Dorian's original Benchmarking_LLM suite

21 Apr 11:18

Choose a tag to compare