Artifact for Lean-SMT

Paper submission number: 305
Claimed badges: Available, Functional, Reusable

Artifact Requirements

The artifact is provided as a Docker image. We recommend using a machine with the following specifications:

CPU: 16-core (32-thread) processor for brief experiments
RAM: 128 GB for brief experiments
Storage: At least 60 GB free disk space (image size ~48.5 GB uncompressed)
Operating System: Linux, macOS, or Windows with Docker support
Internet Connection: Required if building the Docker image manually

Estimated runtimes for the benchmark categories:

Benchmark Set	Configuration	Time (approx.)
Seventeen	cvc5+leansmt±compiler, duper	~1 hour
Seventeen	verit+sledgehammer	~8 hours
SMT-LIB	cvc5+leansmt±compiler, cvc5+ethos, verit+smtcoq	~4 hours

Evaluation was performed on a cluster of 48 nodes with:

AMD Ryzen 9 7950X @ 4.50GHz
128 GB RAM

The smoke, minimal, and brief subsets can be run in significantly less time with recommended specifications:

smoke: ~10 minutes
minimal: ~2 hours
brief: ~8 hours

Structure and Content

The artifact includes:

Source Code:
- lean-smt/: Lean-SMT tactic
- lean-cpc-checker/: SMT-LIB v2 frontend (CPC checker)
Benchmarking & Evaluation Scripts:
- run_all_benchmarks.sh: main script for evaluation
- generate_figures_and_tables.sh: generates figures and tables from data in data directory
- cvc5+ethos.sh, cvc5+leansmt±compiler.sh, duper.sh, verit+sledgehammer.sh, verit+smtcoq.sh: wrappers for each configuration
- run_benchmarks.py: runs a solver over benchmark sets
- collect_*_stats.py: parses logs into CSVs
- cactus.py, scatter±compiler.py, tables.py: visualization tools
- data/all/seventeen, data/all/SMT-LIB: data from paper evaluation (slightly modified for artifact scripts) Docker Image Contents (abdoo8080/lean-smt-artifact:v4):
Precompiled versions of all tools
Benchmark datasets (benchmarks/seventeen, benchmarks/SMT-LIB)
Isabelle/AFP versions as used in the Seventeen Provers under the Hammer paper

Getting Started

Pull the Docker image (available for x86 architecture):

docker pull abdoo8080/lean-smt-artifact:v4

Or, build it from source (estimated time: 2-3 hours on recommended hardware):

docker build -t abdoo8080/lean-smt-artifact:v4 .

Then run the container:

docker run -it abdoo8080/lean-smt-artifact:v4

Within the container, use:

./run_all_benchmarks.sh --mode <mode> [--jobs <J>] [--enable-sledgehammer]

Where <mode> is one of:

smoke: quick verification (10 samples per category)
minimal: minimal verification (100 samples per category)
brief: substantial verification (500 samples per category)
all: complete benchmark set (5000 baseline, 24817 SMT-LIB)

Note on Modes and Hardware Requirements: minimal mode is recommended for hardware with specifications that are significantly lower than our suggested configuration. It uses fewer benchmarks and smaller timeouts to accommodate such environments. You can customize the number of benchmarks, the number of parallel jobs (ideally up to the number of physical CPU cores), and the per-benchmark timeout/memory limit using the script's command-line options.

About Isabelle Sledgehammer: The verit+sledgehammer solver is excluded by default due to its high requirements. You can run it by invoking the script with --enable-sledgehammer argument. The higher requirements are due to verit+sledgehammer not directly running on the seventeen benchmark set. Instead, it locates the original Isabelle goals that produced the benchmarks and builds all the Isabelle sessions required for that before running sledgehammer on the goal. Building sessions uses all CPU cores and the time it takes highly depends on the sessions needed, hence the higher requirements below:

16 GB memory per job
Single-threaded mode
Extended timeouts (up to 20 minutes per benchmark)

Visualizing Original Data: The directories data/all/seventeen and data/all/SMT-LIB contain data collected from evaluating all benchmarks for comparison. You can visualize this data by running:

./generate_figures_and_tables.sh all

This script scans the data/all directory, then produces the figures and tables in /home/user/artifact/figures/all and /home/user/artifact/tables/all.

Functional Badge

This artifact reproduces all experimental results shown in the paper, including:

Figures: seventeen.pdf, QF_SMT-LIB.pdf, SMT-LIB.pdf, scatter±compiler.pdf
Tables: seventeen.tex, QF_SMT-LIB.tex, SMT-LIB.tex

These are generated after running the run_all_benchmarks.sh script and are placed in the figures/<mode> and tables/<mode> directories.

To copy them to the host machine:

docker ps        # Get <container-id>
docker cp <container-id>:/home/user/artifact/figures .

Correctness is ensured by Lean's kernel checking the proofs:

For CPC checker: see lean-cpc-checker/Checker.lean, lines 111–119
For tactic-based checking, Lean's frontend automatically invokes the kernel after proof generation

Reusable Badge

The artifact, encompassing the Lean-SMT tactic, its SMT-LIB frontend, and the associated benchmarking scripts, is open source and distributed under the Apache 2.0 License:

Portability & Compatibility

Works with Lean v4.20.0-rc5 (June 2025 toolchain release candidate)
Compatible with Linux and macOS (both x86 and ARM) and Windows (only x86 as Lean does not currently support ARM on Windows)

To use Lean-SMT in another Lean project:

Add it as a dependency in lakefile.toml similar to how it is included in lean-cpc-checker/lakefile.toml
Run lake update and lake build to compile

All dependencies are explicitly listed and version-pinned in the Dockerfile.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
data/all		data/all
lean-cpc-checker @ c513efc		lean-cpc-checker @ c513efc
lean-smt @ 384f556		lean-smt @ 384f556
.gitmodules		.gitmodules
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
cactus.py		cactus.py
collect_duper_stats.py		collect_duper_stats.py
collect_ethos_stats.py		collect_ethos_stats.py
collect_leansmt_stats.py		collect_leansmt_stats.py
collect_sledgehammer_stats.py		collect_sledgehammer_stats.py
collect_smtcoq_stats.py		collect_smtcoq_stats.py
cvc5+ethos.sh		cvc5+ethos.sh
cvc5+leansmt+compiler.sh		cvc5+leansmt+compiler.sh
cvc5+leansmt-compiler.sh		cvc5+leansmt-compiler.sh
duper.sh		duper.sh
generate_figures_and_tables.sh		generate_figures_and_tables.sh
requirements.txt		requirements.txt
run_all_benchmarks.sh		run_all_benchmarks.sh
run_benchmarks.py		run_benchmarks.py
scatter.py		scatter.py
skSorryAx.patch		skSorryAx.patch
tables.py		tables.py
verit+sledgehammer.sh		verit+sledgehammer.sh
verit+smtcoq.sh		verit+smtcoq.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Artifact for Lean-SMT

Artifact Requirements

Structure and Content

Getting Started

Functional Badge

Reusable Badge

Portability & Compatibility

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

cvc5/artifact-cav25-lean

Folders and files

Latest commit

History

Repository files navigation

Artifact for Lean-SMT

Artifact Requirements

Structure and Content

Getting Started

Functional Badge

Reusable Badge

Portability & Compatibility

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages