Perses: Syntax-Directed Program Reduction

Perses is a language-agnostic program reducer to minimize a program with respect to a set of constraints. It takes as input a program to reduce, and a test script which specifies the constraints. It outputs a minimized program which still satisfies the constraints specified in the test script. Compared to Delta Debugging and Hierarchical Delta Debugging, Perses leverages the syntax information in the Antlr grammar, and prunes the search space by avoiding generating syntactically invalid programs.

Supported Languages

Currently, Perses supports reduction for the following programming languages:

c: *.c
cpp: *.cc, *.cpp, *.cxx
glsl: *.glsl, *.comp, *.frag, *.vert
go: *.go
java: *.java
javascript: *.javascript, *.js
line: *.line
mysql: *.mysql
onetoken: *.onetoken
php: *.php
python3: *.py, *.py3
ruby: *.rb
rust: *.rs
scala: *.scala, *.sc
smtlibv2: *.smt2
solidity: *.sol
sqlite: *.sqlite
system_verilog: *.v, *.sv
xml: *.xml

Support for other languages is coming soon.

Obtain and Run

There are three ways to obtain Perses.

Download a prebuilt release JAR file from our release page, for example,

wget https://github.com/uw-pluverse/perses/releases/download/v2.1/perses_deploy.jar
java -jar perses_deploy.jar [options]? --test-script <test-script.sh> --input-file <program file>

Clone the repo and build Perses from the source.

git clone https://github.com/perses-project/perses.git
cd perses
bazelisk build //src/org/perses:perses_deploy.jar
java -jar bazel-bin/src/org/perses/perses_deploy.jar [options]? \
    --test-script <test-script.sh> --input-file <program file>

If you want to always use the trunk version of Perses, perses-trunk automatically downloads and builds the latest version.

NOTE: Bazelisk is the prerequisite to run perses-trunk successfully.

wget https://raw.githubusercontent.com/perses-project/perses/master/scripts/perses-trunk
chmod +x perses-trunk
./perses-trunk [options]? --test-script <test-script.sh> --input-file <program file>

Important Flags

--test-script <test-script.sh>: The script encodes the constraints that both of the original program file and the reduced version should satisfy. It should return 0 if the constraints are satisfied.
--input-file <program-file>: the program needs to be reduced. Currently, Perses supports C, Rust, Java and Go. Note that we can easily support any other languages, if the specific language can be parsed by an Antlr parser.

Check all available command line arguments

java -jar perses_deploy.jar  --help

The following is the complete list of command line arguments.

Usage: org.perses.Main [options]

[Inputs]  Options:
  * --test-script, --test, -t
      The test script to specify the property the reducer needs to preserve.
  * --input-file, --input, -i
      The input file to reduce
    --deps
      The dependency files required for running the property test
      Default: []

[Outputs]  Options:
    --output-dir, -o
      The output directory to save the reduced result.

[General Reduction Control]  Options:
    --fixpoint
      iterative reduction till fixpoint
      Default: true
    --threads
      Number of reduction threads: a positive integer, or 'auto'.
      Default: auto
    --code-format
      The format of the reduced program.
      Possible Values: [SINGLE_TOKEN_PER_LINE, ORIG_FORMAT, COMPACT_ORIG_FORMAT, PYTHON3_FORMAT, COMPACT_PYTHON3_FORMAT]
    --script-execution-timeout-in-seconds
      the interval in seconds to timeout the test script executions. the 
      default timeout is 600 seconds.
      Default: 600
    --script-execution-keep-waiting-after-timeout
      keep trying even after the script execution timeouts.
      Default: true

[Output Refining Control]  Options:
    --call-formatter
      call a formatter on the final result
      Default: false
    --format-cmd
      the command to format the reduced source file
      Default: <empty string>
    --call-creduce
      call C-Reduce when Perses is done.
      Default: false
    --creduce-cmd
      the C-Reduce command name or path
      Default: creduce

[Reduction Algorithm Control]  Options:
    --alg
      reduction algorithm: use --list-algs to list all available algorithms
    --list-algs
      list all the reduction algorithms.
    --enable-token-slicer
      Enable token slicer after syntax-guided reduction is done. Maybe slow.
      Default: false
    --enable-tree-slicer
      Enable tree slicer after syntax-guided reduction, and before token 
      slicer 
      Default: false
    --enable-line-slicer
      Enable line slicer after syntax-guided reduction, and before token 
      slicer 
      Default: false
    --default-delta-debugger-for-kleene
      The default delta debugger algorithm to reduce kleene nodes.
      Default: DFS
      Possible Values: [PRISTINE, PERSES_VARIANT_OF_PRISTINE, DFS, BFS, CDD, PROBDD, WDD, WPROBDD]

[Language Control]  Options:
    --list-langs
      List all the supported languages.
    --lang
      Specify the language of the program that is to be reduced.
      Default: <empty string>
    --parser-facade-class-name
      The parser facade to be used to parse the input program
      Default: <empty string>
    --list-parser-facades
      List all the available parser facades.
    --language-ext-jars
      A list of JAR files to support new languages
      Default: []

[Vulcan Reducer Control]  Options:
    --enable-vulcan
      Enable vulcan (using auxiliary reducers to help produce smaller 
      reduction output).
      Default: false
    --non-deletion-iteration-limit
      The maximum number of continuous non-deletion iterations allowed.
      Default: 10
    --window-size
      The window size used to perform local exhaustive pattern reduction.
      Default: 4
    --vulcan-fixpoint
      Enable vulcan fixpoint iteratively using auxiliary reducers until no 
      progress can be made
      Default: false

[T-Rec Reducer Control]  Options:
    --enable-trec
      enable T-Rec (a lexical-syntax guided fine-grained reduction process to 
      reduce and canonicalize each token)
      Default: false

[Profiling]  Options:
    --progress-dump-file
      The file to record the reduction process. The dump file can be large..
    --append-to-progress-dump-file
      Whether to append the reduction progress to the progress dump file
      Default: true
    --stat-dump-file
      The file to save the statistics collected during reduction.
    --profile-query-cache-time
      The file to save the profiling data of the query cache.
    --profile-query-cache-time-csv
      The file to save the profiling data of the query cache in the CSV 
      format. 
    --profile-query-cache-memory
      The file to save the profiling data of the query cache.
    --profile-actionset
      The file to save information of all the created edit action sets.
    --profile-delta-debugger
      The file to save the reduction process of the delta debugger.

[Cache Control]  Options:
    --query-caching
      Enable query caching for test script executions.
      Default: AUTO
      Possible Values: [TRUE, FALSE, AUTO]
    --enable-lightweight-refreshing
      Whether to enable lightweight refreshing
      Default: true

[Experiment Control]  Options:
    --query-cache-type
      the algorithm of the query cache
      Default: COMPACT_QUERY_CACHE
      Possible Values: [AUTO, COMPACT_QUERY_CACHE, COMPACT_QUERY_CACHE_FORMAT_SENSITIVE, PERSES_FAST_LINEAR_SCAN_NO_COMPRESSION, PERSES_LEXEME_ID, CONFIG_BASED, ORIG_CONTENT_STRING_BASED, CONTENT_LEXEME_LIST_BASE, CONTENT_SHA512, CONTENT_SHA512_FORMAT, CONTENT_ZIP, RCC_MEM_LIT]

[Verbosity]  Options:
    --verbosity
      verbosity of logging
      Default: INFO
    --list-verbosity-levels
      list all verbosity levels

[Version]  Options:
    --version
      print the version

[Help]  Options:
    -h, --help
      print help message

License

GNU General Public License 3.

Publication

This repository contains the implementations of the techniques proposed in the following papers.

1. Perses: Syntax-Guided Program Reduction (ICSE 2018, pdf)

@inproceedings{perses,
  author = {Sun, Chengnian and Li, Yuanbo and Zhang, Qirun and Gu, Tianxiao and Su, Zhendong},
  title = {Perses: Syntax-Guided Program Reduction},
  year = {2018},
  publisher = {Association for Computing Machinery},
  doi = {10.1145/3180155.3180236},
  booktitle = {Proceedings of the 40th International Conference on Software Engineering},
  pages = {361–371},
}

2. Pushing the Limit of 1-Minimality of Language-Agnostic Program Reduction (OOPSLA 2023, pdf)

@article{perses-vulcan,
  title={Pushing the Limit of 1-Minimality of Language-Agnostic Program Reduction},
  author={Xu, Zhenyang and Tian, Yongqiang and Zhang, Mengxiao and Zhao, Gaosen and Jiang, Yu and Sun, Chengnian},
  journal={Proceedings of the ACM on Programming Languages},
  volume={7},
  number={OOPSLA1},
  pages={636--664},
  year={2023},
  publisher={ACM New York, NY, USA}
}

3. PPR: Pairwise Program Reduction (ESEC/FSE 2023, pdf, doc)

@inproceedings{perses-ppr,
  title={PPR: Pairwise Program Reduction},
  author={Zhang, Mengxiao and Xu, Zhenyang and Tian, Yongqiang and Jiang, Yu and Sun, Chengnian},
  booktitle={Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering},
  pages={338--349},
  year={2023}
}

4. On the Caching Schemes to Speed Up Program Reduction (TOSEM, pdf)

@article{perses-caching,
  title={On the Caching Schemes to Speed Up Program Reduction},
  author={Tian, Yongqiang and Zhang, Xueyan and Dong, Yiwen and Xu, Zhenyang and Zhang, Mengxiao and Jiang, Yu and Cheung, Shing-Chi and Sun, Chengnian},
  journal={ACM Transactions on Software Engineering and Methodology},
  volume={33},
  number={1},
  pages={1--30},
  year={2023},
  publisher={ACM New York, NY, USA}
}

5. LPR: Large language models-aided program reduction (ISSTA 2024, pdf)

@inproceedings{perses-lpr,
  title={LPR: Large Language Models-Aided Program Reduction},
  author={Zhang, Mengxiao and Tian, Yongqiang and Xu, Zhenyang and Dong, Yiwen and Tan, Shin Hwei and Sun, Chengnian},
  booktitle={Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis},
  pages={261--273},
  year={2024}
}

6. T-Rec: Fine-Grained Language-Agnostic Program Reduction Guided by Lexical Syntax (TOSEM, pdf)

@article{perses-trec,
  title={T-Rec: Fine-Grained Language-Agnostic Program Reduction Guided by Lexical Syntax},
  author={Xu, Zhenyang and Tian, Yongqiang and Zhang, Mengxiao and Zhang, Jiarui and Liu, Puzhuo and Jiang, Yu and Sun, Chengnian},
  journal={ACM Transactions on Software Engineering and Methodology},
  year={2024},
  publisher={ACM New York, NY}
}

Name		Name	Last commit message	Last commit date
Latest commit History 977 Commits
.github/workflows		.github/workflows
antlropt		antlropt
antlrrdc		antlrrdc
benchmark		benchmark
copyright		copyright
doc		doc
kitten		kitten
ppr		ppr
scripts		scripts
src		src
test/org/perses		test/org/perses
test_data		test_data
third_party		third_party
version		version
.bazelrc		.bazelrc
.bazelversion		.bazelversion
.editorconfig		.editorconfig
.gitignore		.gitignore
BUILD		BUILD
CONTRIBUTORS		CONTRIBUTORS
LICENSE		LICENSE
MODULE.bazel		MODULE.bazel
MODULE.bazel.kept_for_future		MODULE.bazel.kept_for_future
MODULE.bazel.lock		MODULE.bazel.lock
README.md		README.md
WORKSPACE		WORKSPACE
perses.bzl		perses.bzl
to_be_deleted.sh		to_be_deleted.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Perses: Syntax-Directed Program Reduction

Supported Languages

Obtain and Run

Important Flags

License

Publication

1. Perses: Syntax-Guided Program Reduction (ICSE 2018, pdf)

2. Pushing the Limit of 1-Minimality of Language-Agnostic Program Reduction (OOPSLA 2023, pdf)

3. PPR: Pairwise Program Reduction (ESEC/FSE 2023, pdf, doc)

4. On the Caching Schemes to Speed Up Program Reduction (TOSEM, pdf)

5. LPR: Large language models-aided program reduction (ISSTA 2024, pdf)

6. T-Rec: Fine-Grained Language-Agnostic Program Reduction Guided by Lexical Syntax (TOSEM, pdf)

About

Releases 12

Packages

Contributors 4

Languages

License

uw-pluverse/perses

Folders and files

Latest commit

History

Repository files navigation

Perses: Syntax-Directed Program Reduction

Supported Languages

Obtain and Run

Important Flags

License

Publication

1. Perses: Syntax-Guided Program Reduction (ICSE 2018, pdf)

2. Pushing the Limit of 1-Minimality of Language-Agnostic Program Reduction (OOPSLA 2023, pdf)

3. PPR: Pairwise Program Reduction (ESEC/FSE 2023, pdf, doc)

4. On the Caching Schemes to Speed Up Program Reduction (TOSEM, pdf)

5. LPR: Large language models-aided program reduction (ISSTA 2024, pdf)

6. T-Rec: Fine-Grained Language-Agnostic Program Reduction Guided by Lexical Syntax (TOSEM, pdf)

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 12

Packages 0

Contributors 4

Languages

Packages