This is the official PyTorch implementation for the following FSE 2023 paper from Salesforce Research:
Title: RAP-Gen: Retrieval-Augmented Patch Generation with CodeT5 for Automatic Program Repair
Authors: Weishi Wang*, Yue Wang *, Shafiq Joty, and Steven C.H. Hoi (* indicates equal contribution)
RAP-Gen is a novel Retrieval-Augmented Patch Generation framework with an encoder-decoder code-aware language model
CodeT5 (60M
,220M
), explicitly leveraging relevant fix patterns retrieved from a codebase of previous bug-fix pairs
retrieved from a codebase of previous bug-fix pairs to guide the language model for a robust APR system.
To retrieve the most relevant bug-fix pattern, we build a hybrid patch retriever to account for both lexical and
semantic matching based on the raw source code in a language-agnostic manner, which does not rely on any code-specific
features. See the below overview of RAP-Gen.
├── evaluator: the source code of evaluation
│
├── sh: the bash shell script files to run experiments
│
├── data
│ ├── refine_small : the dataset used to train and evaluate task code refinement (small)
│ ├── refine_small : the dataset used to train and evaluate task code refinement (medium)
│ ├── refine_rapgen : the dataset used to train and evaluate task code refinement with RAP-Gen
│ │ ├── small
│ │ ├── medium
│ ├── tfix_rmd : the dataset used to train and evaluate task TFix
│ ├── tfix_rmd_rapgen: the dataset used to train and evaluate task TFix with RAP-Gen
│ ├── selfapr : the SelfAPR dataset used to train and then evaluate for Defects4J
│ ├── selfapr_rapgen : the SelfAPR dataset used to train and then evaluate for Defects4J with RAP-Gen
│ ├── selfapr_rapgen_P1: the SelfAPR dataset used evaluate Defects4J with RAP-Gen (top_1 bug-fix patern from P1)
│ ├── ...
│ ├── selfapr_rapgen_P16: the SelfAPR dataset used evaluate Defects4J with RAP-Gen (top_1 bug-fix patern from P16)
│
├── Defects4j_result
│ ├──Defects4J v1.2 Results : the correctly generated patches by RAP-Gen (trained with SelfAPR data).
│ ├──Defects4J v2.0 Results : the correctly generated patches by RAP-Gen (trained with SelfAPR data).
│ ├──Defects4J_predicted_patches : the patches generated by RAP-Gen and CodeT5
│ │ ├── CodeT5
│ │ ├── RAP-Gen
│ ├──Defects4J_result_summary_v1p2.csv)
│ ├──Defects4J_result_summary_v2p0.csv)
│ ├──Defects4J_pattern_analysis : the summary and plotting script of RAP-Gen pattern analysis
└──
Instructions to download:
# pip install gsutil
cd your-cloned-RAP-Gen-path
gsutil -m cp -r "gs://sfr-codet5-data-research/RAP-Gen/data" .
gsutil -m cp -r "gs://sfr-codet5-data-research/RAP-Gen/checkpoints" .
- pickle5
- tqdm
- pandas 1.3.5
- tensorboard 2.4.1
- protobuf 3.15.8
- numpy 1.20.2
- transformers 4.6.1
- packaging 20.9
- tree-sitter 0.2.2
- python 3.7.9
For our A100-SXM4-40GB with CUDA capability sm_80
- Pytorch 1.7.1
- torchvision 0.8.2
- torchaudio==0.7.2
- cudatoolkit=11.0
Go to sh
folder, set the WORKDIR
in exp_with_args.sh
to be your cloned RAP-Gen repository path.
You can use run_exp.py
to run a broad set of experiments by simply passing the model_tag
, task
, and sub_task
arguments. In total, we support six models (
i.e., ['roberta', 'codebert', 'bart_base', 't5_base', 'codet5_small', 'codet5_base'])
and 8 main tasks (
i.e., ['tfix_rmd', 'tfix_rmd_rapgen', 'refine_small', 'refine_medium', 'refine_rapgen', 'selfapr', 'selfapr_rapgen']).
For each task, we use
the sub_task
to specify which specific datasets to fine-tne on. Below is the full list:
--task | --sub_task | Description |
---|---|---|
tfix_rmd | none | APR task on TFix data (deduplicated version) |
tfix_rmd_rapgen | none | RAP-Gen APR task on TFix data (deduplicated version) |
refine_small | none | code refinement on code repair data with small/medium functions |
refine_medium | none | code refinement on code repair data with small/medium functions |
refine_rapgen | small/medium | RAP-Gen code refinement on code repair data with small/medium functions |
selfapr | none | Defects4J APR on SelfAPR data |
selfapr_rapgen | none | RAP-Gen Defects4J APR on SelfAPR data |
selfapr_rapgen_P1 | none | RAP-Gen Defects4J APR on SelfAPR data with top_1 bug-fix patern from P1 (test) |
... | ... | ... |
selfapr_rapgen_P16 | none | RAP-Gen Defects4J APR on SelfAPR data with top_1 bug-fix patern from P16 (test) |
For example, if you want to run CodeT5-base model on the selfapr_rapgen task for Defects4J, you can simply run:
cd sh
python run_exp.py --model_tag codet5_base --task selfapr_rapgen
For example, if you want to run CodeT5-base model on the refine_rapgen task for code refinement medium, you can simply run:
cd sh
python run_exp.py --model_tag codet5_base --task refine_rapgen --sub_task medium
Besides, you can specify:
model_dir: where to save fine-tuning checkpoints
res_dir: where to save the performance results
summary_dir: where to save the training curves
data_num: how many data instances to use, the default -1 is for using the full data
gpu: the index of the GPU to use in the cluster
load_model_dir: where to reload the fine-tuned checkpoints
tag_suffix: indicate the purpose of experiments i.e., finetune. This is mostly for annotation, except for zeroshot.
You can also revise the suggested
arguments here or directly customize the exp_with_args.sh bash file.
Please refer to the argument flags in configs.py for the full
available options. The saved training curves in summary_dir
can be visualized
using tensorboard.
Note that we employ one A100 GPU for all fine-tuning experiments.
Please follow the instructions below to reproduce the results.
cd sh
then run the inference viarun_exp.py
.- Pass the path of your downloaded finetuned checkpoint to load at here,
e.g.,
file = "xx/checkpoints/codet5_base_selfapr_rapgen_pytorch_model.bin"
- Run the
program:
python run_exp.py --model_tag codet5_base --task selfapr_rapgen_P1 --load_model_dir file --tag_suffix reproduce
- Please refer to our paper for the reproduced results.
We follow SelfAPR compute all the suspicous buggy lines with two fault localization with Gzoltar and Flacoco.
We follow CodeT5+ and CodeBERT to train our retriever using contrastive loss.
All perturbed rules generated by SelfAPR
Perturb Actions | Explanation |
---|---|
P1 Replace type | modify declaring type ... |
P2 Replace operator | modify operator ==, !=, etc |
P3 replace literal | modify literal, "STRING", true, false |
P4 replace constructor | modify constructor |
P5 replace variable | modify arguments/swap argumens |
P6 replace condition expression | reduce/expand boolean expression |
P7 replace invocation | modify invocation |
P8 compound statement | compound of rules |
P9 replace with similarity | replace by transplanting a similar donor statement |
P10 move statement | move a later statement before the target statement |
P11 transplatant statement | transplanting a donor statement |
P12 transplatant condition | wrap target statement with an existing conditional block |
P13 transplatant a block | insert an existing block (if, loop, etc) |
P14 remove a statement | delete statement |
P15 unwrap a statement | unwrap block |
P16 remove a block | remove block |
@inproceedings{wang2023rap,
title = {RAP-Gen: Retrieval-Augmented Patch Generation with CodeT5 for Automatic Program Repair},
booktitle = {Proceedings of FSE},
year = {2023},
author = {Weishi Wang and Yue Wang and Shafiq Joty, and Steven C.H. Hoi},
url = {https://arxiv.org/abs/2309.06057},
}