Supplementary Material

This replication package contains supplementary material for the paper "ACRCoder: Adaptive Context Retrieval through Reinforcement Learning for Repository-level Code Completion".

Generating Reinforcement Learning Training Data

Download the original dataset from https://huggingface.co/datasets/nov3630/Data4RLCoder
Use DataProcess/PreProcessData.py to construct the FIM dataset
Use DataProcess/analyzer.py to calculate the PPL sequence of context combinations
Use DataProcess/GenerateTrainData.py to generate reinforcement learning training data based on RRL

Training the Retrieve Model

Use Trainer/trainReranker.py to train the reranker model
Reference training command:

deepspeed --include localhost:0,1,2,3 trainRetrieve.py \
--deepspeed ZeRO_2.json --base_model /nvme1n1/LLM/Qwen3-Reranker-0.6B \
--cutoff_len 8000 --per_device_train_batch_size 1 --gradient_accumulation_steps 64 \
--warmup_steps 100 --num_train_epochs 5 --learning_rate 1e-7 --logging_steps 5 \
--val_set_size 0 --output_dir ./Models \
--data_path Data/processed_train_datasetV4_deepseek-coder-1.3b-base_balance_min_ppl3.parquet 
--sft_loss_rate 1.0 --scorer_loss_rate 1.0 --log_scorer_loss False

Evaluation Using the CCEval Benchmark

Use Evaluate/RetriveCCEval.py to retrieve context combinations
Use Evaluate/evalCCEvalScoreVllm.py to evaluate EM and ES performance

Implementation of RQ1

The implementation of RQ1 is based on RLCoder, with the repository at: https://github.com/DeepSoftwareAnalytics/RLCoder

Implementation of RQ2

Use RRLPerformance/InferCCEval.py to generate completion results
Use RRLPerformance/inferRRLandPPL.py to calculate PPL and RRL
Use RRLPerformance/calculateCorrelation.py to compute point biserial correlation and Spearman rank-order correlation

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
DataProcess		DataProcess
Evaluate		Evaluate
RRLPerformance		RRLPerformance
Trainer		Trainer
LICENSE		LICENSE
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Supplementary Material

Generating Reinforcement Learning Training Data

Training the Retrieve Model

Evaluation Using the CCEval Benchmark

Implementation of RQ1

Implementation of RQ2

About

Uh oh!

Releases

Packages

Languages

License

aiopsplus/ACRCoder

Folders and files

Latest commit

History

Repository files navigation

Supplementary Material

Generating Reinforcement Learning Training Data

Training the Retrieve Model

Evaluation Using the CCEval Benchmark

Implementation of RQ1

Implementation of RQ2

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages