Skip to content

This replication package contains supplementary material for the paper "ACRCoder: Adaptive Context Retrieval through Reinforcement Learning for Repository-level Code Completion".

License

Notifications You must be signed in to change notification settings

aiopsplus/ACRCoder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Supplementary Material

This replication package contains supplementary material for the paper "ACRCoder: Adaptive Context Retrieval through Reinforcement Learning for Repository-level Code Completion".

Generating Reinforcement Learning Training Data

  • Download the original dataset from https://huggingface.co/datasets/nov3630/Data4RLCoder
  • Use DataProcess/PreProcessData.py to construct the FIM dataset
  • Use DataProcess/analyzer.py to calculate the PPL sequence of context combinations
  • Use DataProcess/GenerateTrainData.py to generate reinforcement learning training data based on RRL

Training the Retrieve Model

  • Use Trainer/trainReranker.py to train the reranker model
  • Reference training command:
deepspeed --include localhost:0,1,2,3 trainRetrieve.py \
--deepspeed ZeRO_2.json --base_model /nvme1n1/LLM/Qwen3-Reranker-0.6B \
--cutoff_len 8000 --per_device_train_batch_size 1 --gradient_accumulation_steps 64 \
--warmup_steps 100 --num_train_epochs 5 --learning_rate 1e-7 --logging_steps 5 \
--val_set_size 0 --output_dir ./Models \
--data_path Data/processed_train_datasetV4_deepseek-coder-1.3b-base_balance_min_ppl3.parquet 
--sft_loss_rate 1.0 --scorer_loss_rate 1.0 --log_scorer_loss False

Evaluation Using the CCEval Benchmark

  • Use Evaluate/RetriveCCEval.py to retrieve context combinations
  • Use Evaluate/evalCCEvalScoreVllm.py to evaluate EM and ES performance

Implementation of RQ1

Implementation of RQ2

  • Use RRLPerformance/InferCCEval.py to generate completion results
  • Use RRLPerformance/inferRRLandPPL.py to calculate PPL and RRL
  • Use RRLPerformance/calculateCorrelation.py to compute point biserial correlation and Spearman rank-order correlation

About

This replication package contains supplementary material for the paper "ACRCoder: Adaptive Context Retrieval through Reinforcement Learning for Repository-level Code Completion".

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages