sweet-watermark

**updated (3/4/2024)** Our paper and repo are updated: DS-1000 benchmark is included, a new baseline (EXP-edit) is included for reproducing the main results. Experiments of using surrogate model, variable renaming, and detectability@T will be added.

Introduction

Official repository of the paper:

"Who Wrote this Code? Watermarking for Code Generation" by Taehyun Lee*, Seokhee Hong*, Jaewoo Ahn, Ilgee Hong, Hwaran Lee, Sangdoo Yun, Jamin Shin', Gunhee Kim'

Reproducing the Main Experiments

1. Generating watermarked machine-generated code, calculating pass@k and detecting watermarks

We conducted our (main) experiments by separating them into generation and detection phases. However, anyone wanting to run both phases with a single command removes the --generation_only argument.

For EXP-edit with a high entropy setting, please set top_p=1.0 and temperature=1.0.

generation phase

bash scripts/main/run_{MODEL}_generation.sh

detection phase

bash scripts/main/run_{MODEL}_detection.sh

2. Detecting watermarks in human-written code

bash scripts/main/run_{MODEL}_detection_human.sh

3. Calculating Metrics (AUROC, TPR)

With both metric output files from machine-generated and human-written codes, we calculate metrics including AUROC and TPR and update the results to OUTPUT_DIRECTORY.

python calculate_auroc_tpr.py \
    --task {humaneval,mbpp} \
    --human_fname OUTPUT_DIRECTORY_HUMAN \
    --machine_fname OUTPUT_DIRECTORY

Acknowledgements

This repository is based on the codes of bigcode-evaluation-harness in BigCode Project.

Contact

If you have any questions about our codes, feel free to ask us: Taehyun Lee ([email protected]) or Seokhee Hong ([email protected])

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
exp_utils		exp_utils
img		img
lm_eval		lm_eval
scripts/main		scripts/main
README.md		README.md
calculate_auroc_tpr.py		calculate_auroc_tpr.py
exp.py		exp.py
main.py		main.py
sweet.py		sweet.py
watermark.py		watermark.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

sweet-watermark

Introduction

Reproducing the Main Experiments

1. Generating watermarked machine-generated code, calculating pass@k and detecting watermarks

generation phase

detection phase

2. Detecting watermarks in human-written code

3. Calculating Metrics (AUROC, TPR)

Acknowledgements

Contact

About

Releases

Packages

Contributors 3

Languages

hongcheki/sweet-watermark

Folders and files

Latest commit

History

Repository files navigation

sweet-watermark

Introduction

Reproducing the Main Experiments

1. Generating watermarked machine-generated code, calculating pass@k and detecting watermarks

generation phase

detection phase

2. Detecting watermarks in human-written code

3. Calculating Metrics (AUROC, TPR)

Acknowledgements

Contact

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages