- 🌻 Acknowledgement
- 🌟 Overview
- 🔧 Algorithm
- 🚀 Installation
- 📚 Partial Answer Preview
- 📉 Vanilla Fine-Tuning
- 🧐 OURS
- 🚩 Citation
Our dataset is sourced from the GSM8K dataset, the Samsum dataset, and the HEx-PHI dataset . Our training code and safety evaluation procedures are derived from the LLMs-Finetuning-Safety and the shallow-vs-deep-alignment.
We sincerely appreciate their remarkable contributions to the field.
LookAhead Tuning comprises two data-centric, resource-efficient, simple, and effective methods. The fundamental concept is modifying the training data to preview the answer prefix tokens, thereby reducing the loss associated with these tokens and minimizing disturbances to the model’s initial tokens, which helps maintain the model’s safety.
-
Clone the Repository and Enter the Directory:
git clone https://github.com/zjunlp/LookAheadTuning cd LookAheadTuning
-
Install Dependencies:
pip install -r requirements.txt
-
Download the Model:
Download the LLaMA 2 7B Chat model into the
LookAheadTuning/llama-2-7b-chat-hf
directory. Alternatively, you can use another model and tokenizer.
We utilize a Python script named main.py
that is designed to process datasets in both JSON and JSONL formats using two distinct modes: real and virtual. This tool allows you to modify your training data, enabling a preview of the answer by adjusting specific fields based on your selected mode. Inference data is unchanged.
- Supports Multiple Formats: Handles both JSON and JSONL (JSON Lines) file formats.
- Dual Processing Modes:
- Real Mode: Uses a tokenizer to concatenate a specified number of tokens from the output field to the input field.
- Virtual Mode: Inserts a predefined string into both the input and output fields without tokenization.
- Customizable Fields: Specify which fields in your JSON objects to process.
- Easy Configuration: Simple command-line arguments to tailor the processing to your needs.
Run the main.py
script from the command line with the appropriate arguments.
--input_file
: (Required) Path to the input file.--output_file
: (Required) Path to the output file.--input_format
: (Required) Format of the input file. Choose eitherjson
orjsonl
.--output_format
: (Required) Format of the output file. Choose eitherjson
orjsonl
.--input_field
: (Optional) Field name for the input in the JSON objects. Default isinput
.--output_field
: (Optional) Field name for the output in the JSON objects. Default isoutput
.--mode
: (Required) Processing mode. Choose eitherreal
orvirtual
.- Real Mode:
--m
: Number of tokens to preview.--tokenizer_path
: Path to the model tokenizer.
- Virtual Mode:
--P
: Predefined string to insert.
- Real Mode:
For more detailed parameters, refer to the main.py
file.
Real Mode: Set m=6
and use question
as the input field and answer
as the output field.
python main.py \
--input_file data/tasks/gsm8k/train.json \
--output_file data/tasks/gsm8k/train_real_6.json \
--input_format jsonl \
--output_format jsonl \
--input_field question \
--output_field answer \
--mode real \
--m 6 \
--tokenizer_path llama-2-7b-chat-hf
Virtual Mode: Set P=" Let's solve this problem. "
and use question
as the input field and answer
as the output field.
python main.py \
--input_file data/tasks/gsm8k/train.json \
--output_file data/tasks/gsm8k/train_virtual.json \
--input_format jsonl \
--output_format jsonl \
--input_field question \
--output_field answer \
--mode virtual \
--P " Let's solve this problem. "
- The file extension of
input_file
does not need to match the--input_format
argument. Ensure that the content of the input file corresponds to the specified--input_format
(json
orjsonl
). - The
input_field
andoutput_field
should correspond to the fields in your JSON objects. - In real mode, both
--m
and--tokenizer_path
are required. - In virtual mode,
--P
is required.
First, clone the repository and organize the required files:
git clone https://github.com/Unispac/shallow-vs-deep-alignment
mv data shallow-vs-deep-alignment/finetuning_buckets/datasets
mv Vanilla_FT.sh eval shallow-vs-deep-alignment
cd shallow-vs-deep-alignment
Next, set up the Python environment required to run shallow-vs-deep-alignment
, and ensure you have access to 4 A100/H100 80GB GPUs for execution.
To perform vanilla fine-tuning, run the following command:
bash Vanilla_FT.sh
Please note that the input_file
parameters used here should point to the paths generated as save_path
from steps 2, 3, and 4 in Vanilla_FT.sh
during the training and inference process.
Utility Assessment: Calculate the utility using the specified metrics:
python eval/calculate_utility.py rouge1 --input_file <path>
python eval/calculate_utility.py gsm8k --input_file <path>
Safe Rate Assessment: Provide scores using both keywords and GPT-based methods.
In our paper, we utilize the GPT-based method for Safe Rate Assessment to achieve more accurate evaluations.
Using keywords:
python eval/calculate_safe_rate.py --input_file <path> --mode keywords
Using GPT (where path_2
is where each data score is stored):
python eval/calculate_safe_rate.py --input_file <path_1> --mode gpt --output_file <path_2> --api_key <your_key>
To apply our method, follow these steps:
- Modify the paths in
get_finetuning_data.py
located inshallow-vs-deep-alignment-main/finetuning_buckets/datasets/utils/
:- Update the
load_dataset
calls in theget_gsm8k
andget_samsum
functions to point to paths with your modified data (as specified in your data storage path updates in the Partial Answer Preview section).
- Update the
- Update all
output_dir
andsave_path
occurrences inVanilla_FT.sh
to reflect your desired storage locations. - Then, repeat the Training and Inference and Evaluation steps outlined in the Vanilla Fine-Tuning section.
Please cite our repository if you use LookAhead Tuning in your work. Thanks!
@misc{liu2025lookaheadtuningsaferlanguage,
title={LookAhead Tuning: Safer Language Models via Partial Answer Previews},
author={Kangwei Liu and Mengru Wang and Yujie Luo and Lin Yuan and Mengshu Sun and Ningyu Zhang and Lei Liang and Zhiqiang Zhang and Jun Zhou and Huajun Chen},
year={2025},
eprint={2503.19041},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2503.19041},
}
We will offer long-term maintenance to fix bugs and solve issues. So if you have any problems, please put issues to us.