Skip to content

VityaVitalich/GIFT_SW

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🤗 GIFT-SW with PEFT

GIFT-SW with State-of-the-art Parameter-Efficient Fine-Tuning (PEFT) methods

This repository contains code for GIFT-SW method implemented with PEFT library. It could be used in the same interface as usual PEFT methods and easily pluggable into any code.

PEFT is integrated with Transformers for easy model training and inference, Diffusers for conveniently managing different adapters, and Accelerate for distributed training and inference for really big models.

Quickstart

Install PEFT directly from repository:

cd GIFT_SW/
pip install -e .

In case you have already installed PEFT, you will need to reinstall it:

cd GIFT_SW/
pip uninstall -y peft
pip install -e .

Get the activation scales for outlier computation from precomputed scales in QUIK repository or by collecting them with script from SmoothQuant

Prepare a model for training with a GIFT-SW method by wrapping the base model and PEFT configuration with get_peft_model.

from transformers import AutoModelForCausalLM
from peft import get_peft_config, get_peft_model, GIFTConfig, TaskType

model_name_or_path = "facebook/opt-1.3b"
tokenizer_name_or_path = "facebook/opt-1.3b"
path_to_act_scales = "./opt-1.3b.pt"

peft_config = GIFTConfig(
    outlier_num=64,
    target_modules=['q_proj', 'k_proj'],
    path_to_act_scales=path_to_act_scales,
)

model = AutoModelForCausalLM.from_pretrained(model_name_or_path)
model = get_peft_model(model, peft_config)
model.print_trainable_parameters()
#"trainable params: 6,291,456 || all params: 1,517,182,976 || trainable%: 0.4147"

To save and later inference GIFT-SW model it is highly recommended to "merge_and_unload()" the model as GIFT-SW is not a regular adapter, but a learned subset of model weights. Further tuning of already trained GIFT-SW model is equivalent to merging the model and learning new one.

How to get activations in SmoothQuant

To get the activation scales for your model you will need to get them with SmoothQuant method, it is simple and easy to use.

git clone https://github.com/mit-han-lab/smoothquant

# make sure the git-lfs is installed
# curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | bash
# apt-get install git-lfs
# git lfs install

# clone the calibration data
git clone https://huggingface.co/datasets/mit-han-lab/pile-val-backup
# move to smoothquant and run the script

cd smoothquant
python examples/generate_act_scales.py \
    --model-name <your model name> \
    --output-path <save path .pt> \
    --num-samples 512 \ #number of calibration samples
    --seq-len 2048 \ #max sequence length
    --dataset-path ../pile-val-backup/val.jsonl.zst

About

🤗 GIFT-SW for PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.1%
  • Other 0.9%