Installatioin

prerequisite

First you need to obtain the hugging-face llama 2 model from (request access from https://llama.meta.com/llama2/) and place it in a folder ./models/llama2/llama_hf_converted/7b along with the tokenizer ./models/llama2/llama/tokenizer.model

Clone the repo into a folder ./SecureLLM

Then create a conda environment conda env create -f ./environment.yml

Then activate the conda environment conda activate securellm

Finally make sure all .sh files are executable. Go to ./SecureLLM and run chmod +x *.sh

Results

To get the results for all columns of all 3 tables run the bash script run_all.sh inside ./SecureLLM

To print all the results into tables (once run_all.sh finishes) run python print_tables.py

For individual tables run the following:

For Table 1 results:

first two columns: python experiments.py --device 0 --sample_size 120 --mroot "./trained_models/SQL/" --models M123E > T1.C1.log python experiments.py --device 0 --sample_size 120 --mroot "./trained_models/SQL/" --models M123 > T1.C2.log column 3-7: python experiments.py --device 0 --sample_size 120 --mroot "./trained_models/SQL/" --models M1 M2 M3 --config SloraLoraHub > T1.C3.log python experiments.py --device 0 --sample_size 120 --mroot "./trained_models/SQL/" --models M1 M2 M3 --config SloraSum > T1.C4.log python experiments.py --device 0 --sample_size 120 --mroot "./trained_models/SQL/" --models M1 M2 M3 --config SloraMax > T1.C5.log python experiments.py --device 0 --sample_size 120 --mroot "./trained_models/SQL/" --models M1 M2 M3 --config SloraLogit > T1.C6.log python experiments.py --device 0 --sample_size 120 --mroot "./trained_models/6NF/" --pseudo --models M1 M2 M3 --config SloraLogit > T1.C7.log

For Table 2 results:

first two columns: python experiments.py --device 0 --sample_size 120 --gpt --mroot "./trained_models/SQL/" --models M123E > T2.C1.log python experiments.py --device 0 --sample_size 120 --gpt --mroot "./trained_models/SQL/" --models M123 > T2.C2.log column 3-7: python experiments.py --device 0 --sample_size 120 --gpt --mroot "./trained_models/SQL/" --models M1 M2 M3 --config SloraLoraHub > T2.C3.log python experiments.py --device 0 --sample_size 120 --gpt --mroot "./trained_models/SQL/" --models M1 M2 M3 --config SloraSum > T2.C4.log python experiments.py --device 0 --sample_size 120 --gpt --mroot "./trained_models/SQL/" --models M1 M2 M3 --config SloraMax > T2.C5.log python experiments.py --device 0 --sample_size 120 --gpt --mroot "./trained_models/SQL/" --models M1 M2 M3 --config SloraLogit > T2.C6.log python experiments.py --device 0 --sample_size 120 --gpt --mroot "./trained_models/6NF/" --pseudo --models M1 M2 M3 --config SloraLogit > T2.C7.log

For Table 3 results:

first two columns: python experiments.py --device 0 --sample_size 120 --mapping 1 --mroot "./trained_models/SQL_obf/" --models M123E > T3.C1.log python experiments.py --device 0 --sample_size 120 --mapping 1 --mroot "./trained_models/SQL_obf/" --models M123 > T3.C2.log column 3-7: python experiments.py --device 0 --sample_size 120 --mapping 1 --mroot "./trained_models/SQL_obf/" --models M1 M2 M3 --config SloraLoraHub > T3.C3.log python experiments.py --device 0 --sample_size 120 --mapping 1 --mroot "./trained_models/SQL_obf/" --models M1 M2 M3 --config SloraSum > T3.C4.log python experiments.py --device 0 --sample_size 120 --mapping 1 --mroot "./trained_models/SQL_obf/" --models M1 M2 M3 --config SloraMax > T3.C5.log python experiments.py --device 0 --sample_size 120 --mapping 1 --mroot "./trained_models/SQL_obf/" --models M1 M2 M3 --config SloraLogit > T3.C6.log python experiments.py --device 0 --sample_size 120 --mapping 1 --mroot "./trained_models/6NF_obf/" --pseudo --models M1 M2 M3 --config SloraLogit > T3.C7.log

Train

Train one model

To train a model, run the following command inside ./SecureLLM (update the TRAIN_STR and SAVE_PATH depending on which model you want to train).

export TRAIN_STR="schema_1:1000,schema_2:1000,schema_3:1000"
export SAVE_PATH="./trained_models/SQL/M123.pt"
python train.py \
    --dataset_args \
        train_str=${TRAIN_STR} \
        val_str="schema_1_val:100,schema_2_val:100,schema_3_val:100" \
        max_inp_matrix_size=800 \
    --model_args \
        name=llama-2-7b \
        max_seq_len=1024 \
        max_batch_size=32 \
        type=lora \
        lora_r=8 \
        lora_alpha=32 \
        lora_dropout=0.1 \
    --train_args \
        epochs=1 \
        lr=2e-4 \
        weight_decay=0.002 \
    --save_path=${SAVE_PATH} \
    --experiment "0.0" \
    --world_size 1 \
    --seed 2 \
    --device cuda:0

Note: the val_str line can be deleted which will make the command finish faster with no affect on the saved model.

The general structure for the train_str is simple, schema_1~~mapping=1 for column obfuscation, schemapseudo_1 for pseudocode, schemapseudo_1~~mapping=1 for both. Look at train_all.sh for all combinations.

Train all models

The file train_all.sh contains and will train all pairs that are needed to reproduce all models used.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
logs		logs
modeling		modeling
schema		schema
trained_models		trained_models
training_data		training_data
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
datasets.py		datasets.py
db_manager.py		db_manager.py
env.py		env.py
environment.yml		environment.yml
experiments.py		experiments.py
model_selector.py		model_selector.py
notes.txt		notes.txt
print_tables.py		print_tables.py
run_all.sh		run_all.sh
run_allcuda0.sh		run_allcuda0.sh
run_allcuda1.sh		run_allcuda1.sh
run_allcuda2.sh		run_allcuda2.sh
run_allcuda3.sh		run_allcuda3.sh
sandbox.ipynb		sandbox.ipynb
train.py		train.py
train_all.sh		train_all.sh
train_one.sh		train_one.sh
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Installatioin

prerequisite

Results

Train

Train one model

Train all models

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Scuwr/SecureLLM

Folders and files

Latest commit

History

Repository files navigation

Installatioin

prerequisite

Results

Train

Train one model

Train all models

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages