You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to reproduce the result shown in Figure 4 of paper on arxiv (See attached image 1). If I understand correctly, it represents the 1-prompt transfer Attack Success Rate (ASR), and is achieved by running scripts/experiment.jl.
For the first run, I only changed the vicuna model path to the huggingface dir lmsys/vicuna-7b-v1.5, and the code compiled without error. The blackbox victim (target model) is gpt-3.5-turbo. Each output file gpt3-advbench[i]-adv-mdp-data.bson included 7 suffixes. However, I'm not sure which one I should pick if I'm looking for a "pass@1" ASR statistics. I wonder if they are ranked on whitebox victim's reward from top to bottom. If so, The top one is the "best" suffix for each prompt, correct?
For the second run, I switched RUN_BEST = true, and errors occured. There are three output files: gpt3-advbench1-best-data.bson, gpt3-advbench1-best-moderation.bson, gpt3-advbench1-best-data.bson. The suffix in the first file was exactly the same as the top entry in gpt3-advbench[i]-adv-mdp-data.bson, but the iteration stopped. I attach the error logs in the images below. I looked into the lines but I'm not sure how to fix them. Would appreciate if you can push a fix!
I'm trying to reproduce the result shown in Figure 4 of paper on arxiv (See attached image 1). If I understand correctly, it represents the 1-prompt transfer Attack Success Rate (ASR), and is achieved by running
scripts/experiment.jl
.For the first run, I only changed the vicuna model path to the huggingface dir
lmsys/vicuna-7b-v1.5
, and the code compiled without error. The blackbox victim (target model) isgpt-3.5-turbo
. Each output filegpt3-advbench[i]-adv-mdp-data.bson
included 7 suffixes. However, I'm not sure which one I should pick if I'm looking for a "pass@1" ASR statistics. I wonder if they are ranked on whitebox victim's reward from top to bottom. If so, The top one is the "best" suffix for each prompt, correct?For the second run, I switched
RUN_BEST = true
, and errors occured. There are three output files: gpt3-advbench1-best-data.bson, gpt3-advbench1-best-moderation.bson, gpt3-advbench1-best-data.bson. The suffix in the first file was exactly the same as the top entry ingpt3-advbench[i]-adv-mdp-data.bson
, but the iteration stopped. I attach the error logs in the images below. I looked into the lines but I'm not sure how to fix them. Would appreciate if you can push a fix!This error showed up 8 times:
This error showed up once towards the end:
The text was updated successfully, but these errors were encountered: