My thesis is about: Enhancing transferability to protect against textual adversarial examples.
The steps of my research are as follows:
- Select one NLP task (for example text classification)
- Select two models for solving NLP task
- Select one attack method
- Compute attack success rate for two models
- Compute attack transferability from one model to another
- Enhance the transferability of attack
- Adversarial training by enhanced transferable attack method
- Selected task: sentiment analysis in
rotten_tomatoesdataset - Selected models:
textattack/xlnet-base-cased-rotten-tomatoesandtextattack/roberta-base-rotten-tomatoes - Selected attack method:
BAEGarg2019 - :)
- Compute attack transferability from
textattack/xlnet-base-cased-rotten-tomatoestotextattack/roberta-base-rotten-tomatoes