Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to implement a new model in the adversarial-example model #4

Open
sebastianwgm opened this issue Apr 8, 2023 · 3 comments
Open

Comments

@sebastianwgm
Copy link

Dear authors

We are currently working on a black-box adversarial attack on the model code2vec based on this code. We trained a code2vec substitute model based on a different attention method (where we used as input the predicted output from code2vec).

We are trying to implement our pretrained substitute model in the adversarial-example code. However, we have the following questions:

Why are you calculating the gradient in the test graph? (method build_test_graph_with_loss)

We want to use our substitute model to get the gradient w.r.t inputs and then test your pretrained model (code2vec) with our updated adversarial code. We are not sure where to make changes on model.py to include our model.

Thank you

@sebastianwgm sebastianwgm changed the title How to How to implement a new model in the adversarial-example model Apr 8, 2023
@urialon
Copy link
Contributor

urialon commented Apr 10, 2023

Hi Sebastian,
Thank you for your interest in our work!

Note that the main attack we described in our paper was "white-box", in the sense that it required access to the gradients.
Thus:

  1. We calculated the gradients in the test graph because the gradients are specific to the test example we attempted to attack. Mathematically, and in newer libraries such as PyTorch, the train and the test graphs are the same thing. The distinction between train/test graph was only made because of the limitations of the older version of TensorFlow.
  2. I am not sure I understand the second question. From what I understand: are you assuming that you don't have access to the code2vec model, so you use gradients from the substitute model as a proxy. If so, why can't you find the concrete adversarial perturbation in the substitute model, and then feed the perturbed example as a new model to code2vec?

Best,
Uri

@sebastianwgm
Copy link
Author

Hi Uri,

Given that your paper was a "white-box" we are working with a "black-box" attack based on your adversarial method, below you may find our comments:

  1. Thank you for the clarification
  2. Yes, that's it. We were thinking of implementing two separate tf.session() inside the method evaluate_and_adverse(), one for build_test_graph_with_loss() using the substitute model and one for build_test_graph() using your model, this way we think the code2vec model will predict with our substitute model adversarial gradients.

I am not sure if our proposal is the same as your suggestion. In particular, what should be a good approach to get the adversarial perturbation in the substitute model and then to feed the code2vec?

Thank you again.

@urialon
Copy link
Contributor

urialon commented Apr 11, 2023

Hi @sebastianwgm ,
I am not sure.

This project was implemented by @noamyft more than 3 years ago, and it is based on the code of code2vec that I wrote more than 5 years ago :-)

So I don't remember the exact Tensorflow settings, especially those of sessions and graphs, which are practices that generally disappeared since then. So I don't want to give concrete implementation advice that I'm not sure about.

Sorry if this doesn't help...

Best,
Uri

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants