@@ -7,7 +7,7 @@ modeling of programs, composed of three major components:
7
7
[ Learning to Represent Programs with Graphs] ( https://openreview.net/forum?id=BJOFETxR- ) .
8
8
More precisely, it implements that paper apart from the speculative
9
9
dataflow component ("draw dataflow edges as if a variable would be used
10
- in this place") and an alias analysis to filter equivalent variables.
10
+ in this place") and the alias analysis to filter equivalent variables.
11
11
* A TensorFlow model for program graphs, following ICLR'18 paper
12
12
[ Learning to Represent Programs with Graphs] ( https://openreview.net/forum?id=BJOFETxR- ) .
13
13
This is a refactoring/partial rewrite of the original model, incorporating
@@ -52,15 +52,15 @@ paper), please use this bibtex entry:
52
52
The released code provides two components:
53
53
* Data Extraction: A C# project extracting graphs and expressions from a corpus
54
54
of C# projects. The sources for this are in ` DataExtraction/ ` .
55
- * Modelling: A python project learning model of expressions, conditionally on
55
+ * Modelling: A Python project learning model of expressions, conditionally on
56
56
the program context. The sources for this are in ` Models/ ` .
57
57
58
58
Note that the code is a research prototype; the documentation is generally
59
59
incomplete and code quality is varying.
60
60
61
61
## Data Extraction
62
62
### Building the data extractor
63
- To build the data extraction, you need a .Net development environment (i.e.,
63
+ To build the data extraction, you need a .NET development environment (i.e.,
64
64
a working ` dotnet ` executable). Once this is set up, you can build the
65
65
extractor as follows:
66
66
```
@@ -93,7 +93,8 @@ consisting of a context graph and a target expression in tree form.
93
93
` ExpressionDataExtractor.exe --help ` provides some information on
94
94
additional options.
95
95
96
- * Note* : Building C# projects is often non-trivial (requiring libraries in the
96
+ * Note* : Building C# projects is often non-trivial (requiring [ NuGet] ( https://www.nuget.org/ )
97
+ and other libraries in the
97
98
path, preparing the build by running helper scripts, etc.). Roughly, data
98
99
extraction from a solution ` Project.sln ` will only succeed if running
99
100
` MSBuild Project.sln ` succeeds as well.
@@ -120,9 +121,9 @@ Data extraction is split into two projects:
120
121
121
122
## Models
122
123
First, run ` pip install -r requirements.txt ` to download the needed
123
- dependencies.
124
+ dependencies. Note that all code is written in Python 3.
124
125
125
- As the preprocessing of graphs into tensorised form is relatively expensive,
126
+ As the preprocessing of graphs into tensorised form is relatively computationally expensive,
126
127
we use a preprocessing step to do this. This computes vocabularies, the
127
128
grammar required to produce the observed expressions and so on, and then
128
129
transforms node labels from string form into tensorised form, etc.:
@@ -304,4 +305,4 @@ provided by the bot. You will only need to do this once across all repos using o
304
305
305
306
This project has adopted the [ Microsoft Open Source Code of Conduct] ( https://opensource.microsoft.com/codeofconduct/ ) .
306
307
For more information see the [ Code of Conduct FAQ] ( https://opensource.microsoft.com/codeofconduct/faq/ ) or
307
- contact
[ [email protected] ] ( mailto:[email protected] ) with any additional questions or comments.
308
+ contact
[ [email protected] ] ( mailto:[email protected] ) with any additional questions or comments.
0 commit comments