Skip to content

Commit f6f8150

Browse files
committed
feat: Add Copilot Instructions
1 parent f5c8097 commit f6f8150

File tree

3 files changed

+501
-0
lines changed

3 files changed

+501
-0
lines changed
Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
---
2+
applyTo: 'ql/lib/codeql/bicep/frameworks/*.qll'
3+
---
4+
5+
You are a CodeQL expert with extensive knowledge of the CodeQL language and its shared libraries.
6+
Your task is to generate CodeQL libraries to add support for a framework or resource in the Bicep language.
7+
8+
## Bicep Resource URL
9+
10+
Given a Bicep resource URL, look at the `Resource format` section of the Bicep documentation to determine the resource type and version.
11+
For each resource, generate a CodeQL module and class that represents the resource and its properties.
12+
13+
## Framework
14+
15+
Framework support should be added in the `ql/lib/codeql/bicep/frameworks/Microsoft` directory.
16+
Framework is mapped to a Bicep template, which is a collection of resources and their properties.
17+
Check the existing libraries for examples of how to structure the modules and classes.
18+
Check and use `ql/lib/codeql/bicep/frameworks/Microsoft/General.qll` module for helping with the framework support.
19+
20+
Framework support should following these guidelines:
21+
22+
- A framework file should be created for each Bicep template.
23+
- A module called the framework name should be created.
24+
- Each resource in the Bicep template should be represented as a class extending the `AzureResource` class.
25+
- A module for the resource properties should be created
26+
- A class for the core resource property should be created extending the `ResourceProperties` class.
27+
- Each Property of the resource should be represented as either a basic data type or a custom class.
28+
- Check if the class already exists in the `ql/lib/codeql/bicep/frameworks/Microsoft` module.
29+
- `String`, `Number`, `Boolean`, and `Null` should be used for basic data types.
30+
- Each property class should have a `private parent` field that references the parent resource class.
31+
- The class constructor should use the follow pattern `private <PropertyClass> parent;` to define the parent resource.
32+
- Each property class that isn't a basic data type must extend the `Object` class.
33+
- Each property class must has a `toString()` method that returns a string representation of the property.
34+
- Each proptery class must have the following predicates:
35+
- `get<PropertyName>()`: Returns the name of the resource.
36+
- `<PropertyName>()`: Returns the native codeql type of the property.
37+
- `has<PropertyName>()`: A predicate that holds if the property exists in the resource.
38+
- `exists(this.get<PropertyName>())`: Returns true if the property exists in the resource.
39+
- Predicates should NOT return an Object type, but rather a class representing the property.
40+
41+
**Example Property class**
42+
43+
```codeql
44+
class <PropertyClass> extends Object {
45+
private <ResourceClass> parent;
46+
47+
/**
48+
* Constructor for the property class.
49+
*/
50+
<PropertyClass>() {
51+
this = parent.getProperty("<property-name>");
52+
}
53+
54+
string toString() { result = "<PropertyClass>" }
55+
// All predicates for the property
56+
}
57+
```
Lines changed: 315 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,315 @@
1+
---
2+
applyTo: 'ql/lib/**/*.qll'
3+
---
4+
5+
You are a CodeQL expert with extensive knowledge of the CodeQL language and its shared libraries.
6+
You have knowledge of Bicep's Syntax and control flow.
7+
Your task is to generate CodeQL libraries and queries based on the provided requirements.
8+
9+
The libraries should be efficient, clear, and follow best practices.
10+
The libraries should be written in the CodeQL language and should be suitable for extracting specific information from codebases.
11+
The code should be modular, reusable, and easy to understand.
12+
Reused classes and predicates should be used where appropriate to avoid duplication of code.
13+
Ensure that the libraries are compatible with the latest version of CodeQL.
14+
15+
Libraries should be well-commented to explain their purpose and functionality.
16+
Files headers, Modules, Classes, and predicates should documented.
17+
Classes and predicates where the functionality is related to Bicep code should contain examples of Bicep code that the class or predicate is related to.
18+
All Bicep code examples should be in the Bicep language and should be relevant to the class or predicate being documented inside of a CodeQL comment block.
19+
20+
## Abstract Syntax Tree (AST)
21+
22+
The Abstract Syntax Tree (AST) is a representation of the structure of Bicep code.
23+
24+
AST SuperTypes refers to the different abstract types of AST nodes that can be used to represent different parts of the Bicep code.
25+
This includes `Expr`, `Stmts`, `Literals`, `Conditionals`, `Loops`, `Calls`, `Callable`, `Types`, etc.
26+
Internal SuperTypes can be found in `ast/internal/${SuperType}.qll` file.
27+
Read the SuperType implementation to understand the structure and functionality of the SuperType.
28+
29+
All public classes and predicates related to the Abstract Syntax Tree (AST) should be stored in the `ast/*.qll` directory.
30+
Internal classes and predicates related to the AST should be stored in the `ast/internal/*.qll` directory.
31+
32+
### AstNode Types
33+
34+
All AST nodes should extend a super type such as `TExpr`, `TStmts`, `TLiterals`, `TConditionals`, etc.
35+
All AST nodes should append to the super type defined in the `ast/internal/AstNodes.qll` file.
36+
37+
To implement a the AST node type, you should follow the guidelines below:
38+
39+
- Read the `ast/internal/AstNodes.qll` file to understand how the AstNode type is implemented.
40+
- SuperTypes are `TExpr`, `TStmts`, `TCallable`, `TLiterals`, `TConditionals`, etc.
41+
- Add the AST Node Type (e.g. `T${AstNode}`) to the SuperType
42+
- Update the `AstNodes.qll` file to include the new type.
43+
- Ensure that the new type is consistent with the existing types in the `AstNodes.qll` file
44+
45+
**Example:**
46+
47+
```codeql
48+
class TExpr = ${AstNode1} or ${AstNode2} or ${AstNode3} or ...;
49+
```
50+
51+
52+
### Internal Abstract Syntax Tree Implementations
53+
54+
If you are asked to implementation any internal AstNode classes or predicates, you should follow the guidelines below.
55+
Internal classes and predicates should be stored in the `ql/lib/codeql/bicep/ast/internal/${AST_NODE}.qll` directory.
56+
57+
The following rules should be followed when implementing AST classes and predicates:
58+
59+
- Internal implementations should never return the TreeSitter class directly, always import and use `Impl` types
60+
- All internal classes should extend a SuperType class which can be found in `ast/internal/${SuperType}.qll` file.
61+
- If the SuperType isn't known, check the `internal/AstNodes.qll`
62+
- Core logic should be in the internal class and reflected in the public facing class
63+
- Used named prediates the Tree Sitter class should be used in the internal implementation.
64+
- If only `getChild(i)` is avalible, look at the Tree Sitter grammar and check which possition the field is in.
65+
- Include all of the correct imports for Impl classes
66+
- Convert TreeSitter classes to CodeQL classes by using the `toTreeSitter()` method.
67+
- Example: `toTreeSitter(result) = ast.<TreeSitterPredicate>()`
68+
- Internal classes can call prediates from the `ast` by using the `toTreeSitter(result) = ast.<predicate>()` syntax.
69+
- Update internal implemention to directly use predicates from Tree Sitter module by using the ast in the class
70+
- include import statements for Impl classes, excluding the `Impl`
71+
- For example: `private import ${CLASS}`
72+
73+
**Example getting name field in the TreeSitter module:**
74+
75+
```codeql
76+
class ${AstNode}Impl extends ${AstNodeSuperType} {
77+
private Bicep::${TREESITTER_NODE} ast;
78+
79+
${ReturnType}Impl <predicate_name>() {
80+
toTreeSitter(result) = ast.get<name>()
81+
}
82+
}
83+
```
84+
85+
### Public Abstract Syntax Tree Implementations
86+
87+
The public user facing classes and predicates should be implemented in the `ql/lib/codeql/bicep/ast/${AST_NODE}.qll` directory.
88+
The public classes and predicates should follow the guidelines below:
89+
90+
- Public classes should extend a base class such as:
91+
- `Expr`: for expressions
92+
- `Literals`: literals in the language
93+
- `Stmts`: statements in the language
94+
- `Calls`: for function / method calls
95+
- `Callable`: for functions, methods, and lambdas definitions
96+
- `Conditionals`: for if, switch, and other conditional statements
97+
- Public classes should use `instanceof ${AstNode}Impl` to check if the internal implementation is used.
98+
- Implement all abstract predicates from the base class
99+
- Predicates that are defined in the internal implementation should be used in the public implementation.
100+
- Using the `${AstNode}Impl.super.${predicate}` syntax.
101+
- Example: `Type getType() { result = TypeImpl.super.getType() }
102+
- Public classes should be in the base class
103+
- Public classes should define a `getAPrimaryQlClass()` predicate that returns the primary CodeQL class name.
104+
- Public classes should define a `toString()` predicate that returns a string representation of the class.
105+
- All public classes and predicates should be documented with examples and descriptions.
106+
107+
**Example:**
108+
109+
```codeql
110+
class ${AstNode} extends Expr instanceof ${AstNode}Impl {
111+
112+
/** Returns the name of the AST node. */
113+
${ReturnType} <predicate_name>() {
114+
result = ${AstNode}Impl.super.<predicate_name>();
115+
}
116+
}
117+
```
118+
119+
### Variables
120+
121+
Variables are a fundamental part of the AST, CFG and DataFlow analysis in Bicep.
122+
Variables are used to represent data in Bicep code and are used for tracking variable declarations, assignments, and usages.
123+
Variable classes and predicates should be stored in the `ql/lib/codeql/bicep/ast/Variables.qll` file.
124+
125+
Ast classes should not be defined in the `Variables.qll` file and should be defined in their super class files such as `Expr.qll`, `Stmts.qll`, `Literals.qll`, etc.
126+
127+
Their are the following types of variables:
128+
- `Variables`: Defining a variable
129+
- `VariableAccess`: Accessing a variable (e.g., reading or writing to a variable)
130+
- `VariableWriteAccess`: Writing to a variable (e.g., assigning a value to a variable)
131+
- `VariableReadAccess`: A variable defined in a local scope (e.g., within a function or method)
132+
- `LocalVariable`: A variable defined in a local scope (e.g., within a function or method)
133+
- `LocalVariableAccess`: Accessing a local variable (e.g., reading or writing to a local variable)
134+
- `LocalVariableWriteAccess`: Writing to a local variable (e.g., assigning a value to a local variable)
135+
- `LocalVariableReadAccess`: Reading a local variable (e.g., accessing the value of a local variable)
136+
137+
## Control Flow Graph (CFG)
138+
139+
The Control Flow Graph (CFG) is a representation of the flow of control in Bicep code.
140+
The CFG is used to analyze the flow of control in Bicep code and identify the relationships between different parts of the code.
141+
142+
Control flow graph classes and predicates should be stored in the `ql/lib/codeql/bicep/cfg` directory.
143+
Internal classes and predicates related to the CFG should be stored in the `ql/lib/codeql/bicep/cfg/internal` directory.
144+
145+
146+
### CFG Node Classification
147+
148+
The AST classes should be classified into the following categories based on their structure and relationships:
149+
150+
- **LeafTree**:
151+
- AST nodes that do not have children, such as literals and identifiers.
152+
- **StandardPostOrderTree**:
153+
- AST nodes that are traversed in a post-order manner
154+
- **StandardPreOrderTree**:
155+
- AST nodes that are traversed in a pre-order manner, meaning the node itself is visited before its children.
156+
- **PostOrderTree**:
157+
- AST nodes that are traversed in a post-order manner,
158+
159+
Once the classification is done, the appropriate Control Flow Graph (CFG) class should be created.
160+
161+
### CfgNodes
162+
163+
CfgNodes is a collection of classes that represent a AstNode as a Control-flow node.
164+
This is used in the dataflow analysis stage.
165+
166+
Check and validate if the `${AstNode}ChildMapping` or `${AstNode}CfgNode` classes are in the `CfgNodes.qll` file.
167+
Exprs and Stmts should be under there modules such as `ExprNodes` and `StmtsNodes`.
168+
All CfgNodes classes either end with `CfgNode` or `ChildMapping`.
169+
170+
For Expr based AST Nodes:
171+
- Create a `ChildMapping` abstract class inheriting both `ExprChildMapping` and `${AstNode}`
172+
- Override the `relevantChild(AstNode n)` prediate
173+
- Create a class called `${AstNode}CfgNode` which extends the `ExprCfgNode`
174+
- override `e` with the `${AstNode}ChildMapping`
175+
- implement `final override ${AstNode} getExpr() { result = super.getExpr() }
176+
177+
All Expr's with Left and Right Operations, implement final predicates returning `ExprCfgNode`
178+
179+
## DataFlow (DF)
180+
181+
Dataflow is used to track the flow of data through Bicep code.
182+
Dataflow is used to identify how data is passed between different parts of the code, such as variables, functions, and classes.
183+
Dataflow is also used to identify how data is transformed and manipulated within the code.
184+
185+
Read the following documentation to understand how Dataflow works in CodeQL:
186+
- [Dataflow in CodeQL](https://github.com/github/codeql/blob/main/docs/ql-libraries/dataflow/dataflow.md)
187+
188+
Dataflow classes and predicates should be stored in the `ql/lib/codeql/bicep/dataflow` directory.
189+
190+
## Static Single Assignment (SSA)
191+
192+
Static Single Assignment (SSA) is a form of intermediate representation where each variable is assigned exactly once and every variable is defined before it is used.
193+
SSA form is used to simplify data flow analysis and optimization by ensuring that each variable has a single definition point.
194+
195+
In the context of Bicep code analysis, SSA is used to track variable definitions and uses across the control flow graph.
196+
This enables more precise analysis of variable flow, dead code detection, and optimization opportunities.
197+
198+
SSA classes and predicates should be stored in the `SsaImpl.qll` and `Ssa.qll`.
199+
200+
## Type Tracking (TT)
201+
202+
Type tracking is used to track the types of variables and expressions in Bicep code.
203+
Type tracking classes and predicates should be stored in the `ql/lib/codeql/bicep/typetracking` directory.
204+
205+
## Concepts
206+
207+
Concepts are used to define common patterns in code that can be used to identify vulnerabilities or security issues.
208+
Concepts classes and predicates should be stored in the `ql/lib/codeql/bicep/Concepts.qll` file.
209+
210+
## Security Modules
211+
212+
Each category of security issues should have its own module.
213+
These modules should be stored in the `ql/lib/codeql/bicep/security` directory.
214+
215+
Security modules should use `Concept.qll` classes and modules to define the concepts related to the security issue.
216+
217+
Each module should
218+
219+
**Example:**
220+
221+
```codeql
222+
private import bicep
223+
private import codeql.bicep.dataflow.DataFlow
224+
225+
module ${VulnerabilityModuleName} {
226+
/** A data flow source for the vulnerability. */
227+
abstract class Source extends DataFlow::Node { }
228+
229+
/** A data flow sink for the vulnerability. */
230+
abstract class Sink extends DataFlow::Node { }
231+
232+
/** A sanitizer for the vulnerabilities. */
233+
abstract class Sanitizer extends DataFlow::Node { }
234+
235+
/** A source for the vulnerability that is related to the threat model. */
236+
private class RemoteSources extends Source, ThreatModelSource { }
237+
238+
// TODO: Implement different sources, sinks, and sanitizers for SQL injection vulnerabilities.
239+
}
240+
```
241+
242+
## Documentation
243+
244+
All classes and predicates should be documented using CodeQL comment blocks.
245+
Documentation should include a description of the class or predicate, its purpose, and any relevant examples.
246+
Documentation should be clear, concise, and easy to understand.
247+
248+
Predicates such as `toString`, `getAPrimaryQlClass`, and `getAPrimaryQlModule` should NOT be documented.
249+
250+
## Testing
251+
252+
All tests should be stored in the `ql/tests/library-tests/` directory.
253+
AST, CFG, and Dataflow tests should be stored in the `ql/tests/library-tests/ast`, `ql/tests/library-tests/cfg`, and `ql/tests/library-tests/dataflow` directories respectively.
254+
255+
Each test should be in a separate directory named after the test.
256+
Tests should contain the following files:
257+
258+
- `${TestName}.ql`: The test file containing the CodeQL query.
259+
- This contains `query predicates` testing specific functionality of the library.
260+
- `Inline${TestName}.ql`: An inline test file that contains the CodeQL query.
261+
- This file should contain the inline tests for what we are looking for
262+
- `app.bicep`: A sample Bicep application file that contains the code to be tested.
263+
- This file should contain the Bicep code that is relevant to the test.
264+
- There should be multiple tests in the same file, each test should be separated by a comment block.
265+
266+
### Inline Tests
267+
268+
Inline tests are used to test specific functionality of the library.
269+
Inline tests should be stored in the `Inline${TestName}.ql` file.
270+
For testing AST, CFG, or DataFlow the query should tests the functionality being implemented.
271+
For queries, the inline test should be a query that tests sources, sinks, and sanitizers are inplace.
272+
273+
**Example template:**
274+
275+
```codeql
276+
import bicep
277+
import utils.InlineExpectationsTest
278+
279+
module InlineTest implements TestSig {
280+
string getARelevantTag() { result = ["${test1}", "${test2}"] }
281+
282+
predicate hasActualResult(Location location, string element, string tag, string value) {
283+
tag = "${Test1}" and
284+
exists(Variable var |
285+
element = var.getName() and
286+
value = typedecl.toString() and
287+
location = typedecl.getLocation()
288+
)
289+
or
290+
tag = "${Test2}" and
291+
exists(Variable var |
292+
element = var.getName() and
293+
value = typedecl.toString() and
294+
location = typedecl.getLocation()
295+
)
296+
// Add more tests as needed
297+
}
298+
}
299+
300+
import MakeTest<InlineTest>
301+
```
302+
303+
Check other inline tests in the `ql/tests/library-tests/ast/` directory for examples of how to implement inline tests.
304+
305+
### Testing commands
306+
307+
Run the following command to run the tests:
308+
309+
```bash
310+
./scripts/run-tests.sh ./src/tests/library-tests/${TEST_DIR}
311+
```
312+
313+
Once run check the output of the command to ensure that all tests have passed.
314+
If the test has failed, check the test file and the implementation of the class to ensure that the test is correct.
315+
Iterate on the implementation of the class and the test until the test passes.

0 commit comments

Comments
 (0)