|
| 1 | +CN Naming Conventions |
| 2 | +--------------------- |
| 3 | + |
| 4 | +This document describes our (Benjamin and Liz's) current best shot at |
| 5 | +a good set of conventions for naming identifiers in CN, based on |
| 6 | +numerous discussions and worked examples. Everything in the tutorial |
| 7 | +(in src/examples) follows these conventions. Future CN coders are |
| 8 | +encouraged to follow suit. |
| 9 | + |
| 10 | +# Principles |
| 11 | + |
| 12 | +- When similar concepts exist in both C and CN, they should be named |
| 13 | + so that the correspondence is immediately obvious. |
| 14 | + - In particular, the C and CN versions of a given data structure |
| 15 | + should have very similar names. |
| 16 | + |
| 17 | +- In text, we use the modifiers _CN-level_ vs. _C-level_ to |
| 18 | + distinguish the two worlds. |
| 19 | + |
| 20 | +# General conventions |
| 21 | + |
| 22 | + ## For new code |
| 23 | + |
| 24 | +When writing both C and CN code from scratch (e.g., in the tutorial), |
| 25 | +aim for maximal correspondence between |
| 26 | + |
| 27 | +- In general, identifiers are written in `snake_case` (or |
| 28 | + `Snake_Case`) rather than `camlCase` (or `CamlCase`). |
| 29 | + |
| 30 | +- C-level identifiers are `lowercase` wherever possible. |
| 31 | + |
| 32 | +- CN-level identifiers are `Uppercase_Consistently_Throughout`. |
| 33 | + |
| 34 | +- A CN identifier that represents the state of a mutable data |
| 35 | + structure after some function returns should be named the same as |
| 36 | + the starting state of the data structure, with an `_post` at the |
| 37 | + end. |
| 38 | + - E.g., The list copy function takes a linked list `l` |
| 39 | + representing a sequence `L` and leaves `l` at the end pointing |
| 40 | + to a final sequence `L_post` such that `L == L_post`. |
| 41 | + (Moreover, it returns a new sequence `Ret` with `L == Ret`.) |
| 42 | + |
| 43 | +- Predicates that extract some structure from the heap should be named |
| 44 | + the same as the structure they extract, plus the suffix `_At`. |
| 45 | + E.g., the result type of the `Queue` predicate is also called |
| 46 | + `Queue_At`. |
| 47 | + |
| 48 | +## For existing code |
| 49 | + |
| 50 | +In existing C codebases, uppercase-initial identifiers are often used |
| 51 | +for typedefs, structs, and enumerations. We should choose a |
| 52 | +recommended standard convention for such cases -- e.g., "prefix |
| 53 | +CN-level identifiers with `CN` when needed to avoid confusion with |
| 54 | +C-level identifiers". Some experimentation will be needed to see |
| 55 | +which convention we like best; this is left for future discussions. |
| 56 | + |
| 57 | +# Built-ins |
| 58 | + |
| 59 | +This proposal may ultimately suggest changing some built-ins for |
| 60 | +consistency. |
| 61 | + |
| 62 | + - `i32` should change to `I32`, `u64` to `U64` |
| 63 | + - `is_null` to `Is_null` (or `Is_Null`) |
| 64 | + |
| 65 | +*Discussion*: One point against this change is that CN currently tries |
| 66 | +to use names reminiscent of Rust (`i32`, `u64`, etc.). I (BCP) do not |
| 67 | +personally find this argument persuasive -- internal consistency seems |
| 68 | +more important than miscellaeous points of similarity with some other |
| 69 | +language. One way or the other, this will require a global decision. |
| 70 | + |
| 71 | +# Polymorphism |
| 72 | + |
| 73 | +One particularly tricky issue is how to name the "monomorphic |
| 74 | +instances" of "morally polymorphic" functions (i.e., whether to write |
| 75 | +`append__Int` or `append__List_Int` rather than just `append`). On |
| 76 | +one hand, `append__Int` is "more correct". On the other hand, these |
| 77 | +extra annotations can get pretty heavy. |
| 78 | + |
| 79 | +We propose a compromise: |
| 80 | + |
| 81 | +1. If a project needs to use two or more instances of some polymorphic |
| 82 | + type, then the names of the C and CN types, the C and CN functions |
| 83 | + operating over them, and the CN predicates describing them are all |
| 84 | + suffixed with `__xxx`, where `xxx` is the appropriate "type |
| 85 | + argument". E.g., if some codebase uses lists of both signed and |
| 86 | + unsigned 32-bit ints, then we would use names like this: |
| 87 | + - `list__int` / `list__uint` |
| 88 | + - `append__int` / `append__uint` |
| 89 | + - `List__I32` / `List__U32` |
| 90 | + - `Cons__I32` / `Cons__U32` |
| 91 | + - etc. |
| 92 | + |
| 93 | +2. However, if, in a given project, a set of "morally polymorphic" |
| 94 | + type definitions and library functions is *only used at one |
| 95 | + monomorphic instance* (e.g., if some codebase only ever uses lists |
| 96 | + of 32-bit signed ints), then the `__int` or `__I32` annotations are |
| 97 | + omitted. |
| 98 | + |
| 99 | + This convention is used in the CN tutorial, for example. |
| 100 | + |
| 101 | +*Discussion*: One downside of this convention is that it might |
| 102 | +sometimes require some after-the-fact renaming: If a project starts |
| 103 | +out using just lists of signed ints and later needs to introduce lists |
| 104 | +of unsigned ints, the old signed operations will need to be renamed. |
| 105 | +This seems like an acceptable cost for keeping things light. |
0 commit comments