CableCodingConvention

CABLE Fortran Coding Convention

Version 1.0

Kai Lu, Jhan Srbinovsky, Claire Carouge, Bernard Pak and Mike Rezny

Last Edited: 5 March 2018 (minor revisions)

Update: 5 March 2018

The text below outlines suggested good practice. Please note that we will soon be transitioning to the coding standings outlined by the UK met office (see attachment below). It will be expected that all new code that users wish to be incorporated into the trunk is compliant with this protocol.

More soon ...

Acknowledgements:

The following documents were consulted in the preparation of these coding standards.

Introduction

Programming is often a team effort, with the developers located in different states. Often the people involved in the original programming are not available to help with support and maintenance.

A coding convention is particularly needed with a distributed programming, support, and maintenance staff. With a well-defined coding convention, software is easier to read, understand, debug, maintain, distribute, and support than with no coding convention. To date, no conventions have been specified which apply across all CABLE components.

It is recommended that all CABLE components follow certain common coding convention. The goal is to

reduce common causes of bugs,
improve overall software quality, and
reduce differences in coding styles that limit legibility

This document specifies the coding convention to be used when writing new code files for CABLE. When making extensive changes to an existing file, a rewrite of the whole file should be done to ensure that the file meets the CABLE coding standard and style. All code modifications within an existing file should follow these standards.

If you have any corrections, or suggestions, please feel free to email the CABLE Technical Support Group on [email protected] so that we can continue to update the coding convention over time. The coding convention in this document will be periodically reviewed, updated, and extended to ensure maximum benefit.

CABLE Coding Convention

This section outlines the coding convention that developers should adhere to when developing code for inclusion within CABLE. The rules set out in this section aim to improve code readability and ensure that CABLE code is compatible with the Fortran 90 standard.

Layout and Formatting

All code should be written using the free source form. Please restrict code to 80 columns, so that the code can be easily viewed on any editor and screen and can be printed easily on A4 paper.
Never put more than one statement on a line.
Line up the statements, where appropriate, to improve readability.
Comment lines are critical to making a program easy to understand. Use good comments liberally. Make sure that comments and code agree; when the code changes, comments also change.
Comment line starts with a single "!‟ at the beginning of line. Short comments may be included on the same line as executable code using the "!" character followed by the description, and an additional line can be used with proper alignment.
The use of comments is required for both large DO loops and large IF blocks that span 15 lines or more.
Comment lines with no text may be used to separate groups of code but generally should not be used without an accompanying comment line containing text. If the logic of the code is such that the programmer believes it should be spaced by a blank comment, it probably needs a comment describing what the next block of code is all about.
Comment lines are never inserted into the middle of a statement that is continued on more than one line.
Use spaces and blank lines where appropriate to format the code to improve readability. Single blank lines should be used before comments and before logical sections of codes inside a subroutine. A single blank line, then a complete line (80 characters) of "---", followed by another blank line should be inserted between routines. A single blank line, then a complete line (80 characters) of "...", followed by another blank line should be inserted between TYPE definitions. The blank space should be used in such circumstances as after a comma or semicolon in a loop or arguments in a list, between assignment operators, around conditional operators (except "/=").
Each level of indentation shall be three spaces for the code and comment lines to increase readability (except for CONTAINS, which remains at first column).
Never use tabs within code lines as the tab character is not in the Fortran character set. If the editor inserts tabs automatically, this functionality should be configured to switch off when editing Fortran source files.
The only symbol to be used as a continuation line marker is "&‟ at the end of a line. It is suggested that these continuation markers should be aligned to aid readability.
Code lines that are continuation lines of assignment statements must begin to the right of the column of the assignment operator. Similarly, continuation lines of subroutine calls and dummy argument lists of subroutine declarations must have the arguments aligned to the right of the "(" character.
Short and simple Fortran statements are easier to read and understand than long and complex ones. Where possible, avoid using continuation lines in a statement.
Routines with large argument lists will contain five variables per line. This applies both to the calling routine and the dummy argument list in the routine being called. The purpose is to simplify matching up the arguments between caller and callee. In rare instances in which five variables will not fit on a single line, a number smaller than five may be used but consistency must be maintained between caller and callee.

Style and Fortran features

The code should adhere to the Fortran 90 standard.
CABLE file names should use the *.F90 file extension for source files (to be compatible with the Unified Model). The file name base should be named the same as the subroutine if it only includes a subroutine definition.
Avoid naming the program units and variables with names that match a keyword in a Fortran statement.
Avoid the use of the EQUIVALENCE statement as the use of equivalenced variables often reduces program clarity, making maintenance more difficult.
Never use the PAUSE statement. Execution of a PAUSE statement requires operator or system-specific intervention to resume execution. In most cases, the same functionality can be achieved as effectively and in a more portable way with the use of an appropriate READ statement that awaits some input data.
Avoid the use of GO TO, assigned GO TO, computed GO TO and arithmetic IF statements. The multiple branching natures of these statements violate the principles of structured top-down programming and make code more difficult to understand and maintain. The preferred alternative is to use the appropriate modern constructs such as IF, WHERE, SELECT CASE, etc. – Where possible, consider using CYCLE, EXIT or a WHERE construct to simplify complicated DO loops.
Write the comments in simple English and name the program units and variables based on sensible English words.
To improve readability, write the code using the ALL CAPS Fortran keywords approach. This is the style used in most of the examples in this document, where Fortran keywords and intrinsic procedures are written in ALL CAPS. The rest of the code may be written in lowercase. This approach has the advantage that Fortran keywords stand out.
The full version of END should be used at all times to improve readability instead of just END, e.g. END SUBROUTINE <name>, END FUNCTION <name>, etc.
Use of the operators <, >, <=, >=, ==, /= is strongly recommended instead of their deprecated counterparts, .lt., .gt., .le., .ge., .eq., and .ne. The motivation is readability. In general use the notation: < Blank >< Operator >< Blank >
In general, == or /= should not be used to compare two real numbers. An alternative is to check for a very small absolute difference between the two numbers. Machine precision and round off in computations may make equivalent variables different by a very small fraction.
Avoid the use of "magic numbers" that is numeric constants hard wired into the code. These are very hard to maintain and obscure the function of the code. It is much better to assign the "magic number" to a variable or constant with a meaningful name and then to use this throughout the code. In many cases the variable will be assigned in a top-level control routine (e.g. cable_define_types.F90) and passed down via a module. This ensures that all subroutines will use the correct value of the numeric constant and that alteration of it in one place will be propagated to all its occurrences. If the value does not need to be altered whilst the program is running, the assignment should be made using a PARAMETER statement and should not be copied to local variables; this avoids using different values for the same constant or changing it accidentally.
All variables must be declared, and commented with a brief description. This increases understandability and reduces errors caused by misspellings of variables. In general variables should be declared one per line, with a comment field expressed with a "!" character followed by the comment text all on the same line as the declaration. Multiple comment lines describing a single variable are acceptable when necessary. Variables of a like function may be grouped together on a single line.
The variables of a given type should be grouped together when they are declared. These groups should be declared in the order INTEGER, REAL, LOGICAL and then CHARACTER.
Positive logic is usually easier to understand. When using an IF-ELSE-END IF construct, positive logic should be used in the IF test, provided that the positive and the negative blocks are about the same length.
Avoid using numeric LABELs in loops. Loops must terminate with an END DO statement. To improve the clarity of program structure, the developer can optionally add labels or comments to the DO and END DO statements. This is especially helpful when using EXIT.
When writing a REAL literal with an integer value, put a 0 after the decimal point (i.e. 1.0 as opposed to 1.) to improve readability.
Usage of the DIMENSION statement or attribute is required in declaration statements.
In an array assignment except for long loops, it is recommended that the developers use array notations to improve readability.
For long, complicated loops in array assignment, explicitly indexed loops should be preferred. In general when using this syntax, the order of the loop indices should reflect the scheme: left-most array index should be in the inner-most loop, followed by the next left-most array index in the next inner-most loop, etc, and the last array index in the outer-most loop. This can lead to increased efficiency (and increased speed) as Fortran stores arrays in column-major form.
When accessing sections of arrays, using the triplet notation on the full array is recommended.
Never access arrays outside of their declared bounds - the results are unpredictable and often undesirable.
Where appropriate, use parentheses to improve readability.
The formatting information can be placed explicitly within the READ, WRITE or PRINT statement, or be assigned to a CHARACTER variable in a PARAMETER statement in the header of the routine for later use in I/O statements. Never place output text within the format specifier: i.e. only format information may be placed within the FMT= part of an I/O statement, all variables and literals, including any character literals, must be 'arguments' of the I/O routine itself. This improves readability by clearly separating what is to be read/written from how to read/write it.
When allocating and deallocating, use a separate ALLOCATE and DEALLOCATE statement for each array.
When using the ALLOCATE statement, ensure that any arrays passed to subroutines have been allocated, even if it is anticipated that they will not be used.
To prevent memory fragmentation, ensure that ALLOCATEs and DEALLOCATEs match in reverse order.
Where possible, an ALLOCATE statement for an ALLOCATABLE array (or a POINTER used as a dynamic array) should be coupled with a DEALLOCATE within the same scope. If an ALLOCATABLE array is a PUBLIC MODULE variable, it is highly desirable if its memory allocation and deallocation are only performed in procedures within the MODULE in which it is declared. The developers may consider writing specific SUBROUTINEs within the MODULE to handle these memory managements.
Explicit interface blocks are required between routines if optional or keyword arguments are to be used. They also allow the compiler to check that the type, shape and number of arguments specified in the CALL are the same as those specified in the subprogram itself.
The save statement is used to allow local variables within a subprogram to retain their values between calls to the subprogram. SAVE should not be used indiscriminately. Variables to be saved are explicitly declared by type before they are included in the SAVE statement. Any variable that needs to be saved needs a good explanation of how it is used.

Subroutines, Functions and Modules Guidance

Each subroutine, function and module should be placed in a separate file. Modules may be used to group related variables, subroutines and functions. Each separate file within the source tree should be uniquely named.
Use meaningful English names for subroutines, functions and modules; the names should be written in lowercase. The underscore is allowed in the name.
Headers are an immensely important part of any code as they document what it does, and how it does it. You should write as much of the header as possible before writing the code. New CABLE files should include a header in the following format.

!==============================================================================
! This source code is part of the
! Australian Community Atmosphere Biosphere Land Exchange (CABLE) model.
! This work is licensed under the CABLE Academic User Licence Agreement
! (the "Licence").
! You may not use this file except in compliance with the Licence.
! A copy of the Licence and registration form can be obtained from
! http://www.accessimulator.org.au/cable
! You need to register and read the Licence agreement before use.
! Please contact [email protected] for any questions on
! registration and the Licence.
!
! Unless required by applicable law or agreed to in writing,
! software distributed under the Licence is distributed on an "AS IS" BASIS,
! WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
! See the Licence for the specific language governing permissions and
! limitations under the Licence.
!
==============================================================================
!
! Purpose:
!
! Contact:
!
! History:
!
!
!
==============================================================================

The "Contact" should be the author, or someone who is familiar with the code. The "History" should note any major changes to the file. A detailed revision history will be available through the CABLE Subversion/Trac site.

Use IMPLICIT NONE in all program units. This forces you to declare all your variables explicitly. This helps to reduce bugs in your program that will otherwise be difficult to track.

Subroutines

Write well-structured code makes use of subroutines to separate specific subtasks. In particular all file I/O should be done through subroutines: this greatly facilitates the portability of the code.
Subroutines should be kept reasonably short, where appropriate, say up to about 200 lines of executable code, but there are start up overheads involved in calling an external subroutine so they should do a reasonable amount of work.
All dummy arguments must include the INTENT clause in their declaration. This is extremely valuable to someone reading the code, and can be checked by compilers.
Arguments should be declared separately from local variables.
Subroutine arguments must be declared in the same order as they appear in the subroutine statement. This order is not random but is determined by intent, variable dimensions and variable type. All input arguments come first, followed by all input/output arguments and finally by all output arguments. Within each intent statement, all scalar arguments must come before all array arguments.
When an error condition occurs, a message describing what went wrong will be printed. The name of the program unit in which the error occurred must be included. If the user wishes to terminate execution, a generic termination routine should be called instead of issuing a Fortran "stop".
One subroutine should avoid containing a few other subroutines and/or functions.

Functions

All FUNCTION statements include an explicit type specification. FUNCTION returns a single value. FUNCTION arguments are input only. For clarity and maintenance, FUNCTIONs do not modify or output dummy arguments.

Modules

Use USE with the ONLY attribute to specify which of the variables, type definitions etc. defined in a module are to be made available to the using module. Use the ONLY clause in a USE <module> statement to declare all imported symbols (i.e. parameters, variables, functions, subroutines, etc). This makes it easier to locate the source of each symbol, and avoids unintentional access to other PUBLIC symbols within the MODULE.
For code portability, be careful not to USE <module> twice in a routine for the same MODULE, especially where using ONLY. This can lead to compiler Warning and Error messages.
Where possible, module variables and procedures should be declared PRIVATE. This avoids unnecessary export of symbols, promotes data hiding and may also help the compiler to optimize the code. Anything to be used outside the module should be explicitly declared PUBLIC instead.
The use of derived types is encouraged, to group related variables and their use within Modules.
Global type constants (e.g. pi) should be maintained at a high level within the code (see cable_data.F90) and not duplicated within modules at the code section level; USE <global constants module> instead. Only section specific constants should be maintained at the section level.
Embedding multiple routines within a single file and/or module is encouraged, if any of three conditions hold:
1. if routine B is called by routine A and only by routine A, then the two routines may be included in the same file. This construct has the advantage that inlining B into A is often much easier for compilers if both A and B are in the same file. Practical experience with many compilers has shown that inlining when A and B are in different files is often too complicated for most people to consider worthwhile investigating;
2. multiple routines are CONTAINed in a module for the purpose of providing an implicit interface block. This type of construct is strongly encouraged, as it allows the compiler to perform argument consistency checking across routine boundaries; or
3. the scope of the data defined in the module can be limited to only the routines that are also in the module and this is accomplished with the PRIVATE clause. If none of the above conditions hold, it is not acceptable to simply glue together a bunch of functions or subroutines in a single file.
Modules are sufficiently fundamental that reserving a special suffix to indicate their names is sensible and common convention. Most communities have opted to use _mod suffix for this purpose. In CABLE we have used either _module or _mod.
The files in which the modules reside should be named the same as the modules.
Note that by implication multiple modules are not allowed in a single file.

Program Template

A basic template for a module, which includes one subroutine and one function within it, has been provided and is available on the CABLE Trac site (https://trac.nci.org.au/trac/cable/wiki/CableDocuments).

The basic templates for other types of program units are similar to that of a MODULE.

CableCodingConvention

CABLE Fortran Coding Convention

Update: 5 March 2018

Introduction

CABLE Coding Convention

Layout and Formatting

Style and Fortran features

Subroutines, Functions and Modules Guidance

Subroutines

Functions

Modules

Program Template

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!