Skip to content

Commit 8dea29e

Browse files
authored
Merge pull request #795 from AnguseZhang/master
Merge devel to master
2 parents 0767dce + 5d5cb2f commit 8dea29e

File tree

78 files changed

+2933
-928
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

78 files changed

+2933
-928
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,3 +41,4 @@ dbconfig.json
4141
.idea/*
4242
_build
4343
tests/generator/calypso_test_path
44+
doc/api/

README.md

Lines changed: 19 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -373,7 +373,6 @@ In `PARAM`, you can specialize the task as you expect.
373373

374374
"_comment": " that's all ",
375375
"numb_models": 4,
376-
"train_param": "input.json",
377376
"default_training_param": {
378377
"model": {
379378
"type_map": [
@@ -499,9 +498,8 @@ The bold notation of key (such aas **type_map**) means that it's a necessary key
499498
| **use_ele_temp** | int | 0 | Currently only support fp_style vasp. 0(default): no electron temperature. 1: eletron temperature as frame parameter. 2: electron temperature as atom parameter.
500499
| *#Data*
501500
| init_data_prefix | String | "/sharedext4/.../data/" | Prefix of initial data directories
502-
| ***init_data_sys*** | List of string|["CH4.POSCAR.01x01x01/.../deepmd"] |Directories of initial data. You may use either absolute or relative path here.
501+
| ***init_data_sys*** | List of string|["CH4.POSCAR.01x01x01/.../deepmd"] |Directories of initial data. You may use either absolute or relative path here. Systems will be detected recursively in the directories.
503502
| ***sys_format*** | String | "vasp/poscar" | Format of initial data. It will be `vasp/poscar` if not set.
504-
| init_multi_systems | Boolean | false | If set to `true`, `init_data_sys` directories should contain sub-directories of various systems. DP-GEN will regard all of these sub-directories as inital data systems.
505503
| init_batch_size | String of integer | [8] | Each number is the batch_size of corresponding system for training in `init_data_sys`. One recommended rule for setting the `sys_batch_size` and `init_batch_size` is that `batch_size` mutiply number of atoms ot the stucture should be larger than 32. If set to `auto`, batch size will be 32 divided by number of atoms. |
506504
| sys_configs_prefix | String | "/sharedext4/.../data/" | Prefix of `sys_configs`
507505
| **sys_configs** | List of list of string | [<br />["/sharedext4/.../POSCAR"], <br />["....../POSCAR"]<br />] | Containing directories of structures to be explored in iterations.Wildcard characters are supported here. |
@@ -515,10 +513,10 @@ The bold notation of key (such aas **type_map**) means that it's a necessary key
515513
| *#Exploration*
516514
| **model_devi_dt** | Float | 0.002 (recommend) | Timestep for MD |
517515
| **model_devi_skip** | Integer | 0 | Number of structures skipped for fp in each MD
518-
| **model_devi_f_trust_lo** | Float or List of float | 0.05 | Lower bound of forces for the selection. If List, should be set for each index in `sys_configs`, respectively. |
519-
| **model_devi_f_trust_hi** | Float or List of float | 0.15 | Upper bound of forces for the selection. If List, should be set for each index in `sys_configs`, respectively. |
520-
| **model_devi_v_trust_lo** | Float or List of float | 1e10 | Lower bound of virial for the selection. If List, should be set for each index in `sys_configs`, respectively. Should be used with DeePMD-kit v2.x. |
521-
| **model_devi_v_trust_hi** | Float or List of float | 1e10 | Upper bound of virial for the selection. If List, should be set for each index in `sys_configs`, respectively. Should be used with DeePMD-kit v2.x. |
516+
| **model_devi_f_trust_lo** | Float or List of float or Dict[str, float] | 0.05 | Lower bound of forces for the selection. If List, should be set for each index in `sys_configs`, respectively. |
517+
| **model_devi_f_trust_hi** | Float or List of float or Dict[str, float] | 0.15 | Upper bound of forces for the selection. If List, should be set for each index in `sys_configs`, respectively. |
518+
| **model_devi_v_trust_lo** | Float or List of float or Dict[str, float] | 1e10 | Lower bound of virial for the selection. If List, should be set for each index in `sys_configs`, respectively. Should be used with DeePMD-kit v2.x. |
519+
| **model_devi_v_trust_hi** | Float or List of float or Dict[str, float] | 1e10 | Upper bound of virial for the selection. If List, should be set for each index in `sys_configs`, respectively. Should be used with DeePMD-kit v2.x. |
522520
| model_devi_adapt_trust_lo | Boolean | False | Adaptively determines the lower trust levels of force and virial. This option should be used together with `model_devi_numb_candi_f`, `model_devi_numb_candi_v` and optionally with `model_devi_perc_candi_f` and `model_devi_perc_candi_v`. `dpgen` will make two sets: 1. From the frames with force model deviation lower than `model_devi_f_trust_hi`, select `max(model_devi_numb_candi_f, model_devi_perc_candi_f*n_frames)` frames with largest force model deviation. 2. From the frames with virial model deviation lower than `model_devi_v_trust_hi`, select `max(model_devi_numb_candi_v, model_devi_perc_candi_v*n_frames)` frames with largest virial model deviation. The union of the two sets is made as candidate dataset|
523521
| model_devi_numb_candi_f | Int | 10 | See `model_devi_adapt_trust_lo`.|
524522
| model_devi_numb_candi_v | Int | 0 | See `model_devi_adapt_trust_lo`.|
@@ -537,7 +535,8 @@ The bold notation of key (such aas **type_map**) means that it's a necessary key
537535
| **model_devi_jobs["ensembles"]** | String | "nvt" | Determining which ensemble used in MD, **options** include “npt” and “nvt”. |
538536
| model_devi_jobs["neidelay"] | Integer | "10" | delay building until this many steps since last build |
539537
| model_devi_jobs["taut"] | Float | "0.1" | Coupling time of thermostat (ps) |
540-
| model_devi_jobs["taup"] | Float | "0.5" | Coupling time of barostat (ps)
538+
| model_devi_jobs["taup"] | Float | "0.5" | Coupling time of barostat (ps) |
539+
| model_devi_jobs["model_devi_f_trust_lo"] <br> model_devi_jobs["model_devi_f_trust_hi"] <br> model_devi_jobs["model_devi_v_trust_lo"] <br> model_devi_jobs["model_devi_v_trust_hi"] | Float or Dict[str, float] | See global model_devi config above like **model_devi_f_trust_lo**. For dict, should be set for each index in sys_idx, respectively. |
541540
| *#Labeling*
542541
| **fp_style** | string | "vasp" | Software for First Principles. **Options** include “vasp”, “pwscf”, “siesta” and “gaussian” up to now. |
543542
| **fp_task_max** | Integer | 20 | Maximum of structures to be calculated in `02.fp` of each iteration. |
@@ -571,7 +570,7 @@ The bold notation of key (such aas **type_map**) means that it's a necessary key
571570
| **user_fp_params** | Dict | |Parameters for cp2k calculation. find detail in manual.cp2k.org. only the kind section must be set before use. we assume that you have basic knowledge for cp2k input.
572571
| **external_input_path** | String | | Conflict with key:user_fp_params, use the template input provided by user, some rules should be followed, read the following text in detail.
573572
| *fp_style == ABACUS*
574-
| **user_fp_params** | Dict | |Parameters for ABACUS INPUT. find detail [Here](https://github.com/deepmodeling/abacus-develop/blob/develop/docs/input-main.md#out-descriptor). If `deepks_model` is set, the model file should be in the pseudopotential directory.
573+
| **user_fp_params** | Dict | |Parameters for ABACUS INPUT. find detail [Here](https://github.com/deepmodeling/abacus-develop/blob/develop/docs/input-main.md#out-descriptor). If `deepks_model` is set, the model file should be in the pseudopotential directory. You can also set `KPT` file by adding `k_points` that corresponds to a list of six integers in this dictionary.
575574
| **fp_orb_files** | List | |List of atomic orbital files. The files should be in pseudopotential directory.
576575
| **fp_dpks_descriptor** | String | |DeePKS descriptor file name. The file should be in pseudopotential directory.
577576

@@ -1016,7 +1015,6 @@ Here is an example of `param.json` for QM7 dataset:
10161015
"auto"
10171016
],
10181017
"numb_models": 4,
1019-
"train_param": "input.json",
10201018
"default_training_param": {
10211019
"model": {
10221020
"type_map": [
@@ -1086,7 +1084,6 @@ Here is an example of `param.json` for QM7 dataset:
10861084
},
10871085
"_comment": "that's all"
10881086
},
1089-
"use_clusters": true,
10901087
"fp_style": "gaussian",
10911088
"shuffle_poscar": false,
10921089
"fp_task_max": 1000,
@@ -1109,7 +1106,7 @@ Here is an example of `param.json` for QM7 dataset:
11091106
}
11101107
```
11111108

1112-
Here `pick_data` is the data to simplify and currently only supports `MultiSystems` containing `System` with `deepmd/npy` format, and `use_clusters` should always be `true`. `init_pick_number` and `iter_pick_number` are the numbers of picked frames. `e_trust_lo`, `e_trust_hi` mean the range of the deviation of the frame energy, and `f_trust_lo` and `f_trust_hi` mean the range of the max deviation of atomic forces in a frame. `fp_style` can only be `gaussian` currently. Other parameters are as the same as those of generator.
1109+
Here `pick_data` is the directory to data to simplify where the program recursively detects systems `System` with `deepmd/npy` format. `init_pick_number` and `iter_pick_number` are the numbers of picked frames. `e_trust_lo`, `e_trust_hi` mean the range of the deviation of the frame energy, and `f_trust_lo` and `f_trust_hi` mean the range of the max deviation of atomic forces in a frame. `fp_style` can only be `gaussian` currently. Other parameters are as the same as those of generator.
11131110

11141111

11151112
## Set up machine
@@ -1139,7 +1136,7 @@ an example of new dpgen's machine.json
11391136
```json
11401137
{
11411138
"api_version": "1.0",
1142-
"train": [
1139+
"train":
11431140
{
11441141
"command": "dp",
11451142
"machine": {
@@ -1163,9 +1160,8 @@ an example of new dpgen's machine.json
11631160
"para_deg": 3,
11641161
"source_list": ["/home/user1234/deepmd.1.2.4.env"]
11651162
}
1166-
}
1167-
],
1168-
"model_devi":[
1163+
},
1164+
"model_devi":
11691165
{
11701166
"command": "lmp",
11711167
"machine":{
@@ -1186,9 +1182,8 @@ an example of new dpgen's machine.json
11861182
"group_size": 5,
11871183
"source_list": ["/home/user1234/deepmd.1.2.4.env"]
11881184
}
1189-
}
1190-
],
1191-
"fp":[
1185+
},
1186+
"fp":
11921187
{
11931188
"command": "vasp_std",
11941189
"machine":{
@@ -1210,7 +1205,6 @@ an example of new dpgen's machine.json
12101205
"source_list": ["~/vasp.env"]
12111206
}
12121207
}
1213-
]
12141208
}
12151209
```
12161210
note1: the key "local_root" in dpgen's machine.json is always `./`
@@ -1222,7 +1216,7 @@ When switching into a new machine, you may modifying the `MACHINE`, according to
12221216
An example for `MACHINE` is:
12231217
```json
12241218
{
1225-
"train": [
1219+
"train":
12261220
{
12271221
"machine": {
12281222
"batch": "slurm",
@@ -1245,9 +1239,8 @@ An example for `MACHINE` is:
12451239
"qos": "data"
12461240
},
12471241
"command": "USERPATH/dp"
1248-
}
1249-
],
1250-
"model_devi": [
1242+
},
1243+
"model_devi":
12511244
{
12521245
"machine": {
12531246
"batch": "slurm",
@@ -1271,9 +1264,8 @@ An example for `MACHINE` is:
12711264
},
12721265
"command": "lmp_serial",
12731266
"group_size": 1
1274-
}
1275-
],
1276-
"fp": [
1267+
},
1268+
"fp":
12771269
{
12781270
"machine": {
12791271
"batch": "slurm",
@@ -1300,7 +1292,6 @@ An example for `MACHINE` is:
13001292
"command": "vasp_gpu",
13011293
"group_size": 1
13021294
}
1303-
]
13041295
}
13051296
```
13061297
Following table illustrates which key is needed for three types of machine: `train`,`model_devi` and `fp`. Each of them is a list of dicts. Each dict can be considered as an independent environmnet for calculation.

conda/meta.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@ requirements:
2828
- ase
2929
- GromacsWrapper
3030
- custodian
31+
- netCDF4
3132

3233
run:
3334
- python >=3.6
@@ -40,6 +41,7 @@ requirements:
4041
- ase
4142
- GromacsWrapper
4243
- custodian
44+
- netCDF4
4345

4446
test:
4547
imports:

doc/conf.py

Lines changed: 27 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -40,17 +40,20 @@
4040

4141
extensions = [
4242
'deepmodeling_sphinx',
43+
'dargs.sphinx',
4344
"sphinx_rtd_theme",
4445
'myst_parser',
4546
'sphinx.ext.autosummary',
47+
'sphinx.ext.viewcode',
48+
'sphinxarg.ext',
4649
]
4750

4851

4952
# Tell sphinx what the primary language being documented is.
50-
primary_domain = 'cpp'
53+
primary_domain = 'py'
5154

5255
# Tell sphinx what the pygments highlight language should be.
53-
highlight_language = 'cpp'
56+
highlight_language = 'py'
5457

5558
#
5659
myst_heading_anchors = 4
@@ -81,9 +84,28 @@
8184
autosummary_generate = True
8285
master_doc = 'index'
8386

87+
intersphinx_mapping = {
88+
"python": ("https://docs.python.org/", None),
89+
"dargs": ("https://docs.deepmodeling.com/projects/dargs/en/latest/", None),
90+
"dpdata": ("https://docs.deepmodeling.com/projects/dpdata/en/latest/", None),
91+
"dpdispatcher": ("https://docs.deepmodeling.com/projects/dpdispatcher/en/latest/", None),
92+
"ase": ("https://wiki.fysik.dtu.dk/ase/", None),
93+
"numpy": ("https://docs.scipy.org/doc/numpy/", None),
94+
"pamatgen": ("https://pymatgen.org/", None),
95+
"monty": ("https://guide.materialsvirtuallab.org/monty/", None),
96+
"paramiko": ("https://docs.paramiko.org/en/stable/", None),
97+
"custodian": ("https://cloudcustodian.io/docs/", None),
98+
"GromacsWrapper": ("https://gromacswrapper.readthedocs.io/en/latest/", None),
99+
}
100+
101+
102+
def run_apidoc(_):
103+
from sphinx.ext.apidoc import main
104+
sys.path.append(os.path.join(os.path.dirname(__file__), '..'))
105+
cur_dir = os.path.abspath(os.path.dirname(__file__))
106+
module = os.path.join(cur_dir, "..", "dpgen")
107+
main(['-M', '--tocfile', 'api', '-H', 'DP-GEN API', '-o', os.path.join(cur_dir, "api"), module, '--force'])
84108

85-
def generate_arginfo(app):
86-
subprocess.check_output((sys.executable, "gen_arginfo.py"), universal_newlines=True)
87109

88110
def setup(app):
89-
app.connect('builder-inited', generate_arginfo)
111+
app.connect('builder-inited', run_apidoc)

doc/gen_arginfo.py

Lines changed: 0 additions & 5 deletions
This file was deleted.

doc/index.rst

Lines changed: 53 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2,13 +2,60 @@
22
DPGEN's documentation
33
==========================
44

5-
.. _parameters::
5+
.. _overview::
66

77
.. toctree::
88
:maxdepth: 2
9-
:caption: Parameters
9+
:caption: Overview
10+
11+
overview/cli
12+
13+
14+
.. _installation::
15+
16+
.. toctree::
17+
:maxdepth: 2
18+
:caption: Installation
19+
20+
21+
.. _run::
22+
23+
.. toctree::
24+
:maxdepth: 2
25+
:caption: Run
26+
27+
run/run-process.rst
28+
run/param.rst
29+
run/mdata.rst
30+
31+
.. _init::
32+
33+
.. toctree::
34+
:maxdepth: 2
35+
:caption: Init
36+
37+
init/init-bulk-mdata
38+
init/init-surf-mdata
39+
init/init-reaction
40+
init/init-reaction-jdata
41+
init/init-reaction-mdata
42+
43+
.. _autotest::
44+
45+
.. toctree::
46+
:maxdepth: 2
47+
:caption: Autotest
48+
49+
50+
.. _simplify::
51+
52+
.. toctree::
53+
:maxdepth: 2
54+
:caption: Simplify
55+
56+
simplify/simplify-jdata
57+
simplify/simplify-mdata
1058

11-
run-mdata.rst
1259

1360
.. _tutorial:
1461

@@ -17,16 +64,17 @@ DPGEN's documentation
1764
:caption: Tutorial
1865
:glob:
1966

20-
toymodels/*
67+
Tutorials <https://tutorials.deepmodeling.com/en/latest/Tutorials/DP-GEN/>
2168

2269

2370
.. _Contribution:
2471

2572
.. toctree::
2673
:maxdepth: 2
27-
:caption: Contribution Guild
74+
:caption: Contribution Guide
2875

2976
README.md
77+
api/api
3078

3179
* :ref:`genindex`
3280
* :ref:`modindex`

doc/init/init-bulk-mdata.rst

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
dpgen init_bulk machine parameters
2+
==================================
3+
4+
.. dargs::
5+
:module: dpgen.data.arginfo
6+
:func: init_bulk_mdata_arginfo

doc/init/init-reaction-jdata.rst

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
dpgen init_reaction parameters
2+
======================================
3+
4+
.. dargs::
5+
:module: dpgen.data.arginfo
6+
:func: init_reaction_jdata_arginfo

doc/init/init-reaction-mdata.rst

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
dpgen init_reaction machine parameters
2+
======================================
3+
4+
.. dargs::
5+
:module: dpgen.data.arginfo
6+
:func: init_reaction_mdata_arginfo

doc/init/init-reaction.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
# init_reaction
2+
3+
`dpgen init_reaction` is a workflow to initilize data for reactive systems of small gas-phase molecules. The workflow was introduced in the "Initialization" section of [Energy & Fuels, 2021, 35 (1), 762–769](https://10.1021/acs.energyfuels.0c03211).
4+
5+
To start the workflow, one needs a box containing reactive systems. The following packages are required for each of the step:
6+
- Exploring: [LAMMPS](https://github.com/lammps/lammps)
7+
- Sampling: [MDDatasetBuilder](https://github.com/tongzhugroup/mddatasetbuilder)
8+
- Labeling: [Gaussian](https://gaussian.com/)
9+
10+
The Exploring step uses LAMMPS [pair_style reaxff](https://docs.lammps.org/latest/pair_reaxff.html) to run a short ReaxMD NVT MD simulation. In the Sampling step, molecular clusters are taken and k-means clustering algorithm is applied to remove the redundancy, which is described in [Nature Communications, 11, 5713 (2020)](https://doi.org/10.1038/s41467-020-19497-z). The Labeling step calculates energies and forces using the Gaussian package.
11+
12+
An example of `reaction.json` is given below:
13+
14+
```{literalinclude} ../../examples/init/reaction.json
15+
:language: json
16+
:linenos:
17+
```
18+
19+
For detailed parameters, see [parametes](init-reaction-jdata.rst) and [machine parameters](init-reaction-mdata.rst).
20+
21+
The genereated data can be used to continue DP-GEN concurrent learning workflow. Read [Energy & Fuels, 2021, 35 (1), 762–769](https://10.1021/acs.energyfuels.0c03211) for details.

0 commit comments

Comments
 (0)