You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+19-28Lines changed: 19 additions & 28 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -373,7 +373,6 @@ In `PARAM`, you can specialize the task as you expect.
373
373
374
374
"_comment": " that's all ",
375
375
"numb_models": 4,
376
-
"train_param": "input.json",
377
376
"default_training_param": {
378
377
"model": {
379
378
"type_map": [
@@ -499,9 +498,8 @@ The bold notation of key (such aas **type_map**) means that it's a necessary key
499
498
| **use_ele_temp** | int | 0 | Currently only support fp_style vasp. 0(default): no electron temperature. 1: eletron temperature as frame parameter. 2: electron temperature as atom parameter.
500
499
| *#Data*
501
500
| init_data_prefix | String | "/sharedext4/.../data/" | Prefix of initial data directories
502
-
| ***init_data_sys*** | List of string|["CH4.POSCAR.01x01x01/.../deepmd"] |Directories of initial data. You may use either absolute or relative path here.
501
+
| ***init_data_sys*** | List of string|["CH4.POSCAR.01x01x01/.../deepmd"] |Directories of initial data. You may use either absolute or relative path here. Systems will be detected recursively in the directories.
503
502
| ***sys_format*** | String | "vasp/poscar" | Format of initial data. It will be `vasp/poscar` if not set.
504
-
| init_multi_systems | Boolean | false | If set to `true`, `init_data_sys` directories should contain sub-directories of various systems. DP-GEN will regard all of these sub-directories as inital data systems.
505
503
| init_batch_size | String of integer |[8]| Each number is the batch_size of corresponding system for training in `init_data_sys`. One recommended rule for setting the `sys_batch_size` and `init_batch_size` is that `batch_size` mutiply number of atoms ot the stucture should be larger than 32. If set to `auto`, batch size will be 32 divided by number of atoms. |
506
504
| sys_configs_prefix | String | "/sharedext4/.../data/" | Prefix of `sys_configs`
507
505
|**sys_configs**| List of list of string |[<br />["/sharedext4/.../POSCAR"], <br />["....../POSCAR"]<br />]| Containing directories of structures to be explored in iterations.Wildcard characters are supported here. |
@@ -515,10 +513,10 @@ The bold notation of key (such aas **type_map**) means that it's a necessary key
515
513
| *#Exploration*
516
514
|**model_devi_dt**| Float | 0.002 (recommend) | Timestep for MD |
517
515
| **model_devi_skip** | Integer | 0 | Number of structures skipped for fp in each MD
518
-
|**model_devi_f_trust_lo**| Float or List of float | 0.05 | Lower bound of forces for the selection. If List, should be set for each index in `sys_configs`, respectively. |
519
-
|**model_devi_f_trust_hi**| Float or List of float | 0.15 | Upper bound of forces for the selection. If List, should be set for each index in `sys_configs`, respectively. |
520
-
|**model_devi_v_trust_lo**| Float or List of float | 1e10 | Lower bound of virial for the selection. If List, should be set for each index in `sys_configs`, respectively. Should be used with DeePMD-kit v2.x. |
521
-
|**model_devi_v_trust_hi**| Float or List of float | 1e10 | Upper bound of virial for the selection. If List, should be set for each index in `sys_configs`, respectively. Should be used with DeePMD-kit v2.x. |
516
+
|**model_devi_f_trust_lo**| Float or List of float or Dict[str, float]| 0.05 | Lower bound of forces for the selection. If List, should be set for each index in `sys_configs`, respectively. |
517
+
|**model_devi_f_trust_hi**| Float or List of float or Dict[str, float]| 0.15 | Upper bound of forces for the selection. If List, should be set for each index in `sys_configs`, respectively. |
518
+
|**model_devi_v_trust_lo**| Float or List of float or Dict[str, float]| 1e10 | Lower bound of virial for the selection. If List, should be set for each index in `sys_configs`, respectively. Should be used with DeePMD-kit v2.x. |
519
+
|**model_devi_v_trust_hi**| Float or List of float or Dict[str, float]| 1e10 | Upper bound of virial for the selection. If List, should be set for each index in `sys_configs`, respectively. Should be used with DeePMD-kit v2.x. |
522
520
| model_devi_adapt_trust_lo | Boolean | False | Adaptively determines the lower trust levels of force and virial. This option should be used together with `model_devi_numb_candi_f`, `model_devi_numb_candi_v` and optionally with `model_devi_perc_candi_f` and `model_devi_perc_candi_v`. `dpgen` will make two sets: 1. From the frames with force model deviation lower than `model_devi_f_trust_hi`, select `max(model_devi_numb_candi_f, model_devi_perc_candi_f*n_frames)` frames with largest force model deviation. 2. From the frames with virial model deviation lower than `model_devi_v_trust_hi`, select `max(model_devi_numb_candi_v, model_devi_perc_candi_v*n_frames)` frames with largest virial model deviation. The union of the two sets is made as candidate dataset|
523
521
| model_devi_numb_candi_f | Int | 10 | See `model_devi_adapt_trust_lo`.|
524
522
| model_devi_numb_candi_v | Int | 0 | See `model_devi_adapt_trust_lo`.|
@@ -537,7 +535,8 @@ The bold notation of key (such aas **type_map**) means that it's a necessary key
537
535
|**model_devi_jobs["ensembles"]**| String | "nvt" | Determining which ensemble used in MD, **options** include “npt” and “nvt”. |
538
536
| model_devi_jobs["neidelay"]| Integer | "10" | delay building until this many steps since last build |
539
537
| model_devi_jobs["taut"]| Float | "0.1" | Coupling time of thermostat (ps) |
540
-
| model_devi_jobs["taup"] | Float | "0.5" | Coupling time of barostat (ps)
538
+
| model_devi_jobs["taup"]| Float | "0.5" | Coupling time of barostat (ps) |
539
+
| model_devi_jobs["model_devi_f_trust_lo"] <br> model_devi_jobs["model_devi_f_trust_hi"] <br> model_devi_jobs["model_devi_v_trust_lo"] <br> model_devi_jobs["model_devi_v_trust_hi"]| Float or Dict[str, float]| See global model_devi config above like **model_devi_f_trust_lo**. For dict, should be set for each index in sys_idx, respectively. |
541
540
| *#Labeling*
542
541
|**fp_style**| string | "vasp" | Software for First Principles. **Options** include “vasp”, “pwscf”, “siesta” and “gaussian” up to now. |
543
542
|**fp_task_max**| Integer | 20 | Maximum of structures to be calculated in `02.fp` of each iteration. |
@@ -571,7 +570,7 @@ The bold notation of key (such aas **type_map**) means that it's a necessary key
571
570
| **user_fp_params** | Dict | |Parameters for cp2k calculation. find detail in manual.cp2k.org. only the kind section must be set before use. we assume that you have basic knowledge for cp2k input.
572
571
| **external_input_path** | String | | Conflict with key:user_fp_params, use the template input provided by user, some rules should be followed, read the following text in detail.
573
572
| *fp_style == ABACUS*
574
-
| **user_fp_params** | Dict | |Parameters for ABACUS INPUT. find detail [Here](https://github.com/deepmodeling/abacus-develop/blob/develop/docs/input-main.md#out-descriptor). If `deepks_model` is set, the model file should be in the pseudopotential directory.
573
+
| **user_fp_params** | Dict | |Parameters for ABACUS INPUT. find detail [Here](https://github.com/deepmodeling/abacus-develop/blob/develop/docs/input-main.md#out-descriptor). If `deepks_model` is set, the model file should be in the pseudopotential directory. You can also set `KPT` file by adding `k_points` that corresponds to a list of six integers in this dictionary.
575
574
| **fp_orb_files** | List | |List of atomic orbital files. The files should be in pseudopotential directory.
576
575
| **fp_dpks_descriptor** | String | |DeePKS descriptor file name. The file should be in pseudopotential directory.
577
576
@@ -1016,7 +1015,6 @@ Here is an example of `param.json` for QM7 dataset:
1016
1015
"auto"
1017
1016
],
1018
1017
"numb_models": 4,
1019
-
"train_param": "input.json",
1020
1018
"default_training_param": {
1021
1019
"model": {
1022
1020
"type_map": [
@@ -1086,7 +1084,6 @@ Here is an example of `param.json` for QM7 dataset:
1086
1084
},
1087
1085
"_comment": "that's all"
1088
1086
},
1089
-
"use_clusters": true,
1090
1087
"fp_style": "gaussian",
1091
1088
"shuffle_poscar": false,
1092
1089
"fp_task_max": 1000,
@@ -1109,7 +1106,7 @@ Here is an example of `param.json` for QM7 dataset:
1109
1106
}
1110
1107
```
1111
1108
1112
-
Here `pick_data` is the data to simplify and currently only supports `MultiSystems` containing`System` with `deepmd/npy` format, and `use_clusters` should always be `true`. `init_pick_number` and `iter_pick_number` are the numbers of picked frames. `e_trust_lo`, `e_trust_hi` mean the range of the deviation of the frame energy, and `f_trust_lo` and `f_trust_hi` mean the range of the max deviation of atomic forces in a frame. `fp_style` can only be `gaussian` currently. Other parameters are as the same as those of generator.
1109
+
Here `pick_data` is the directory to data to simplify where the program recursively detects systems`System` with `deepmd/npy` format. `init_pick_number` and `iter_pick_number` are the numbers of picked frames. `e_trust_lo`, `e_trust_hi` mean the range of the deviation of the frame energy, and `f_trust_lo` and `f_trust_hi` mean the range of the max deviation of atomic forces in a frame. `fp_style` can only be `gaussian` currently. Other parameters are as the same as those of generator.
1113
1110
1114
1111
1115
1112
## Set up machine
@@ -1139,7 +1136,7 @@ an example of new dpgen's machine.json
1139
1136
```json
1140
1137
{
1141
1138
"api_version": "1.0",
1142
-
"train": [
1139
+
"train":
1143
1140
{
1144
1141
"command": "dp",
1145
1142
"machine": {
@@ -1163,9 +1160,8 @@ an example of new dpgen's machine.json
@@ -1210,7 +1205,6 @@ an example of new dpgen's machine.json
1210
1205
"source_list": ["~/vasp.env"]
1211
1206
}
1212
1207
}
1213
-
]
1214
1208
}
1215
1209
```
1216
1210
note1: the key "local_root" in dpgen's machine.json is always `./`
@@ -1222,7 +1216,7 @@ When switching into a new machine, you may modifying the `MACHINE`, according to
1222
1216
An example for `MACHINE` is:
1223
1217
```json
1224
1218
{
1225
-
"train": [
1219
+
"train":
1226
1220
{
1227
1221
"machine": {
1228
1222
"batch": "slurm",
@@ -1245,9 +1239,8 @@ An example for `MACHINE` is:
1245
1239
"qos": "data"
1246
1240
},
1247
1241
"command": "USERPATH/dp"
1248
-
}
1249
-
],
1250
-
"model_devi": [
1242
+
},
1243
+
"model_devi":
1251
1244
{
1252
1245
"machine": {
1253
1246
"batch": "slurm",
@@ -1271,9 +1264,8 @@ An example for `MACHINE` is:
1271
1264
},
1272
1265
"command": "lmp_serial",
1273
1266
"group_size": 1
1274
-
}
1275
-
],
1276
-
"fp": [
1267
+
},
1268
+
"fp":
1277
1269
{
1278
1270
"machine": {
1279
1271
"batch": "slurm",
@@ -1300,7 +1292,6 @@ An example for `MACHINE` is:
1300
1292
"command": "vasp_gpu",
1301
1293
"group_size": 1
1302
1294
}
1303
-
]
1304
1295
}
1305
1296
```
1306
1297
Following table illustrates which key is needed for three types of machine: `train`,`model_devi` and `fp`. Each of them is a list of dicts. Each dict can be considered as an independent environmnet for calculation.
`dpgen init_reaction` is a workflow to initilize data for reactive systems of small gas-phase molecules. The workflow was introduced in the "Initialization" section of [Energy & Fuels, 2021, 35 (1), 762–769](https://10.1021/acs.energyfuels.0c03211).
4
+
5
+
To start the workflow, one needs a box containing reactive systems. The following packages are required for each of the step:
The Exploring step uses LAMMPS [pair_style reaxff](https://docs.lammps.org/latest/pair_reaxff.html) to run a short ReaxMD NVT MD simulation. In the Sampling step, molecular clusters are taken and k-means clustering algorithm is applied to remove the redundancy, which is described in [Nature Communications, 11, 5713 (2020)](https://doi.org/10.1038/s41467-020-19497-z). The Labeling step calculates energies and forces using the Gaussian package.
For detailed parameters, see [parametes](init-reaction-jdata.rst) and [machine parameters](init-reaction-mdata.rst).
20
+
21
+
The genereated data can be used to continue DP-GEN concurrent learning workflow. Read [Energy & Fuels, 2021, 35 (1), 762–769](https://10.1021/acs.energyfuels.0c03211) for details.
0 commit comments