Welcome to HDNNP’s documentation!¶
What is HDNNP?¶
[Ref] | https://onlinelibrary.wiley.com/doi/full/10.1002/qua.24890 |
How to install HDNNP¶
Python installation¶
(on Linux)
$ git clone https://github.com/yyuu/pyenv.git ~/.pyenv
(on MacOS)
$ brew install pyenv
$ echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bash_profile
$ echo 'export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bash_profile
$ echo 'eval "$(pyenv init -)"' >> ~/.bash_profile
$ source ~/.bash_profile
$ pyenv install 3.6.7
Get source code¶
Note
$ git clone https://github.com/ogura-edu/HDNNP.git
Install dependencies and this program¶
Via pipenv¶
$ cd HDNNP/
$ pyenv local 3.6.7
$ pip install pipenv
$ pipenv install --dev
(activate)
$ pipenv shell
(for example:)
(HDNNP) $ hdnnpy train
(deactivate)
(HDNNP) $ exit
Via anaconda¶
Anaconda also can be installed by pyenv.
$ cd HDNNP/
$ pyenv install anaconda3-xxx
$ pyenv local anaconda3-xxx
$ conda env create -n HDNNP --file condaenv.yaml
(activate)
$ conda activate HDNNP
(for example:)
(HDNNP) $ hdnnpy train
(deactivate)
(HDNNP) $ conda deactivate
Via raw pip¶
You can install all dependent packages manually.
The dependent packages are written in Pipfile
, condaenv.yaml
or requirements.txt
.
$ cd HDNNP/
$ pip install PKG1 PKG2 ...
$ pip install --editable .
How to use HDNNP¶
Data generation¶
Pre-processing¶
OUTCAR
to .xyz format file,
but in the same way you can convert the output of other DFT calculation program to .xyz format file.Training¶
Configuration¶
A default configuration file for training is located in examples/training_config.py
.
training_config.py
consists of some subclasses that inherits traitlets.config.Configurable
:
- c.Application.xxx
- c.TrainingApplication.xxx
- c.DatasetConfig.xxx
- c.ModelConfig.xxx
- c.TrainingConfig.xxx
Following configurations are required, and remaining configurations are optional.
- c.DatasetConfig.parameters
- c.ModelConfig.layers
- c.TrainingConfig.data_file
- c.TrainingConfig.batch_size
- c.TrainingConfig.epoch
- c.TrainingConfig.order
- c.TrainingConfig.loss_function
- c.TrainingConfig.interval
- c.TrainingConfig.patients
For details of each setting, see training_config.py
Command line interface¶
Execute the following command in the directory where training_config.py
is located.
$ hdnnpy train
Note
c.TrainingConfig.out_dir
already exists, it overwrites the existing file in the directory.c.TrainingConfig.out_dir
for each execution.Prediction¶
Configuration¶
A default configuration file for prediction is located in examples/prediction_config.py
.
prediction_config.py
consists of some subclasses that inherits traitlets.config.Configurable
:
- c.Application.xxx
- c.PredictionApplication.xxx
- c.PredictionConfig.xxx
Following configurations are required, and remaining configurations are optional.
- c.PredictionConfig.data_file
- c.PredictionConfig.order
For details of each setting, see prediction_config.py
Command line interface¶
Execute the following command in the directory where prediction_config.py
is located.
$ hdnnpy predict
Post-processing¶
HDNNP-LAMMPS interface program
Command line interface¶
Execute the following command.
$ hdnnpy convert
$ hdnnpy convert -h
Execution example¶
GaN interatomic potential¶
Data file¶
Prepare a .xyz format file which have some structures with energy and force data.
GaN.xyz
32
Lattice="6.46474316 0.0 0.0 -3.23237159 5.5986318 0.0 0.0 0.0 10.53232454" Properties=species:S:1:pos:R:3:forces:R:3 energy=-194.5164333 tag=CrystalGa16N16 pbc="T T T"
Ga 1.61619000 0.93311000 2.62845000 0.00000300 0.00001200 -0.00570900
Ga 3.23237000 3.73242000 2.62845000 0.00003900 -0.00004700 -0.00571500
Ga 4.84856000 0.93311000 2.62845000 0.00000400 -0.00001100 -0.00563600
Ga -0.00000000 3.73242000 7.89461000 -0.00003800 0.00003200 -0.00564200
Ga 1.61619000 0.93311000 7.89461000 0.00006100 -0.00001800 -0.00571100
Ga 3.23237000 3.73242000 7.89461000 0.00002100 -0.00006400 -0.00572000
Ga 4.84856000 0.93311000 7.89461000 -0.00003200 -0.00002300 -0.00565600
Ga -0.00000000 3.73242000 2.62845000 0.00002100 -0.00002000 -0.00565100
Ga -0.00000000 1.86621000 5.26153000 -0.00006900 0.00005900 -0.00572300
Ga 1.61619000 4.66553000 5.26153000 -0.00002700 0.00008200 -0.00571900
Ga 3.23237000 1.86621000 5.26153000 0.00001800 -0.00001400 -0.00566500
Ga -1.61619000 4.66553000 10.52769000 -0.00002700 -0.00002600 -0.00566900
Ga -0.00000000 1.86621000 10.52769000 -0.00002200 0.00008500 -0.00568700
Ga 1.61619000 4.66553000 10.52769000 0.00000600 -0.00002400 -0.00574300
Ga 3.23237000 1.86621000 10.52769000 0.00000100 0.00007600 -0.00564000
Ga -1.61619000 4.66553000 5.26153000 0.00002200 -0.00000200 -0.00568800
N 1.61619000 0.93311000 4.61253000 0.00005500 -0.00002000 -0.00041000
N 3.23237000 3.73242000 4.61253000 0.00003600 -0.00000900 -0.00037900
N 4.84856000 0.93311000 4.61253000 -0.00004100 0.00000700 -0.00041100
N -0.00000000 3.73242000 9.87869000 -0.00001300 -0.00003500 -0.00042500
N 1.61619000 0.93311000 9.87869000 0.00001200 0.00002900 -0.00040900
N 3.23237000 3.73242000 9.87869000 0.00002700 -0.00006200 -0.00041700
N 4.84856000 0.93311000 9.87869000 -0.00000400 0.00002500 -0.00041500
N -0.00000000 3.73242000 4.61253000 -0.00004500 -0.00000400 -0.00041800
N -0.00000000 1.86621000 1.97945000 0.00000000 -0.00000800 -0.00034400
N 1.61619000 4.66553000 1.97945000 -0.00000200 0.00000500 -0.00033700
N 3.23237000 1.86621000 1.97945000 0.00001700 0.00001600 -0.00036100
N -1.61619000 4.66553000 7.24561000 0.00002800 -0.00002300 -0.00036000
N -0.00000000 1.86621000 7.24561000 -0.00008200 0.00001500 -0.00043200
N 1.61619000 4.66553000 7.24561000 -0.00002200 0.00004200 -0.00040100
N 3.23237000 1.86621000 7.24561000 0.00001900 -0.00001200 -0.00039500
N -1.61619000 4.66553000 1.97945000 0.00000400 -0.00001800 -0.00046000
32
Lattice="6.46474316 0.0 0.0 -3.23237159 5.5986318 0.0 0.0 0.0 10.53232454" Properties=species:S:1:pos:R:3:forces:R:3 energy=-169.96635976 tag=CrystalGa16N16 pbc="T T T"
Ga 1.44265000 1.46790000 2.04947000 -0.95595000 -3.56110800 2.54045000
Ga 2.88538000 4.34404000 2.89380000 4.75932000 -2.04809500 -1.43108200
Ga 4.38372000 0.68215000 2.61606000 0.15090500 6.97113700 2.40537400
Ga 0.47836000 3.95213000 7.90284000 -3.31821700 -0.13409600 -0.21437100
Ga 1.82415000 1.43420000 8.18380000 -0.78327100 -2.70531000 -3.50469000
Ga 3.49351000 3.96284000 7.92622000 1.84595600 -0.42627100 -0.16593100
Ga 5.17229000 0.83662000 7.71745000 -0.46937900 1.21688400 1.11923500
Ga -0.04508000 3.95689000 2.71946000 -3.88117900 -1.84159800 0.64959300
Ga -0.96518000 1.98086000 5.22137000 1.12890800 -1.31857500 -0.37168600
Ga 1.18573000 3.20454000 5.22045000 1.58317800 1.58466500 0.77557000
Ga 2.91073000 1.45415000 5.60119000 -0.29420600 -1.79185700 -2.55652100
Ga -0.99634000 4.45389000 0.07004000 -2.39983600 3.43545000 1.27018200
Ga 0.17764000 1.60544000 10.36435000 6.30208700 4.30252400 2.73199900
Ga 2.35420000 4.13573000 0.39168000 -1.28509600 -0.64262000 -3.92936300
...
4
Lattice="3.21629013 0.0 0.0 -1.60814507 2.78538896 0.0 0.0 0.0 5.23996246" Properties=species:S:1:pos:R:3:forces:R:3 energy=-24.3605335 tag=CrystalGa2N2 pbc="T T T"
Ga 1.60815000 0.92846000 2.61537000 0.00057000 -0.00032400 -0.00131800
Ga 0.00000000 1.85693000 5.23535000 -0.00055000 0.00030900 -0.00128000
N 1.60815000 0.92846000 4.58958000 0.00038300 -0.00020300 0.00049500
N 0.00000000 1.85693000 1.96960000 -0.00030900 0.00021200 0.00050600
4
Lattice="3.21629013 0.0 0.0 -1.60814507 2.78538896 0.0 0.0 0.0 5.23996246" Properties=species:S:1:pos:R:3:forces:R:3 energy=-24.04284841 tag=CrystalGa2N2 pbc="T T T"
Ga 1.56998000 1.01961000 2.64712000 0.37879200 -0.65345000 -0.84588100
Ga 0.00233000 1.78610000 5.21359000 1.53422400 0.01126800 0.83092200
N 1.80998000 0.78162000 4.55671000 -1.91098000 0.49960800 -0.07141600
N -0.02338000 1.90257000 1.95274000 0.00855700 0.14604000 0.09234500
4
Lattice="3.21629013 0.0 0.0 -1.60814507 2.78538896 0.0 0.0 0.0 5.23996246" Properties=species:S:1:pos:R:3:forces:R:3 energy=-24.07370026 tag=CrystalGa2N2 pbc="T T T"
Ga 1.68022000 0.78468000 2.59601000 -0.77026300 1.15126700 0.71828100
Ga -0.04831000 1.97869000 0.01593000 -1.05203000 0.42443800 -0.31339000
N 1.47544000 1.12447000 4.57171000 1.50854300 -1.32922700 -0.04524600
N 0.01431000 1.77059000 1.98155000 0.31937700 -0.24596800 -0.35639000
4
Lattice="3.21629013 0.0 0.0 -1.60814507 2.78538896 0.0 0.0 0.0 5.23996246" Properties=species:S:1:pos:R:3:forces:R:3 energy=-24.06789171 tag=CrystalGa2N2 pbc="T T T"
Ga 1.55216000 1.03346000 2.59780000 1.76477100 -1.33788800 0.62275500
Ga 0.04645000 1.78043000 0.02483000 -0.39888700 -0.84820500 -0.84426800
N 1.59299000 0.75442000 4.54056000 0.36047300 1.45854900 0.51138400
N 0.06265000 1.88907000 1.95951000 -1.73396900 0.72932900 -0.27762300
4
Lattice="3.21629013 0.0 0.0 -1.60814507 2.78538896 0.0 0.0 0.0 5.23996246" Properties=species:S:1:pos:R:3:forces:R:3 energy=-24.10933618 tag=CrystalGa2N2 pbc="T T T"
Ga 1.62285000 0.92354000 2.56898000 -0.87387700 0.84344000 1.29437700
Ga -0.00655000 1.82730000 0.04373000 0.63633100 1.10065300 -1.07564600
N 1.65007000 1.03662000 4.56438000 -0.83168500 -1.16592600 0.26072300
N -0.08253000 1.92082000 1.98507000 1.07124400 -0.78418500 -0.47994500
4
Lattice="3.21629013 0.0 0.0 -1.60814507 2.78538896 0.0 0.0 0.0 5.23996246" Properties=species:S:1:pos:R:3:forces:R:3 energy=-24.15961153 tag=CrystalGa2N2 pbc="T T T"
Ga 1.61929000 0.86275000 2.60668000 0.91655600 0.12884500 0.02524600
Ga -0.02746000 1.90759000 0.02534000 -0.00425900 0.48361500 -1.32527900
N 1.57325000 1.05930000 4.54898000 0.29235100 -0.94998800 0.25695700
N 0.11613000 1.80106000 1.90435000 -1.21017800 0.33509300 1.05032200
4
Lattice="3.21629013 0.0 0.0 -1.60814507 2.78538896 0.0 0.0 0.0 5.23996246" Properties=species:S:1:pos:R:3:forces:R:3 energy=-23.90497111 tag=CrystalGa2N2 pbc="T T T"
Ga 1.57753000 1.01962000 2.53889000 -0.58498700 0.38561600 1.95812800
Ga 0.05221000 1.77667000 0.06084000 -0.50913400 -1.39207300 -1.16507600
N 1.60109000 0.71987000 4.62834000 0.25821000 2.35785600 -0.69708500
N -0.10050000 2.01120000 1.98576000 0.83273600 -1.35617800 -0.10520400
4
Lattice="3.21629013 0.0 0.0 -1.60814507 2.78538896 0.0 0.0 0.0 5.23996246" Properties=species:S:1:pos:R:3:forces:R:3 energy=-24.17936965 tag=CrystalGa2N2 pbc="T T T"
Ga 1.65588000 0.84325000 2.61391000 -0.48280700 0.58352400 -0.06140200
Ga -0.05236000 1.91994000 0.00989000 1.13163900 0.73695700 -0.46324400
N 1.63413000 1.09260000 4.55873000 -1.08709100 -1.30806300 0.05205700
N -0.00295000 1.80336000 1.93549000 0.44154800 -0.01662100 0.47920500
4
Lattice="3.21629013 0.0 0.0 -1.60814507 2.78538896 0.0 0.0 0.0 5.23996246" Properties=species:S:1:pos:R:3:forces:R:3 energy=-23.82707164 tag=CrystalGa2N2 pbc="T T T"
...
Config file¶
training_config.py
(necessary parts picked up)
c.TrainingApplication.verbose = True
c.DatasetConfig.parameters = {
'type1': [
(5.0,),
],
'type2': [
(5.0, 0.01, 2.0),
(5.0, 0.01, 3.2),
(5.0, 0.01, 3.8),
(5.0, 0.1, 2.0),
(5.0, 0.1, 3.2),
(5.0, 0.1, 3.8),
(5.0, 1.0, 2.0),
(5.0, 1.0, 3.2),
(5.0, 1.0, 3.8),
],
'type4': [
(5.0, 0.01, -1, 1),
(5.0, 0.01, -1, 2),
(5.0, 0.01, -1, 4),
(5.0, 0.01, 1, 1),
(5.0, 0.01, 1, 2),
(5.0, 0.01, 1, 4),
(5.0, 0.1, -1, 1),
(5.0, 0.1, -1, 2),
(5.0, 0.1, -1, 4),
(5.0, 0.1, 1, 1),
(5.0, 0.1, 1, 2),
(5.0, 0.1, 1, 4),
(5.0, 1.0, -1, 1),
(5.0, 1.0, -1, 2),
(5.0, 1.0, -1, 4),
(5.0, 1.0, 1, 1),
(5.0, 1.0, 1, 2),
(5.0, 1.0, 1, 4),
],
}
c.DatasetConfig.preprocesses = [
('pca', (), {}),
]
c.ModelConfig.layers = [
(90, 'tanh'),
(90, 'tanh'),
(1, 'identity'),
]
c.TrainingConfig.batch_size = 100
c.TrainingConfig.data_file = 'data/GaN.xyz'
c.TrainingConfig.epoch = 1000
c.TrainingConfig.interval = 10
c.TrainingConfig.loss_function = (
'first_only',
{}
)
c.TrainingConfig.lr_decay = 1.0e-6
c.TrainingConfig.order = 1
c.TrainingConfig.out_dir = 'output'
c.TrainingConfig.patients = 5
c.TrainingConfig.scatter_plot = True
command line log¶
Once edited configuration file training_config.py
, you just do one command hdnnpy trian
.
$ hdnnpy train
Construct sub dataset tagged as "CrystalGa16N16"
Successfully loaded & made needed symmetry_function dataset from <workdir>/data/CrystalGa16N16/symmetry_function.npz
Successfully loaded & made needed interatomic_potential dataset from <workdir>/data/CrystalGa16N16/interatomic_potential.npz
Initialized PCA parameters for Ga
Feature dimension: 74 => 74
Cumulative contribution rate = 0.9999999403953552
Initialized PCA parameters for N
Feature dimension: 74 => 74
Cumulative contribution rate = 1.0000001192092896
Construct sub dataset tagged as "CrystalGa2N2"
Successfully loaded & made needed symmetry_function dataset from <workdir>/data/CrystalGa2N2/symmetry_function.npz
Successfully loaded & made needed interatomic_potential dataset from <workdir>/data/CrystalGa2N2/interatomic_potential.npz
Saved PCA parameters to <workdir>/output/preprocess/pca.npz.
early stopping: operator is less
epoch iteration main/RMSE/force main/RMSE/total val/main/RMSE/force val/main/RMSE/total
1 14 1.20575 1.20575 1.21576 1.21576
2 28 1.08758 1.08758 1.06121 1.06121
3 42 0.895798 0.895798 0.865482 0.865482
4 55 0.685623 0.685623 0.694789 0.694789
5 69 0.560702 0.560702 0.603832 0.603832
6 83 0.509542 0.509542 0.570984 0.570984
7 97 0.486743 0.486743 0.552533 0.552533
8 110 0.468966 0.468966 0.540375 0.540375
9 124 0.458917 0.458917 0.531327 0.531327
10 138 0.448132 0.448132 0.524466 0.524466
...
Directory tree¶
After training, directory tree becomes as follows:
workdir
├── data/
│ ├── GaN.xyz
│ ...
├── output/
│ ├── CrystalGa16N16/
│ │ ├── energy.png
│ │ ├── force.png
│ │ └── training.log
│ ├── CrystalGa2N2/
│ │ ├── energy.png
│ │ ├── force.png
│ │ └── training.log
│ ├── master_nnp.npz
│ ├── preprocess/
│ │ └── pca.npz
│ ├── training_config.py
│ └── training_result.yaml
└── training_config.py
Modules¶
Dataset tools¶
DatasetGenerator |
Deal out datasets as needed. |
HDNNPDataset |
Combine and preprocess descriptor and property dataset. |
Descriptor datasets¶
SymmetryFunctionDataset |
Symmetry function dataset for descriptor of HDNNP. |
Property datasets¶
InteratomicPotentialDataset |
Interatomic potential dataset for property of HDNNP. |
Dataset base classes¶
DescriptorDatasetBase |
Base class of atomic structure based descriptor dataset. |
PropertyDatasetBase |
Base class of atomic structure based property dataset. |
Atomic structure¶
AtomicStructure |
Wrapper class of ase.Atoms. |
Neural network potential models¶
HighDimensionalNNP |
High dimensional neural network potential. |
MasterNNP |
Responsible for managing the parameters of each element. |
SubNNP |
Feed-forward neural network representing one element or atom. |
Pre-processing of dataset¶
PCA |
Principal component analysis (PCA). |
Scaling |
Scale all feature values into the certain range. |
Standardization |
Scale all feature values to be zero-mean and unit-variance. |
Pre-processing base class¶
PreprocessBase |
Base class of pre-processing. |
Chainer-based training tools¶
Custom training extensions¶
ScatterPlot |
Trainer extension to output predictions/labels scatter plots. |
set_log_scale |
Change y axis scale as log scale. |
Loss functions¶
Zeroth |
Loss function to optimize 0th-order property. |
First |
Loss function to optimize 0th and 1st-order property. |
Potential |
Loss function to optimize 0th property as scalar potential. |
Loss function base class¶
loss_function.loss_function_base.LossFunctionBase |
Training manager¶
Manager |
Context manager to take trainer snapshot and decide whether to train or not. |
Updater¶
Updater |
Updater for HDNNP training using HighDimensionalNNP and MasterNNP . |
How to extend HDNNP¶
Dataset¶
HDNNP dataset consists of Descriptor dataset and Property dataset.
Descriptor dataset¶
hdnnpy.dataset.descriptor.descriptor_dataset_base.DescriptorDatasetBase
In addition, override the following abstract method.
- generate_feature_keys
hdnnpy.dataset.HDNNPDataset
- calculate_descriptors
Property dataset¶
hdnnpy.dataset.property.property_dataset_base.PropertyDatasetBase
In addition, override the following abstract method.
- calculate_properties
Preprocess¶
- PCA
- Scaling
- Standardization
Loss function¶
Currently, we have implemented following loss function for HDNNP training.
- Zeroth
- First
Each loss function uses a 0th/1st order error of property to optimize HDNNP.
First
uses both 0th/1st order errors of property weighted by parameter mixing_beta
to optimize HDNNP.
- Potential
It uses 2nd order derivative of descriptor dataset to optimize HDNNP to satisfy following condition:
Then, there is a scalar potential \(\varphi\):
hdnnpy.training.loss_function.loss_function_base.LossFunctionBase
.