Skip to content

Latest commit

 

History

History
487 lines (363 loc) · 27.4 KB

user_guide.md

File metadata and controls

487 lines (363 loc) · 27.4 KB

User Guide

EnsembleFlex - Flexibility Analysis of Structure Ensembles

Introduction

Welcome to an exploration of the dynamics within protein structure ensembles!

EnsembleFlex serves as a exploration gateway into the field of protein structural flexibility in an automated manner, where tools are streamlined and analysis is easily reproducible and fast.

Why investigating protein flexibility?

Proteins, the fundamental molecular machines exhibit a remarkable diversity in structure and function. 
While classical structural biology has provided invaluable insights into the static three-dimensional 
architectures of proteins, the dynamic nature of these macromolecules is increasingly recognized as a 
critical determinant of their biological activity.  

The inherent flexibility of proteins manifests at various levels, spanning from local fluctuations 
of individual amino acid side chains to large-scale domain movements. Understanding the dynamic 
behavior of proteins is essential for deciphering the mechanisms underlying their diverse functions, 
including enzymatic catalysis, molecular recognition, and signal transduction. Moreover, protein 
flexibility plays a pivotal role in modulating interactions with other biomolecules, enabling 
proteins to adapt to different cellular environments and respond to external stimuli.

Understanding protein structural flexibility is not just about unraveling molecular intricacies; 
it holds the key to deciphering diseases, designing targeted therapies, and engineering proteins 
with tailored functionalities.

Input

  • <Input path>: The primary input is the path to a folder that contains all PDB structure files to be analysed.
  • Calculations can only be performed on monomers!
  • In case your PDB files contain several chains or are multi-model files they need to be split. (Tools to do this are provided - see below.)

Command-line usage of analysis scripts explained

The following analysis scripts can be used independently from the browser-based interface.
Note that script usage follows a general common rule ([Rscript/python3] path/to/script -i <input> -o <output> <more options>) and is specified in the header of each script.

EnsembleFlex  
└── src  
   ├── analyse_conserved_waters.R
   ├── analyse_flex_binding_site_bio3d.R
   ├── analyse_flex_bio3d.R
   ├── analyse_flex_bio3d_reporting.py
   ├── analyse_flex_prody.py
   ├── analyse_flex_prody_reporting.py
   ├── analyse_flex_sasa_biopython.py
   ├── identify_binding_site_bio3d.R
   ├── predict_flex_binding_site_essa_prody.py
   ├── predict_flex_nma_bio3d.R
   ├── predict_flex_nma_prody.py
   ├── superimpose_binding_site_bio3d.R
   ├── superimpose_bio3d.R
   ├── superimpose_prody.py
   └── tools
       ├── data_on_structure.R
       ├── pdb_delhetatm.py
       ├── pdb_delhetatm_ions.py
       ├── pdb_delresname.py
       ├── run_pdb_del_on_directory.sh
       ├── sort_pdbs_from_dataframe.py
       ├── sort_pdbs_from_file.py
       ├── sort_pdbs_has_gap_in_seq.py
       ├── sort_pdbs_has_ligand.R
       ├── split_pdbs_bio3d.R
       ├── subset_alignment_gap_fraction_drop.py
       └── subset_alignment_has_gap_in_seq.sh

Here is an example of how to execute the whole pipeline from the command line.

Go to your desired output directory:

cd <your/output/directory>

Note that execution from inside your output directory is not required, but otherwise you need to specify explicitly the path to your input and output directories in each command.

Don't forget to activate the conda environment:

conda activate EnsembleFlex

Now you can run the scripts as in the example below!
Just make sure that you use the correct paths in your system
a) to the scripts: You can run the scripts from your output directory by specifying the correct path to the EnsembleFlex scripts (replace ~/path/to with the correct path on your system).
b) to your input and output directory: The following commands assume that your input structure files are located in a folder called pdbs and you want to create an output directory called EnsembleFlex besides your input folder where all outputs will be located.

Pre-processing Tools

  • [optional] Tool: PDB splitting on whole directory (if multiple chains are present in your PDB files)

    Rscript ~/path/to/EnsembleFlex/src/tools/split_pdbs_bio3d.R -i pdbs -o split_pdbs
    
  • [optional] Tool: subset PDBs based on gap occurrence (if you want to analyse "structures_with_gaps" and "structures_without_gaps" separately)

    python3 ~/path/to/EnsembleFlex/src/tools/sort_pdbs_has_gap_in_pdb_seq.py -i pdbs/ -o .
    

Superimpositioning

  # with Bio3D (recommended)
  Rscript ~/path/to/EnsembleFlex/src/superimpose_bio3d.R -i pdbs -o EnsembleFlex

  # [OR] with ProDy
  python3 ~/path/to/EnsembleFlex/src/superimpose_prody.py -i pdbs/ -o EnsembleFlex/

Main Analysis

  • Flexibility Analysis

    # Flexibility Analysis with Bio3D
    Rscript ~/path/to/EnsembleFlex/src/analyse_flex_bio3d.R -i EnsembleFlex/superimposed -o EnsembleFlex/Analysis_Bio3D -n 3
      
    # [optional] Flexibility Analysis with ProDy
    python3 ~/path/to/EnsembleFlex/src/analyse_flex_prody.py -i EnsembleFlex/superimposed -o EnsembleFlex/Analysis_ProDy
    
  • SASA Analysis

    # SASA Analysis with Biopython
    python3 ~/path/to/EnsembleFlex/src/analyse_flex_sasa_biopython.py -i EnsembleFlex/superimposed/ -o EnsembleFlex/Analysis_SASA_Biopython
    Rscript ~/path/to/EnsembleFlex/src/tools/data_on_structure.R -i EnsembleFlex/superimposed -o EnsembleFlex/Analysis_SASA_Biopython -d EnsembleFlex/Analysis_SASA_Biopython/SASA_global.csv -c sd
    
  • Subset clusters

    # Subset clusters based on dataframe (where option `-c ` specifies the column name of the data frame)
    python3 ~/path/to/EnsembleFlex/src/tools/pdb_sorter_from_dataframe.py -i EnsembleFlex/superimposed -o EnsembleFlex/Analysis_Bio3D/Consensus_Clusters -d EnsembleFlex/Analysis_Bio3D/cluster_attributions_with_consensus.csv -c Consensus_Cluster
    

Binding Site Analysis

  • Checking for ligands

    Rscript ~/path/to/EnsembleFlex/src/tools/pdb_sorter_has_ligand.R -i EnsembleFlex/superimposed -o EnsembleFlex
    
  • [optional] Removing crystallization factor DMSO (using resname DMS)

    ~/path/to/EnsembleFlex/src/tools/run_pdb_del_on_directory.sh ~/path/to/EnsembleFlex/src/tools/pdb_delresname.py EnsembleFlex/superimposed DMS [EnsembleFlex/superimposed_no_DMS]
    
  • [optional] Removing Ions

    ~/path/to/EnsembleFlex/src/tools/run_pdb_del_on_directory.sh ~/path/to/EnsembleFlex/src/tools/pdb_delhetatm_ions.py EnsembleFlex/superimposed[_no_DMS] ions [EnsembleFlex/superimposed[_no_DMS]_no_ions]
    
  • [if one of the optional "removing" steps has been performed] Checking again for ligands

    Rscript ~/path/to/EnsembleFlex/src/tools/pdb_sorter_has_ligand.R -i EnsembleFlex/superimposed_no_DMS_no_ions -o EnsembleFlex
    
  • Binding site Identification and Analysis

    Rscript ~/path/to/EnsembleFlex/src/identify_binding_site_bio3d.R -i EnsembleFlex/structures_with_ligand -o EnsembleFlex/BindingSite_ident_Bio3D -d 4.0[3.5]
    
  • [optional] Superimpose only on binding site

    Rscript ~/path/to/EnsembleFlex/src/superimpose_binding_site_bio3d.R -i EnsembleFlex/superimposed -o EnsembleFlex/superimposed_only_on_BindingSite -b EnsembleFlex/BindingSite_ident_Bio3D/binding_site_residue_numbers.txt
    
  • Binding site Flexibility Analysis

    Rscript ~/path/to/EnsembleFlex/src/analyse_flex_binding_site_bio3d.R -i EnsembleFlex/structures_with_ligand -o EnsembleFlex/BindingSite_analysis_Bio3D -b EnsembleFlex/BindingSite_ident_Bio3D/binding_site_residue_numbers.txt
    

Conserved Water Analysis

  • Conserved Water Analysis

    Rscript ~/path/to/EnsembleFlex/src/analyse_conserved_waters.R -i EnsembleFlex/superimposed -o EnsembleFlex/Analysis_Waters
    

Flexibility Prediction

  • NMA (coarse-grained, using C-alpha atoms)

    # NMA and eNMA with Bio3D, remove `-e` to avoid ensemble NMA (eNMA)
    # Note that ensemble NMA should only be performed on small ensembles (<=50 structures), as computationally expensive
    Rscript ~/path/to/EnsembleFlex/src/predict_flex_nma_bio3d.R -i EnsembleFlex/superimposed -o EnsembleFlex/Prediction_NMA_Bio3D -e
    
    # NMA with ProDy
    python3 ~/path/to/EnsembleFlex/src/predict_flex_nma_prody.py -i EnsembleFlex/superimposed -o EnsembleFlex/Prediction_NMA_ProDy
    
  • all-atom NMA (using all heavy atoms)

    # aaNMA and aaeNMA with Bio3D, remove `-e` to avoid all-atom ensemble NMA (aaeNMA)
    # Note that all-atom ensemble NMA should only be performed on mini ensembles (<=5 structures), as computationally very expensive
    Rscript ~/path/to/EnsembleFlex/src/predict_flex_aanma_bio3d.R -i EnsembleFlex/superimposed -o EnsembleFlex/Prediction_aaNMA_Bio3D -e
    
  • Essential Site Scanning Analysis (ESSA)

    python3 ~/path/to/EnsembleFlex/src/predict_flex_binding_site_essa_prody.py -i EnsembleFlex/superimposed/ -o EnsembleFlex/Prediction_BindingSite_ESSA_ProDy -b EnsembleFlex/BindingSite_ident_Bio3D/binding_site_residue_numbers.txt
    

Troubleshooting installation

Problems with finding the right path to Rscript

If you have R already installed on your system the execution of Rscript may default back to your previous R installation. In such case just replace Rscript by ${CONDA_PREFIX}/bin/Rscript in all commands. This will ensure usage of the R installation inside the conda environment.

Problems with package versions and dependencies (Python an R)

In case you have errors during installation indicating incompatibilities of Python or R packages, try first using the provided conda-lock.yml file (as indicated in installation instructions, for macOS or Linux), and if this does not work on your system (e.g. for Windows) try using the environment_versioned.yml file instead of environment.yml. This will potentially result in slightly older package versions, but they are compatible.

Output structure explained

Several types of output files will be generated by the analysis tools. This includes image files (png), text files (txt), data tables (csv/tsv/xlsx), structure files (pdb), PyMol scripts (pml).
Most output will be visualised directly in the browser-based user interface when analysis is performed. All generated output will be saved in the respective output directory with potential further information. The output directory will contain the following structure of subdirectories, where uppercase directory names contain analysis output files of diverse format and lowercase directory names only contain .pdb structure files:

EnsembleFlex_output_folder (defined by you)  
├── superimposed  
├── Analysis_Bio3D  
│   └── pymol_pdbs  
├── Analysis_SASA_Biopython  
├── Analysis_Waters  
├── superimposed_no_ions (optional)  
├── superimposed_no_ions_<MOL-ID>  (optional)  
├── structures_with_ligand  
├── structures_without_ligand  
├── BindingSite_ident_Bio3D  
│   └── structures_labeled_binding_site  
├── BindingSite_analysis_Bio3D  
├── Prediction_NMA_Bio3D  
├── Prediction_NMA_ProDy  
├── Prediction_aaNMA_Bio3D  
└── Prediction_ESSA_ProDy  

Output interpretation help

1. Flexibility analysis: How to interpret RMSD, RMSF, PCA & UMAP analysis?

RMSD & RMSF

### RMSD and RMSF analysis for flexibility analysis

RMSD (Root Mean Square Deviation) and RMSF (Root Mean Square Fluctuation) are widely used metrics for analyzing 
the flexibility and dynamics of protein structures, particularly in the context of structural ensembles 
obtained from techniques like NMR spectroscopy, molecular dynamics (MD) simulations, or other computational 
methods.

#### RMSD

RMSD measures the overall structural deviation between two conformations of a protein. It quantifies the 
average distance between the atoms of a target structure and a reference structure after optimal 
superimposition. A higher RMSD value indicates a greater structural difference between the two conformations.

In the context of protein ensembles, RMSD is often used to:

1. Assess the structural heterogeneity within the ensemble by calculating pairwise RMSDs between all 
conformations[4].
2. Monitor the structural deviation of a protein over time in MD simulations by calculating the RMSD from the 
starting structure at each time step[2].
3. Evaluate the convergence of NMR structure calculations by examining the RMSD between the calculated 
models[1].

#### RMSF

RMSF, on the other hand, measures the fluctuation of individual atoms or residues around their average 
positions within an ensemble or trajectory. It provides a residue-level measure of flexibility, highlighting 
the most mobile regions of the protein.

RMSF is commonly used to:

1. Identify flexible loops, hinges, or other dynamic regions in proteins by analysing the RMSF profile along 
the protein sequence[1][3].
2. Compare the flexibility patterns between different ensembles, such as NMR models, MD trajectories, or 
crystal structures[3].
3. Relate the observed flexibility to functional mechanisms, such as conformational changes or binding 
events[1][3].

It's important to note that RMSD and RMSF provide complementary information. While RMSD captures global 
structural changes, RMSF highlights local flexibility. Additionally, the interpretation of these metrics can 
be influenced by factors like the alignment strategy, reference structure, and ensemble size[2][4].

In summary, RMSD and RMSF are invaluable tools for analyzing protein flexibility and dynamics from structural 
ensembles, enabling researchers to gain insights into conformational heterogeneity, identify mobile regions, 
and relate structural dynamics to protein function[1][2][3][4].

Citations:  
[1] https://europepmc.org/article/MED/24735558  
[2] https://europepmc.org/article/MED/25816325  
[3] https://europepmc.org/article/MED/33803249  
[4] https://europepmc.org/article/MED/20197040  

PCA & UMAP

### Dimension reduction techniques (PCA and UMAP) for flexibility analysis

Dimensionality reduction techniques like Principal Component Analysis (PCA) and Uniform Manifold 
Approximation and Projection (UMAP) are valuable tools for visualizing conformational changes in protein 
structure ensembles.

#### PCA for Visualizing Protein Conformations

Principal Component Analysis (PCA) is a linear technique that projects the high-dimensional conformational 
data (e.g., Cartesian coordinates of atoms) onto a lower-dimensional subspace while preserving the maximum 
variance in the original data. The key steps are:  
1.	Calculate the covariance matrix of the conformational ensemble.
2.	Compute the eigenvectors and eigenvalues of the covariance matrix.
3.	Project the conformations onto the eigenvectors corresponding to the largest eigenvalues (principal 
components).[1][2]

In the context of protein conformations, PCA can be applied to the Cartesian coordinates of atoms, or other 
higher dimensional conformational data, such as distance matrices, or torsion angles, to obtain a reduced 
representation that captures the dominant modes of motion by projecting the data onto the principal 
components that capture the largest variance. PCA facilitates the comparison of conformational ensembles 
from different conditions or mutations by projecting them onto a common subspace defined by the principal 
components, enabling the identification of differences in conformational sampling.[1][2]  
The advantage of PCA is that it provides a straightforward way to visualize the conformational space 
sampled by the ensemble by projecting the high-dimensional conformations onto the first two or three 
principal components. However, PCA has limitations in accurately representing the local structure and 
nonlinear relationships in the data, as it assumes linearity.

#### UMAP for Visualizing Protein Conformations

Uniform Manifold Approximation and Projection (UMAP) is a nonlinear dimensionality reduction technique that 
aims to preserve both local and global structure of the high-dimensional data in the lower-dimensional 
representation. UMAP works by:
1.	Constructing a fuzzy topological representation of the high-dimensional data.
2.	Optimizing a low-dimensional representation that preserves the fuzzy topological structure.  

UMAP can capture complex, nonlinear relationships in the data, making it suited for visualizing the 
conformational space of proteins, which often involves nonlinear motions like bond rotations and global 
structural changes. Studies have shown that UMAP can outperform PCA and other linear techniques in 
preserving the structure of protein conformations in the reduced space, as evidenced by higher Pearson 
correlation coefficients between high- and low-dimensional distances.
UMAP's ability to accurately represent the conformational space can aid in understanding free energy 
barriers, conformational transitions, and other biologically relevant information.

In summary, dimensionality reduction techniques like PCA and UMAP are powerful tools for visualizing and 
understanding the conformational changes and dynamics of protein structure ensembles. While PCA offers a 
linear projection and therefore enables a very straightforward molecular visualisation including atom 
contributions to the principal components (when performed on coordinates), UMAP's nonlinear approach may 
better capture the complex motions involved in protein conformational changes, making it a more accurate 
representation of the conformational space.[3][4]

Citations:  
[1] https://europepmc.org/article/MED/24061923  
[2] https://doi.org/10.3390/j5020021  
[3] https://europepmc.org/article/MED/33973773  
[4] https://europepmc.org/article/MED/27179343  
[5] https://www.bioinformatics.babraham.ac.uk/training/10XRNASeq/Dimension%20Reduction.pdf  

2. How to interpret analysis performed on two different scales: Backbone vs. All-atom

### Analysis performed on two different scales: Backbone vs. All-atom

The values of RMSD (Root Mean Square Deviation) and RMSF (Root Mean Square Fluctuation) can vary significantly 
depending on which atoms are included in the computation, such as considering only the backbone atoms versus all 
heavy atoms. The same holds true for principal component analysis (PCA). The choice of atoms included in the PCA 
of protein structures can significantly impact the resulting principal components and the interpretation of 
conformational changes. Here's how these metrics are affected:

RMSD:
- When calculating RMSD between two protein structures, including only the backbone atoms (e.g., Cα, N, C, O) 
generally results in lower RMSD values compared to including all heavy atoms[1].
- Backbone atoms tend to be more structurally conserved, while side chains can exhibit greater conformational 
variability, leading to higher RMSD values when included[1].
- The difference in RMSD values between backbone-only and all-atom calculations can be substantial for proteins 
with flexible loops, termini, or side chains undergoing conformational changes[1].
- For highly rigid proteins or small conformational changes, the difference may be less pronounced[1].

RMSF:
- When calculating RMSF, including only backbone atoms will capture the fluctuations of the protein's main chain, 
but may underestimate the mobility of side chains.
- Including all heavy atoms in the RMSF calculation provides a more comprehensive picture of residue-level 
flexibility, as it accounts for side chain motions[3].
- The RMSF profile along the protein sequence can differ significantly between backbone-only and all-atom 
calculations, especially for regions with flexible side chains or residues involved in conformational changes[3].
- Backbone-only RMSF may be more suitable for identifying flexible loops or hinges, while all-atom RMSF can reveal 
side chain dynamics relevant for binding or functional mechanisms.

PCA:
- PCA performed on just the backbone atoms (N, Cα, C, O) captures the large-scale, global motions of the protein 
backbone.
- It highlights the dominant conformational changes in the overall protein fold and secondary structure elements.
- Backbone-only PCA may miss local conformational changes involving side-chain motions or loop rearrangements.
- Including all non-hydrogen atoms (backbone and side-chains) in the PCA captures both global and local 
conformational changes.
- It can reveal side-chain motions, loop rearrangements, and subtle local structural variations that may be 
functionally relevant.
- The principal components will reflect a combination of backbone and side-chain motions.
- However, the increased number of atoms can make the PCA computationally more expensive and may introduce noise 
from flexible side-chains.

In general, including only backbone atoms in RMSD and RMSF calculations can provide a more conservative estimate of 
structural differences and flexibility, focusing on the core protein fold[5]. Including all heavy atoms captures 
additional conformational variability from side chains, which can be important for understanding functional 
dynamics or comparing structures with significant side chain rearrangements[5][6]. 
Backbone-only PCA captures the dominant global motions but may miss local conformational changes. Including all 
heavy atoms in the PCA provides a more comprehensive representation of the conformational variability, but it may 
also introduce noise from highly flexible regions. 
The choice depends on the specific research question and the level of detail required in the analysis. If the focus 
is on large-scale conformational changes or global protein motions, backbone-only RMSD/RMSF/PCA may be sufficient. 
However, if local structural variations or side-chain motions are of interest, including all heavy atoms in the 
analysis is recommended.
It's also possible to combine both approaches, using backbone-only PCA to identify the dominant global motions and 
then analysing the side-chain conformations within the subspace defined by the principal components.[4]

Citations:  
[1] https://europepmc.org/article/MED/22323224  
[2] https://europepmc.org/article/MED/28267328  
[3] https://europepmc.org/article/MED/25816325  
[4] https://en.wikipedia.org/wiki/Root_mean_square_deviation_of_atomic_positions  
[5] https://europepmc.org/article/MED/11420442  
[6] https://europepmc.org/article/MED/11430756  
[7] http://thegrantlab.org/bio3d_v2/tutorials/principal-component-analysis  

3. Flexibility prediction: How to interpret Normal Mode Analysis (NMA) results?

Normal Mode Analysis (NMA) based on elastic network models like Anisotropic Network Model (ANM) and Gaussian 
Network Model (GNM) is a widely used computational technique for predicting the intrinsic flexibility of protein 
structures. These coarse-grained models represent proteins as a network of nodes (atoms) connected by springs 
(interactions), allowing for efficient calculations of low-frequency normal modes that describe the large-scale 
collective motions.[1][2][3]

## Elastic Network Model NMA

- ANM and GNM treat proteins as elastic networks, capturing the overall shape and topology rather than atomic details.
- They are computationally inexpensive and provide an approximation of the slowest modes that dominate functional motions.
- The modes describe the directions and relative amplitudes of residue fluctuations around an equilibrium state.
- ANM is anisotropic, considering directional preferences, while GNM is isotropic, treating all directions equivalently.
- These modes can reveal mechanisms of conformational changes, functional motions, and allosteric communication pathways.[1][2][3]

### Differences between ANM and GNM 

The Anisotropic Network Model (ANM) and the Gaussian Network Model (GNM) differ in their treatment of protein 
flexibility prediction as follows:

1. Directionality of motions:
 - ANM is anisotropic, meaning it can predict the directionality of residue fluctuations and provide information on 
  the individual x, y, z components of the deformation vectors associated with each mode.[3][4]
 - GNM is isotropic, predicting only the magnitudes of the fluctuation vectors but not their directions. It cannot 
  resolve the individual x, y, z components.[3][5]

2. Accuracy of fluctuation magnitudes:
 - GNM generally provides a more accurate prediction of the magnitudes of residue fluctuations compared to 
  experimental B-factors or mean-square fluctuations.[3][4]
 - The fluctuation magnitudes predicted by GNM tend to correlate better with experimental data than those from ANM.[3][4]

3. Computational cost:
 - GNM operates in an N-dimensional configurational space, where N is the number of residues, making it 
  computationally more efficient than ANM.[4]
 - ANM operates in a 3N-dimensional space, considering all 3N degrees of freedom, resulting in a higher 
  computational cost.[4]

In summary, GNM is preferred when accurately predicting the distribution and magnitudes of residue fluctuations is 
the primary goal, while ANM is better suited for assessing the directionality and mechanisms of motions, albeit 
with a higher computational expense.[3][4] The choice depends on whether the focus is on the amplitudes or the 
directions of the motions.

## All-Atom NMA

In contrast, all-atom NMA considers the detailed atomic interactions within the protein structure, providing a more 
accurate but computationally expensive description of the potential energy surface and corresponding normal modes.[1]

- All-atom force fields explicitly model bonded and non-bonded atomic interactions.
- This level of detail captures finer motions and higher-frequency modes inaccessible to coarse-grained models.
- All-atom NMA is better suited for studying local flexibility, side-chain motions, and subtle conformational changes.
- However, the high computational cost limits its application to smaller systems or lower modes.[1][2]

Both coarse-grained and all-atom NMA offer valuable insights into protein dynamics, with the choice depending on 
the system size, desired accuracy, and the nature of the motions under investigation.[1][2]

Citations:  
[1] https://europepmc.org/article/MED/33349977  
[2] https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-014-0427-6  
[3] https://europepmc.org/article/MED/31510014  
[4] https://www.taylorfrancis.com/books/edit/10.1201/9781420035070/normal-mode-analysis-ivet-bahar-qiang-cui?refId=7b1a66b2-71fe-4a55-b39a-1212bd8da7b5&context=ubx  
[4] https://www.ccbb.pitt.edu/archive/Faculty/bahar/b14.pdf  
[5] https://link.springer.com/article/10.1007/s42493-024-00097-8  
[6] https://www.sciencedirect.com/science/article/pii/S0166128008005435