Analysis methods

The implemented analysis methods can be distinguished into two categories:

Using the external_manipulation_method decorator, new analysis methods can be easily defined and interfaced with Structure and StructureOperations objects.

All analysis methods store two different types of output data:

  • A complete and extensive set of results is stored in the extras property and is returned by caling the function itself.

  • A representative subset of the data is stored in the attributes property.

As an example we identify the space group of the crystal using the spglib Python package as backend:

[1]:
from aim2dat.strct import Structure

strct = Structure(
    label="GaAs",
    elements=["Ga", "As"],
    positions=[
        [0.0, 0.0, 0.0],
        [1.75, 1.75, 1.75],
    ],
    cell=[
        [0.0, 4.066, 4.0660001],
        [4.066, 0.0, 4.066],
        [4.066, 4.066, 0.0],
    ],
    is_cartesian=False,
    wrap=True,
    pbc=[True, True, True],
)
strct.determine_space_group()

print("Attributes:", strct.attributes["space_group"])
print("Extras:", strct.extras["space_group"])
Attributes: 216
Extras: {'equivalent_sites': [{'wyckoff': 'a', 'symmetry': '-43m', 'sites': [0]}, {'wyckoff': 'c', 'symmetry': '-43m', 'sites': [1]}], 'space_group': {'number': 216, 'international_short': 'F-43m', 'international_full': 'F -4 3 m', 'international': 'F -4 3 m', 'schoenflies': 'Td^2', 'hall_number': 512, 'hall_symbol': 'F -4 2 3', 'choice': '', 'pointgroup_international': '-43m', 'pointgroup_schoenflies': 'Td', 'arithmetic_crystal_class_number': 69, 'arithmetic_crystal_class_symbol': '-43mF', 'centrosymmetric': False}}

In case the same analysis function of the instance is called again with the same parameters, the stored data in the extras property is returned instead of performing the same analysis again. This feature is very handy, but may also demand large memory capacities. Thus, it can be switched off via the store_calculated_properties property.

Internal analysis methods

The internal analysis methods are implemented as functions in the Structure and StructureOperations classes. In case of a single Structure object the function can be directly called (in this example the distance between the two sites is calculated via calculate_distance):

[2]:
strct.calculate_distance(0, 1)
[2]:
3.521259277353771

As for a StructureOperations object, the function can be called on one structure or several structures via single key or list of keys, respectively:

[3]:
from aim2dat.strct import StructureOperations

strct_op = StructureOperations(structures=[strct])
strct_op[0].calculate_distance(0, 1)
[3]:
3.521259277353771

Warning

It should to be stressed that a single key, e.g., strct_op[0].calculate_distance(0, 1), and a list of keys, e.g., strct_op[[0]].calculate_distance(0, 1), results in different output.

A list of all internal analysis methods is printed via:

[4]:
strct.list_analysis_methods()
[4]:
['determine_point_group',
 'determine_space_group',
 'calculate_distance',
 'calculate_angle',
 'calculate_dihedral_angle',
 'calculate_voronoi_tessellation',
 'calculate_coordination',
 'calculate_ffingerprint']

Distances and angles

The functions calculate_distance, calculate_angle and calculate_dihedral_angle calculate the distance, angle and dihedral angle between two, three and four atomic sites, respectively. Thus, the first two, three and four arguments are given by the site indices.

The calculate_distance function has a more enhanced interface: If the first parameter is a list and the second parameter is None, all index combinations of the list are returned:

[5]:
strct.calculate_distance([0, 1], None)
[5]:
{(0, 1): 3.521259277353771}

The keyword argument use_supercell returns a list of distances including periodic replica of the atom sites up to a distance defined by the cutoff distance r_max:

[6]:
strct.calculate_distance(0, 1, use_supercell=True, r_max=5.0)
[6]:
(3.521259277353771, 3.5212593062212845, 3.5212593062212845, 3.5212593350887986)

Point group

The function determine_point_group indentifies the point group of non-periodic structures.

Warning

This method has to be considered as experimental. The implementation is not yet complete and may not work for all point groups correctly.

Space group

The function determine_space_group identifies symmetry operations, the point group and the space group of periodic structures making use of the spglib Python package. Additionally, a list of equivalent sites is returned as well. Optional output includes a list of symmetry operations, the primitive and the standardized unit cell, controlled by the keyword arguments return_primitive_structure, return_standardized_structure and return_sym_operations.

[7]:
strct.determine_space_group()
[7]:
{'equivalent_sites': [{'wyckoff': 'a', 'symmetry': '-43m', 'sites': [0]},
  {'wyckoff': 'c', 'symmetry': '-43m', 'sites': [1]}],
 'space_group': {'number': 216,
  'international_short': 'F-43m',
  'international_full': 'F -4 3 m',
  'international': 'F -4 3 m',
  'schoenflies': 'Td^2',
  'hall_number': 512,
  'hall_symbol': 'F -4 2 3',
  'choice': '',
  'pointgroup_international': '-43m',
  'pointgroup_schoenflies': 'Td',
  'arithmetic_crystal_class_number': 69,
  'arithmetic_crystal_class_symbol': '-43mF',
  'centrosymmetric': False}}

Coordination

The calculation of the coordination environment of atom sites in molecules or crystals is relevant for a manifold of use cases, ranging from the parametrization of classical force fields and the generation of graphs for neural networks to rationalizing chemical trends and determining the oxidation state of the site. Unfortunately, the one single algorithm to do the job does not exist, and the coordination environment itself can (in many cases) not be uniquely defined. The function calculate_coordination implements several algorithms selected via the key word argument method (an overview and comparison of a subset of the implemented methods is also given in doi:10.1021/acs.inorgchem.0c02996). The following options are supported:

  • minimum_distance: The method determines the coordination based on the distance \(d_\mathrm{min}\) to the nearest neighbouring atom. A neighbouring atom is coordinated if the condition

    \[d_{ij} < d_\mathrm{min}\cdot (1.0 + \Delta d)\]

    is True, with \(d_{ij}\) being the distance between the two atoms and \(\Delta d\) is a tolerance threshold given by the key word argument distance_delta.

  • n_nearest_neighbours: The method considers the \(n\) closest atoms as coordinated.

  • atomic_radius: The sum of atomic radii \(r_i\) and \(r_j\) (including a tolerance threshold \(\Delta d\)) is taken as condition to consider two atoms coordinated:

    \[d_{ij} < (r_i + r_j)\cdot (1.0 + \Delta d).\]

    The tolerance threshold \(\Delta d\) is defined via the key word argument atomic_radius_delta. Different radius types can be selected via the key word argument radius_type. Supported options are implemente in the get_atomic_radius function. If the radius for an element is not defined, the 'covalent' radius type is used as fallback.

  • econ: The econ algorithm has been introduced in doi:10.1524/zkri.1979.150.14.23, in this case the coordination environment is determined via an iterative scheme.

  • voronoi: This option allows to use a Voronoi tessellation as base to determine coordinated sites. The parameter voronoi_weight_type implements a weight condition with the supported options:

    • 'covalent_atomic_radius' or 'vdw_atomic_radius': The weight is calculated via the difference between the distance between the two sites and the sum of their covalent or van der Waals atomic radii.

    • 'solid_angle': The solid angle of the Voronoi facet is used as weight.

    • 'area': The area of the Voronoi facet is used as weight.

    Adding the prefix 'rel_' to the parameter (e.g. 'rel_solid_angle'), will result in all weight values being divided by the maximum weight. The parameter voronoi_weight_threshold then defines the lower threshold (weight > voronoi_weight_threshold) for the site to be considered coordinated. This way different methods can be assessed, e.g. setting it to rel_solid_angle gives the method introduced by O’Keeffe (doi:10.1107/S0567739479001765) or covalent_atomic_radius with the threshold 0.25 gives the method described in doi:10.1038/ncomms15679.

The final output is given as a dictionary:

[8]:
coord_dict = strct.calculate_coordination(
    method="atomic_radius", atomic_radius_delta=0.5
)
coord_dict.keys()
[8]:
dict_keys(['distance_avg', 'distance_stdev', 'distance_max', 'distance_min', 'nrs_avg', 'nrs_stdev', 'nrs_max', 'nrs_min', 'sites'])

including statistical quantities as well as detailed information of each atomic site within the key 'sites', stored as a list (the order of the sites in Structure object is maintained):

[9]:
coord_dict["sites"][0]
[9]:
{'Ga': 0,
 'As': 4,
 'element': 'Ga',
 'kind': None,
 'position': [0.0, 0.0, 0.0],
 'neighbours': [{'element': 'As',
   'kind': None,
   'site_index': 1,
   'distance': 3.521259277353771,
   'position': [2.0330000000000004, -2.033, 2.032999975]},
  {'element': 'As',
   'kind': None,
   'site_index': 1,
   'distance': 3.5212593062212845,
   'position': [2.033, 2.033, -2.0330000249999998]},
  {'element': 'As',
   'kind': None,
   'site_index': 1,
   'distance': 3.5212593062212845,
   'position': [-2.033, -2.033, -2.0330000249999998]},
  {'element': 'As',
   'kind': None,
   'site_index': 1,
   'distance': 3.5212593350887986,
   'position': [-2.033, 2.0330000000000004, 2.0330000750000004]}],
 'total_cn': 4,
 'min_dist': 3.521259277353771,
 'max_dist': 3.5212593350887986,
 'avg_dist': 3.521259306221285}

F-Fingerprint

The F-Fingerprint has been introduced in doi:10.1063/1.3079326 and aims to compare polymorphs and quantify their similarity, using radial distribution functions. The calculate_ffingerprint function calculates the distribution functions for each element-pair and individual sites.

External analysis methods

The external analysis methods take the Structure as input and return the calculated property (taking the calculate_prdf as example):

[10]:
from aim2dat.strct.ext_analysis import calculate_prdf

prdf = calculate_prdf(strct)

Alternatively, the methods can also be used within a StructureOperations object:

[11]:
prdf = strct_op[0].perform_analysis(calculate_prdf)

Molecular fragments and graphs

The method determine_molecular_fragments aims to determine molecular fragments in organic or hybrid inorganic-organic structures based on the coordination environment of each site calculated with the calculate_coordination function. the parameter exclude_elements or end_points restrict the

The method create_graph creates a graph with the atom sites as nodes and the edges represent coordinated sites. Once again, the calculate_coordination function is used to identify coordinated sites.

Partial radial distribution functions

The implementation of the calculate_prdf function is based on :doi:`` and its output is similar to the F-Fingerprint, however, in this case there is no Gaussian broadening applied and the normalization factor is differs as well.

Order parameters

The library offers two different order parameters via the calculate_prdf and calculate_prdf functions as defined in doi:10.1016/j.cpc.2010.06.007 and doi:10.1103/PhysRevB.89.205118, respectively.

Planes

The calculate_planes method identifies the plane which including all atom sites of the Structure or the sites given by the fragment parameter. It was originally designed to produce the input for the critic2 code.

Dscribe descriptors

Wrapping the Dscribe Python package, different interaction matrices and other descriptors are calculated using the functions calculate_interaction_matrix, calculate_acsf_descriptor, calculate_soap_descriptor and calculate_mbtr_descriptor.

Designing new methods

New external analysis methods which seamlessly interface with the Structure and StructureOperations classes can be implemented by using the external_analysis_method decorator. To work properly, the first argument needs to be named structure with a Structure object as input and the output needs to be a tuple with a length of two consisting of the calculated attributes and extras.

The function calculate_n_element shown below represents an example for an external analysis method. It determines how many atoms of a specific element are present in the Structure and returns it in extras.

[12]:
from aim2dat.strct.ext_analysis.decorator import external_analysis_method
from aim2dat.utils.element_properties import get_element_symbol


@external_analysis_method
def determine_n_element(structure, element="H"):
    element = get_element_symbol(element)
    if element in structure.chem_formula:
        return (None, structure.chem_formula[element])
    else:
        return (None, 0)


determine_n_element(strct, "Ga")
strct.extras["n_element"]
[12]:
1