Utility functions:

This is an overview of various miscellaneous utility functions, useful for different purposes when doing calculations with vitrum. These used to live together in a single vitrum.utility module, which has since been split into focused modules.

Geometry

Low-level distance/PDF helpers used internally by GlassAtoms — useful directly when you have raw distances rather than a full Atoms object.

from vitrum.geometry import pdf, find_min_after_peak

xval, gr = pdf(dist_list, volume, rrange=10, nbin=100)
cutoff = xval[find_min_after_peak(gr)]  # first minimum after the first peak

`find_min_after_peak(padf)`

Find the index of the first local minimum after the first peak in a function. Useful for determining cutoffs from PDFs.

Parameters:

Name	Type	Description	Default
`padf`	`ndarray`	The probability density function or similar array.	required

Returns:

Name	Type	Description
`int`		The index of the minimum.

Source code in src/vitrum/geometry.py

def find_min_after_peak(padf):
    """
    Find the index of the first local minimum after the first peak in a function.
    Useful for determining cutoffs from PDFs.

    Args:
        padf (np.ndarray): The probability density function or similar array.

    Returns:
        int: The index of the minimum.
    """
    mins = argrelextrema(padf, np.less_equal, order=4)[0]
    second_min = [i for ind, i in enumerate(mins) if i != ind][0]
    return second_min

`get_dist(list, cell)`

Calculate the pairwise distance matrix for atoms in a periodic simulation box.

Kept as a plain-numpy (non-numba) reference implementation for backward compatibility; get_dist_numba is the version used internally by GlassAtoms.

Parameters:

Name	Type	Description	Default
`list`	`ndarray`	Atomic positions (N x 3).	required
`cell`	`ndarray`	Cell dimensions (3,). Assumes an orthorhombic cell.	required

Returns:

Type	Description
	np.ndarray: Symmetric distance matrix (N x N) containing distances between all atom pairs.

Source code in src/vitrum/geometry.py

def get_dist(list, cell):
    """
    Calculate the pairwise distance matrix for atoms in a periodic simulation box.

    Kept as a plain-numpy (non-numba) reference implementation for backward
    compatibility; get_dist_numba is the version used internally by GlassAtoms.

    Args:
        list (np.ndarray): Atomic positions (N x 3).
        cell (np.ndarray): Cell dimensions (3,). Assumes an orthorhombic cell.

    Returns:
        np.ndarray: Symmetric distance matrix (N x N) containing distances between all atom pairs.
    """
    dim = [cell[0], cell[1], cell[2]]
    x_dif = np.abs(list[:, 0][np.newaxis, :] - list[:, 0][:, np.newaxis])
    y_dif = np.abs(list[:, 1][np.newaxis, :] - list[:, 1][:, np.newaxis])
    z_dif = np.abs(list[:, 2][np.newaxis, :] - list[:, 2][:, np.newaxis])
    x_dif = np.where(x_dif > 0.5 * dim[0], np.abs(x_dif - dim[0]), x_dif)
    y_dif = np.where(y_dif > 0.5 * dim[1], np.abs(y_dif - dim[1]), y_dif)
    z_dif = np.where(z_dif > 0.5 * dim[2], np.abs(z_dif - dim[2]), z_dif)
    return np.sqrt(x_dif**2 + y_dif**2 + z_dif**2)

`get_dist_numba(pos, cell)`

Calculate the distance matrix between atoms using Numba for performance.

Parameters:

Name	Type	Description	Default
`pos`	`ndarray`	Array of atomic positions (N x 3).	required
`cell`	`ndarray`	Cell dimensions (3,). Assumes an orthorhombic cell (lx, ly, lz).	required

Returns:

Type	Description
	np.ndarray: Symmetric distance matrix (N x N).

Source code in src/vitrum/geometry.py

@njit(parallel=True)
def get_dist_numba(pos, cell):
    """
    Calculate the distance matrix between atoms using Numba for performance.

    Args:
        pos (np.ndarray): Array of atomic positions (N x 3).
        cell (np.ndarray): Cell dimensions (3,). Assumes an orthorhombic cell (lx, ly, lz).

    Returns:
        np.ndarray: Symmetric distance matrix (N x N).
    """
    n = pos.shape[0]
    # Initialize the output matrix
    dist_matrix = np.zeros((n, n))

    # Extract cell dimensions for faster access
    lx, ly, lz = cell[0], cell[1], cell[2]
    half_lx, half_ly, half_lz = lx / 2.0, ly / 2.0, lz / 2.0

    for i in prange(n):
        for j in range(i + 1, n):  # Only calculate the upper triangle
            dx = abs(pos[i, 0] - pos[j, 0])
            dy = abs(pos[i, 1] - pos[j, 1])
            dz = abs(pos[i, 2] - pos[j, 2])

            # Apply Periodic Boundary Conditions (Minimum Image Convention)
            if dx > half_lx: dx -= lx
            if dy > half_ly: dy -= ly
            if dz > half_lz: dz -= lz

            d = np.sqrt(dx**2 + dy**2 + dz**2)

            # Fill both symmetric entries
            dist_matrix[i, j] = d
            dist_matrix[j, i] = d

    return dist_matrix

`pdf(dist_list, volume, rrange=10, nbin=100)`

Calculate the pair distribution function (PDF) of a list of distances.

Parameters:

Name	Type	Description	Default
`dist_list`	`ndarray`	A 2D numpy array of distances.	required
`volume`	`float`	The volume of the system.	required
`rrange`	`float`	The range of the PDF. Defaults to 10.	`10`
`nbin`	`int`	The number of bins. Defaults to 100.	`100`

Returns:

Name	Type	Description
`xval`	`ndarray`	The x values of the PDF.
`pdf`	`ndarray`	The PDF values.

Source code in src/vitrum/geometry.py

def pdf(dist_list, volume, rrange=10, nbin=100):
    """
    Calculate the pair distribution function (PDF) of a list of distances.

    Parameters:
        dist_list (np.ndarray): A 2D numpy array of distances.
        volume (float): The volume of the system.
        rrange (float, optional): The range of the PDF. Defaults to 10.
        nbin (int, optional): The number of bins. Defaults to 100.

    Returns:
        xval (np.ndarray): The x values of the PDF.
        pdf (np.ndarray): The PDF values.
    """

    edges = np.linspace(0, rrange, nbin + 1)
    xval = (edges[1:] + edges[:-1]) / 2
    volbin = (4 / 3) * np.pi * (edges[1:] ** 3 - edges[:-1] ** 3)
    h, bin_edges = np.histogram(dist_list, bins=nbin, range=(0, rrange))
    h[0] = 0
    pdf = (h / volbin) / (dist_list.size / volume)
    return xval, pdf

Trajectory

See Quick start for a worked unwrap_trajectory example.

`get_high_low_displacement_index(initial_state, current_state, target_atom, percentage=0.25)`

Calculates the indices of the atoms with the high and low displacements between an initial and current state.

Parameters:

Name	Type	Description	Default
`initial_state`	`Atoms`	The initial state of the system.	required
`current_state`	`Atoms`	The current state of the system.	required
`target_atom`	`str or int`	The chemical symbol or atomic number of the target atom.	required
`percentage`	`float`	The percentage of the highest and lowest displacements to consider. Defaults to 0.25.	`0.25`

Returns:

Name	Type	Description
`list`		A list of two elements, where the first element is the index of the atoms with the highest displacements and the second element is the index of the atoms with the lowest displacements.

Source code in src/vitrum/trajectory.py

def get_high_low_displacement_index(initial_state, current_state, target_atom, percentage=0.25):
    """
    Calculates the indices of the atoms with the high and low displacements between an initial and current state.

    Parameters:
        initial_state (Atoms): The initial state of the system.
        current_state (Atoms): The current state of the system.
        target_atom (str or int): The chemical symbol or atomic number of the target atom.
        percentage (float, optional): The percentage of the highest and lowest displacements to consider. Defaults to 0.25.

    Returns:
        list: A list of two elements, where the first element is the index of the atoms with the highest displacements
            and the second element is the index of the atoms with the lowest displacements.
    """
    index = np.where(np.array(initial_state.get_chemical_symbols()) == target_atom)[0]
    initial_positions = initial_state.get_positions()[index]
    current_positions = current_state.get_positions()[index]
    displacements = initial_positions - current_positions
    displacements = np.sum(displacements**2, axis=1)
    ind = np.argsort(displacements)
    low_ind = index[ind[: int(len(ind) * percentage)]]
    high_ind = index[ind[int(len(ind) * percentage) :]]
    return [low_ind, high_ind]

`unwrap_trajectory(atoms_list)`

Unwraps a list of Atoms objects to remove periodic boundary crossings.

Parameters:

Name	Type	Description	Default
`atoms_list`	`list of Atoms objects`	The list of Atoms objects to unwrap.	required

Returns:

Name	Type	Description
`unwrapped_atoms_list`	`list of Atoms objects`	The unwrapped list of Atoms objects.

Source code in src/vitrum/trajectory.py

def unwrap_trajectory(atoms_list):
    """
    Unwraps a list of Atoms objects to remove periodic boundary crossings.

    Parameters:
        atoms_list (list of Atoms objects): The list of Atoms objects to unwrap.

    Returns:
        unwrapped_atoms_list (list of Atoms objects): The unwrapped list of Atoms objects.
    """

    if not atoms_list or len(atoms_list) == 0:
        raise ValueError("The input atoms_list must be a non-empty list of ASE Atom objects.")
    cell = np.diagonal(atoms_list[0].get_cell())
    n_atoms = len(atoms_list[0])
    unwrapped_atoms_list = [atoms.copy() for atoms in atoms_list]

    crossings = np.zeros((n_atoms, 3))
    previous_positions = unwrapped_atoms_list[0].get_positions()
    for atoms in unwrapped_atoms_list:
        current_positions = atoms.get_positions()
        difference = current_positions - previous_positions
        crossings = crossings + np.floor_divide(difference + 0.5 * cell, cell).astype(int)
        previous_positions = current_positions
        new_positions = current_positions - cell * crossings
        atoms.set_positions(new_positions)
    return unwrapped_atoms_list

Packing

See Quick start for a worked get_random_packed example.

`apply_strain_to_structure(structure, deformations)`

Apply strain(s) to input structure and return transformation(s) as list.

Parameters:

Name	Type	Description	Default
`structure`	`.Structure`	Input structure to apply strain to	required
`deformations`	`list[.Deformation]`	A list of deformations to apply independently to the input structure, in anticipation of performing an EOS fit. Deformations should be of the form of a 3x3 matrix, e.g., [[1.2, 0., 0.], [0., 1.2, 0.], [0., 0., 1.2]] or ((1.2, 0., 0.), (0., 1.2, 0.), (0., 0., 1.2))	required

Returns:

Name	Type	Description
`list`	`list`	A list of .TransformedStructure objects corresponding to the list of input deformations.

Source code in src/vitrum/packing.py

def apply_strain_to_structure(structure, deformations: list) -> list:
    """
    Apply strain(s) to input structure and return transformation(s) as list.

    Parameters:
        structure (.Structure): Input structure to apply strain to
        deformations (list[.Deformation]): A list of deformations to apply independently to the input
            structure, in anticipation of performing an EOS fit.
            Deformations should be of the form of a 3x3 matrix, e.g.,

            [[1.2, 0., 0.], [0., 1.2, 0.], [0., 0., 1.2]]

            or

            ((1.2, 0., 0.), (0., 1.2, 0.), (0., 0., 1.2))

    Returns:
        list: A list of .TransformedStructure objects corresponding to the
            list of input deformations.
    """
    transformations = []
    for deformation in deformations:
        # deform the structure
        ts = TransformedStructure(
            structure,
            transformations=[DeformStructureTransformation(deformation=deformation)],
        )
        transformations += [ts]
    return transformations

`get_random_packed(composition, target_atoms=100, min_distance=None, radii_scaling=1.0, volume_scaling=1.0, vol_per_atom_source='mp', datatype='ase', db_kwargs=None, density=None, seed=None, side_ratios=[1, 1, 1], **kwargs)`

Generate a random packed structure based on the given composition.

Parameters:

Name	Type	Description	Default
`composition`	`(str, dict or Composition)`	The composition of the structure.	required
`density`	`float`	The target density of the structure (in g/cm^3). If not provided, the volume per atom is estimated using the Materials Project API.	`None`
`target_atoms`	`int`	The target number of atoms in the structure. Defaults to 100.	`100`
`radii_scaling`	`float`	Scaling factor for the covalent radii of the atoms. Defaults to 1.0.	`1.0`
`volume_scaling`	`float`	Scaling factor for the volume of the structure. Defaults to 1.0.	`1.0`
`vol_per_atom_source`	`float or str`	The source for the volume per atom. Can be a float value or one of the following strings: "mp" (Materials Project), "icsd" (Inorganic Crystal Structure Database), "density" (use provided density), "covalent_radius" (estimate from covalent radii), or "convex_hull" (estimate from convex hull). Defaults to "mp".	`'mp'`
`datatype`	`str`	The type of data to return. Can be "ase" for ASE format or "pymatgen" for pymatgen format. Defaults to "ase".	`'ase'`
`db_kwargs`	`dict`	Additional keyword arguments for database access. Defaults to None.	`None`
`seed`	`int`	The seed for random number generation. Defaults to 0.	`None`
`side_ratios`	`list`	The side ratios for the lattice. Defaults to [1, 1, 1].	`[1, 1, 1]`

Returns:

Name	Type	Description
`data`	`Atoms or Structure`	The generated random packed structure.

Source code in src/vitrum/packing.py

def get_random_packed(
    composition: str | dict | Composition,
    target_atoms: int = 100,
    min_distance: float | None = None,
    radii_scaling: float = 1.0,
    volume_scaling: float = 1.0,
    vol_per_atom_source: float | str = "mp",
    datatype: str = "ase",
    db_kwargs: dict | None = None,
    density: float | None = None,
    seed: int | None = None,
    side_ratios: list = [1, 1, 1],
    **kwargs,
):
    """
    Generate a random packed structure based on the given composition.

    Parameters:
        composition (str, dict or pymatgen.core.Composition): The composition of the structure.
        density (float, optional): The target density of the structure (in g/cm^3). If not provided, the volume per atom
                                   is estimated using the Materials Project API.
        target_atoms (int, optional): The target number of atoms in the structure. Defaults to 100.
        radii_scaling (float, optional): Scaling factor for the covalent radii of the atoms. Defaults to 1.0.
        volume_scaling (float, optional): Scaling factor for the volume of the structure. Defaults to 1.0.
        vol_per_atom_source (float or str, optional): The source for the volume per atom. Can be a float value or one of the following strings:
                                                    "mp" (Materials Project), "icsd" (Inorganic Crystal Structure Database),
                                                     "density" (use provided density), "covalent_radius" (estimate from covalent radii),
                                                     or "convex_hull" (estimate from convex hull). Defaults to "mp".
        datatype (str, optional): The type of data to return. Can be "ase" for ASE format or "pymatgen"
                                  for pymatgen format. Defaults to "ase".
        db_kwargs (dict, optional): Additional keyword arguments for database access. Defaults to None.
        seed (int, optional): The seed for random number generation. Defaults to 0.
        side_ratios (list, optional): The side ratios for the lattice. Defaults to [1, 1, 1].

    Returns:
        data (ase.Atoms or pymatgen.core.Structure): The generated random packed structure.
    """

    if isinstance(composition, str):
        composition = Composition(composition)
    elif isinstance(composition, dict):
        comp_string = "".join(mol * (int(composition[mol] * 10)) for mol in composition)
        composition = Composition(comp_string)
    elements, factor = composition.get_integer_formula_and_factor()
    integer_composition = Composition(elements)
    full_cell_composition = integer_composition * np.ceil(target_atoms / integer_composition.num_atoms)

    structure = {}
    for el in full_cell_composition:
        structure[str(el)] = int(full_cell_composition.element_composition.get(el))
    elements = sum([[i] * structure[i] for i in structure], [])
    np.random.seed(seed)

    cell_vol = get_volume(composition, structure, vol_per_atom_source, db_kwargs, density, **kwargs)

    cell_vol *= volume_scaling
    k = (cell_vol / (side_ratios[0] * side_ratios[1] * side_ratios[2])) ** (1 / 3)
    cell = np.array([side_ratios[0] * k, side_ratios[1] * k, side_ratios[2] * k])
    cell = np.diag(cell)

    radii = covalent_radii[symbols2numbers(elements)] * radii_scaling

    if min_distance:
        radii = np.maximum(radii, min_distance / 2)

    nat = len(elements)

    pos = np.random.rand(len(elements), 3) @ cell
    ats = Atoms(elements, cell=cell, pbc=True, positions=pos)
    skin = 0.0
    nl = NeighborList(radii, self_interaction=False, bothways=True, skin=skin)

    for it in range(500):
        nl.update(ats)
        pos = ats.get_positions()
        dx = np.zeros(pos.shape)
        dsum = 0
        for i in range(nat):
            indices, offsets = nl.get_neighbors(i)
            rs = pos[indices, :] + offsets @ cell - pos[i, :]
            # ds is the overlap
            ds = np.linalg.norm(rs, axis=1) - (radii[indices] + radii[i])
            if np.any(ds > 1.e-10):
                raise RuntimeError(
                    f"Overlap relaxation invariant violated: max residual {np.max(ds):.3e} > 1e-10"
                )
            # sum overlaps
            dsum += np.sum(ds)
            ds -= skin
            # move atoms away from each other by overlap amount
            dx[i,:] = np.sum(rs / np.linalg.norm(rs, axis=1)[:, None] * ds[:, None], axis=0)
        # print(it, dsum, np.linalg.norm(dx))
        ats.set_positions(pos + dx)
        if dsum >= -1.e-5:
            break
    else:
        raise RuntimeError('Cell packing not converged')

    ats.wrap()
    if datatype == "pymatgen":
        structure = Structure(
            lattice=ats.get_cell(),
            species=ats.get_chemical_symbols(),
            coords=ats.get_scaled_positions(),
            to_unit_cell=True,
            coords_are_cartesian=False,
        )
    else:
        structure = ats
    return structure

Volume estimation

Used internally by packing.get_random_packed to estimate the cell volume for a target composition; can also be called directly.

from vitrum.volume_estimation import get_volume

# estimate volume purely from covalent radii, no external API required
volume = get_volume("SiO2", {"Si": 1, "O": 2}, vol_per_atom_source="covalent_radius")

`get_average_volume_convex_hull(composition, MP_API_KEY=None)`

Get the average volume per atom from the convex hull on Materials Project.

Parameters:

Name	Type	Description	Default
`composition`	`Composition`	The composition to query.	required
`MP_API_KEY`	`str`	Materials Project API Key.	`None`

Returns:

Name	Type	Description
`float`		Average volume per atom.

Source code in src/vitrum/volume_estimation.py

def get_average_volume_convex_hull(composition, MP_API_KEY=None):
    """
    Get the average volume per atom from the convex hull on Materials Project.

    Args:
        composition (Composition): The composition to query.
        MP_API_KEY (str, optional): Materials Project API Key.

    Returns:
        float: Average volume per atom.
    """
    with MPRester(api_key=MP_API_KEY) as mpr:
        entries = mpr.get_entries_in_chemsys(
            elements=[str(el) for el in composition.elements],
            additional_criteria={"thermo_types": ["GGA_GGA+U"]},
        )
    pd = PhaseDiagram(entries)
    decomp = pd.get_decomposition(composition)
    volume = sum([d.structure.volume / d.composition.num_atoms * decomp[d] for d in decomp])
    return volume

`get_volume(composition, structure, vol_per_atom_source='mp', db_kwargs=None, density=None, MP_API_KEY=None)`

Get the volume of the cell based on the composition and various estimation methods.

Parameters:

Name	Type	Description	Default
`composition`	`Union[Composition, str]`	The composition of the material.	required
`structure`	`dict`	A dictionary mapping element symbols to their count in the structure.	required
`vol_per_atom_source`	`Union[float, str]`	Method to estimate volume per atom. Options: - "mp": Use Materials Project (requires API key). - "icsd": Use ICSD database (if available). - "density": Calculate from provided density. - "covalent_radius": Estimate from covalent radii. - "convex_hull": Estimate from convex hull on Materials Project. - float: Directly provide the volume per atom. Defaults to "mp".	`'mp'`
`db_kwargs`	`dict`	Keyword arguments for database queries. Defaults to None.	`None`
`density`	`float`	Density in g/cm^3. Required if vol_per_atom_source="density". Defaults to None.	`None`
`MP_API_KEY`	`str`	Materials Project API key. Defaults to None.	`None`

Returns:

Name	Type	Description
`float`		The calculated total volume of the cell in Angstrom^3.

Raises:

Type	Description
`ValueError`	If estimating from density but estimates fail, or if an unknown source is provided.

Source code in src/vitrum/volume_estimation.py

def get_volume(
    composition: Composition | str,
    structure: dict,
    vol_per_atom_source: float | str = "mp",
    db_kwargs: dict | None = None,
    density: float | None = None,
    MP_API_KEY: str | None = None,
):
    """
    Get the volume of the cell based on the composition and various estimation methods.

    Args:
        composition (Union[Composition, str]): The composition of the material.
        structure (dict): A dictionary mapping element symbols to their count in the structure.
        vol_per_atom_source (Union[float, str], optional): Method to estimate volume per atom.
            Options:
            - "mp": Use Materials Project (requires API key).
            - "icsd": Use ICSD database (if available).
            - "density": Calculate from provided density.
            - "covalent_radius": Estimate from covalent radii.
            - "convex_hull": Estimate from convex hull on Materials Project.
            - float: Directly provide the volume per atom.
            Defaults to "mp".
        db_kwargs (dict, optional): Keyword arguments for database queries. Defaults to None.
        density (float, optional): Density in g/cm^3. Required if vol_per_atom_source="density". Defaults to None.
        MP_API_KEY (str, optional): Materials Project API key. Defaults to None.

    Returns:
        float: The calculated total volume of the cell in Angstrom^3.

    Raises:
        ValueError: If estimating from density but estimates fail, or if an unknown source is provided.
    """

    struct_db = vol_per_atom_source.lower() if isinstance(vol_per_atom_source, str) else None
    db_kwargs = db_kwargs or ({"use_cached": True} if struct_db == "mp" else {})
    cell_vol = None

    if density:
        if not isinstance(density, (float, int)):
            raise ValueError("Density must be a float or int.")

        struct_db = "density"

    if isinstance(vol_per_atom_source, float | int):
        vol_per_atom = vol_per_atom_source

    elif struct_db == "mp":
        vol_per_atom = get_average_volume_from_mp(composition, **db_kwargs)

    elif struct_db == "icsd":
        vol_per_atom = get_average_volume_from_db_cached(composition, db_name="icsd", **db_kwargs)

    elif struct_db == "density":
        mass = np.sum([Atoms(f"{i}").get_masses()[0] * structure[i] for i in structure])
        cell_vol = ((mass / (6.0221 * (10**23))) / density) * (10**24)

    elif struct_db == "covalent_radius":
        all_radii = np.hstack(
            [np.repeat(covalent_radii[atomic_numbers[key]], structure[key]) for key in structure.keys()]
        )
        cell_vol = np.sum((4 / 3 * np.pi * all_radii**3)) * 3

    elif struct_db == "convex_hull":
        try:
            vol_per_atom = get_average_volume_convex_hull(composition, MP_API_KEY=MP_API_KEY)
        except MPRestError as e:
            raise ValueError(f"Could not retrieve volume from convex hull. Check your MP_API_KEY. Error: {e}")

    else:
        raise ValueError(f"Unknown volume per atom source: {vol_per_atom_source}.")

    if not cell_vol:
        cell_vol = vol_per_atom * sum(structure.values())

    return cell_vol

I/O helpers

See Quick start for worked correct_atom_types/get_LAMMPS_dump_timesteps examples.

`correct_atom_types(atoms_list, atom_to_type_map)`

Correct the atom types in a list of Atoms objects.

Parameters:

Name	Type	Description	Default
`atoms_list`	`list of Atoms objects`	The list of Atoms objects to correct.	required
`atom_to_type_map`	`dict`	A dictionary mapping atomic numbers to atom types.	required

Returns:

Name	Type	Description
`atoms_list`	`list of Atoms objects`	The corrected list of Atoms objects.

Source code in src/vitrum/io_helpers.py

def correct_atom_types(atoms_list, atom_to_type_map):
    """
    Correct the atom types in a list of Atoms objects.

    Parameters:
        atoms_list (list of Atoms objects): The list of Atoms objects to correct.
        atom_to_type_map (dict): A dictionary mapping atomic numbers to atom types.

    Returns:
        atoms_list (list of Atoms objects): The corrected list of Atoms objects.
    """
    #Check if atoms_list is a list of Atoms objects
    if not isinstance(atoms_list, list):
        atoms_list = [atoms_list]

    for atoms in atoms_list:
        corr_symbols = [atom_to_type_map[i] for i in atoms.get_atomic_numbers()]
        atoms.set_chemical_symbols(corr_symbols)

`get_LAMMPS_dump_timesteps(filename)`

Retrieves the timesteps from a LAMMPS dump file.

Parameters:

Name	Type	Description	Default
`filename`	`str`	The path to the LAMMPS dump file.	required

Returns:

Type	Description
	List[int]: A list of timesteps extracted from the file.

Source code in src/vitrum/io_helpers.py

def get_LAMMPS_dump_timesteps(filename: str):
    """
    Retrieves the timesteps from a LAMMPS dump file.

    Parameters:
        filename (str): The path to the LAMMPS dump file.

    Returns:
        List[int]: A list of timesteps extracted from the file.
    """
    with open(filename, encoding="utf-8") as f:
        timesteps = []
        lines = deque(f.readlines())
        if len(lines) == 0:
            return timesteps
        line = lines.popleft()
        while len(lines) > 0:
            if "ITEM: TIMESTEP" in line:
                line = lines.popleft()
                timesteps.append(int(line))
            else:
                line = lines.popleft()
    return timesteps

Structure validation

Sanity checks for generated or simulated structures.

from vitrum.structure_validation import homogeneity_checker, dimer_checker

is_homogeneous = homogeneity_checker(atoms, grid_density=(3, 3, 3))
has_too_many_dimers = dimer_checker(atoms, bond_length=2.0, num_allowed=2)

`dimer_checker(atoms, bond_length=2.0, num_allowed=2)`

Check for the presence of dimers (e.g., O2, N2, F2, Cl2, Br2, I2) in the structure.

Parameters:

Name	Type	Description	Default
`atoms`	`Atoms`	The ASE Atoms object to check.	required
`bond_length`	`float`	The cutoff distance to define a bond. Defaults to 2.0.	`2.0`
`num_allowed`	`int`	The maximum number of allowed dimers before returning True. Defaults to 2.	`2`

Returns:

Name	Type	Description
`bool`		True if the number of dimers exceeds `num_allowed`, False otherwise.

Source code in src/vitrum/structure_validation.py

def dimer_checker(atoms, bond_length=2.0, num_allowed=2):
    """
    Check for the presence of dimers (e.g., O2, N2, F2, Cl2, Br2, I2) in the structure.

    Args:
        atoms (Atoms): The ASE Atoms object to check.
        bond_length (float, optional): The cutoff distance to define a bond. Defaults to 2.0.
        num_allowed (int, optional): The maximum number of allowed dimers before returning True. Defaults to 2.

    Returns:
        bool: True if the number of dimers exceeds `num_allowed`, False otherwise.
    """
    atoms.wrap()
    species = ["O", "N", "F", "Cl", "Br", "I"]
    for spec in species:
        if spec not in atoms.get_chemical_symbols():
            continue
        spec_atoms = atoms[np.array(atoms.get_chemical_symbols()) == spec]
        if len(spec_atoms) < 2:
            continue
        spec_positions = spec_atoms.get_positions()
        distances = np.linalg.norm(spec_positions[:, np.newaxis] - spec_positions, axis=-1)
        np.fill_diagonal(distances, np.inf)  # Ignore self-distances
        num_dimers = np.sum(distances < bond_length)
        if num_dimers > num_allowed:  # Adjust threshold as needed
            return True
    return False

`homogeneity_checker(atoms, grid_density, slide_steps=2, target_species='all', upper_bound=1.5, lower_bound=0.5, box_threshold=0.1, separated_species_threshold=0.5)`

Check the homogeneity of the atomic structure by analyzing atom density in grid boxes.

Parameters:

Name	Type	Description	Default
`atoms`	`Atoms`	The ASE Atoms object to analyze.	required
`grid_density`	`tuple or list`	Number of grid divisions in x, y, z directions (nx, ny, nz).	required
`slide_steps`	`int`	Number of steps to slide the grid for averaging. Defaults to 2.	`2`
`target_species`	`str or list`	Species to check. "all" checks all present species. Defaults to "all".	`'all'`
`upper_bound`	`float`	Multiplier for average density to consider a box over-dense. Defaults to 1.5.	`1.5`
`lower_bound`	`float`	Multiplier for average density to consider a box under-dense. Defaults to 0.5.	`0.5`
`box_threshold`	`float`	Fraction of boxes allowed to be out of bounds before flagging a species. Defaults to 0.1.	`0.1`
`separated_species_threshold`	`float`	Fraction of species allowed to be phase separated before the structure is flagged. Defaults to 0.5.	`0.5`

Returns:

Name	Type	Description
`bool`		True if the structure is considered homogeneous, False otherwise.

Raises:

Type	Description
`ValueError`	If no species are found in the atoms object.

Source code in src/vitrum/structure_validation.py

def homogeneity_checker(
    atoms,
    grid_density,
    slide_steps=2,
    target_species="all",
    upper_bound=1.5,
    lower_bound=0.5,
    box_threshold=0.1,
    separated_species_threshold=0.5,
):
    """
    Check the homogeneity of the atomic structure by analyzing atom density in grid boxes.

    Args:
        atoms (Atoms): The ASE Atoms object to analyze.
        grid_density (tuple or list): Number of grid divisions in x, y, z directions (nx, ny, nz).
        slide_steps (int, optional): Number of steps to slide the grid for averaging. Defaults to 2.
        target_species (str or list, optional): Species to check. "all" checks all present species. Defaults to "all".
        upper_bound (float, optional): Multiplier for average density to consider a box over-dense. Defaults to 1.5.
        lower_bound (float, optional): Multiplier for average density to consider a box under-dense. Defaults to 0.5.
        box_threshold (float, optional): Fraction of boxes allowed to be out of bounds before flagging a species. Defaults to 0.1.
        separated_species_threshold (float, optional): Fraction of species allowed to be phase separated before the structure is flagged. Defaults to 0.5.

    Returns:
        bool: True if the structure is considered homogeneous, False otherwise.

    Raises:
        ValueError: If no species are found in the atoms object.
    """
    atoms.wrap()
    species = np.unique(atoms.get_chemical_symbols()) if target_species == "all" else [target_species]

    if len(species) == 0:
        raise ValueError("No species found in the atoms object.")

    cell_lengths = np.array(atoms.get_cell()).diagonal()
    num_boxes = np.prod(grid_density)

    x_edges = np.linspace(0, cell_lengths[0], grid_density[0] + 1)
    y_edges = np.linspace(0, cell_lengths[1], grid_density[1] + 1)
    z_edges = np.linspace(0, cell_lengths[2], grid_density[2] + 1)

    slide_fractions = np.linspace(0, 1, slide_steps, endpoint=False)

    phase_seperated_species = 0

    for spec in species:
        spec_atoms = atoms[np.array(atoms.get_chemical_symbols()) == spec]
        num_atoms = len(spec_atoms)
        avg_atoms_per_box = num_atoms / num_boxes

        if avg_atoms_per_box < 2:
            continue

        positions = spec_atoms.get_positions()

        out_of_bounds_boxes = []
        for dx, dy, dz in product(slide_fractions, repeat=3):
            # Apply offset in each direction
            x_slide = x_edges + dx * (cell_lengths[0] / grid_density[0])
            y_slide = y_edges + dy * (cell_lengths[1] / grid_density[1])
            z_slide = z_edges + dz * (cell_lengths[2] / grid_density[2])

            x_slide[-1] = cell_lengths[0]
            y_slide[-1] = cell_lengths[1]
            z_slide[-1] = cell_lengths[2]

            x_idx = np.digitize(positions[:, 0], x_slide) - 1
            y_idx = np.digitize(positions[:, 1], y_slide) - 1
            z_idx = np.digitize(positions[:, 2], z_slide) - 1

            x_idx[x_idx == -1] = 2
            y_idx[y_idx == -1] = 2
            z_idx[z_idx == -1] = 2

            counts = np.zeros(grid_density, dtype=int)
            for xi, yi, zi in zip(x_idx, y_idx, z_idx):
                counts[xi, yi, zi] += 1

            too_low = counts < avg_atoms_per_box * lower_bound
            too_high = counts > avg_atoms_per_box * upper_bound
            out_of_bounds_boxes.append(np.sum(too_low | too_high))

        if np.sum(out_of_bounds_boxes) > num_boxes * slide_steps**3 * box_threshold:
            phase_seperated_species += 1

    if phase_seperated_species > separated_species_threshold * len(species):
        return False
    else:
        return True

Comparison

from vitrum.comparison import r_chi

simulated = {"x": sc.xval, "y": rdf}
experimental = {"x": exp_r, "y": exp_gr}
rchi, common_x, y_sim, y_exp = r_chi(simulated, experimental)

`r_chi(function_1, function_2, x_min=0, x_max=np.inf, steps=100)`

Calculate the Wright coefficient (https://doi.org/10.1016/0022-3093(93)90232-M) between two functions

Parameters:

Name	Type	Description	Default
`function_1`	`dict`	Dictionary with keys 'x' and 'y' representing the first function, usually from simulations	required
`function_2`	`dict`	Dictionary with keys 'x' and 'y' representing the second function, usually from experimental meassurements.	required

Returns:

Name	Type	Description
`rchi`	`float`	Wright coefficient, a measure of similarity between the two functions.

Source code in src/vitrum/comparison.py

def r_chi(function_1, function_2, x_min=0, x_max=np.inf, steps=100):
    """
    Calculate the Wright coefficient (https://doi.org/10.1016/0022-3093(93)90232-M) between two functions

    Parameters:
        function_1 (dict): Dictionary with keys 'x' and 'y' representing the first function, usually from simulations
        function_2 (dict): Dictionary with keys 'x' and 'y' representing the second function, usually from experimental meassurements.

    Returns:
        rchi (float): Wright coefficient, a measure of similarity between the two functions.
    """
    # Determine the overlapping x-range
    min_x_val = np.max([np.min(function_1["x"]), np.min(function_2["x"]), x_min])
    max_x_val = np.min([np.max(function_1["x"]), np.max(function_2["x"]), x_max])

    if min_x_val >= max_x_val:
        raise ValueError("No overlapping x-range between the two functions.")

    # Create common x-axis over the overlap region
    common_x = np.linspace(min_x_val, max_x_val, steps)

    # Interpolate both functions onto the common x-axis
    interp_f1 = interp1d(function_1["x"], function_1["y"], kind="linear", bounds_error=False, fill_value=0)
    interp_f2 = interp1d(function_2["x"], function_2["y"], kind="linear", bounds_error=False, fill_value=0)

    y1 = interp_f1(common_x)
    y2 = interp_f2(common_x)

    # Calculate the r-chi (Wright coefficient)
    numerator = np.sum((y1 - y2) ** 2)

    denominator = np.sum(y2**2)
    rchi = np.sqrt(numerator / denominator)

    return rchi, common_x, y1, y2