Skip to content

Objective Function

objective

Objective function for force field optimization.

Wraps the ForceField ↔ MM-engine ↔ reference-data loop into a single callable that :func:scipy.optimize.minimize can drive.

Scoring approach

This module uses raw weighted residuals (modern approach):

.. math:: r_i = w_i (x_{ref,i} - x_{calc,i})

The objective value is sum(r_i**2). This is the standard form expected by scipy.optimize.least_squares and gradient-based minimizers.

The legacy code in q2mm.optimizers.scoring uses a different normalisation inherited from upstream compare.py:

  • Energies are zero-referenced via correlate_energies() before scoring
  • A denominator based on the total count of energy-type data points is applied (see compare_data()).

For migration validation, use :func:q2mm.optimizers.scoring.compare_data directly — it is importable and usable standalone to cross-check scores against the upstream code path.

ReferenceValue dataclass

ReferenceValue(kind: Literal['energy', 'frequency', 'bond_length', 'bond_angle', 'torsion_angle', 'eig_diagonal', 'eig_offdiagonal'], value: float, weight: float = 1.0, label: str = '', molecule_idx: int = 0, data_idx: int = 0, atom_indices: tuple[int, ...] | None = None)

A single reference observation (QM or experimental).

ReferenceData dataclass

ReferenceData(values: list[ReferenceValue] = list())

Complete set of reference data for an optimization.

Each entry describes one observable: an energy, a frequency, or a geometric parameter that the force field should reproduce.

n_observations property

n_observations: int

Total number of reference observations.

Returns:

Name Type Description
int int

Length of the values list.

add_energy

add_energy(value: float, *, weight: float = 1.0, molecule_idx: int = 0, label: str = '')

Add a single energy reference value.

Parameters:

Name Type Description Default
value float

Reference energy value.

required
weight float

Weight for this entry.

1.0
molecule_idx int

Index into the molecules list.

0
label str

Human-readable label.

''
Source code in q2mm/optimizers/objective.py
def add_energy(
    self,
    value: float,
    *,
    weight: float = 1.0,
    molecule_idx: int = 0,
    label: str = "",
):
    """Add a single energy reference value.

    Args:
        value (float): Reference energy value.
        weight (float): Weight for this entry.
        molecule_idx (int): Index into the molecules list.
        label (str): Human-readable label.
    """
    self.values.append(
        ReferenceValue(
            kind="energy",
            value=value,
            weight=weight,
            molecule_idx=molecule_idx,
            label=label,
        )
    )

add_frequency

add_frequency(value: float, *, data_idx: int, weight: float = 1.0, molecule_idx: int = 0, label: str = '')

Add a single vibrational frequency reference value.

Parameters:

Name Type Description Default
value float

Reference frequency in cm⁻¹.

required
data_idx int

0-based index of this frequency mode.

required
weight float

Weight for this entry.

1.0
molecule_idx int

Index into the molecules list.

0
label str

Human-readable label.

''
Source code in q2mm/optimizers/objective.py
def add_frequency(
    self,
    value: float,
    *,
    data_idx: int,
    weight: float = 1.0,
    molecule_idx: int = 0,
    label: str = "",
):
    """Add a single vibrational frequency reference value.

    Args:
        value (float): Reference frequency in cm⁻¹.
        data_idx (int): 0-based index of this frequency mode.
        weight (float): Weight for this entry.
        molecule_idx (int): Index into the molecules list.
        label (str): Human-readable label.
    """
    self.values.append(
        ReferenceValue(
            kind="frequency",
            value=value,
            weight=weight,
            molecule_idx=molecule_idx,
            data_idx=data_idx,
            label=label,
        )
    )

add_bond_length

add_bond_length(value: float, *, data_idx: int = -1, atom_indices: tuple[int, int] | None = None, weight: float = 1.0, molecule_idx: int = 0, label: str = '')

Add a single bond length reference value.

Parameters:

Name Type Description Default
value float

Reference bond length in Ångströms.

required
data_idx int

0-based positional index (fallback if atom_indices is not provided).

-1
atom_indices tuple[int, int] | None

Atom pair for identity-based matching.

None
weight float

Weight for this entry.

1.0
molecule_idx int

Index into the molecules list.

0
label str

Human-readable label.

''

Raises:

Type Description
ValueError

If neither atom_indices nor a non-negative data_idx is provided.

Source code in q2mm/optimizers/objective.py
def add_bond_length(
    self,
    value: float,
    *,
    data_idx: int = -1,
    atom_indices: tuple[int, int] | None = None,
    weight: float = 1.0,
    molecule_idx: int = 0,
    label: str = "",
):
    """Add a single bond length reference value.

    Args:
        value (float): Reference bond length in Ångströms.
        data_idx (int): 0-based positional index (fallback if
            ``atom_indices`` is not provided).
        atom_indices (tuple[int, int] | None): Atom pair for
            identity-based matching.
        weight (float): Weight for this entry.
        molecule_idx (int): Index into the molecules list.
        label (str): Human-readable label.

    Raises:
        ValueError: If neither ``atom_indices`` nor a non-negative
            ``data_idx`` is provided.
    """
    if atom_indices is None and data_idx < 0:
        raise ValueError("Either atom_indices or data_idx must be provided for bond_length.")
    self.values.append(
        ReferenceValue(
            kind="bond_length",
            value=value,
            weight=weight,
            molecule_idx=molecule_idx,
            data_idx=max(data_idx, 0),
            atom_indices=atom_indices,
            label=label,
        )
    )

add_bond_angle

add_bond_angle(value: float, *, data_idx: int = -1, atom_indices: tuple[int, int, int] | None = None, weight: float = 1.0, molecule_idx: int = 0, label: str = '')

Add a single bond angle reference value.

Parameters:

Name Type Description Default
value float

Reference bond angle in degrees.

required
data_idx int

0-based positional index (fallback if atom_indices is not provided).

-1
atom_indices tuple[int, int, int] | None

Atom triple (i, j, k) for identity-based matching.

None
weight float

Weight for this entry.

1.0
molecule_idx int

Index into the molecules list.

0
label str

Human-readable label.

''

Raises:

Type Description
ValueError

If neither atom_indices nor a non-negative data_idx is provided.

Source code in q2mm/optimizers/objective.py
def add_bond_angle(
    self,
    value: float,
    *,
    data_idx: int = -1,
    atom_indices: tuple[int, int, int] | None = None,
    weight: float = 1.0,
    molecule_idx: int = 0,
    label: str = "",
):
    """Add a single bond angle reference value.

    Args:
        value (float): Reference bond angle in degrees.
        data_idx (int): 0-based positional index (fallback if
            ``atom_indices`` is not provided).
        atom_indices (tuple[int, int, int] | None): Atom triple
            (i, j, k) for identity-based matching.
        weight (float): Weight for this entry.
        molecule_idx (int): Index into the molecules list.
        label (str): Human-readable label.

    Raises:
        ValueError: If neither ``atom_indices`` nor a non-negative
            ``data_idx`` is provided.
    """
    if atom_indices is None and data_idx < 0:
        raise ValueError("Either atom_indices or data_idx must be provided for bond_angle.")
    self.values.append(
        ReferenceValue(
            kind="bond_angle",
            value=value,
            weight=weight,
            molecule_idx=molecule_idx,
            data_idx=max(data_idx, 0),
            atom_indices=atom_indices,
            label=label,
        )
    )

add_torsion_angle

add_torsion_angle(value: float, *, atom_indices: tuple[int, int, int, int], weight: float = 1.0, molecule_idx: int = 0, label: str = '')

Add a single torsion (dihedral) angle reference value.

Parameters:

Name Type Description Default
value float

Reference torsion angle in degrees.

required
atom_indices tuple[int, int, int, int]

Four atom indices defining the dihedral.

required
weight float

Weight for this entry.

1.0
molecule_idx int

Index into the molecules list.

0
label str

Human-readable label.

''
Source code in q2mm/optimizers/objective.py
def add_torsion_angle(
    self,
    value: float,
    *,
    atom_indices: tuple[int, int, int, int],
    weight: float = 1.0,
    molecule_idx: int = 0,
    label: str = "",
):
    """Add a single torsion (dihedral) angle reference value.

    Args:
        value (float): Reference torsion angle in degrees.
        atom_indices (tuple[int, int, int, int]): Four atom indices
            defining the dihedral.
        weight (float): Weight for this entry.
        molecule_idx (int): Index into the molecules list.
        label (str): Human-readable label.
    """
    self.values.append(
        ReferenceValue(
            kind="torsion_angle",
            value=value,
            weight=weight,
            molecule_idx=molecule_idx,
            atom_indices=atom_indices,
            label=label,
        )
    )

add_hessian_eigenvalue

add_hessian_eigenvalue(value: float, *, mode_idx: int, weight: float = 0.1, molecule_idx: int = 0, label: str = '')

Add a diagonal element (eigenvalue) of the eigenmatrix.

Parameters:

Name Type Description Default
value float

QM eigenvalue for this mode.

required
mode_idx int

0-based index of the vibrational mode.

required
weight float

Weight for this entry. Legacy defaults: 0.10 for both low- and high-frequency modes, 0.00 for the first (imaginary) mode.

0.1
molecule_idx int

Index into the molecules list.

0
label str

Human-readable label.

''
Source code in q2mm/optimizers/objective.py
def add_hessian_eigenvalue(
    self,
    value: float,
    *,
    mode_idx: int,
    weight: float = 0.1,
    molecule_idx: int = 0,
    label: str = "",
):
    """Add a diagonal element (eigenvalue) of the eigenmatrix.

    Args:
        value (float): QM eigenvalue for this mode.
        mode_idx (int): 0-based index of the vibrational mode.
        weight (float): Weight for this entry. Legacy defaults: 0.10
            for both low- and high-frequency modes, 0.00 for the
            first (imaginary) mode.
        molecule_idx (int): Index into the molecules list.
        label (str): Human-readable label.
    """
    self.values.append(
        ReferenceValue(
            kind="eig_diagonal",
            value=value,
            weight=weight,
            molecule_idx=molecule_idx,
            data_idx=mode_idx,
            label=label,
        )
    )

add_hessian_offdiagonal

add_hessian_offdiagonal(value: float, *, row: int, col: int, weight: float = 0.05, molecule_idx: int = 0, label: str = '')

Add an off-diagonal element of the eigenmatrix.

Off-diagonal elements measure cross-coupling between modes. They should be close to zero when the MM Hessian closely reproduces the QM eigenvector structure.

Parameters:

Name Type Description Default
value float

QM eigenmatrix element (typically 0.0 for the QM self-projection).

required
row int

0-based row index into the eigenmatrix.

required
col int

0-based column index into the eigenmatrix.

required
weight float

Weight for this entry. Legacy default: 0.05.

0.05
molecule_idx int

Index into the molecules list.

0
label str

Human-readable label.

''
Source code in q2mm/optimizers/objective.py
def add_hessian_offdiagonal(
    self,
    value: float,
    *,
    row: int,
    col: int,
    weight: float = 0.05,
    molecule_idx: int = 0,
    label: str = "",
):
    """Add an off-diagonal element of the eigenmatrix.

    Off-diagonal elements measure cross-coupling between modes.
    They should be close to zero when the MM Hessian closely
    reproduces the QM eigenvector structure.

    Args:
        value (float): QM eigenmatrix element (typically 0.0 for
            the QM self-projection).
        row (int): 0-based row index into the eigenmatrix.
        col (int): 0-based column index into the eigenmatrix.
        weight (float): Weight for this entry. Legacy default: 0.05.
        molecule_idx (int): Index into the molecules list.
        label (str): Human-readable label.
    """
    self.values.append(
        ReferenceValue(
            kind="eig_offdiagonal",
            value=value,
            weight=weight,
            molecule_idx=molecule_idx,
            atom_indices=(row, col),
            label=label,
        )
    )

add_eigenmatrix_from_hessian

add_eigenmatrix_from_hessian(hessian: ndarray, *, diagonal_only: bool = False, molecule_idx: int = 0, weights: dict[str, float] | None = None, skip_first: bool = True, eigenvalue_threshold: float = 0.1173) -> int

Bulk-load eigenmatrix training data from a QM Hessian.

Decomposes the Hessian, computes the eigenmatrix, and adds all elements as reference values with the legacy weight scheme.

The Hessian should be in canonical units (Hartree/Bohr²).

Parameters:

Name Type Description Default
hessian ndarray

QM Hessian matrix (3N, 3N) in Hartree/Bohr².

required
diagonal_only bool

If True, add only diagonal elements (eigenvalues). If False, add all lower-triangular elements.

False
molecule_idx int

Index into the molecules list.

0
weights dict[str, float] | None

Weight overrides. Keys: "eig_i" (1st eigenvalue), "eig_d_low" (diagonal, value < threshold), "eig_d_high" (diagonal, value ≥ threshold), "eig_o" (off-diagonal). Defaults match legacy: {eig_i: 0.0, eig_d_low: 0.1, eig_d_high: 0.1, eig_o: 0.05}.

None
skip_first bool

If True, the first eigenvalue gets weight eig_i (default 0.0, effectively skipping it). This is standard for TS fitting where the first mode is imaginary.

True
eigenvalue_threshold float

Eigenvalue threshold (Hartree/Bohr²) separating low/high frequency modes for weight assignment. The default 0.1173 corresponds to the legacy threshold of 1100 kJ/(mol·Å²).

0.1173

Returns:

Name Type Description
int int

Number of entries added.

Source code in q2mm/optimizers/objective.py
def add_eigenmatrix_from_hessian(
    self,
    hessian: np.ndarray,
    *,
    diagonal_only: bool = False,
    molecule_idx: int = 0,
    weights: dict[str, float] | None = None,
    skip_first: bool = True,
    eigenvalue_threshold: float = 0.1173,
) -> int:
    """Bulk-load eigenmatrix training data from a QM Hessian.

    Decomposes the Hessian, computes the eigenmatrix, and adds all
    elements as reference values with the legacy weight scheme.

    The Hessian should be in canonical units (Hartree/Bohr²).

    Args:
        hessian (np.ndarray): QM Hessian matrix ``(3N, 3N)`` in
            Hartree/Bohr².
        diagonal_only (bool): If ``True``, add only diagonal elements
            (eigenvalues). If ``False``, add all lower-triangular
            elements.
        molecule_idx (int): Index into the molecules list.
        weights (dict[str, float] | None): Weight overrides. Keys:
            ``"eig_i"`` (1st eigenvalue), ``"eig_d_low"`` (diagonal,
            value < threshold), ``"eig_d_high"`` (diagonal, value ≥
            threshold), ``"eig_o"`` (off-diagonal). Defaults match
            legacy: ``{eig_i: 0.0, eig_d_low: 0.1, eig_d_high: 0.1,
            eig_o: 0.05}``.
        skip_first (bool): If ``True``, the first eigenvalue gets
            weight ``eig_i`` (default 0.0, effectively skipping it).
            This is standard for TS fitting where the first mode is
            imaginary.
        eigenvalue_threshold (float): Eigenvalue threshold
            (Hartree/Bohr²) separating low/high frequency modes for
            weight assignment. The default 0.1173 corresponds to the
            legacy threshold of 1100 kJ/(mol·Å²).

    Returns:
        int: Number of entries added.
    """
    from q2mm.models.hessian import decompose, transform_to_eigenmatrix, extract_eigenmatrix_data

    w = {"eig_i": 0.0, "eig_d_low": 0.1, "eig_d_high": 0.1, "eig_o": 0.05}
    if weights:
        w.update(weights)

    eigenvalues, eigenvectors = decompose(hessian)
    eigenmatrix = transform_to_eigenmatrix(hessian, eigenvectors)
    elements = extract_eigenmatrix_data(eigenmatrix, diagonal_only=diagonal_only)

    added = 0
    for row, col, value in elements:
        if row == col:
            # Diagonal element
            if row == 0 and skip_first:
                weight = w["eig_i"]
            elif value < eigenvalue_threshold:
                weight = w["eig_d_low"]
            else:
                weight = w["eig_d_high"]
            self.add_hessian_eigenvalue(
                value,
                mode_idx=row,
                weight=weight,
                molecule_idx=molecule_idx,
                label=f"eig[{row}]",
            )
        else:
            # Off-diagonal element
            self.add_hessian_offdiagonal(
                value,
                row=row,
                col=col,
                weight=w["eig_o"],
                molecule_idx=molecule_idx,
                label=f"eig[{row},{col}]",
            )
        added += 1
    return added

add_frequencies_from_array

add_frequencies_from_array(frequencies: ndarray | list[float], *, weight: float = 1.0, molecule_idx: int = 0, skip_imaginary: bool = False) -> int

Add all frequencies from a 1-D array.

Parameters:

Name Type Description Default
frequencies ndarray | list[float]

Vibrational frequencies (cm⁻¹). Imaginary modes should be negative values.

required
weight float

Weight applied to every frequency entry.

1.0
molecule_idx int

Index into the molecules list for multi-structure fits.

0
skip_imaginary bool

If True, negative frequencies (imaginary modes) are skipped.

False

Returns:

Name Type Description
int int

Number of frequency entries added.

Source code in q2mm/optimizers/objective.py
def add_frequencies_from_array(
    self,
    frequencies: np.ndarray | list[float],
    *,
    weight: float = 1.0,
    molecule_idx: int = 0,
    skip_imaginary: bool = False,
) -> int:
    """Add all frequencies from a 1-D array.

    Args:
        frequencies (np.ndarray | list[float]): Vibrational frequencies
            (cm⁻¹). Imaginary modes should be negative values.
        weight (float): Weight applied to every frequency entry.
        molecule_idx (int): Index into the molecules list for
            multi-structure fits.
        skip_imaginary (bool): If ``True``, negative frequencies
            (imaginary modes) are skipped.

    Returns:
        int: Number of frequency entries added.
    """
    freqs = np.asarray(frequencies, dtype=float).ravel()
    added = 0
    for i, freq in enumerate(freqs):
        if skip_imaginary and freq < 0:
            continue
        self.add_frequency(
            float(freq),
            data_idx=i,
            weight=weight,
            molecule_idx=molecule_idx,
            label=f"mode {i}",
        )
        added += 1
    return added

from_molecule classmethod

from_molecule(mol: Q2MMMolecule, *, weights: dict[str, float] | None = None, molecule_idx: int = 0, frequencies: ndarray | list[float] | None = None, skip_imaginary: bool = False, include_eigenmatrix: bool = False, eigenmatrix_diagonal_only: bool = False) -> ReferenceData

Auto-populate reference data from a molecule's detected geometry.

Extracts all auto-detected bond lengths and bond angles from the molecule. Optionally adds vibrational frequencies and/or Hessian eigenmatrix training data.

Parameters:

Name Type Description Default
mol Q2MMMolecule

Molecule with geometry (bonds/angles auto-detected).

required
weights dict[str, float] | None

Weight overrides keyed by data type. Supported keys: "bond_length", "bond_angle", "frequency", and the eigenmatrix keys "eig_i", "eig_d_low", "eig_d_high", "eig_o". Defaults: {"bond_length": 10.0, "bond_angle": 5.0, "frequency": 1.0}.

None
molecule_idx int

Index for multi-molecule fits.

0
frequencies ndarray | list[float] | None

Vibrational frequencies (cm⁻¹) to include.

None
skip_imaginary bool

If True, negative frequencies are skipped.

False
include_eigenmatrix bool

If True and the molecule has a Hessian, add eigenmatrix training data (diagonal and optionally off-diagonal elements).

False
eigenmatrix_diagonal_only bool

If True, only diagonal eigenmatrix elements are added.

False

Returns:

Name Type Description
ReferenceData ReferenceData

Populated with bond lengths, angles, and (optionally) frequencies and eigenmatrix data.

Source code in q2mm/optimizers/objective.py
@classmethod
def from_molecule(
    cls,
    mol: Q2MMMolecule,
    *,
    weights: dict[str, float] | None = None,
    molecule_idx: int = 0,
    frequencies: np.ndarray | list[float] | None = None,
    skip_imaginary: bool = False,
    include_eigenmatrix: bool = False,
    eigenmatrix_diagonal_only: bool = False,
) -> ReferenceData:
    """Auto-populate reference data from a molecule's detected geometry.

    Extracts all auto-detected bond lengths and bond angles from the
    molecule. Optionally adds vibrational frequencies and/or Hessian
    eigenmatrix training data.

    Args:
        mol (Q2MMMolecule): Molecule with geometry (bonds/angles
            auto-detected).
        weights (dict[str, float] | None): Weight overrides keyed by
            data type. Supported keys: ``"bond_length"``,
            ``"bond_angle"``, ``"frequency"``, and the eigenmatrix
            keys ``"eig_i"``, ``"eig_d_low"``, ``"eig_d_high"``,
            ``"eig_o"``. Defaults: ``{"bond_length": 10.0,
            "bond_angle": 5.0, "frequency": 1.0}``.
        molecule_idx (int): Index for multi-molecule fits.
        frequencies (np.ndarray | list[float] | None): Vibrational
            frequencies (cm⁻¹) to include.
        skip_imaginary (bool): If ``True``, negative frequencies are
            skipped.
        include_eigenmatrix (bool): If ``True`` and the molecule has
            a Hessian, add eigenmatrix training data (diagonal and
            optionally off-diagonal elements).
        eigenmatrix_diagonal_only (bool): If ``True``, only diagonal
            eigenmatrix elements are added.

    Returns:
        ReferenceData: Populated with bond lengths, angles, and
            (optionally) frequencies and eigenmatrix data.
    """
    w = {"bond_length": 10.0, "bond_angle": 5.0, "frequency": 1.0}
    if weights:
        w.update(weights)

    ref = cls()

    for bond in mol.bonds:
        ref.add_bond_length(
            bond.length,
            atom_indices=(bond.atom_i, bond.atom_j),
            weight=w["bond_length"],
            molecule_idx=molecule_idx,
            label=f"{bond.element_pair} bond",
        )

    for angle in mol.angles:
        ref.add_bond_angle(
            angle.value,
            atom_indices=(angle.atom_i, angle.atom_j, angle.atom_k),
            weight=w["bond_angle"],
            molecule_idx=molecule_idx,
            label=f"{angle.elements} angle",
        )

    if frequencies is not None:
        ref.add_frequencies_from_array(
            frequencies,
            weight=w["frequency"],
            molecule_idx=molecule_idx,
            skip_imaginary=skip_imaginary,
        )

    if include_eigenmatrix and mol.hessian is not None:
        eig_weights = {k: w[k] for k in ("eig_i", "eig_d_low", "eig_d_high", "eig_o") if k in w}
        ref.add_eigenmatrix_from_hessian(
            mol.hessian,
            diagonal_only=eigenmatrix_diagonal_only,
            molecule_idx=molecule_idx,
            weights=eig_weights or None,
        )

    return ref

from_molecules classmethod

from_molecules(molecules: list[Q2MMMolecule], *, weights: dict[str, float] | None = None, frequencies_list: list[ndarray | list[float]] | None = None, skip_imaginary: bool = False, include_eigenmatrix: bool = False, eigenmatrix_diagonal_only: bool = False) -> ReferenceData

Auto-populate reference data from multiple molecules.

Each molecule is assigned a sequential molecule_idx starting from 0. Delegates to :meth:from_molecule per molecule.

Parameters:

Name Type Description Default
molecules list[Q2MMMolecule]

Training set molecules.

required
weights dict[str, float] | None

Weight overrides (same keys as :meth:from_molecule).

None
frequencies_list list[ndarray | list[float]] | None

Per-molecule frequencies. Must have the same length as molecules if provided.

None
skip_imaginary bool

If True, negative frequencies are skipped.

False
include_eigenmatrix bool

If True and a molecule has a Hessian, add eigenmatrix data.

False
eigenmatrix_diagonal_only bool

If True, only diagonal eigenmatrix elements are added.

False

Returns:

Name Type Description
ReferenceData ReferenceData

Combined reference data for all molecules.

Raises:

Type Description
ValueError

If frequencies_list length does not match molecules length.

Source code in q2mm/optimizers/objective.py
@classmethod
def from_molecules(
    cls,
    molecules: list[Q2MMMolecule],
    *,
    weights: dict[str, float] | None = None,
    frequencies_list: list[np.ndarray | list[float]] | None = None,
    skip_imaginary: bool = False,
    include_eigenmatrix: bool = False,
    eigenmatrix_diagonal_only: bool = False,
) -> ReferenceData:
    """Auto-populate reference data from multiple molecules.

    Each molecule is assigned a sequential ``molecule_idx`` starting
    from 0.  Delegates to :meth:`from_molecule` per molecule.

    Args:
        molecules (list[Q2MMMolecule]): Training set molecules.
        weights (dict[str, float] | None): Weight overrides (same
            keys as :meth:`from_molecule`).
        frequencies_list (list[np.ndarray | list[float]] | None):
            Per-molecule frequencies. Must have the same length as
            *molecules* if provided.
        skip_imaginary (bool): If ``True``, negative frequencies are
            skipped.
        include_eigenmatrix (bool): If ``True`` and a molecule has a
            Hessian, add eigenmatrix data.
        eigenmatrix_diagonal_only (bool): If ``True``, only diagonal
            eigenmatrix elements are added.

    Returns:
        ReferenceData: Combined reference data for all molecules.

    Raises:
        ValueError: If ``frequencies_list`` length does not match
            ``molecules`` length.
    """
    if frequencies_list is not None and len(frequencies_list) != len(molecules):
        raise ValueError(
            f"frequencies_list length ({len(frequencies_list)}) must match molecules length ({len(molecules)})."
        )

    ref = cls()
    for idx, mol in enumerate(molecules):
        single = cls.from_molecule(
            mol,
            weights=weights,
            molecule_idx=idx,
            frequencies=frequencies_list[idx] if frequencies_list is not None else None,
            skip_imaginary=skip_imaginary,
            include_eigenmatrix=include_eigenmatrix,
            eigenmatrix_diagonal_only=eigenmatrix_diagonal_only,
        )
        ref.values.extend(single.values)

    return ref

from_gaussian classmethod

from_gaussian(path: str | Path, *, weights: dict[str, float] | None = None, bond_tolerance: float = 1.3, charge: int = 0, multiplicity: int = 1, include_frequencies: bool = True, skip_imaginary: bool = False, au_hessian: bool = True) -> tuple[ReferenceData, Q2MMMolecule]

Build reference data from a Gaussian log file.

Parses the log file for the optimised geometry and vibrational frequencies, then auto-populates bond lengths, angles, and (optionally) frequencies.

Parameters:

Name Type Description Default
path str | Path

Path to the Gaussian .log file (from an opt freq job).

required
weights dict[str, float] | None

Weight overrides (same keys as :meth:from_molecule).

None
bond_tolerance float

Multiplier for covalent-radii bond detection. Use 1.4+ for TS.

1.3
charge int

Molecular charge.

0
multiplicity int

Spin multiplicity.

1
include_frequencies bool

Whether to add frequency data from the log file.

True
skip_imaginary bool

If True, negative frequencies are skipped.

False
au_hessian bool

Keep Hessian in atomic units (Hartree/Bohr²).

True

Returns:

Type Description
tuple[ReferenceData, Q2MMMolecule]

tuple[ReferenceData, Q2MMMolecule]: Populated reference data and the parsed molecule (with Hessian attached if available).

Source code in q2mm/optimizers/objective.py
@classmethod
def from_gaussian(
    cls,
    path: str | Path,
    *,
    weights: dict[str, float] | None = None,
    bond_tolerance: float = 1.3,
    charge: int = 0,
    multiplicity: int = 1,
    include_frequencies: bool = True,
    skip_imaginary: bool = False,
    au_hessian: bool = True,
) -> tuple[ReferenceData, Q2MMMolecule]:
    """Build reference data from a Gaussian log file.

    Parses the log file for the optimised geometry and vibrational
    frequencies, then auto-populates bond lengths, angles, and
    (optionally) frequencies.

    Args:
        path (str | Path): Path to the Gaussian ``.log`` file
            (from an ``opt freq`` job).
        weights (dict[str, float] | None): Weight overrides (same
            keys as :meth:`from_molecule`).
        bond_tolerance (float): Multiplier for covalent-radii bond
            detection. Use 1.4+ for TS.
        charge (int): Molecular charge.
        multiplicity (int): Spin multiplicity.
        include_frequencies (bool): Whether to add frequency data
            from the log file.
        skip_imaginary (bool): If ``True``, negative frequencies are
            skipped.
        au_hessian (bool): Keep Hessian in atomic units
            (Hartree/Bohr²).

    Returns:
        tuple[ReferenceData, Q2MMMolecule]: Populated reference data
            and the parsed molecule (with Hessian attached if
            available).
    """
    from q2mm.parsers.gaussian import GaussLog
    from q2mm.models.hessian import reform_hessian

    log = GaussLog(str(path), au_hessian=au_hessian)

    # Build molecule from the last (optimised) structure
    mol = log.molecules[-1]
    mol.charge = charge
    mol.multiplicity = multiplicity
    mol.bond_tolerance = bond_tolerance

    # Override hessian with eigenvalue-reconstructed version if available
    if log.evals is not None and log.evecs is not None and log.evals.size and log.evecs.size:
        mol.hessian = reform_hessian(log.evals, log.evecs)

    # Frequencies in cm⁻¹ from the Gaussian log
    # Note: log.evals are eigenvalues (mass-weighted force constants in
    # atomic units), NOT frequencies.  Use log.frequencies for cm⁻¹ values.
    frequencies = None
    if include_frequencies and log.frequencies is not None and len(log.frequencies):
        frequencies = np.array(log.frequencies)

    ref = cls.from_molecule(
        mol,
        weights=weights,
        frequencies=frequencies,
        skip_imaginary=skip_imaginary,
    )

    return ref, mol

from_fchk classmethod

from_fchk(path: str | Path, *, weights: dict[str, float] | None = None, bond_tolerance: float = 1.3, charge: int = 0, multiplicity: int = 1) -> tuple[ReferenceData, Q2MMMolecule]

Build reference data from a Gaussian formatted checkpoint file.

Parses the .fchk file for geometry, Cartesian Force Constants (Hessian), and atom data. Auto-populates bond lengths and angles.

Parameters:

Name Type Description Default
path str | Path

Path to the Gaussian .fchk file.

required
weights dict[str, float] | None

Weight overrides (same keys as :meth:from_molecule).

None
bond_tolerance float

Multiplier for covalent-radii bond detection.

1.3
charge int

Molecular charge (overridden by file values if present).

0
multiplicity int

Spin multiplicity (overridden by file values if present).

1

Returns:

Type Description
tuple[ReferenceData, Q2MMMolecule]

tuple[ReferenceData, Q2MMMolecule]: Populated reference data and the parsed molecule with Hessian.

Source code in q2mm/optimizers/objective.py
@classmethod
def from_fchk(
    cls,
    path: str | Path,
    *,
    weights: dict[str, float] | None = None,
    bond_tolerance: float = 1.3,
    charge: int = 0,
    multiplicity: int = 1,
) -> tuple[ReferenceData, Q2MMMolecule]:
    """Build reference data from a Gaussian formatted checkpoint file.

    Parses the ``.fchk`` file for geometry, Cartesian Force Constants
    (Hessian), and atom data. Auto-populates bond lengths and angles.

    Args:
        path (str | Path): Path to the Gaussian ``.fchk`` file.
        weights (dict[str, float] | None): Weight overrides (same
            keys as :meth:`from_molecule`).
        bond_tolerance (float): Multiplier for covalent-radii bond
            detection.
        charge (int): Molecular charge (overridden by file values
            if present).
        multiplicity (int): Spin multiplicity (overridden by file
            values if present).

    Returns:
        tuple[ReferenceData, Q2MMMolecule]: Populated reference data
            and the parsed molecule with Hessian.
    """
    path = Path(path)
    symbols, coords_ang, hessian, file_charge, file_mult = _parse_fchk(path)

    mol = Q2MMMolecule(
        symbols=symbols,
        geometry=coords_ang,
        charge=file_charge if file_charge is not None else charge,
        multiplicity=file_mult if file_mult is not None else multiplicity,
        name=path.stem,
        bond_tolerance=bond_tolerance,
        hessian=hessian,
    )

    ref = cls.from_molecule(mol, weights=weights)

    return ref, mol

ObjectiveFunction

ObjectiveFunction(forcefield: ForceField, engine: MMEngine, molecules: list[Q2MMMolecule], reference: ReferenceData)

Objective function for scipy-based force field optimization.

Evaluates the weighted sum-of-squares between MM-calculated and reference data for one or more molecules.

Parameters:

Name Type Description Default
forcefield ForceField

The force field whose parameters are being optimized.

required
engine MMEngine

The MM backend (OpenMM, Tinker, etc.).

required
molecules list[Q2MMMolecule]

Training set molecules.

required
reference ReferenceData

QM/experimental reference observations.

required

Initialize the objective function.

Parameters:

Name Type Description Default
forcefield ForceField

The force field whose parameters are being optimized.

required
engine MMEngine

The MM backend (OpenMM, Tinker, etc.).

required
molecules list[Q2MMMolecule]

Training set molecules.

required
reference ReferenceData

QM/experimental reference observations.

required
Source code in q2mm/optimizers/objective.py
def __init__(
    self,
    forcefield: ForceField,
    engine: MMEngine,
    molecules: list[Q2MMMolecule],
    reference: ReferenceData,
):
    """Initialize the objective function.

    Args:
        forcefield (ForceField): The force field whose parameters
            are being optimized.
        engine (MMEngine): The MM backend (OpenMM, Tinker, etc.).
        molecules (list[Q2MMMolecule]): Training set molecules.
        reference (ReferenceData): QM/experimental reference
            observations.
    """
    self.forcefield = forcefield
    self.engine = engine
    self.molecules = molecules
    self.reference = reference
    self.n_eval = 0
    self.history: list[float] = []
    # Reusable engine handles for backends that support runtime parameter
    # updates (e.g., OpenMM). Avoids rebuilding simulation contexts each
    # evaluation — critical for optimization performance.
    self._handles: dict[int, object] = {}
    # Cached QM eigenvectors per molecule (precomputed once for
    # eigenmatrix comparisons — the QM basis is fixed).
    self._qm_eigenvectors: dict[int, np.ndarray] = {}

__call__

__call__(param_vector: ndarray) -> float

Evaluate objective for a given parameter vector.

This is the function signature that :func:scipy.optimize.minimize expects: f(x) -> float.

Parameters:

Name Type Description Default
param_vector ndarray

Flat parameter vector.

required

Returns:

Name Type Description
float float

Sum-of-squared weighted residuals.

Source code in q2mm/optimizers/objective.py
def __call__(self, param_vector: np.ndarray) -> float:
    """Evaluate objective for a given parameter vector.

    This is the function signature that :func:`scipy.optimize.minimize`
    expects: ``f(x) -> float``.

    Args:
        param_vector (np.ndarray): Flat parameter vector.

    Returns:
        float: Sum-of-squared weighted residuals.
    """
    self.forcefield.set_param_vector(param_vector)

    residuals = self._compute_residuals()
    score = float(np.sum(residuals**2))

    self.n_eval += 1
    self.history.append(score)
    return score

residuals

residuals(param_vector: ndarray) -> ndarray

Compute weighted residual vector (for least-squares methods).

Parameters:

Name Type Description Default
param_vector ndarray

Flat parameter vector.

required

Returns:

Type Description
ndarray

np.ndarray: Weighted residuals for each reference observation.

Source code in q2mm/optimizers/objective.py
def residuals(self, param_vector: np.ndarray) -> np.ndarray:
    """Compute weighted residual vector (for least-squares methods).

    Args:
        param_vector (np.ndarray): Flat parameter vector.

    Returns:
        np.ndarray: Weighted residuals for each reference observation.
    """
    self.forcefield.set_param_vector(param_vector)
    r = self._compute_residuals()
    self.n_eval += 1
    self.history.append(float(np.sum(r**2)))
    return r

gradient

gradient(param_vector: ndarray) -> ndarray

Compute analytical gradient of the score w.r.t. parameters.

Uses the engine's energy_and_param_grad() method (available on :class:~q2mm.backends.mm.jax_engine.JaxEngine) to compute exact derivatives for energy reference data. Raises NotImplementedError for reference data types that require Hessians or minimized geometries; use jac=None (finite differences) for mixed reference data.

The score is sum_i (w_i * (ref_i - calc_i))**2, so:

d(score)/d(p) = -2 * sum_i [w_i^2 * (ref_i - calc_i) * d(calc_i)/d(p)]

Note

This method does not increment n_eval or append to history. SciPy's minimize calls fun(x) and jac(x) separately, so tracking state here would double-count evaluations. Evaluation counting is handled exclusively in __call__.

Parameters:

Name Type Description Default
param_vector ndarray

Flat parameter vector (same as :meth:__call__).

required

Returns:

Type Description
ndarray

np.ndarray: Gradient of the score with respect to each parameter.

Raises:

Type Description
TypeError

If the engine does not support energy_and_param_grad().

NotImplementedError

If the reference data contains types other than energy (Hessian/frequency/geometry gradient support is planned).

Source code in q2mm/optimizers/objective.py
def gradient(self, param_vector: np.ndarray) -> np.ndarray:
    """Compute analytical gradient of the score w.r.t. parameters.

    Uses the engine's ``energy_and_param_grad()`` method (available on
    :class:`~q2mm.backends.mm.jax_engine.JaxEngine`) to compute exact
    derivatives for energy reference data.  Raises ``NotImplementedError``
    for reference data types that require Hessians or minimized geometries;
    use ``jac=None`` (finite differences) for mixed reference data.

    The score is ``sum_i (w_i * (ref_i - calc_i))**2``, so:

    ``d(score)/d(p) = -2 * sum_i [w_i^2 * (ref_i - calc_i) * d(calc_i)/d(p)]``

    Note:
        This method does **not** increment ``n_eval`` or append to
        ``history``.  SciPy's ``minimize`` calls ``fun(x)`` and ``jac(x)``
        separately, so tracking state here would double-count evaluations.
        Evaluation counting is handled exclusively in ``__call__``.

    Args:
        param_vector (np.ndarray): Flat parameter vector (same as
            :meth:`__call__`).

    Returns:
        np.ndarray: Gradient of the score with respect to each parameter.

    Raises:
        TypeError: If the engine does not support
            ``energy_and_param_grad()``.
        NotImplementedError: If the reference data contains types
            other than ``energy`` (Hessian/frequency/geometry gradient
            support is planned).
    """
    if not self.engine.supports_analytical_gradients():
        raise TypeError(
            f"{self.engine.name} does not support analytical gradients. "
            "Use a JaxEngine or another differentiable engine."
        )

    self.forcefield.set_param_vector(param_vector)

    # Check which data types are needed
    all_kinds = {ref.kind for ref in self.reference.values}
    unsupported = all_kinds - {"energy"}
    if unsupported:
        raise NotImplementedError(
            f"Analytical gradients not yet supported for data types: {unsupported}. "
            "Only 'energy' references are supported. Use finite differences (jac=None) "
            "for mixed reference data, or contribute Hessian/geometry gradient support."
        )

    # Compute energy + gradient for each molecule
    n_params = len(param_vector)
    total_grad = np.zeros(n_params)
    energy_cache: dict[int, tuple[float, np.ndarray]] = {}

    for ref in self.reference.values:
        mol_idx = ref.molecule_idx
        if mol_idx not in energy_cache:
            structure = self._get_structure(mol_idx)
            energy_cache[mol_idx] = self.engine.energy_and_param_grad(structure, self.forcefield)

        calc_value, calc_grad = energy_cache[mol_idx]
        diff = ref.value - calc_value
        # d(score)/d(p) = -2 * w^2 * (ref - calc) * d(calc)/d(p)
        total_grad += -2.0 * ref.weight**2 * diff * calc_grad

    return total_grad

reset

reset()

Reset evaluation counter, history, and cached engine handles.

Source code in q2mm/optimizers/objective.py
def reset(self):
    """Reset evaluation counter, history, and cached engine handles."""
    self.n_eval = 0
    self.history.clear()
    self._handles.clear()
    self._qm_eigenvectors.clear()