Objective Function¶

objective ¶

Objective function for force field optimization.

Wraps the ForceField ↔ MM-engine ↔ reference-data loop into a single callable that :func:scipy.optimize.minimize can drive.

Scoring approach¶

This module uses raw weighted residuals (modern approach):

.. math:: r_i = w_i (x_{ref,i} - x_{calc,i})

The objective value is sum(r_i**2). This is the standard form expected by scipy.optimize.least_squares and gradient-based minimizers.

The legacy code in q2mm.optimizers.scoring uses a different normalisation inherited from upstream compare.py:

Energies are zero-referenced via correlate_energies() before scoring
A denominator based on the total count of energy-type data points is applied (see compare_data()).

For migration validation, use :func:q2mm.optimizers.scoring.compare_data directly — it is importable and usable standalone to cross-check scores against the upstream code path.

ReferenceValue `dataclass` ¶

ReferenceValue(kind: Literal['energy', 'frequency', 'bond_length', 'bond_angle', 'torsion_angle', 'eig_diagonal', 'eig_offdiagonal'], value: float, weight: float = 1.0, label: str = '', molecule_idx: int = 0, data_idx: int = 0, atom_indices: tuple[int, ...] | None = None)

A single reference observation (QM or experimental).

ReferenceData `dataclass` ¶

ReferenceData(values: list[ReferenceValue] = list())

Complete set of reference data for an optimization.

Each entry describes one observable: an energy, a frequency, or a geometric parameter that the force field should reproduce.

n_observations `property` ¶

n_observations: int

Total number of reference observations.

Returns:

Name	Type	Description
`int`	`int`	Length of the `values` list.

add_energy ¶

add_energy(value: float, *, weight: float = 1.0, molecule_idx: int = 0, label: str = '')

Add a single energy reference value.

Parameters:

Name	Type	Description	Default
`value`	`float`	Reference energy value.	required
`weight`	`float`	Weight for this entry.	`1.0`
`molecule_idx`	`int`	Index into the molecules list.	`0`
`label`	`str`	Human-readable label.	`''`

Source code in q2mm/optimizers/objective.py

def add_energy(
    self,
    value: float,
    *,
    weight: float = 1.0,
    molecule_idx: int = 0,
    label: str = "",
):
    """Add a single energy reference value.

    Args:
        value (float): Reference energy value.
        weight (float): Weight for this entry.
        molecule_idx (int): Index into the molecules list.
        label (str): Human-readable label.
    """
    self.values.append(
        ReferenceValue(
            kind="energy",
            value=value,
            weight=weight,
            molecule_idx=molecule_idx,
            label=label,
        )
    )

add_frequency ¶

add_frequency(value: float, *, data_idx: int, weight: float = 1.0, molecule_idx: int = 0, label: str = '')

Add a single vibrational frequency reference value.

Parameters:

Name	Type	Description	Default
`value`	`float`	Reference frequency in cm⁻¹.	required
`data_idx`	`int`	0-based index of this frequency mode.	required
`weight`	`float`	Weight for this entry.	`1.0`
`molecule_idx`	`int`	Index into the molecules list.	`0`
`label`	`str`	Human-readable label.	`''`

Source code in q2mm/optimizers/objective.py

def add_frequency(
    self,
    value: float,
    *,
    data_idx: int,
    weight: float = 1.0,
    molecule_idx: int = 0,
    label: str = "",
):
    """Add a single vibrational frequency reference value.

    Args:
        value (float): Reference frequency in cm⁻¹.
        data_idx (int): 0-based index of this frequency mode.
        weight (float): Weight for this entry.
        molecule_idx (int): Index into the molecules list.
        label (str): Human-readable label.
    """
    self.values.append(
        ReferenceValue(
            kind="frequency",
            value=value,
            weight=weight,
            molecule_idx=molecule_idx,
            data_idx=data_idx,
            label=label,
        )
    )

add_bond_length ¶

add_bond_length(value: float, *, data_idx: int = -1, atom_indices: tuple[int, int] | None = None, weight: float = 1.0, molecule_idx: int = 0, label: str = '')

Add a single bond length reference value.

Parameters:

Name	Type	Description	Default
`value`	`float`	Reference bond length in Ångströms.	required
`data_idx`	`int`	0-based positional index (fallback if `atom_indices` is not provided).	`-1`
`atom_indices`	`tuple[int, int] \| None`	Atom pair for identity-based matching.	`None`
`weight`	`float`	Weight for this entry.	`1.0`
`molecule_idx`	`int`	Index into the molecules list.	`0`
`label`	`str`	Human-readable label.	`''`

Raises:

Type	Description
`ValueError`	If neither `atom_indices` nor a non-negative `data_idx` is provided.

Source code in q2mm/optimizers/objective.py

def add_bond_length(
    self,
    value: float,
    *,
    data_idx: int = -1,
    atom_indices: tuple[int, int] | None = None,
    weight: float = 1.0,
    molecule_idx: int = 0,
    label: str = "",
):
    """Add a single bond length reference value.

    Args:
        value (float): Reference bond length in Ångströms.
        data_idx (int): 0-based positional index (fallback if
            ``atom_indices`` is not provided).
        atom_indices (tuple[int, int] | None): Atom pair for
            identity-based matching.
        weight (float): Weight for this entry.
        molecule_idx (int): Index into the molecules list.
        label (str): Human-readable label.

    Raises:
        ValueError: If neither ``atom_indices`` nor a non-negative
            ``data_idx`` is provided.
    """
    if atom_indices is None and data_idx < 0:
        raise ValueError("Either atom_indices or data_idx must be provided for bond_length.")
    self.values.append(
        ReferenceValue(
            kind="bond_length",
            value=value,
            weight=weight,
            molecule_idx=molecule_idx,
            data_idx=max(data_idx, 0),
            atom_indices=atom_indices,
            label=label,
        )
    )

add_bond_angle ¶

add_bond_angle(value: float, *, data_idx: int = -1, atom_indices: tuple[int, int, int] | None = None, weight: float = 1.0, molecule_idx: int = 0, label: str = '')

Add a single bond angle reference value.

Parameters:

Name	Type	Description	Default
`value`	`float`	Reference bond angle in degrees.	required
`data_idx`	`int`	0-based positional index (fallback if `atom_indices` is not provided).	`-1`
`atom_indices`	`tuple[int, int, int] \| None`	Atom triple (i, j, k) for identity-based matching.	`None`
`weight`	`float`	Weight for this entry.	`1.0`
`molecule_idx`	`int`	Index into the molecules list.	`0`
`label`	`str`	Human-readable label.	`''`

Raises:

Type	Description
`ValueError`	If neither `atom_indices` nor a non-negative `data_idx` is provided.

Source code in q2mm/optimizers/objective.py

def add_bond_angle(
    self,
    value: float,
    *,
    data_idx: int = -1,
    atom_indices: tuple[int, int, int] | None = None,
    weight: float = 1.0,
    molecule_idx: int = 0,
    label: str = "",
):
    """Add a single bond angle reference value.

    Args:
        value (float): Reference bond angle in degrees.
        data_idx (int): 0-based positional index (fallback if
            ``atom_indices`` is not provided).
        atom_indices (tuple[int, int, int] | None): Atom triple
            (i, j, k) for identity-based matching.
        weight (float): Weight for this entry.
        molecule_idx (int): Index into the molecules list.
        label (str): Human-readable label.

    Raises:
        ValueError: If neither ``atom_indices`` nor a non-negative
            ``data_idx`` is provided.
    """
    if atom_indices is None and data_idx < 0:
        raise ValueError("Either atom_indices or data_idx must be provided for bond_angle.")
    self.values.append(
        ReferenceValue(
            kind="bond_angle",
            value=value,
            weight=weight,
            molecule_idx=molecule_idx,
            data_idx=max(data_idx, 0),
            atom_indices=atom_indices,
            label=label,
        )
    )

add_torsion_angle ¶

add_torsion_angle(value: float, *, atom_indices: tuple[int, int, int, int], weight: float = 1.0, molecule_idx: int = 0, label: str = '')

Add a single torsion (dihedral) angle reference value.

Parameters:

Name	Type	Description	Default
`value`	`float`	Reference torsion angle in degrees.	required
`atom_indices`	`tuple[int, int, int, int]`	Four atom indices defining the dihedral.	required
`weight`	`float`	Weight for this entry.	`1.0`
`molecule_idx`	`int`	Index into the molecules list.	`0`
`label`	`str`	Human-readable label.	`''`

Source code in q2mm/optimizers/objective.py

def add_torsion_angle(
    self,
    value: float,
    *,
    atom_indices: tuple[int, int, int, int],
    weight: float = 1.0,
    molecule_idx: int = 0,
    label: str = "",
):
    """Add a single torsion (dihedral) angle reference value.

    Args:
        value (float): Reference torsion angle in degrees.
        atom_indices (tuple[int, int, int, int]): Four atom indices
            defining the dihedral.
        weight (float): Weight for this entry.
        molecule_idx (int): Index into the molecules list.
        label (str): Human-readable label.
    """
    self.values.append(
        ReferenceValue(
            kind="torsion_angle",
            value=value,
            weight=weight,
            molecule_idx=molecule_idx,
            atom_indices=atom_indices,
            label=label,
        )
    )

add_hessian_eigenvalue ¶

add_hessian_eigenvalue(value: float, *, mode_idx: int, weight: float = 0.1, molecule_idx: int = 0, label: str = '')

Add a diagonal element (eigenvalue) of the eigenmatrix.

Parameters:

Name	Type	Description	Default
`value`	`float`	QM eigenvalue for this mode.	required
`mode_idx`	`int`	0-based index of the vibrational mode.	required
`weight`	`float`	Weight for this entry. Legacy defaults: 0.10 for both low- and high-frequency modes, 0.00 for the first (imaginary) mode.	`0.1`
`molecule_idx`	`int`	Index into the molecules list.	`0`
`label`	`str`	Human-readable label.	`''`

Source code in q2mm/optimizers/objective.py

def add_hessian_eigenvalue(
    self,
    value: float,
    *,
    mode_idx: int,
    weight: float = 0.1,
    molecule_idx: int = 0,
    label: str = "",
):
    """Add a diagonal element (eigenvalue) of the eigenmatrix.

    Args:
        value (float): QM eigenvalue for this mode.
        mode_idx (int): 0-based index of the vibrational mode.
        weight (float): Weight for this entry. Legacy defaults: 0.10
            for both low- and high-frequency modes, 0.00 for the
            first (imaginary) mode.
        molecule_idx (int): Index into the molecules list.
        label (str): Human-readable label.
    """
    self.values.append(
        ReferenceValue(
            kind="eig_diagonal",
            value=value,
            weight=weight,
            molecule_idx=molecule_idx,
            data_idx=mode_idx,
            label=label,
        )
    )

add_hessian_offdiagonal ¶

add_hessian_offdiagonal(value: float, *, row: int, col: int, weight: float = 0.05, molecule_idx: int = 0, label: str = '')

Add an off-diagonal element of the eigenmatrix.

Off-diagonal elements measure cross-coupling between modes. They should be close to zero when the MM Hessian closely reproduces the QM eigenvector structure.

Parameters:

Name	Type	Description	Default
`value`	`float`	QM eigenmatrix element (typically 0.0 for the QM self-projection).	required
`row`	`int`	0-based row index into the eigenmatrix.	required
`col`	`int`	0-based column index into the eigenmatrix.	required
`weight`	`float`	Weight for this entry. Legacy default: 0.05.	`0.05`
`molecule_idx`	`int`	Index into the molecules list.	`0`
`label`	`str`	Human-readable label.	`''`

Source code in q2mm/optimizers/objective.py

def add_hessian_offdiagonal(
    self,
    value: float,
    *,
    row: int,
    col: int,
    weight: float = 0.05,
    molecule_idx: int = 0,
    label: str = "",
):
    """Add an off-diagonal element of the eigenmatrix.

    Off-diagonal elements measure cross-coupling between modes.
    They should be close to zero when the MM Hessian closely
    reproduces the QM eigenvector structure.

    Args:
        value (float): QM eigenmatrix element (typically 0.0 for
            the QM self-projection).
        row (int): 0-based row index into the eigenmatrix.
        col (int): 0-based column index into the eigenmatrix.
        weight (float): Weight for this entry. Legacy default: 0.05.
        molecule_idx (int): Index into the molecules list.
        label (str): Human-readable label.
    """
    self.values.append(
        ReferenceValue(
            kind="eig_offdiagonal",
            value=value,
            weight=weight,
            molecule_idx=molecule_idx,
            atom_indices=(row, col),
            label=label,
        )
    )

add_eigenmatrix_from_hessian ¶

add_eigenmatrix_from_hessian(hessian: ndarray, *, diagonal_only: bool = False, molecule_idx: int = 0, weights: dict[str, float] | None = None, skip_first: bool = True, eigenvalue_threshold: float = 0.1173) -> int

Bulk-load eigenmatrix training data from a QM Hessian.

Decomposes the Hessian, computes the eigenmatrix, and adds all elements as reference values with the legacy weight scheme.

The Hessian should be in canonical units (Hartree/Bohr²).

Parameters:

Name	Type	Description	Default
`hessian`	`ndarray`	QM Hessian matrix `(3N, 3N)` in Hartree/Bohr².	required
`diagonal_only`	`bool`	If `True`, add only diagonal elements (eigenvalues). If `False`, add all lower-triangular elements.	`False`
`molecule_idx`	`int`	Index into the molecules list.	`0`
`weights`	`dict[str, float] \| None`	Weight overrides. Keys: `"eig_i"` (1st eigenvalue), `"eig_d_low"` (diagonal, value < threshold), `"eig_d_high"` (diagonal, value ≥ threshold), `"eig_o"` (off-diagonal). Defaults match legacy: `{eig_i: 0.0, eig_d_low: 0.1, eig_d_high: 0.1, eig_o: 0.05}`.	`None`
`skip_first`	`bool`	If `True`, the first eigenvalue gets weight `eig_i` (default 0.0, effectively skipping it). This is standard for TS fitting where the first mode is imaginary.	`True`
`eigenvalue_threshold`	`float`	Eigenvalue threshold (Hartree/Bohr²) separating low/high frequency modes for weight assignment. The default 0.1173 corresponds to the legacy threshold of 1100 kJ/(mol·Å²).	`0.1173`

Returns:

Name	Type	Description
`int`	`int`	Number of entries added.

Source code in q2mm/optimizers/objective.py

def add_eigenmatrix_from_hessian(
    self,
    hessian: np.ndarray,
    *,
    diagonal_only: bool = False,
    molecule_idx: int = 0,
    weights: dict[str, float] | None = None,
    skip_first: bool = True,
    eigenvalue_threshold: float = 0.1173,
) -> int:
    """Bulk-load eigenmatrix training data from a QM Hessian.

    Decomposes the Hessian, computes the eigenmatrix, and adds all
    elements as reference values with the legacy weight scheme.

    The Hessian should be in canonical units (Hartree/Bohr²).

    Args:
        hessian (np.ndarray): QM Hessian matrix ``(3N, 3N)`` in
            Hartree/Bohr².
        diagonal_only (bool): If ``True``, add only diagonal elements
            (eigenvalues). If ``False``, add all lower-triangular
            elements.
        molecule_idx (int): Index into the molecules list.
        weights (dict[str, float] | None): Weight overrides. Keys:
            ``"eig_i"`` (1st eigenvalue), ``"eig_d_low"`` (diagonal,
            value < threshold), ``"eig_d_high"`` (diagonal, value ≥
            threshold), ``"eig_o"`` (off-diagonal). Defaults match
            legacy: ``{eig_i: 0.0, eig_d_low: 0.1, eig_d_high: 0.1,
            eig_o: 0.05}``.
        skip_first (bool): If ``True``, the first eigenvalue gets
            weight ``eig_i`` (default 0.0, effectively skipping it).
            This is standard for TS fitting where the first mode is
            imaginary.
        eigenvalue_threshold (float): Eigenvalue threshold
            (Hartree/Bohr²) separating low/high frequency modes for
            weight assignment. The default 0.1173 corresponds to the
            legacy threshold of 1100 kJ/(mol·Å²).

    Returns:
        int: Number of entries added.
    """
    from q2mm.models.hessian import decompose, transform_to_eigenmatrix, extract_eigenmatrix_data

    w = {"eig_i": 0.0, "eig_d_low": 0.1, "eig_d_high": 0.1, "eig_o": 0.05}
    if weights:
        w.update(weights)

    eigenvalues, eigenvectors = decompose(hessian)
    eigenmatrix = transform_to_eigenmatrix(hessian, eigenvectors)
    elements = extract_eigenmatrix_data(eigenmatrix, diagonal_only=diagonal_only)

    added = 0
    for row, col, value in elements:
        if row == col:
            # Diagonal element
            if row == 0 and skip_first:
                weight = w["eig_i"]
            elif value < eigenvalue_threshold:
                weight = w["eig_d_low"]
            else:
                weight = w["eig_d_high"]
            self.add_hessian_eigenvalue(
                value,
                mode_idx=row,
                weight=weight,
                molecule_idx=molecule_idx,
                label=f"eig[{row}]",
            )
        else:
            # Off-diagonal element
            self.add_hessian_offdiagonal(
                value,
                row=row,
                col=col,
                weight=w["eig_o"],
                molecule_idx=molecule_idx,
                label=f"eig[{row},{col}]",
            )
        added += 1
    return added

add_frequencies_from_array ¶

add_frequencies_from_array(frequencies: ndarray | list[float], *, weight: float = 1.0, molecule_idx: int = 0, skip_imaginary: bool = False) -> int

Add all frequencies from a 1-D array.

Parameters:

Name	Type	Description	Default
`frequencies`	`ndarray \| list[float]`	Vibrational frequencies (cm⁻¹). Imaginary modes should be negative values.	required
`weight`	`float`	Weight applied to every frequency entry.	`1.0`
`molecule_idx`	`int`	Index into the molecules list for multi-structure fits.	`0`
`skip_imaginary`	`bool`	If `True`, negative frequencies (imaginary modes) are skipped.	`False`

Returns:

Name	Type	Description
`int`	`int`	Number of frequency entries added.

Source code in q2mm/optimizers/objective.py

def add_frequencies_from_array(
    self,
    frequencies: np.ndarray | list[float],
    *,
    weight: float = 1.0,
    molecule_idx: int = 0,
    skip_imaginary: bool = False,
) -> int:
    """Add all frequencies from a 1-D array.

    Args:
        frequencies (np.ndarray | list[float]): Vibrational frequencies
            (cm⁻¹). Imaginary modes should be negative values.
        weight (float): Weight applied to every frequency entry.
        molecule_idx (int): Index into the molecules list for
            multi-structure fits.
        skip_imaginary (bool): If ``True``, negative frequencies
            (imaginary modes) are skipped.

    Returns:
        int: Number of frequency entries added.
    """
    freqs = np.asarray(frequencies, dtype=float).ravel()
    added = 0
    for i, freq in enumerate(freqs):
        if skip_imaginary and freq < 0:
            continue
        self.add_frequency(
            float(freq),
            data_idx=i,
            weight=weight,
            molecule_idx=molecule_idx,
            label=f"mode {i}",
        )
        added += 1
    return added

from_molecule `classmethod` ¶

from_molecule(mol: Q2MMMolecule, *, weights: dict[str, float] | None = None, molecule_idx: int = 0, frequencies: ndarray | list[float] | None = None, skip_imaginary: bool = False, include_eigenmatrix: bool = False, eigenmatrix_diagonal_only: bool = False) -> ReferenceData

Auto-populate reference data from a molecule's detected geometry.

Extracts all auto-detected bond lengths and bond angles from the molecule. Optionally adds vibrational frequencies and/or Hessian eigenmatrix training data.

Parameters:

Name	Type	Description	Default
`mol`	`Q2MMMolecule`	Molecule with geometry (bonds/angles auto-detected).	required
`weights`	`dict[str, float] \| None`	Weight overrides keyed by data type. Supported keys: `"bond_length"`, `"bond_angle"`, `"frequency"`, and the eigenmatrix keys `"eig_i"`, `"eig_d_low"`, `"eig_d_high"`, `"eig_o"`. Defaults: `{"bond_length": 10.0, "bond_angle": 5.0, "frequency": 1.0}`.	`None`
`molecule_idx`	`int`	Index for multi-molecule fits.	`0`
`frequencies`	`ndarray \| list[float] \| None`	Vibrational frequencies (cm⁻¹) to include.	`None`
`skip_imaginary`	`bool`	If `True`, negative frequencies are skipped.	`False`
`include_eigenmatrix`	`bool`	If `True` and the molecule has a Hessian, add eigenmatrix training data (diagonal and optionally off-diagonal elements).	`False`
`eigenmatrix_diagonal_only`	`bool`	If `True`, only diagonal eigenmatrix elements are added.	`False`

Returns:

Name	Type	Description
`ReferenceData`	`ReferenceData`	Populated with bond lengths, angles, and (optionally) frequencies and eigenmatrix data.

Source code in q2mm/optimizers/objective.py

@classmethod
def from_molecule(
    cls,
    mol: Q2MMMolecule,
    *,
    weights: dict[str, float] | None = None,
    molecule_idx: int = 0,
    frequencies: np.ndarray | list[float] | None = None,
    skip_imaginary: bool = False,
    include_eigenmatrix: bool = False,
    eigenmatrix_diagonal_only: bool = False,
) -> ReferenceData:
    """Auto-populate reference data from a molecule's detected geometry.

    Extracts all auto-detected bond lengths and bond angles from the
    molecule. Optionally adds vibrational frequencies and/or Hessian
    eigenmatrix training data.

    Args:
        mol (Q2MMMolecule): Molecule with geometry (bonds/angles
            auto-detected).
        weights (dict[str, float] | None): Weight overrides keyed by
            data type. Supported keys: ``"bond_length"``,
            ``"bond_angle"``, ``"frequency"``, and the eigenmatrix
            keys ``"eig_i"``, ``"eig_d_low"``, ``"eig_d_high"``,
            ``"eig_o"``. Defaults: ``{"bond_length": 10.0,
            "bond_angle": 5.0, "frequency": 1.0}``.
        molecule_idx (int): Index for multi-molecule fits.
        frequencies (np.ndarray | list[float] | None): Vibrational
            frequencies (cm⁻¹) to include.
        skip_imaginary (bool): If ``True``, negative frequencies are
            skipped.
        include_eigenmatrix (bool): If ``True`` and the molecule has
            a Hessian, add eigenmatrix training data (diagonal and
            optionally off-diagonal elements).
        eigenmatrix_diagonal_only (bool): If ``True``, only diagonal
            eigenmatrix elements are added.

    Returns:
        ReferenceData: Populated with bond lengths, angles, and
            (optionally) frequencies and eigenmatrix data.
    """
    w = {"bond_length": 10.0, "bond_angle": 5.0, "frequency": 1.0}
    if weights:
        w.update(weights)

    ref = cls()

    for bond in mol.bonds:
        ref.add_bond_length(
            bond.length,
            atom_indices=(bond.atom_i, bond.atom_j),
            weight=w["bond_length"],
            molecule_idx=molecule_idx,
            label=f"{bond.element_pair} bond",
        )

    for angle in mol.angles:
        ref.add_bond_angle(
            angle.value,
            atom_indices=(angle.atom_i, angle.atom_j, angle.atom_k),
            weight=w["bond_angle"],
            molecule_idx=molecule_idx,
            label=f"{angle.elements} angle",
        )

    if frequencies is not None:
        ref.add_frequencies_from_array(
            frequencies,
            weight=w["frequency"],
            molecule_idx=molecule_idx,
            skip_imaginary=skip_imaginary,
        )

    if include_eigenmatrix and mol.hessian is not None:
        eig_weights = {k: w[k] for k in ("eig_i", "eig_d_low", "eig_d_high", "eig_o") if k in w}
        ref.add_eigenmatrix_from_hessian(
            mol.hessian,
            diagonal_only=eigenmatrix_diagonal_only,
            molecule_idx=molecule_idx,
            weights=eig_weights or None,
        )

    return ref

from_molecules `classmethod` ¶

from_molecules(molecules: list[Q2MMMolecule], *, weights: dict[str, float] | None = None, frequencies_list: list[ndarray | list[float]] | None = None, skip_imaginary: bool = False, include_eigenmatrix: bool = False, eigenmatrix_diagonal_only: bool = False) -> ReferenceData

Auto-populate reference data from multiple molecules.

Each molecule is assigned a sequential molecule_idx starting from 0. Delegates to :meth:from_molecule per molecule.

Parameters:

Name	Type	Description	Default
`molecules`	`list[Q2MMMolecule]`	Training set molecules.	required
`weights`	`dict[str, float] \| None`	Weight overrides (same keys as :meth:`from_molecule`).	`None`
`frequencies_list`	`list[ndarray \| list[float]] \| None`	Per-molecule frequencies. Must have the same length as molecules if provided.	`None`
`skip_imaginary`	`bool`	If `True`, negative frequencies are skipped.	`False`
`include_eigenmatrix`	`bool`	If `True` and a molecule has a Hessian, add eigenmatrix data.	`False`
`eigenmatrix_diagonal_only`	`bool`	If `True`, only diagonal eigenmatrix elements are added.	`False`

Returns:

Name	Type	Description
`ReferenceData`	`ReferenceData`	Combined reference data for all molecules.

Raises:

Type	Description
`ValueError`	If `frequencies_list` length does not match `molecules` length.

Source code in q2mm/optimizers/objective.py

@classmethod
def from_molecules(
    cls,
    molecules: list[Q2MMMolecule],
    *,
    weights: dict[str, float] | None = None,
    frequencies_list: list[np.ndarray | list[float]] | None = None,
    skip_imaginary: bool = False,
    include_eigenmatrix: bool = False,
    eigenmatrix_diagonal_only: bool = False,
) -> ReferenceData:
    """Auto-populate reference data from multiple molecules.

    Each molecule is assigned a sequential ``molecule_idx`` starting
    from 0.  Delegates to :meth:`from_molecule` per molecule.

    Args:
        molecules (list[Q2MMMolecule]): Training set molecules.
        weights (dict[str, float] | None): Weight overrides (same
            keys as :meth:`from_molecule`).
        frequencies_list (list[np.ndarray | list[float]] | None):
            Per-molecule frequencies. Must have the same length as
            *molecules* if provided.
        skip_imaginary (bool): If ``True``, negative frequencies are
            skipped.
        include_eigenmatrix (bool): If ``True`` and a molecule has a
            Hessian, add eigenmatrix data.
        eigenmatrix_diagonal_only (bool): If ``True``, only diagonal
            eigenmatrix elements are added.

    Returns:
        ReferenceData: Combined reference data for all molecules.

    Raises:
        ValueError: If ``frequencies_list`` length does not match
            ``molecules`` length.
    """
    if frequencies_list is not None and len(frequencies_list) != len(molecules):
        raise ValueError(
            f"frequencies_list length ({len(frequencies_list)}) must match molecules length ({len(molecules)})."
        )

    ref = cls()
    for idx, mol in enumerate(molecules):
        single = cls.from_molecule(
            mol,
            weights=weights,
            molecule_idx=idx,
            frequencies=frequencies_list[idx] if frequencies_list is not None else None,
            skip_imaginary=skip_imaginary,
            include_eigenmatrix=include_eigenmatrix,
            eigenmatrix_diagonal_only=eigenmatrix_diagonal_only,
        )
        ref.values.extend(single.values)

    return ref

from_gaussian `classmethod` ¶

from_gaussian(path: str | Path, *, weights: dict[str, float] | None = None, bond_tolerance: float = 1.3, charge: int = 0, multiplicity: int = 1, include_frequencies: bool = True, skip_imaginary: bool = False, au_hessian: bool = True) -> tuple[ReferenceData, Q2MMMolecule]

Build reference data from a Gaussian log file.

Parses the log file for the optimised geometry and vibrational frequencies, then auto-populates bond lengths, angles, and (optionally) frequencies.

Parameters:

Name	Type	Description	Default
`path`	`str \| Path`	Path to the Gaussian `.log` file (from an `opt freq` job).	required
`weights`	`dict[str, float] \| None`	Weight overrides (same keys as :meth:`from_molecule`).	`None`
`bond_tolerance`	`float`	Multiplier for covalent-radii bond detection. Use 1.4+ for TS.	`1.3`
`charge`	`int`	Molecular charge.	`0`
`multiplicity`	`int`	Spin multiplicity.	`1`
`include_frequencies`	`bool`	Whether to add frequency data from the log file.	`True`
`skip_imaginary`	`bool`	If `True`, negative frequencies are skipped.	`False`
`au_hessian`	`bool`	Keep Hessian in atomic units (Hartree/Bohr²).	`True`

Returns:

Type	Description
`tuple[ReferenceData, Q2MMMolecule]`	tuple[ReferenceData, Q2MMMolecule]: Populated reference data and the parsed molecule (with Hessian attached if available).

Source code in q2mm/optimizers/objective.py

@classmethod
def from_gaussian(
    cls,
    path: str | Path,
    *,
    weights: dict[str, float] | None = None,
    bond_tolerance: float = 1.3,
    charge: int = 0,
    multiplicity: int = 1,
    include_frequencies: bool = True,
    skip_imaginary: bool = False,
    au_hessian: bool = True,
) -> tuple[ReferenceData, Q2MMMolecule]:
    """Build reference data from a Gaussian log file.

    Parses the log file for the optimised geometry and vibrational
    frequencies, then auto-populates bond lengths, angles, and
    (optionally) frequencies.

    Args:
        path (str | Path): Path to the Gaussian ``.log`` file
            (from an ``opt freq`` job).
        weights (dict[str, float] | None): Weight overrides (same
            keys as :meth:`from_molecule`).
        bond_tolerance (float): Multiplier for covalent-radii bond
            detection. Use 1.4+ for TS.
        charge (int): Molecular charge.
        multiplicity (int): Spin multiplicity.
        include_frequencies (bool): Whether to add frequency data
            from the log file.
        skip_imaginary (bool): If ``True``, negative frequencies are
            skipped.
        au_hessian (bool): Keep Hessian in atomic units
            (Hartree/Bohr²).

    Returns:
        tuple[ReferenceData, Q2MMMolecule]: Populated reference data
            and the parsed molecule (with Hessian attached if
            available).
    """
    from q2mm.parsers.gaussian import GaussLog
    from q2mm.models.hessian import reform_hessian

    log = GaussLog(str(path), au_hessian=au_hessian)

    # Build molecule from the last (optimised) structure
    mol = log.molecules[-1]
    mol.charge = charge
    mol.multiplicity = multiplicity
    mol.bond_tolerance = bond_tolerance

    # Override hessian with eigenvalue-reconstructed version if available
    if log.evals is not None and log.evecs is not None and log.evals.size and log.evecs.size:
        mol.hessian = reform_hessian(log.evals, log.evecs)

    # Frequencies in cm⁻¹ from the Gaussian log
    # Note: log.evals are eigenvalues (mass-weighted force constants in
    # atomic units), NOT frequencies.  Use log.frequencies for cm⁻¹ values.
    frequencies = None
    if include_frequencies and log.frequencies is not None and len(log.frequencies):
        frequencies = np.array(log.frequencies)

    ref = cls.from_molecule(
        mol,
        weights=weights,
        frequencies=frequencies,
        skip_imaginary=skip_imaginary,
    )

    return ref, mol

from_fchk `classmethod` ¶

from_fchk(path: str | Path, *, weights: dict[str, float] | None = None, bond_tolerance: float = 1.3, charge: int = 0, multiplicity: int = 1) -> tuple[ReferenceData, Q2MMMolecule]

Build reference data from a Gaussian formatted checkpoint file.

Parses the .fchk file for geometry, Cartesian Force Constants (Hessian), and atom data. Auto-populates bond lengths and angles.

Parameters:

Name	Type	Description	Default
`path`	`str \| Path`	Path to the Gaussian `.fchk` file.	required
`weights`	`dict[str, float] \| None`	Weight overrides (same keys as :meth:`from_molecule`).	`None`
`bond_tolerance`	`float`	Multiplier for covalent-radii bond detection.	`1.3`
`charge`	`int`	Molecular charge (overridden by file values if present).	`0`
`multiplicity`	`int`	Spin multiplicity (overridden by file values if present).	`1`

Returns:

Type	Description
`tuple[ReferenceData, Q2MMMolecule]`	tuple[ReferenceData, Q2MMMolecule]: Populated reference data and the parsed molecule with Hessian.

Source code in q2mm/optimizers/objective.py

@classmethod
def from_fchk(
    cls,
    path: str | Path,
    *,
    weights: dict[str, float] | None = None,
    bond_tolerance: float = 1.3,
    charge: int = 0,
    multiplicity: int = 1,
) -> tuple[ReferenceData, Q2MMMolecule]:
    """Build reference data from a Gaussian formatted checkpoint file.

    Parses the ``.fchk`` file for geometry, Cartesian Force Constants
    (Hessian), and atom data. Auto-populates bond lengths and angles.

    Args:
        path (str | Path): Path to the Gaussian ``.fchk`` file.
        weights (dict[str, float] | None): Weight overrides (same
            keys as :meth:`from_molecule`).
        bond_tolerance (float): Multiplier for covalent-radii bond
            detection.
        charge (int): Molecular charge (overridden by file values
            if present).
        multiplicity (int): Spin multiplicity (overridden by file
            values if present).

    Returns:
        tuple[ReferenceData, Q2MMMolecule]: Populated reference data
            and the parsed molecule with Hessian.
    """
    path = Path(path)
    symbols, coords_ang, hessian, file_charge, file_mult = _parse_fchk(path)

    mol = Q2MMMolecule(
        symbols=symbols,
        geometry=coords_ang,
        charge=file_charge if file_charge is not None else charge,
        multiplicity=file_mult if file_mult is not None else multiplicity,
        name=path.stem,
        bond_tolerance=bond_tolerance,
        hessian=hessian,
    )

    ref = cls.from_molecule(mol, weights=weights)

    return ref, mol

ObjectiveFunction ¶

ObjectiveFunction(forcefield: ForceField, engine: MMEngine, molecules: list[Q2MMMolecule], reference: ReferenceData)

Objective function for scipy-based force field optimization.

Evaluates the weighted sum-of-squares between MM-calculated and reference data for one or more molecules.

Parameters:

Name	Type	Description	Default
`forcefield`	`ForceField`	The force field whose parameters are being optimized.	required
`engine`	`MMEngine`	The MM backend (OpenMM, Tinker, etc.).	required
`molecules`	`list[Q2MMMolecule]`	Training set molecules.	required
`reference`	`ReferenceData`	QM/experimental reference observations.	required

Initialize the objective function.

Parameters:

Name	Type	Description	Default
`forcefield`	`ForceField`	The force field whose parameters are being optimized.	required
`engine`	`MMEngine`	The MM backend (OpenMM, Tinker, etc.).	required
`molecules`	`list[Q2MMMolecule]`	Training set molecules.	required
`reference`	`ReferenceData`	QM/experimental reference observations.	required

Source code in q2mm/optimizers/objective.py

def __init__(
    self,
    forcefield: ForceField,
    engine: MMEngine,
    molecules: list[Q2MMMolecule],
    reference: ReferenceData,
):
    """Initialize the objective function.

    Args:
        forcefield (ForceField): The force field whose parameters
            are being optimized.
        engine (MMEngine): The MM backend (OpenMM, Tinker, etc.).
        molecules (list[Q2MMMolecule]): Training set molecules.
        reference (ReferenceData): QM/experimental reference
            observations.
    """
    self.forcefield = forcefield
    self.engine = engine
    self.molecules = molecules
    self.reference = reference
    self.n_eval = 0
    self.history: list[float] = []
    # Reusable engine handles for backends that support runtime parameter
    # updates (e.g., OpenMM). Avoids rebuilding simulation contexts each
    # evaluation — critical for optimization performance.
    self._handles: dict[int, object] = {}
    # Cached QM eigenvectors per molecule (precomputed once for
    # eigenmatrix comparisons — the QM basis is fixed).
    self._qm_eigenvectors: dict[int, np.ndarray] = {}

call ¶

__call__(param_vector: ndarray) -> float

Evaluate objective for a given parameter vector.

This is the function signature that :func:scipy.optimize.minimize expects: f(x) -> float.

Parameters:

Name	Type	Description	Default
`param_vector`	`ndarray`	Flat parameter vector.	required

Returns:

Name	Type	Description
`float`	`float`	Sum-of-squared weighted residuals.

Source code in q2mm/optimizers/objective.py

def __call__(self, param_vector: np.ndarray) -> float:
    """Evaluate objective for a given parameter vector.

    This is the function signature that :func:`scipy.optimize.minimize`
    expects: ``f(x) -> float``.

    Args:
        param_vector (np.ndarray): Flat parameter vector.

    Returns:
        float: Sum-of-squared weighted residuals.
    """
    self.forcefield.set_param_vector(param_vector)

    residuals = self._compute_residuals()
    score = float(np.sum(residuals**2))

    self.n_eval += 1
    self.history.append(score)
    return score

residuals ¶

residuals(param_vector: ndarray) -> ndarray

Compute weighted residual vector (for least-squares methods).

Parameters:

Name	Type	Description	Default
`param_vector`	`ndarray`	Flat parameter vector.	required

Returns:

Type	Description
`ndarray`	np.ndarray: Weighted residuals for each reference observation.

Source code in q2mm/optimizers/objective.py

def residuals(self, param_vector: np.ndarray) -> np.ndarray:
    """Compute weighted residual vector (for least-squares methods).

    Args:
        param_vector (np.ndarray): Flat parameter vector.

    Returns:
        np.ndarray: Weighted residuals for each reference observation.
    """
    self.forcefield.set_param_vector(param_vector)
    r = self._compute_residuals()
    self.n_eval += 1
    self.history.append(float(np.sum(r**2)))
    return r

gradient ¶

gradient(param_vector: ndarray) -> ndarray

Compute analytical gradient of the score w.r.t. parameters.

Uses the engine's energy_and_param_grad() method (available on :class:~q2mm.backends.mm.jax_engine.JaxEngine) to compute exact derivatives for energy reference data. Raises NotImplementedError for reference data types that require Hessians or minimized geometries; use jac=None (finite differences) for mixed reference data.

The score is sum_i (w_i * (ref_i - calc_i))**2, so:

d(score)/d(p) = -2 * sum_i [w_i^2 * (ref_i - calc_i) * d(calc_i)/d(p)]

Note

This method does not increment n_eval or append to history. SciPy's minimize calls fun(x) and jac(x) separately, so tracking state here would double-count evaluations. Evaluation counting is handled exclusively in __call__.

Parameters:

Name	Type	Description	Default
`param_vector`	`ndarray`	Flat parameter vector (same as :meth:`__call__`).	required

Returns:

Type	Description
`ndarray`	np.ndarray: Gradient of the score with respect to each parameter.

Raises:

Type	Description
`TypeError`	If the engine does not support `energy_and_param_grad()`.
`NotImplementedError`	If the reference data contains types other than `energy` (Hessian/frequency/geometry gradient support is planned).

Source code in q2mm/optimizers/objective.py

def gradient(self, param_vector: np.ndarray) -> np.ndarray:
    """Compute analytical gradient of the score w.r.t. parameters.

    Uses the engine's ``energy_and_param_grad()`` method (available on
    :class:`~q2mm.backends.mm.jax_engine.JaxEngine`) to compute exact
    derivatives for energy reference data.  Raises ``NotImplementedError``
    for reference data types that require Hessians or minimized geometries;
    use ``jac=None`` (finite differences) for mixed reference data.

    The score is ``sum_i (w_i * (ref_i - calc_i))**2``, so:

    ``d(score)/d(p) = -2 * sum_i [w_i^2 * (ref_i - calc_i) * d(calc_i)/d(p)]``

    Note:
        This method does **not** increment ``n_eval`` or append to
        ``history``.  SciPy's ``minimize`` calls ``fun(x)`` and ``jac(x)``
        separately, so tracking state here would double-count evaluations.
        Evaluation counting is handled exclusively in ``__call__``.

    Args:
        param_vector (np.ndarray): Flat parameter vector (same as
            :meth:`__call__`).

    Returns:
        np.ndarray: Gradient of the score with respect to each parameter.

    Raises:
        TypeError: If the engine does not support
            ``energy_and_param_grad()``.
        NotImplementedError: If the reference data contains types
            other than ``energy`` (Hessian/frequency/geometry gradient
            support is planned).
    """
    if not self.engine.supports_analytical_gradients():
        raise TypeError(
            f"{self.engine.name} does not support analytical gradients. "
            "Use a JaxEngine or another differentiable engine."
        )

    self.forcefield.set_param_vector(param_vector)

    # Check which data types are needed
    all_kinds = {ref.kind for ref in self.reference.values}
    unsupported = all_kinds - {"energy"}
    if unsupported:
        raise NotImplementedError(
            f"Analytical gradients not yet supported for data types: {unsupported}. "
            "Only 'energy' references are supported. Use finite differences (jac=None) "
            "for mixed reference data, or contribute Hessian/geometry gradient support."
        )

    # Compute energy + gradient for each molecule
    n_params = len(param_vector)
    total_grad = np.zeros(n_params)
    energy_cache: dict[int, tuple[float, np.ndarray]] = {}

    for ref in self.reference.values:
        mol_idx = ref.molecule_idx
        if mol_idx not in energy_cache:
            structure = self._get_structure(mol_idx)
            energy_cache[mol_idx] = self.engine.energy_and_param_grad(structure, self.forcefield)

        calc_value, calc_grad = energy_cache[mol_idx]
        diff = ref.value - calc_value
        # d(score)/d(p) = -2 * w^2 * (ref - calc) * d(calc)/d(p)
        total_grad += -2.0 * ref.weight**2 * diff * calc_grad

    return total_grad

reset ¶

reset()

Reset evaluation counter, history, and cached engine handles.

Source code in q2mm/optimizers/objective.py

def reset(self):
    """Reset evaluation counter, history, and cached engine handles."""
    self.n_eval = 0
    self.history.clear()
    self._handles.clear()
    self._qm_eigenvectors.clear()

Objective Function¶

objective ¶

Scoring approach¶

ReferenceValue dataclass ¶

ReferenceData dataclass ¶

n_observations property ¶

add_energy ¶

add_frequency ¶

add_bond_length ¶

add_bond_angle ¶

add_torsion_angle ¶

add_hessian_eigenvalue ¶

add_hessian_offdiagonal ¶

add_eigenmatrix_from_hessian ¶

add_frequencies_from_array ¶

from_molecule classmethod ¶

from_molecules classmethod ¶

from_gaussian classmethod ¶

from_fchk classmethod ¶

ObjectiveFunction ¶

__call__ ¶

residuals ¶

gradient ¶

reset ¶

ReferenceValue `dataclass` ¶

ReferenceData `dataclass` ¶

n_observations `property` ¶

from_molecule `classmethod` ¶

from_molecules `classmethod` ¶

from_gaussian `classmethod` ¶

from_fchk `classmethod` ¶

call ¶