Rh-Enamide¶
Rh-enamide is the clearest literature-reproduction case in the repo: a complete Rh(I)-diphosphine asymmetric hydrogenation TSFF with 9 training structures.
Scope¶
- Type: Transition state (Rh-catalyzed asymmetric hydrogenation)
- Molecules: 9 TS structures
- Parameters: 182 (OPT substructure: 8 bonds, 23 angles, 48 torsions)
- QM reference: B3LYP/LACVP**
Publication¶
| Property | Value |
|---|---|
| Paper | Donoghue, P. J. et al. J. Chem. Theory Comput. 2008, 4, 1313–1323 |
| DOI | 10.1021/ct800132a |
| System | Rh(I)-diphosphine asymmetric hydrogenation of enamides |
| Training set | 9 transition-state structures |
| Engine | MacroModel MM3* |
What the paper fitted and reports¶
What the original Q2MM workflow fitted¶
The original Q2MM workflow fit a multi-target penalty function, not just Hessian eigenvalues.1
- Bond lengths
- Bond angles
- Torsions
- The full Hessian matrix
- Partial charges
- Relative energies
The paper reports penalty-function tolerances of 0.01 Å for bonds, 0.5° for angles, 1° for torsions, and 0.02 e for charges, with Newton-Raphson plus Simplex refinement inside MacroModel MM3*.1
The authors also describe two fitted variants:
- RhH — the standard fit
- RhH-E — an energy-emphasized fit
What the paper reports¶
Donoghue et al. report strong structural and energetic agreement between QM and MM for the fitted force field.1
- Bond RMSD: ≤ 0.03 Å (Table 5)
- Angle RMSD: < 2° (Table 6)
- Relative-energy RMSD: 0.3–0.5 kcal/mol (Table 7)
- External selectivity validation: MUE = 0.6 kcal/mol across 18 test points
The 2008 paper does not report an eigenvalue R² directly. It reports structural and energetic agreement (bond RMSD, angle RMSD, relative-energy RMSD).
Our reproduction¶
| Metric | Value |
|---|---|
| Overall eigenvalue R² | 0.991 |
| Overall slope | 0.986 |
| Aggregate frequency RMSD | 259.9 cm⁻¹ (per-molecule avg: 85.5) |
Our analytical QFUERZA starting point reproduces 99.1% of the variance in the QM eigenspectrum (R² = 0.991) with slopes near 1.0 across all 9 transition states, without any iterative optimization.
| TS | Atoms | Eig R² | Slope | Freq RMSD |
|---|---|---|---|---|
| 1 | 36 | 0.990 | 1.000 | 93.9 |
| 2 | 38 | 0.995 | 0.986 | 86.0 |
| 3 | 38 | 0.993 | 0.987 | 87.1 |
| 4 | 62 | 0.985 | 0.975 | 97.5 |
| 5 | 62 | 0.985 | 0.975 | 98.2 |
| 6 | 58 | 0.994 | 0.989 | 76.6 |
| 7 | 58 | 0.993 | 0.989 | 76.3 |
| 8 | 58 | 0.994 | 0.990 | 77.2 |
| 9 | 58 | 0.993 | 0.989 | 76.9 |
Benchmark results¶
Multi-target objective¶
Eigenmatrix-diagonal + geometry refs, 182 frozen-scoped active params,
invert_ts_curvature=True, SciPy L-BFGS-B with JaxLoss analytical
gradients on RTX 5090 GPU:
| Metric | Value |
|---|---|
| Ratio check | 1.05 (pass) |
| Initial score | 3.92 × 10⁵ |
| Final score | 2.80 × 10⁵ |
| Reduction | 28.68 % |
Iterations (L-BFGS-B nit) |
6 |
| ObjectiveFunction evaluations | 2 (initial + final re-evaluation; JaxLoss calls scipy via the surrogate, not via ObjectiveFunction) |
| Gradient source | jac="auto" resolved to jac_mode="jax_loss" (JaxLoss analytical) |
| Wall time | ~11 min (including per-molecule JIT) |
These numbers are reproducible from scripts/regenerate_convergence_results.py
(no --skip-optimization); raw JSON output with provenance lives at
q2mm-data/benchmarks/rh-enamide/convergence/.
The ratio check confirms JaxLoss is a reliable surrogate for this system — the Seminario starting FF is good enough (R² > 0.93 in every category) that unconstrained geometry relaxation finds the correct local minima.
Historical frequency-only results¶
Different objective
Earlier benchmarks used a frequency-only objective with all ~2,742 FF parameters (not the paper's multi-target penalty with 182 OPT-substructure params). These numbers remain useful for optimizer comparison on that specific task, but do not represent literature reproduction.
Under the frequency-only objective, JaxOpt L-BFGS lowered frequency RMSD from 259.9 to 187.7 cm⁻¹ and Optax Adam+cosine reached 199.5 cm⁻¹ — but optimizer improvement on frequency RMSD can trade away eigenspectrum quality (Adam+cosine dropped to R² = 0.843 even as RMSD improved).2
Comparison and gap analysis¶
Comparison¶
- QFUERZA reaches R² = 0.991 with slopes near 1.0 across all 9 TS structures.
- The 2008 paper does not report eigenvalue R², so a direct comparison is not possible.
- The original force field was optimized for MacroModel MM3*; our reproduction uses a different engine (JAX). Cross-engine functional-form differences account for any remaining gap.
What q2mm demonstrates¶
QFUERZA reaches R² = 0.991 in seconds without iteration. The original Q2MM workflow required hours of gradient optimization in MacroModel to reach its final fit.3
The multi-target optimization pipeline runs end-to-end on this 9-molecule system (per-molecule JIT compilation, scipy L-BFGS-B with JaxLoss analytical gradients) and achieves 28.68 % loss reduction.
Gap analysis¶
To improve further:
- Type-normalized penalties — divide each data type's contribution by its count, matching the upstream Q2MM weighting. This is the most impactful change for convergence.
- Closer MacroModel MM3* parity — especially any remaining metal-center functional details that do not transfer cleanly to our engine.
- Off-diagonal eigenmatrix elements — the current objective uses diagonal-only; the papers use the full lower triangle (weight 0.05).
Reproduce¶
# Multi-target (correct methodology)
python -m q2mm.diagnostics.cli --system rh-enamide --backend jax --optimizer scipy-lbfgsb
Raw data: q2mm-data/benchmarks/ → rh-enamide/.
-
Donoghue, P. J. et al. J. Chem. Theory Comput. 2008, 4, 1313–1323. DOI: 10.1021/ct800132a ↩↩↩
-
See the later Q2MM/QFUERZA literature discussion in QFUERZA Validation. ↩
-
QFUERZA Validation summarizes why the analytical start is valuable even before iterative optimization. ↩