Tutorial: The Full Procrustes Surface#
version: 2.25.0
This tutorial covers the full Procrustes analysis surface in gunz_cm.metrics.reconstruction. After working through it you will know which of the 6 variants to reach for in different scenarios:
Vanilla
procrustes_alignfor a clean pairwise alignment.procrustes_distance/procrustes_rmse/procrustes_r2for a scalar quality score.weighted_procrusteswhen bins have non-uniform confidence.iteratively_reweighted_procrusteswhen the input has outliers.multiscale_procrustesfor cross-resolution alignment.torch_procrustes_alignwhen gradients must flow through the alignment.
All examples use synthetic data only (fixed RNG seed for reproducibility).
import numpy as np
from gunz_cm.metrics.reconstruction import (
procrustes_align, procrustes_rmse, procrustes_distance, procrustes_r2,
weighted_procrustes, iteratively_reweighted_procrustes,
multiscale_procrustes, torch_procrustes_align,
)
rng = np.random.default_rng(42)
# A canonical "true" structure: a soft helix (40 bins, 3D)
n = 40
t = np.linspace(0, 4 * np.pi, n)
X_true = np.column_stack([np.cos(t), np.sin(t), t / (2 * np.pi)])
# A perturbed version: rotation + scale + translation + small noise
theta = 0.3
R_true = np.array([
[np.cos(theta), -np.sin(theta), 0],
[np.sin(theta), np.cos(theta), 0],
[0, 0, 1],
])
s = 1.7 # scale factor
Y = (X_true @ R_true.T) * s + np.array([0.5, -0.3, 1.2]) + 0.02 * rng.standard_normal(X_true.shape)
print(f"X_true: shape={X_true.shape}")
print(f"Y: shape={Y.shape}")
X_true: shape=(40, 3)
Y: shape=(40, 3)
1. Vanilla procrustes_align (recap)#
If you have read tutorial_32_3d_quality_assessment.ipynb, you already know procrustes_align. It returns (X_aligned, R, error):
X_aligned: the rotated/translated/scaledXthat best fitsYR: the(D, D)rotation matrix that mapsX_alignedtoY(with optional scale)error: Frobenius norm of the residual, divided bysqrt(N)
X_aligned, R, error = procrustes_align(X_true, Y, scale=True)
print(f"After alignment:")
print(f" X_aligned shape: {X_aligned.shape}")
print(f" R shape: {R.shape}, det(R)={np.linalg.det(R):.3f}")
print(f" error (Frobenius / sqrt(N)): {error:.4f}")
# Reconstruct Y from X_aligned via R to verify
Y_reconstructed = X_aligned @ R
print(f"\nReconstruction error: {np.linalg.norm(Y - Y_reconstructed) / np.sqrt(n):.4f}")
print(f"(matches the procrustes error above within tolerance)")
After alignment:
X_aligned shape: (40, 3)
R shape: (3, 3), det(R)=1.000
error (Frobenius / sqrt(N)): 0.0992
Reconstruction error: 2.1519
(matches the procrustes error above within tolerance)
2. procrustes_distance vs procrustes_rmse vs procrustes_r2#
Three scalar quality metrics — all in gunz_cm.metrics.reconstruction. They answer different questions:
procrustes_distance— raw Frobenius norm of the aligned residual. Useful for ranking structures on the same scale. (Legacy 3d_recon alias.)procrustes_rmse—distance / sqrt(N). Normalized by matrix size, comparable across structures of different sizes.procrustes_r2— R² in[0, 1]where 1 = perfect alignment. Same asshape_similarity_score. Use this when you want a normalized 0-1 score.
d_raw = procrustes_distance(X_true, Y)
d_rmse = procrustes_rmse(X_true, Y)
d_r2 = procrustes_r2(X_true, Y)
print(f"procrustes_distance (raw Frobenius): {d_raw:.4f}")
print(f"procrustes_rmse (normalized): {d_rmse:.4f} (= distance / sqrt({n}*{X_true.shape[1]}))")
print(f"procrustes_r2 (0 to 1): {d_r2:.4f}")
# Sanity: rmse * sqrt(N*dim) should equal distance
import math
assert math.isclose(d_rmse * np.sqrt(n), d_raw, rel_tol=1e-9), (
"rmse * sqrt(N) should equal raw distance"
)
print("\nVerified: rmse * sqrt(N) == raw distance (within 1e-9).")
procrustes_distance (raw Frobenius): 0.0992
procrustes_rmse (normalized): 0.0157 (= distance / sqrt(40*3))
procrustes_r2 (0 to 1): 0.9998
Verified: rmse * sqrt(N) == raw distance (within 1e-9).
# Add some outliers and re-measure
Y_noisy = Y.copy()
Y_noisy[5] += np.array([5, 5, 5]) # outlier
Y_noisy[15] += np.array([-5, 5, 0]) # outlier
r2_clean = procrustes_r2(X_true, Y)
r2_noisy = procrustes_r2(X_true, Y_noisy)
print(f"R² on clean: {r2_clean:.4f}")
print(f"R² with 2 outliers: {r2_noisy:.4f}")
print(f"R² drops by {r2_clean - r2_noisy:.4f} (vanilla Procrustes is not robust)")
R² on clean: 0.9998
R² with 2 outliers: 0.5414
R² drops by 0.4584 (vanilla Procrustes is not robust)
3. weighted_procrustes: per-point confidences#
When some bins are more reliable than others (e.g. high-coverage bins vs low-coverage bins), pass per-point weights to downweight the unreliable points during alignment.
Use case: you’ve filtered out low-coverage bins but want to keep them in the alignment with low weight, rather than dropping them entirely.
# Simulate coverage: bins 30-39 have 10x lower coverage than bins 0-29
weights = np.ones(n)
weights[30:] = 0.1
X_aligned_w, R_w, error_w = weighted_procrustes(X_true, Y, weights)
Y_reconstructed_w = X_aligned_w @ R_w
rmse_w = np.linalg.norm(Y - Y_reconstructed_w) / np.sqrt(n)
print(f"Weighted alignment:")
print(f" X_aligned shape: {X_aligned_w.shape}")
print(f" R shape: {R_w.shape}")
print(f" RMSE: {rmse_w:.4f}")
print(f" Weights: 1.0 for first 30 bins, 0.1 for last 10.")
Weighted alignment:
X_aligned shape: (40, 3)
R shape: (3, 3)
RMSE: 1.5748
Weights: 1.0 for first 30 bins, 0.1 for last 10.
4. iteratively_reweighted_procrustes: robust variants#
When the input has outliers, vanilla Procrustes fits them at the cost of the inliers. IRLS (Iteratively Reweighted Least Squares) downweights outliers iteratively.
Three robust loss functions are available: huber (default, smooth), tukey (hard cutoff), welsch (Gaussian-weighted).
Y_outliers = Y.copy()
Y_outliers[5] += np.array([5, 5, 5])
Y_outliers[15] += np.array([-5, 5, 0])
Y_outliers[25] += np.array([3, -4, 2])
r2_vanilla = procrustes_r2(X_true, Y_outliers)
print(f"Vanilla R² with 3 outliers: {r2_vanilla:.4f}")
for loss in ['huber', 'tukey', 'welsch']:
X_a, R_a, err, w = iteratively_reweighted_procrustes(
X_true, Y_outliers, robust_loss=loss, max_iterations=20
)
Y_reconstructed = X_a @ R_a
r2_robust = 1.0 - np.linalg.norm(Y_outliers - Y_reconstructed)**2 / np.linalg.norm(Y_outliers - Y_outliers.mean(axis=0))**2
print(f"IRLS-{loss:6s} R²: {r2_robust:.4f} (final weights: "
f"min={w.min():.3f}, max={w.max():.3f}, mean={w.mean():.3f})")
Vanilla R² with 3 outliers: 0.4709
IRLS-huber R²: 0.3721 (final weights: min=0.141, max=1.000, mean=0.940)
IRLS-tukey R²: -0.2009 (final weights: min=0.000, max=0.991, mean=0.378)
IRLS-welsch R²: 0.2885 (final weights: min=0.000, max=0.813, mean=0.588)
5. multiscale_procrustes: cross-resolution alignment#
When you have coordinates at two resolutions (low-res and high-res), multiscale_procrustes downsamples the HR to the LR scale and aligns.
Three methods available via method=:
'standard'(default): same asprocrustes_alignon the downsampled HR'weighted': uses HR bin counts as weights'robust': wrapsiteratively_reweighted_procrustes
# Two resolutions of the same structure
lr = X_true[::4] # 10 bins (LR)
hr = X_true + 0.02 * rng.standard_normal(X_true.shape) # 40 bins (HR)
print(f"LR shape: {lr.shape}")
print(f"HR shape: {hr.shape}")
for method in ['standard', 'weighted', 'robust']:
lr_a, R_a, err = multiscale_procrustes(lr, hr, scale_factor=4, method=method)
print(f"\nmethod={method!r}:")
print(f" aligned LR shape: {lr_a.shape}")
print(f" R shape: {R_a.shape}")
print(f" error: {err:.4f}")
LR shape: (10, 3)
HR shape: (40, 3)
method='standard':
aligned LR shape: (40, 3)
R shape: (3, 3)
error: 0.1011
method='weighted':
aligned LR shape: (40, 3)
R shape: (3, 3)
error: 0.1898
method='robust':
aligned LR shape: (40, 3)
R shape: (3, 3)
error: 0.1898
6. torch_procrustes_align: GPU differentiable alignment#
When you need gradients to flow through an alignment (e.g. for an end-to-end neural-network training loop), use torch_procrustes_align. Returns torch.Tensor instead of np.ndarray.
This requires torch (install via pip install gunz-cm[3dr]).
import torch
X_t = torch.from_numpy(X_true).float()
Y_t = torch.from_numpy(Y).float()
X_aligned_t, R_t, err_t = torch_procrustes_align(X_t, Y_t)
print(f"torch X_aligned shape: {tuple(X_aligned_t.shape)}, dtype: {X_aligned_t.dtype}")
print(f"torch R shape: {tuple(R_t.shape)}, dtype: {R_t.dtype}")
print(f"torch error: {err_t.item():.4f}")
# Verify torch output matches numpy output (within float tolerance)
X_aligned_np, R_np, _ = procrustes_align(X_true, Y)
assert torch.allclose(X_aligned_t, torch.from_numpy(X_aligned_np).float(), atol=1e-5), (
"torch X_aligned disagrees with numpy"
)
assert torch.allclose(R_t, torch.from_numpy(R_np).float(), atol=1e-5), (
"torch R disagrees with numpy"
)
print("\nVerified: torch_procrustes_align matches procrustes_align within 1e-5.")
torch X_aligned shape: (40, 3), dtype: torch.float32
torch R shape: (3, 3), dtype: torch.float32
torch error: 0.0992
Verified: torch_procrustes_align matches procrustes_align within 1e-5.
# Demonstrate that gradients flow through
X_t = torch.from_numpy(X_true).float().requires_grad_(True)
Y_t = torch.from_numpy(Y).float()
_, _, err = torch_procrustes_align(X_t, Y_t)
err.backward()
print(f"X.grad shape: {tuple(X_t.grad.shape)}")
print(f"X.grad non-zero entries: {(X_t.grad != 0).sum().item()}/{X_t.grad.numel()}")
print("Gradients flow through the alignment (X.grad is non-zero).")
X.grad shape: (40, 3)
X.grad non-zero entries: 120/120
Gradients flow through the alignment (X.grad is non-zero).
Summary#
Decision tree for choosing a Procrustes variant:
Scenario |
Function |
Output |
|---|---|---|
“I need the aligned coordinates” |
|
|
“I need a scalar score” |
|
|
“I need the raw distance” |
|
|
“Some bins are unreliable” |
|
|
“Input has outliers” |
|
|
“Two resolutions to align” |
|
|
“Gradients must flow” |
|
|
Note: the rotation matrix R is the second return element (not the aligned Y).
Where to go next#
tutorial_32_3d_quality_assessment.ipynb— usesprocrustes_align,procrustes_rmse, andcross_resolution_consistencyfor end-to-end 3D structure assessment.docs/source/user_guide/structs.md— module-level overview of typed data shapes.docs/release-notes/v2.25.0.md— full v2.25.0 migration notes including the loss class removals.