Documentation Quality Pass Implementation Plan¶
For agentic workers: REQUIRED: Use superpowers:subagent-driven-development (if subagents available) or superpowers:executing-plans to implement this plan. Steps use checkbox (
- [ ]) syntax for tracking.
Goal: Improve all public and private doc comments across forge to include LaTeX math, literature references with DOIs, usage examples, template constraints, and cross-references.
Architecture: Edit doc comments in C++ headers and implementation files in-place. No code logic changes. Validate with doxide build && mkdocs build after each task group. Tier 1 (operators/solvers) pauses for user review; all other tiers are autonomous.
Tech Stack: C++ doc comments (///), Doxide (Tree-sitter), MkDocs Material, MathJax (pymdownx.arithmatex)
Spec: docs/superpowers/specs/2026-03-17-doc-quality-pass-design.md
Conventions for All Tasks¶
Every task follows the same pattern:
- Read the source file
- Enhance doc comments per the spec (math, refs, examples, cross-refs)
- Convert Unicode math (β, ½, ‖·‖) to LaTeX (
$\beta$,$\frac{1}{2}$,$\|\cdot\|$) - Use
$...$for inline math,$$...$$for display math - Use
@seewith DOI for literature refs - Add
@code/@endcodeblocks for examples - Field maps use \(\omega\) (rad/s), NOT \(2\pi f\) (Hz)
- Do NOT change any code logic — only comments
- Commit after each task
Validation command (run after each task):
source docs/.venv/bin/activate && rm -rf docs/api/*.md && doxide build && mkdocs build
Chunk 1: Tier 1 — Operators (user review checkpoint)¶
Task 1: Gdft.h — Field-corrected DFT operator¶
Files:
- Modify: forge/Operators/Gdft.h
- [ ] Step 1: Enhance class-level doc comment
Add to the existing class doc:
- Display math for the forward model: $$d_j = \sum_k x_k \, e^{-i(2\pi \mathbf{k}_j \cdot \mathbf{r}_k + \omega_k t_j)}$$ where \(\omega_k\) = FM[k] in rad/s
- Note that operator* computes the forward transform and operator/ computes the adjoint (conjugate transpose)
- Cross-refs: @see GdftR2 for R2* gradient extension, @see Gnufft for the fast gridding-based alternative
- Ref: @see Fessler & Sutton, "Nonuniform Fast Fourier Transforms Using Min-Max Interpolation," IEEE TSP, 2003. https://doi.org/10.1109/TSP.2002.807005
- [ ] Step 2: Add usage example
Add a @code/@endcode block showing basic construction and forward/adjoint:
/// @code
/// // 2D field-corrected DFT: 64x64 image, 4096 k-space samples
/// Col<float> kx(4096), ky(4096), kz(4096, fill::zeros);
/// Col<float> ix(4096), iy(4096), iz(4096, fill::zeros);
/// Col<float> FM(4096, fill::zeros); // field map (rad/s)
/// Col<float> t(4096, fill::zeros); // readout times (s)
/// Gdft<float> G(4096, 4096, kx, ky, kz, ix, iy, iz, FM, t);
/// Col<cx_float> kdata = G * image; // forward
/// Col<cx_float> recon = G / kdata; // adjoint
/// @endcode
- [ ] Step 3: Convert existing Unicode math to LaTeX
Replace any inline Unicode symbols in existing comments with LaTeX equivalents.
- [ ] Step 4: Validate and commit
Run: source docs/.venv/bin/activate && rm -rf docs/api/*.md && doxide build && mkdocs build
git add forge/Operators/Gdft.h
git commit -m "docs: enhance Gdft.h with LaTeX math, example, and literature ref"
Task 2: GdftR2.h — DFT with R2* gradients¶
Files:
- Modify: forge/Operators/GdftR2.h
- [ ] Step 1: Enhance class-level doc comment
Add:
- First-order Taylor expansion formula for the R2-corrected signal model. The expansion improves accuracy by including the spatial gradient of the field map/R2 decay.
- Explanation of calcGradientMaps and the finite-difference operator Cd
- When to use GdftR2 vs plain Gdft (when R2* decay varies significantly across the FOV)
- Cross-refs: @see Gdft for the base DFT without gradients
- [ ] Step 2: Add usage example
Show construction with gradient maps, similar pattern to Gdft but with the R2* extension.
- [ ] Step 3: Convert Unicode math to LaTeX, validate, commit
git add forge/Operators/GdftR2.h
git commit -m "docs: enhance GdftR2.h with Taylor expansion formula and example"
Task 3: Gnufft.h — Kaiser-Bessel NUFFT¶
Files:
- Modify: forge/Operators/Gnufft.h
- [ ] Step 1: Enhance class-level doc comment
Add:
- Gridding algorithm description: convolution with KB kernel on oversampled grid, FFT, then crop/deapodize
- KB kernel formula: $$\kappa(u) = \frac{1}{W} I_0\!\left(\beta\sqrt{1 - (2u/W)^2}\right)$$ where \(W\) is kernel width, \(\beta\) is shape parameter, \(I_0\) is modified Bessel function of the first kind
- Guidance: gridos (grid oversampling factor, typically 2.0), kernelWidth (typically 4-6), beta = pi * W * (1 - 0.5/gridos)
- LUT accuracy notes: precomputed lookup table for kernel evaluation
- Refs:
- @see Jackson et al., "Selection of a Convolution Function for Fourier Inversion Using Gridding," IEEE TMI, 1991. https://doi.org/10.1109/42.75611
- @see Pipe & Menon, "Sampling Density Compensation in MRI," MRM, 1999. https://doi.org/10.1002/(SICI)1522-2594(199901)41:1<179::AID-MRM25>3.0.CO;2-V
- @see Beatty et al., "Rapid Gridding Reconstruction with a Minimal Oversampling Ratio," IEEE TMI, 2005. https://doi.org/10.1109/TMI.2004.842452
- Cross-refs: @see Gdft for the brute-force DFT alternative, @see gridding.h for low-level gridding kernels
- [ ] Step 2: Add usage example
/// @code
/// Gnufft<float> G(dataLength, 2.0f, Nx, Ny, Nz, kx, ky, kz, ix, iy, iz);
/// Col<cx_float> kdata = G * image; // forward: gridding + FFT
/// Col<cx_float> recon = G / kdata; // adjoint: FFT + degridding
/// @endcode
- [ ] Step 3: Convert Unicode math to LaTeX, validate, commit
git add forge/Operators/Gnufft.h
git commit -m "docs: enhance Gnufft.h with KB kernel formula, refs, and example"
Task 4: Gfft.h — Uniform Cartesian FFT¶
Files:
- Modify: forge/Operators/Gfft.h
- [ ] Step 1: Enhance class-level doc comment
Minimal additions:
- Note that this wraps FFTW (CPU) or cuFFT (GPU) for fully-sampled Cartesian data
- Cross-refs: @see Gnufft for non-Cartesian data
- [ ] Step 2: Add usage example
/// @code
/// Gfft<float> F(Nx * Ny);
/// Col<cx_float> kdata = F * image; // forward FFT
/// Col<cx_float> recon = F / kdata; // inverse FFT
/// @endcode
- [ ] Step 3: Validate and commit
git add forge/Operators/Gfft.h
git commit -m "docs: enhance Gfft.h with example and cross-refs"
Task 5: SENSE.h — Multi-coil sensitivity encoding¶
Files:
- Modify: forge/Operators/SENSE.h
- [ ] Step 1: Enhance class-level doc comment
Add:
- Multi-coil model in LaTeX: $$\mathbf{d}_c = G(S_c \circ \mathbf{x}), \quad c = 1, \ldots, N_c$$ where \(S_c\) is the sensitivity map for coil \(c\) and \(\circ\) is element-wise multiplication
- Stacking convention: the full data vector concatenates all coils: \(\mathbf{d} = [\mathbf{d}_1^T, \ldots, \mathbf{d}_{N_c}^T]^T\)
- Template constraint: Tobj must support operator* (forward) and operator/ (adjoint). Compatible types: Gdft, Gnufft, Gfft, TimeSegmentation
- Ref: @see Pruessmann et al., "SENSE: Sensitivity Encoding for Fast MRI," MRM, 1999. https://doi.org/10.1002/(SICI)1522-2594(199911)42:5<952::AID-MRM16>3.0.CO;2-S
- Cross-refs: @see pcSENSE for phase-corrected variant, @see TimeSegmentation for field-corrected wrapper
- [ ] Step 2: Add usage example
/// @code
/// Gnufft<float> G(n1, n2, kx, ky, kz, ix, iy, iz, FM, t);
/// SENSE<float, Gnufft<float>> S(G, SMap, nc);
/// Col<cx_float> allcoil_data = S * image; // forward: all coils
/// Col<cx_float> recon = S / allcoil_data; // adjoint: combined
/// @endcode
- [ ] Step 3: Convert Unicode math to LaTeX, validate, commit
git add forge/Operators/SENSE.h
git commit -m "docs: enhance SENSE.h with multi-coil model, template constraints, and ref"
Task 6: pcSENSE.h — Phase-corrected SENSE¶
Files:
- Modify: forge/Operators/pcSENSE.h
- [ ] Step 1: Enhance class-level doc comment
Add:
- Per-shot phase-corrected model formula: for shot \(s\), \(\mathbf{d}_{c,s} = G_s(P_s \circ S_c \circ \mathbf{x})\) where \(P_s = e^{i\phi_s(\mathbf{r})}\) is the shot-specific phase map
- Distinction from SENSE: handles multi-shot acquisitions where each shot has a different B0-induced phase error (e.g., diffusion-weighted imaging with navigator-based correction)
- Cross-refs: @see SENSE for single-phase variant, @see pcSenseTimeSeg for combined phase + time-segmentation
- Refs:
- @see Liu, Moseley & Bammer, "Simultaneous phase correction and SENSE reconstruction for navigated multi-shot DWI," MRM, 2005. https://doi.org/10.1002/mrm.20706
- @see Holtrop & Sutton, "High spatial resolution diffusion weighted imaging on clinical 3T MRI scanners using multislab spiral acquisitions," J Med Imaging, 2016. https://doi.org/10.1117/1.JMI.3.2.023501
- [ ] Step 2: Add usage example
Show construction with per-shot phase maps and sensitivity maps.
- [ ] Step 3: Convert Unicode math to LaTeX, validate, commit
git add forge/Operators/pcSENSE.h
git commit -m "docs: enhance pcSENSE.h with phase model formula, refs (Liu 2005, Holtrop 2016)"
Task 7: pcSenseTimeSeg.h — pcSENSE + Time Segmentation¶
Files:
- Modify: forge/Operators/pcSenseTimeSeg.h
- [ ] Step 1: Enhance class-level doc comment
Add:
- Combined model: per-shot phase correction + time-segmented field correction applied together
- When to use: multi-shot non-Cartesian acquisitions with significant B0 inhomogeneity (e.g., spiral DWI)
- Cross-refs: @see pcSENSE for phase correction without time segmentation, @see TimeSegmentation for field correction without per-shot phases
-
[ ] Step 2: Add usage example
-
[ ] Step 3: Validate and commit
git add forge/Operators/pcSenseTimeSeg.h
git commit -m "docs: enhance pcSenseTimeSeg.h with combined model description and example"
Chunk 2: Tier 1 — Penalties, Solvers, Gridding (user review checkpoint)¶
Task 8: Robject.h — Abstract penalty base class¶
Files:
- Modify: forge/Penalties/Robject.h
- [ ] Step 1: Enhance class-level doc comment
Add:
- Potential function family in LaTeX:
- \(\psi(d)\): potential function (penalty applied to each difference)
- \(\dot\psi(d) = \psi'(d)\): first derivative
- \(\omega(d) = \psi'(d)/d\): weighting function used in surrogate optimization
- Penalty evaluation: $$R(\mathbf{x}) = \sum_j \beta_j \psi([C\mathbf{x}]_j)$$ where \(C\) is the finite-difference operator
- Subclass contract: override wpot(), dpot(), pot() to define a custom penalty. Default implementations are quadratic (\(\psi(d) = \frac{1}{2}d^2\)).
- Explain Cd (forward finite differences) and Ctd (adjoint / transpose finite differences)
- [ ] Step 2: Add usage example
Show how a penalty plugs into the PCG solver.
- [ ] Step 3: Convert Unicode math to LaTeX, validate, commit
git add forge/Penalties/Robject.h
git commit -m "docs: enhance Robject.h with potential function family and subclass contract"
Task 9: QuadPenalty.h — Quadratic (Tikhonov) penalty¶
Files:
- Modify: forge/Penalties/QuadPenalty.h
- [ ] Step 1: Enhance class-level doc comment
Add/convert:
- Penalty in LaTeX: $$R(\mathbf{x}) = \frac{\beta}{2}\|C\mathbf{x}\|^2$$
- Potential functions: \(\psi(d) = \frac{1}{2}d^2\), \(\dot\psi(d) = d\), \(\omega(d) = 1\)
- \(\beta\) guidance: controls regularization strength. Larger \(\beta\) = smoother images, smaller residual norm. Typical range depends on SNR and data scaling.
- Cross-ref: @see TVPenalty for edge-preserving alternative
- [ ] Step 2: Add usage example
/// @code
/// QuadPenalty<float> R(Nx, Ny, Nz, beta);
/// auto xhat = solve_pwls_pcg<float>(x0, G, W, yi, R, niter);
/// @endcode
- [ ] Step 3: Convert Unicode math to LaTeX, validate, commit
git add forge/Penalties/QuadPenalty.h
git commit -m "docs: enhance QuadPenalty.h with LaTeX formula and beta guidance"
Task 10: TVPenalty.h — Total Variation penalty¶
Files:
- Modify: forge/Penalties/TVPenalty.h
- [ ] Step 1: Enhance class-level doc comment
Add/convert:
- Hyperbola (Charbonnier) potential in LaTeX: $$\psi(d) = \delta^2\!\left(\sqrt{1 + (d/\delta)^2} - 1\right)$$
- Derivative: $$\dot\psi(d) = \frac{d}{\sqrt{1 + (d/\delta)^2}}$$
- Weight: $$\omega(d) = \frac{1}{\sqrt{1 + (d/\delta)^2}}$$
- Correct the existing header comment: this is NOT the Fair potential |d| - delta*log(1+|d|/delta). It is the hyperbola/Charbonnier penalty. Behaves quadratically near zero, linearly for large differences.
- \(\delta\) guidance: controls the quadratic-to-linear transition. Smaller \(\delta\) = sharper edges, closer to true TV, but harder optimization.
- Ref: @see Rudin, Osher & Fatemi, "Nonlinear Total Variation Based Noise Removal Algorithms," Physica D, 1992. https://doi.org/10.1016/0167-2789(92)90242-F (describes exact TV — this implementation is a smooth approximation)
- Cross-ref: @see QuadPenalty for the simpler quadratic alternative
-
[ ] Step 2: Add usage example
-
[ ] Step 3: Convert Unicode math to LaTeX, validate, commit
git add forge/Penalties/TVPenalty.h
git commit -m "docs: enhance TVPenalty.h with LaTeX formulas, delta guidance, and ROF ref"
Task 11: solve_pwls_pcg.hpp — PCG solver¶
Files:
- Modify: forge/Solvers/solve_pwls_pcg.hpp
- [ ] Step 1: Enhance doc comment on solve_pwls_pcg function
Add/convert:
- PWLS objective in LaTeX: $$\hat{\mathbf{x}} = \arg\min_{\mathbf{x}} \|\mathbf{W}^{1/2}(\mathbf{y} - A\mathbf{x})\|^2 + R(\mathbf{x})$$
- PCG algorithm outline: gradient = \(A^H W(A\mathbf{x} - \mathbf{y}) + \nabla R(\mathbf{x})\), Polak-Ribière-Polyak \(\beta\) update, quadratic surrogate step-size
- Convergence thresholds: 1e-10 for zero-gradient detection (normalized gradient magnitude), 1e-20 for denominator guard in step-size computation
- Thread safety note: reads g_should_stop atomic for early termination
- Ref: @see Fessler, "Penalized Weighted Least-Squares Image Reconstruction for Positron Emission Tomography," IEEE TMI, 1994. https://doi.org/10.1109/42.363108
- [ ] Step 2: Enhance norm_grad doc comment
Add LaTeX for the normalized gradient: \(\|\nabla\| = \frac{\|\mathbf{g}\|}{\mathbf{y}^H(W \circ \mathbf{y})}\)
- [ ] Step 3: Add usage example
/// @code
/// Gnufft<float> G(n1, n2, kx, ky, kz, ix, iy, iz, FM, t);
/// SENSE<float, Gnufft<float>> S(G, SMap, nc);
/// QuadPenalty<float> R(Nx, Ny, 1, 0.001f);
/// Col<float> W(n1 * nc, fill::ones);
/// forgeCol<forgeComplex<float>> x0(n2); x0.zeros();
/// auto xhat = solve_pwls_pcg<float>(x0, S, W, yi, R, 50);
/// @endcode
- [ ] Step 4: Convert Unicode math to LaTeX, validate, commit
git add forge/Solvers/solve_pwls_pcg.hpp
git commit -m "docs: enhance solve_pwls_pcg with PWLS objective, algorithm outline, and ref"
Task 12: reconSolve.h — High-level reconstruction helper¶
Files:
- Modify: forge/Solvers/reconSolve.h
- [ ] Step 1: Enhance doc comments
Add:
- Workflow description: reconSolve initializes image-space coordinates, constructs the encoding operator, and calls solve_pwls_pcg
- Coordinate formula in LaTeX: image coordinates span \([0, (N-1)/N]\) in each dimension
- Cross-ref: @see solve_pwls_pcg for the underlying solver
-
[ ] Step 2: Add usage example showing full pipeline
-
[ ] Step 3: Validate and commit
git add forge/Solvers/reconSolve.h
git commit -m "docs: enhance reconSolve.h with workflow description and example"
Task 13: TimeSegmentation.h — Time-segmented field correction¶
Files:
- Modify: forge/Gridding/TimeSegmentation.h
- [ ] Step 1: Enhance class-level doc comment
Add:
- Time-segmentation approximation in LaTeX: $$e^{-i\omega(\mathbf{r})t} \approx \sum_{l=1}^{L} b_l(t)\, e^{-i\omega(\mathbf{r})\tau_l}$$ where \(\omega(\mathbf{r})\) is the field map in rad/s, \(\tau_l\) are segment centers, \(b_l(t)\) are interpolation coefficients
- Interpolation types: Hanning window vs min-max (Fessler) — min-max is more accurate but slower to precompute
- \(L\) selection guidance: more segments = better accuracy but more computation. Typical: 4-8 for moderate inhomogeneity.
- Refs:
- @see Sutton et al., "Fast, Iterative Image Reconstruction for MRI in the Presence of Field Inhomogeneities," IEEE TMI, 2003. https://doi.org/10.1109/TMI.2002.808360
- @see Man et al., "Multifrequency Interpolation for Fast Off-Resonance Correction," IEEE TMI, 1997. https://doi.org/10.1109/42.611354
- Cross-refs: @see Gdft for brute-force field correction (exact but slow)
- [ ] Step 2: Add usage example
/// @code
/// Gnufft<float> G(n1, n2, kx, ky, kz, ix, iy, iz, FM, t);
/// TimeSegmentation<float, Gnufft<float>> Gts(G, FM, t, L, interptype, gridos);
/// SENSE<float, TimeSegmentation<float, Gnufft<float>>> S(Gts, SMap, nc);
/// @endcode
- [ ] Step 3: Convert Unicode math to LaTeX, validate, commit
git add forge/Gridding/TimeSegmentation.h
git commit -m "docs: enhance TimeSegmentation.h with approximation formula, L guidance, and refs"
Task 14: gridding.h — Low-level gridding kernels¶
Files:
- Modify: forge/Gridding/gridding.h
- [ ] Step 1: Enhance doc comments
Add:
- Gridding algorithm overview: adjoint = scatter k-space samples onto oversampled Cartesian grid via KB convolution; forward = sample from grid at non-Cartesian locations
- KB kernel formula: $$\kappa(u) = \frac{1}{W} I_0\!\left(\beta\sqrt{1 - (2u/W)^2}\right)$$ for \(|u| \leq W/2\), zero otherwise
- Oversampling factor: grid is \(\text{gridos} \times N\) in each dimension (typically 2.0)
- Density compensation: data weights \(W\) compensate for non-uniform k-space sampling density
- Refs: same Jackson, Pipe, Beatty refs as Gnufft
- Cross-ref: @see Gnufft for the high-level operator, @see griddingSupport.h for kernel helpers
- [ ] Step 2: Validate and commit
git add forge/Gridding/gridding.h
git commit -m "docs: enhance gridding.h with KB kernel formula, algorithm overview, and refs"
Task 15: Tier 1 validation and user review checkpoint¶
- [ ] Step 1: Full rebuild
source docs/.venv/bin/activate && rm -rf docs/api/*.md && doxide build && mkdocs build
- [ ] Step 2: Start dev server for user review
source docs/.venv/bin/activate && mkdocs serve -a 127.0.0.1:8000
- [ ] Step 3: Present diff to user
Run git diff main -- forge/Operators/ forge/Penalties/ forge/Solvers/ forge/Gridding/TimeSegmentation.h forge/Gridding/gridding.h and present for review. PAUSE HERE — wait for user approval before proceeding to Tier 2.
Chunk 3: Tiers 2-3 — Core Types and Utilities (autonomous)¶
Task 16: forgeCol.hpp — GPU-aware column vector¶
Files:
- Modify: forge/Core/forgeCol.hpp
- [ ] Step 1: Enhance class-level and method doc comments
Add:
- GPU/CPU sync: isOnGPU flag tracks data location. After GPU operations, data is on device. getArma() triggers memcpy back to host.
- View safety: views are non-owning. The parent must outlive the view. set_size() on a view breaks it (allocates fresh memory).
- Metal dispatch: operations dispatch to Metal GPU when METAL_COMPUTE is defined, T = float, and n_elem >= 4096.
- getArma() cost: non-const version does OpenACC device→host update; const version is zero-copy view.
- @warning getArmaComplex() returns a temporary view — never capture with auto.
-
[ ] Step 2: Add usage example
-
[ ] Step 3: Validate and commit
git add forge/Core/forgeCol.hpp
git commit -m "docs: enhance forgeCol.hpp with GPU sync semantics, view safety, and example"
Task 17: forgeMat.hpp — GPU-aware matrix¶
Files:
- Modify: forge/Core/forgeMat.hpp
- [ ] Step 1: Enhance class-level and method doc comments
Add:
- @warning set_size(nCols, nRows) — first argument is COLUMNS, second is ROWS. This is opposite to most matrix APIs. The constructor forgeMat(nRows, nCols) takes rows first (normal order).
- Column-major layout: element \((r, c)\) at index \(r + n\_rows \times c\)
- col() returns non-owning view (fast), col_copy() returns deep copy (safe for long-lived use)
- getArma(): non-const does OpenACC update; const is zero-copy
-
[ ] Step 2: Add usage example
-
[ ] Step 3: Validate and commit
git add forge/Core/forgeMat.hpp
git commit -m "docs: enhance forgeMat.hpp with set_size warning, layout docs, and example"
Task 18: forgeComplex.hpp — GPU-compatible complex type¶
Files:
- Modify: forge/Core/forgeComplex.hpp
- [ ] Step 1: Enhance doc comments
Add:
- Memory layout note: identical to std::complex<T> (interleaved real, imag). Safe to reinterpret_cast between the two.
- GPU compatibility: no virtual methods, no exceptions — safe for OpenACC parallel regions and Metal shaders.
- Document key operators and free functions (abs, arg, norm, conj, polar)
-
[ ] Step 2: Add usage example
-
[ ] Step 3: Validate and commit
git add forge/Core/forgeComplex.hpp
git commit -m "docs: enhance forgeComplex.hpp with layout notes, GPU compat, and example"
Task 19: Tier 3 utilities — ForgeLog, IO, FFT, griddingSupport, griddingTypes¶
Files:
- Modify: forge/Core/ForgeLog.hpp
- Modify: forge/IO/processIsmrmrd.hpp
- Modify: forge/IO/processNIFTI.hpp
- Modify: forge/FFT/fftCPU.h
- Modify: forge/FFT/ftCpu.h
- Modify: forge/FFT/ftCpuWithGrads.h
- Modify: forge/FFT/fftAccelerate.h
- Modify: forge/FFT/fftGPU.h
- Modify: forge/Gridding/griddingSupport.h
- Modify: forge/Core/griddingTypes.h
- Modify: forge/IO/directRecon.h
- Modify: forge/IO/acqTracking.h
- Modify: forge/Solvers/solve_grad_desc.hpp
- [ ] Step 1: Enhance each file per spec
For each file, add the items listed in the Tier 3 table of the spec. Key additions:
- ForgeLog.hpp: Progress bar lifecycle (add → tick → done), JSONL message types, example
- processIsmrmrd.hpp: Template function docs, data layout conventions
- processNIFTI.hpp: LPS → RAS coordinate system notes, quaternion conversion ref
- fftCPU.h: FFTW plan notes, memory layout (interleaved real/imag, length 2·N)
- ftCpu.h: LaTeX for DFT kernel formula, cross-ref to Gdft
- ftCpuWithGrads.h: R2 gradient kernel docs, cross-ref to GdftR2
- fftAccelerate.h/fftGPU.h: Minimal — example, cross-refs
- griddingSupport.h: KB kernel helpers (bessi0, calculateLUT), LaTeX for KB formula, cross-ref to gridding.h
- griddingTypes.h: ReconstructionSample<T1> and parameters<T1> field descriptions
- directRecon.h: Density compensation, SoS combination docs
- acqTracking.h: Acquisition tracking data organization
- solve_grad_desc.hpp*: Gradient descent objective, cross-ref to solve_pwls_pcg
- [ ] Step 2: Validate all
source docs/.venv/bin/activate && rm -rf docs/api/*.md && doxide build && mkdocs build
- [ ] Step 3: Commit in groups
git add forge/Core/ForgeLog.hpp forge/Core/griddingTypes.h
git commit -m "docs: enhance Core utility headers (ForgeLog, griddingTypes)"
git add forge/FFT/
git commit -m "docs: enhance FFT headers with math, examples, and cross-refs"
git add forge/IO/ forge/Gridding/griddingSupport.h forge/Solvers/solve_grad_desc.hpp
git commit -m "docs: enhance IO, griddingSupport, and solve_grad_desc headers"
Chunk 4: Tiers 4-6 — Metal, Internals, Guides (autonomous)¶
Task 20: Tier 4 — Metal shader documentation¶
Files:
- Modify: forge/Metal/vectorops_metal.metal
- Modify: forge/Metal/MetalVectorOps.h
- Modify: forge/Metal/MetalVectorOps.mm
- Modify: forge/Metal/MetalNufftPipeline.h
- Modify: forge/Metal/MetalNufftPipeline.mm
- Modify: forge/Metal/MetalGridding.h (gridding dispatch wrapper)
- Modify: Any other *.metal files in forge/Metal/ (dft_metal.metal, gridding_metal.metal, nufft_support_metal.metal)
- [ ] Step 1: Document Metal compute kernels
For each .metal file:
- Add per-kernel doc comment: inputs, outputs, dispatch semantics
- Data layout: interleaved complex (float2 = {real, imag})
- Threadgroup sizing and workgroup notes
- Cross-ref to the C++ dispatch wrapper
For .h/.mm files:
- Document dispatch strategy (buffer binding, command encoding)
- Error handling patterns
- Cross-ref to forgeCol operator dispatch
- [ ] Step 2: Validate and commit
git add forge/Metal/
git commit -m "docs: add Metal shader and pipeline documentation for CUDA porting reference"
Task 21: Tier 5 — Private/internal method pass¶
Files: - All files already modified in Tiers 1-4
- [ ] Step 1: Add one-liner docs to undocumented private methods
Go through each file modified in previous tasks and add brief doc comments to:
- Private helper functions in operator classes
- Internal state management (memory allocation, GPU sync)
- Implementation details in solvers (step-size selection, convergence checks)
- SFINAE/template machinery (detail::has_pgcol_ops, guard functions)
One-liner format: /// Compute the step size via quadratic surrogate approximation.
Do NOT add lengthy documentation — a "what and why" one-liner is sufficient.
- [ ] Step 2: Validate and commit
source docs/.venv/bin/activate && rm -rf docs/api/*.md && doxide build && mkdocs build
git add forge/Operators/ forge/Penalties/ forge/Solvers/ forge/Gridding/ forge/Core/ forge/FFT/ forge/IO/ forge/Metal/
git commit -m "docs: add one-liner docs to private/internal methods across all tiers"
Task 22: Tier 6 — Hand-authored guides refresh¶
Files:
- Modify: docs/guides/operators-and-solvers.md
- Modify: docs/guides/forge-types.md
- Modify: docs/guides/metal-backend.md
- Modify: docs/guides/forgeview.md
- Modify: docs/guides/mpi.md
- Modify: docs/getting-started/building.md
- Modify: docs/getting-started/installation.md
- Modify: docs/getting-started/first-reconstruction.md
- [ ] Step 1: Verify each guide against current code
For each file: 1. Read the guide 2. Cross-reference against the current source code (header files, CMakeLists.txt, Boost.ProgramOptions definitions) 3. Fix any stale information: - Operator tables that don't match current headers - CLI flags that have changed - Build commands that have changed - Docker image names that have changed - Dependency versions that have changed
- [ ] Step 2: Update operator table in operators-and-solvers.md
Verify every row in the operator, regularization, and solver tables matches the current headers. Add any missing entries.
- [ ] Step 3: Update forge-types.md
Verify operator tables and free function lists against current forgeCol.hpp and forgeMat.hpp.
- [ ] Step 4: Update metal-backend.md
Verify kernel list against current Metal shader files. Update build commands. Check phase status (1/2/3).
- [ ] Step 5: Validate and commit
source docs/.venv/bin/activate && mkdocs build
git add docs/
git commit -m "docs: refresh hand-authored guides to match current codebase"
Task 23: Final validation¶
- [ ] Step 1: Clean full rebuild
source docs/.venv/bin/activate && rm -rf docs/api/*.md site/
doxide build && mkdocs build
- [ ] Step 2: Spot-check rendered pages
Start mkdocs serve and verify:
- [ ] LaTeX math renders in API Reference pages (check solve_pwls_pcg, QuadPenalty, Gdft)
- [ ] @see references appear as formatted citations
- [ ] @code examples render as syntax-highlighted code blocks
- [ ] Cross-references between classes work
- [ ] No @brief or other raw Doxygen commands visible
- [ ] Guides content is current and accurate
- [ ] Step 3: Push
git push
Intentionally Deferred Files¶
These headers exist but are excluded from this pass — their documentation is adequate or they are low-traffic internal files:
forge/Core/forge.h— aggregate header (just includes)forge/Core/ForgeIncludes.h— internal includes/macrosforge/Core/ForgeExitCodes.hpp— enum definitions, self-documentingforge/Core/SignalHandler.hpp— small utility, already has commentsforge/Core/Tracer.hpp— NVTX/Instruments wrapper, context clear from usageforge/Core/AccelerateDispatch.hpp— Apple vDSP dispatch, internal to forgeColforge/Core/forgeSubview_Col.hpp— column subview, internal typeforge/FFT/fftshift.hpp— fftshift utility, self-explanatory