forge¶
Fast, Optimized Reconstruction via Gridding Engine
Types¶
| Name | Description |
|---|---|
| Robject | Abstract base class for regularization penalty functions. |
| GdftR2 | Non-uniform DFT operator extended with R2* (transverse relaxation) gradient maps. |
| Gnufft | Non-uniform Fast Fourier Transform operator using Kaiser-Bessel gridding. |
| pcSenseTimeSeg | Phase-corrected multi-shot SENSE with time-segmented field-map correction. |
| TimeSegmentation | Off-resonance correction via time-segmented interpolation. |
| Gfft | Uniform Cartesian FFT encoding operator. |
| ReconstructionSample | Single k-space sample with trajectory coordinates and density compensation. |
| SENSE | Multi-coil sensitivity encoding (SENSE) operator. |
| PGProgressManager | Singleton manager for stacked progress bar display. |
| PGImagePreviewContext | Singleton storing image dimensions for preview extraction. |
| pcSENSE | Phase-corrected multi-shot SENSE operator. |
| Tracer | RAII wrapper that pushes/pops an NVTX named range for GPU profiling. |
| SignpostTracer | RAII wrapper that emits os_signpost intervals for Apple Instruments. |
| QuadPenalty | Quadratic (Tikhonov) regularization penalty. |
| NiftiOrientation | Orientation metadata for NIfTI output files. |
| TVPenalty | Approximate total-variation (TV) regularization penalty. |
| Gdft | Non-uniform field-corrected discrete Fourier transform (DFT) operator. |
Variables¶
| Name | Description |
|---|---|
| dest | Query or set the JSONL output destination. |
| PG_NO_PARENT_BAR | Sentinel value indicating no parent progress bar is active. |
Operators¶
| Name | Description |
|---|---|
| operator% | Real-weight element-wise multiply with a complex vector: out[i] = W[i] · X[i]. W is real (forgeCol<T>), X is complex (forgeCol<forgeComplex<T>>). Dispatches to Accelerate vDSP rvec_cmul on Apple platforms for float. T : Floating-point base type. W : Real weight vector. X : Complex vector, same length as W. Return : Complex vector of element-wise products. |
| operator/ | Scalar divided by each element: out[i] = s / A[i]. T : Element type. s : Scalar numerator. A : Denominator vector. Return : New vector of quotients. |
Functions¶
| Name | Description |
|---|---|
| gridCoilImages | Grid all coil images from an ISMRMRD dataset. |
| calcSumOfSquaresImage | Compute a sum-of-squares combination image from individual coil images. |
| metal_dft_create | Create a Metal DFT context for Gdft (no gradient correction). |
| metal_dft_create_with_grads | Create a Metal DFT context for GdftR2 (with gradient correction). |
| metal_dft_forward | Forward DFT: image → k-space. ctx : Active Metal DFT context. idata_r : Input image real part (num_i floats). idata_i : Input image imag part (num_i floats). kdata_r : Output k-space real part (num_k floats). kdata_i : Output k-space imag part (num_k floats). |
| metal_dft_adjoint | Adjoint DFT: k-space → image. ctx : Active Metal DFT context. kdata_r : Input k-space real part (num_k floats). kdata_i : Input k-space imag part (num_k floats). idata_r : Output image real part (num_i floats). idata_i : Output image imag part (num_i floats). |
| metal_dft_destroy | Destroy a Metal DFT context and release resources. |
| metal_nufft_pipeline_create | Create a full GPU NUFFT pipeline context. |
| metal_nufft_pipeline_destroy | Destroy a pipeline context and release all GPU resources. |
| metal_nufft_forward | Full forward NUFFT: image → k-space (all on GPU). ctx : Active pipeline context. imageIn : Input image, interleaved complex (2 * NxNyNz floats). samplesOut : Output k-space, interleaved complex (2 * numSamples floats). |
| metal_nufft_adjoint | Full adjoint NUFFT: k-space → image (all on GPU). ctx : Active pipeline context. samplesIn : Input k-space, interleaved complex (2 * numSamples floats). imageOut : Output image, interleaved complex (2 * NxNyNz floats). |
| calculateLUT | Precompute a Kaiser-Bessel kernel lookup table. |
| kernel_value_LUT | Evaluate the Kaiser-Bessel kernel from a precomputed lookup table. |
| deinterleave_data2d | Deinterleave a 2-D interleaved complex buffer into separate real and imaginary arrays. |
| deinterleave_data3d | Deinterleave a 3-D interleaved complex buffer into separate real and imaginary arrays. |
| deapodization2d | Apply 2-D deapodization correction (divide out the KB convolution kernel in image space). |
| deapodization3d | Apply 3-D deapodization correction (divide out the KB convolution kernel in image space). |
| crop_center_region2d | Crop the centre region of a 2-D oversampled grid to the target image size. |
| crop_center_region3d | Crop the centre region of a 3-D oversampled grid to the target image size. |
| zero_pad2d | Zero-pad a 2-D image onto an oversampled grid (inverse of crop_center_region2d). |
| zero_pad3d | Zero-pad a 3-D image onto an oversampled grid (inverse of crop_center_region3d). |
| circshift2 | 2-D circular shift by (xshift, yshift) elements. |
| fftshift2 | 2-D forward FFT shift (swap quadrants so DC is at centre). |
| fftshift3 | 3-D forward FFT shift (swap octants so DC is at centre). |
| ifftshift2 | 2-D inverse FFT shift (undo fftshift2). |
| ifftshift3 | 3-D inverse FFT shift (undo fftshift3). |
| fft2shift_grid | In-place 2-D FFT shift on a complex grid (std::complex variant). |
| fft3shift_grid | In-place 3-D FFT shift on a complex grid (std::complex variant). |
| normalize_fft2d | Normalise a 2-D interleaved buffer by \(1/(n_x \times n_y)\) after an inverse FFT. |
| normalize_fft3d | Normalise a 3-D interleaved buffer by \(1/(n_x \times n_y \times n_z)\) after an inverse FFT. |
| ifft2dAccelerate | In-place 2D inverse FFT (float, interleaved). |
| fft3dAccelerate | In-place 3D forward FFT (float, interleaved). |
| ifft3dAccelerate | In-place 3D inverse FFT (float, interleaved). |
| fft2dAccelerate | In-place 2D forward FFT (double, interleaved) — delegates to FFTW. |
| ifft2dAccelerate | In-place 2D inverse FFT (double, interleaved) — delegates to FFTW. |
| fft3dAccelerate | In-place 3D forward FFT (double, interleaved) — delegates to FFTW. |
| ifft3dAccelerate | In-place 3D inverse FFT (double, interleaved) — delegates to FFTW. |
| ifft2dCPU | In-place 2-D inverse FFT (single precision). |
| fft2dCPU | In-place 2-D forward FFT (single precision). |
| ifft3dCPU | In-place 3-D inverse FFT (single precision). |
| fft3dCPU | In-place 3-D forward FFT (single precision). |
| ifft2dCPU | In-place 2-D inverse FFT (double precision). |
| fft1dCPU | In-place 1-D forward FFT (double precision). |
| fft2dCPU | In-place 2-D forward FFT (double precision). |
| ifft3dCPU | In-place 3-D inverse FFT (double precision). |
| fft3dCPU | In-place 3-D forward FFT (double precision). |
| gridding_adjoint_2D | 2-D Kaiser-Bessel adjoint gridding (k-space -> oversampled grid). |
| gridding_adjoint_3D | 3-D Kaiser-Bessel adjoint gridding (k-space -> oversampled grid). |
| gridding_forward_2D | 2-D Kaiser-Bessel forward gridding (oversampled grid -> k-space samples). |
| gridding_forward_3D | 3-D Kaiser-Bessel forward gridding (oversampled grid -> k-space samples). |
| computeFH_CPU_Grid | Full adjoint NUFFT pipeline: k-space -> image (grid, IFFT, crop, deapodize). T1 : Floating-point precision type. numK_per_coil : Number of k-space samples per coil. kx : k-space x-coordinates. ky : k-space y-coordinates. kz : k-space z-coordinates. dIn : Input k-space data (interleaved real/imag). Nx : Image size in x. Ny : Image size in y. Nz : Image size in z. gridOS : Grid oversampling factor. kernelWidth : Kaiser-Bessel kernel width. beta : Kaiser-Bessel kernel shape parameter beta. LUT : Precomputed Kaiser-Bessel lookup table. sizeLUT : Number of LUT entries. stream : OpenACC/CUDA stream (NULL for CPU). plan : cuFFT plan pointer (NULL for CPU). pGridData_crop_deAp : Scratch: deapodized cropped image. pGridData_crop_d : Scratch: cropped grid data (GPU). pGridData : Scratch: full oversampled grid. pGridData_d : Scratch: GPU-side oversampled grid. |
| computeFd_CPU_Grid | Full forward NUFFT pipeline: image -> k-space (deapodize, zero-pad, FFT, grid). T1 : Floating-point precision type. numK_per_coil : Number of k-space samples per coil. kx : k-space x-coordinates. ky : k-space y-coordinates. kz : k-space z-coordinates. dIn : Input image data (interleaved real/imag). Nx : Image size in x. Ny : Image size in y. Nz : Image size in z. gridOS : Grid oversampling factor. kernelWidth : Kaiser-Bessel kernel width. beta : Kaiser-Bessel kernel shape parameter beta. LUT : Precomputed Kaiser-Bessel lookup table. sizeLUT : Number of LUT entries. stream : OpenACC/CUDA stream (NULL for CPU). plan : cuFFT plan pointer (NULL for CPU). pGridData : Scratch: full oversampled grid (CPU). pGridData_d : Scratch: GPU-side oversampled grid. pGridData_os : Scratch: zero-padded oversampled grid (CPU). pGridData_os_d : Scratch: GPU-side zero-padded grid. pSamples : Output k-space samples array (interleaved real/imag). |
| sheppLogan2D | Generate a 2D modified Shepp-Logan phantom image. |
| metal_gridding_create | Allocate and initialise a Metal gridding context. |
| metal_gridding_destroy | Destroy a Metal gridding context and release all Metal resources. |
| metal_gridding_adjoint_2D | Adjoint (k-space → image) gridding for a 2D problem. |
| metal_gridding_adjoint_3D | Adjoint (k-space → image) gridding for a 3D problem. |
| metal_gridding_forward_2D | Forward (image → k-space) gridding for a 2D problem. |
| metal_gridding_forward_3D | Forward (image → k-space) gridding for a 3D problem. |
| PG_TUI_SPAWN_MODE | Query or set the forgeview spawn mode. |
| FORGE_LOG_INIT | Initialise the forge logger. |
| FORGE_LOG_REINIT_CLASSIC | Reinitialise the spdlog logger for classic (non-TUI) mode. |
| FORGE_TUI_START | Emit a "start" lifecycle event and optionally auto-spawn forgeview. |
| FORGE_TUI_EXIT | Emit an "exit" lifecycle event and clean up. |
| PG_PROGRESS | Access the singleton progress manager. |
| PG_PROGRESS_ADD | Register a new progress bar with label and max count. |
| FORGE_PROGRESS_TICK | Advance bar at idx, reporting current absolute count. |
| FORGE_PROGRESS_DONE | Mark bar at idx as completed (hidden in classic mode, "done" in TUI). |
| FORGE_METRICS | Emit convergence metrics for a progress bar (TUI mode only). |
| FORGE_SET_IMAGE_DIMS | Set the image dimensions for preview extraction. |
| FORGE_IMAGE_PREVIEW | Emit an image preview for display in forgeview. |
| stackOfSpiralsTimingVector | Generate timing vector for stack-of-spirals readout. |
| imageCoordinates3D | Generate 3D image-space coordinates matching pcSENSE Cube(Nx,Ny,Nz) layout. |
| randomShotPhase3D | Generate random per-shot linear phase ramps (models motion-induced phase). |
| fftshift | 2-D FFT-shift: shift both rows and columns by floor(size/2). Equivalent to calling fftshift(X, 0) then fftshift(X, 1). T1 : Armadillo array type. X : Input 2-D array. Return : 2-D shifted copy of X. |
| reconSolve | High-level iterative MRI image reconstruction via PCG. |
| memptr | Return a raw pointer to the underlying data buffer. |
| reset_mem | Release ownership of internal memory without freeing it (used by move semantics). |
| set_size | Allocate (or reallocate) storage for length elements, freeing any prior allocation. |
| zeros | Set all elements to zero (GPU-aware via OpenACC). |
| ones | Set all elements to one (GPU-aware via OpenACC). |
| subvec | Return a non-owning view of elements [first..last] (inclusive). The view must not outlive this forgeCol (or the underlying memory if this is itself a view). |
| pg_isnan_float | Bit-level NaN detection that works even under -ffast-math / -Ofast. |
| has_nan | Check if any element is NaN. |
| serialize | Boost.Serialization support for MPI communication. |
| sum | Sum all elements of a complex forgeCol (separate real/imag reductions for OpenACC). |
| sum | Sum all elements of a real forgeCol, dispatching to Accelerate vDSP on Apple. |
| cdot | Conjugate dot product of two complex vectors: Σᵢ conj(A[i]) · B[i]. Equivalent to the Hermitian inner product ⟨A, B⟩. Accelerate cblas_cdotc is dispatched on Apple platforms for float (faster than Metal for reductions). T : Floating-point base type. A : First complex vector (conjugated). B : Second complex vector, same length as A. Return : Scalar complex inner product. |
| norm | L2 norm of a complex vector: √(Σᵢ |
| norm | L2 norm of a real vector: √(Σᵢ A[i]²). T : Floating-point type (real only). A : Input real vector. Return : Non-negative scalar norm. |
| conj | Element-wise complex conjugate. |
| abs | Element-wise magnitude of a complex vector: out[i] = |
| real | Extract real parts of a complex vector. |
| imag | Extract imaginary parts of a complex vector. |
| accu_schur | Element-wise dot product without conjugation: Σᵢ A[i] · B[i]. Unlike cdot, A is not conjugated. Used in the norm_grad denominator to compute yᵀ · (W ⊙ y) (the non-Hermitian form). T : Floating-point base type. A : First complex vector (not conjugated). B : Second complex vector, same length as A. Return : Scalar complex dot product. |
| accu_schur | Element-wise dot product for real vectors: Σᵢ A[i] · B[i]. T : Floating-point type (real only; SFINAE-excluded for complex). A : First real vector. B : Second real vector, same length as A. Return : Scalar dot product. |
| norm_grad | Scale-invariant gradient magnitude for PCG convergence testing. |
| to_complex | Lift a real vector to complex with zero imaginary part. |
| log1p | Element-wise log(1 + x) for a real vector. |
| sqrt | Element-wise square root for a real vector. |
| square | Element-wise square of a real vector: out[i] = A[i]². T : Floating-point type (real only). A : Input real vector. Return : New vector of squared values. |
| accu | Sum all elements of a real vector (alias for sum). |
| abs | Element-wise absolute value of a real vector. |
| openISMRMRDData | Open an ISMRMRD dataset and initialise bookkeeping structures. |
| closeISMRMRDData | Close an ISMRMRD dataset and free associated resources. |
| convertFromNDArrayToArma | Convert an ISMRMRD NDArray to an Armadillo column vector (zero-copy view). |
| getISMRMRDFieldMap | Read the off-resonance field map (rad/s) from the "FieldMap" NDArray. |
| getISMRMRDSenseMap | Read the multi-coil SENSE sensitivity maps from the "SENSEMap" NDArray. |
| getISMRMRDPhaseMaps | Read the shot-by-shot phase maps from the "PhaseMaps" NDArray. |
| getISMRMRDTemporalBasis | Read the temporal basis vectors ("v" NDArray) for low-rank reconstruction. |
| getISMRMRDCompletePhaseMap | Extract the phase map sub-volume for a specific acquisition combination. |
| getISMRMRDCompleteSENSEMap | Extract the SENSE map sub-volume for a single slice from the full stacked maps. |
| getISMRMRDCompleteFieldMap | Extract the field map sub-volume for a single slice from the full stacked maps. |
| processISMRMRDInput | One-shot convenience: open dataset, read field map and SENSE map. |
| extractNiftiOrientation | Extract NIfTI orientation metadata from an ISMRMRD dataset. |
| getISMRMRDAcqData | Read a single acquisition's k-space data and trajectory from the dataset. |
| writeISMRMRDImageData | Write a reconstructed complex image back into the ISMRMRD dataset. |
| getCompleteISMRMRDAcqData | Read all acquisitions for a given (slice, rep, avg, echo, phase) combination. |
| metal_vector_get_context | Get or create the process-wide singleton context. |
| metal_rvec_cmul | Real weight * complex: C[i] = W[i] * X[i]. W has n floats; X and C have 2*n floats. |
| metal_cvec_axpy | Complex AXPY: C[i] = A[i] + (alphaRe + i*alphaIm) * B[i]. |
| metal_cvec_mul_scalar | Complex scalar multiply: C[i] = (alphaRe + i*alphaIm) * A[i]. |
| metal_cvec_cdot | Complex dot product: sum(conj(A[i]) * B[i]). |
| metal_cvec_norm2sq | L2 norm squared of complex vector: sum( |
| metal_vec_sum | Sum of real float vector. |
| metal_vecops_dispatch_count | Total number of Metal command buffer commits (vector ops only, not gridding). |
| metal_vecops_wait_seconds | Cumulative time in seconds spent in waitUntilCompleted (vector ops only). |
| metal_vecops_reset_stats | Reset both counters to zero. |
| ftCpu | Non-uniform forward DFT kernel (CPU / OpenACC). |
| iftCpu | Non-uniform adjoint DFT kernel (CPU / OpenACC). |
| brainPhantom3D | Generate a 3D multi-compartment brain-like phantom. |
| spiralTrajectory2D | Generate a 2D Archimedean spiral k-space trajectory. |
| imageCoordinates2D | Generate image-space coordinates matching forge convention. |
| linearTimingVector | Generate a linear timing vector for a single readout. |
| forge_signal_handler | SIGINT handler: sets g_should_stop to true. Signal-safe: only sets an atomic flag; no logging or I/O. |
| forge_install_signal_handlers | Install signal handlers for forge executables. |
| rotationMatrixToQuaternion | Convert a 3x3 rotation matrix to a NIfTI quaternion (a, b, c, d). |
| writeNiftiRealImage | Write a real-valued image volume to a NIfTI-1 (.nii) file. filename : Base filename (without .nii extension, which is appended automatically). imageData : Flat column vector of image voxels (column-major, NxNyNzNslabN_TR). Nx : Image size along readout. Ny : Image size along phase-encode. Nz : Image size along slice/partition (default 1 for 2-D). Nslab : Number of slabs (default 1). N_TR : Number of time frames (default 1). orient : Optional NiftiOrientation with voxel sizes and direction cosines. |
| writeNiftiMagPhsImage | Write a complex-valued image as two NIfTI files: magnitude (_mag) and phase (_phs). |
| reset_mem | Release ownership of internal memory without freeing it (used by move semantics). |
| set_size | Allocate (or reallocate) storage for nCols x nRows elements. |
| zeros | Set all elements to zero (GPU-aware via OpenACC). |
| ones | Set all elements to one (GPU-aware via OpenACC). |
| getArma | Const version — returns a non-owning zero-copy arma view (no OpenACC update). |
| getArmaComplex | Return a zero-copy arma::Mat view for forgeMat |
| col | Return a non-owning view into column colIndx. |
| col | Const version of column view. |
| col_copy | Deep-copy a column (returns an owning forgeCol). |
| set_col | Write a forgeCol into column colIndx (memcpy). |
| serialize | Boost.Serialization support for MPI communication. |
| sum | Sum a complex forgeMat along the given dimension (0=column-wise, 1=row-wise). |
| sum | Sum a real forgeMat along the given dimension (0=column-wise, 1=row-wise). |
| vectorise | Flatten a forgeMat into a single forgeCol in column-major order. |
| conj | Element-wise complex conjugate of a forgeMat. |
| matmul | Matrix-matrix multiply for complex forgeMat (conjugate-transpose NOT applied). |
| matmul | Matrix-vector multiply for complex forgeMat × forgeCol. |
| matmul | Matrix-matrix multiply for real forgeMat. |
| trans | Conjugate transpose of a complex forgeMat. |
| strans | Non-conjugate (simple) transpose of a complex forgeMat. |
| trans | Transpose of a real forgeMat. |
| reshape | Reshape a forgeMat to new dimensions (column-major order preserved). |
| cols | Extract a deep-copy column subrange [first, last] (inclusive). |
| ifft2dGPU | In-place 2-D inverse FFT on GPU (single precision). |
| fft2dGPU | In-place 2-D forward FFT on GPU (single precision). |
| ifft3dGPU | In-place 3-D inverse FFT on GPU (single precision). |
| fft3dGPU | In-place 3-D forward FFT on GPU (single precision). |
| fft1dGPU | In-place 1-D forward FFT on GPU (double precision). |
| ifft2dGPU | In-place 2-D inverse FFT on GPU (double precision). |
| fft2dGPU | In-place 2-D forward FFT on GPU (double precision). |
| ifft3dGPU | In-place 3-D inverse FFT on GPU (double precision). |
| fft3dGPU | In-place 3-D forward FFT on GPU (double precision). |
| dot_double | Element-wise dot product (without conjugation) of two complex vectors. |
| norm_grad | Normalized gradient magnitude for convergence testing. |
| solve_pwls_pcg | Solve a penalized weighted least-squares problem via preconditioned CG. |
| ftCpuWithGrads | Non-uniform forward DFT kernel with R2* field-map gradients (CPU / OpenACC). |
| iftCpuWithGrads | Non-uniform adjoint DFT kernel with R2* field-map gradients (CPU / OpenACC). |
Variable Details¶
PG_NO_PARENT_BAR¶
constexpr size_t PG_NO_PARENT_BAR
Sentinel value indicating no parent progress bar is active.
dest¶
static FILE* dest
Query or set the JSONL output destination.
Default is stderr. Set to stdout or a FILE* opened from a path. All JSONL-emitting code (spdlog sink + direct emitters) uses this.
Operator Details¶
operator%¶
template <typename T> forgeCol<forgeComplex<T>> operator%(const forgeCol<T>& W, const forgeCol<forgeComplex<T>>& X)
Real-weight element-wise multiply with a complex vector: out[i] = W[i] · X[i].
W is real (forgeCol<T>), X is complex (forgeCol<forgeComplex<T>>).
Dispatches to Accelerate vDSP rvec_cmul on Apple platforms for float.
T
: Floating-point base type.
W
: Real weight vector.
X
: Complex vector, same length as W.
Return : Complex vector of element-wise products.
operator/¶
template <typename T> forgeCol<T> operator/(const T& s, const forgeCol<T>& A)
Scalar divided by each element: out[i] = s / A[i].
T
: Element type.
s
: Scalar numerator.
A
: Denominator vector.
Return : New vector of quotients.
Function Details¶
FORGE_IMAGE_PREVIEW¶
inline void FORGE_IMAGE_PREVIEW(size_t bar_id, size_t iter, const std::vector<pg_detail::ImagePlane>& planes)
Emit an image preview for display in forgeview.
In TUI mode: base64-encodes each plane and emits JSONL with a "planes" array. In classic mode: no-op (images aren't displayable on a scrolling terminal).
bar_id
: Bar index from PG_PROGRESS_ADD.
iter
: Current iteration number (1-based).
planes
: Vector of image planes to send.
FORGE_LOG_INIT¶
inline void FORGE_LOG_INIT(const std::string& level_str = "info", const std::string& log_path = "forge.log", bool no_jsonl = false, const std::string& jsonl_output = "stderr")
Initialise the forge logger.
level_str
: Log level string ("debug", "info", "warn", "error").
log_path
: Path to the rotating log file (always human-readable).
no_jsonl
: If true, disable JSONL and use classic spdlog+indicators.
jsonl_output
: Where to send JSONL: "stderr" (default), "stdout", or a file path.
FORGE_LOG_REINIT_CLASSIC¶
inline void FORGE_LOG_REINIT_CLASSIC(const std::string& log_path = "forge.log")
Reinitialise the spdlog logger for classic (non-TUI) mode.
Called when forgeview fails to spawn, to swap the JSONL sink for a colored stderr sink so the user sees human-readable output instead of raw JSON.
FORGE_METRICS¶
inline void FORGE_METRICS(size_t bar_id, size_t iter, double error_norm, double penalty)
Emit convergence metrics for a progress bar (TUI mode only).
bar_id
: Bar index from PG_PROGRESS_ADD.
iter
: Current iteration number (1-based).
error_norm
: Data-fidelity error norm (||yi - Ax||).
penalty
: Roughness penalty value (R.Penalty(x)).
FORGE_PROGRESS_DONE¶
inline void FORGE_PROGRESS_DONE(size_t idx, const std::string& error = "")
Mark bar at `idx` as completed (hidden in classic mode, "done" in TUI).
idx
: Bar index from PG_PROGRESS_ADD.
error
: Optional error/cancellation message to include in the JSONL done event.
FORGE_PROGRESS_TICK¶
inline void FORGE_PROGRESS_TICK(size_t idx, size_t current, const std::string& postfix = "")
Advance bar at idx, reporting current absolute count.
idx
: Bar index from PG_PROGRESS_ADD.
current
: Absolute progress count (1-based, for ETA computation).
postfix
: Optional postfix text.
FORGE_SET_IMAGE_DIMS¶
inline void FORGE_SET_IMAGE_DIMS(size_t nx, size_t ny, size_t nz = 1)
Set the image dimensions for preview extraction.
Call this in the app before entering the solver loop.
FORGE_TUI_EXIT¶
inline void FORGE_TUI_EXIT(int code)
Emit an "exit" lifecycle event and clean up.
Case 1 — !PG_TUI_MODE(): classic info log only.
Cases 2 & 3 — PG_TUI_MODE(): emit JSONL exit to PG_JSONL_DEST(), then if forgeview is active (tui.active), wait for the user to quit the TUI. After exit, if the JSONL destination is a file (not stderr or stdout), shuts down spdlog and fcloses the file.
FORGE_TUI_START¶
inline void FORGE_TUI_START(const std::string& app_name, const std::string& version = FORGE_VERSION_STRING)
Emit a "start" lifecycle event and optionally auto-spawn forgeview.
Three cases depending on PG_TUI_MODE() and PG_TUI_SPAWN_MODE():
Case 1 — !PG_TUI_MODE(): classic info log only (no JSONL, no forgeview).
Case 2 — PG_TUI_MODE() && !PG_TUI_SPAWN_MODE(): emit JSONL start to PG_JSONL_DEST() without spawning forgeview (piped/file output mode).
Case 3 — PG_TUI_MODE() && PG_TUI_SPAWN_MODE() (default): try to locate and fork forgeview, redirect stderr to it, then emit JSONL start. Falls back to classic mode if forgeview is not found or stdout is not a terminal.
PG_PROGRESS¶
inline PGProgressManager& PG_PROGRESS()
Access the singleton progress manager.
PG_PROGRESS_ADD¶
inline size_t PG_PROGRESS_ADD( std::shared_ptr<PGProgressBar> bar, const std::string& label = "", size_t max_progress = 0)
Register a new progress bar with label and max count.
In TUI mode, emits a JSONL "add" message. In classic mode, adds to indicators. Returns an index for use with FORGE_PROGRESS_TICK / FORGE_PROGRESS_DONE.
PG_TUI_SPAWN_MODE¶
inline bool& PG_TUI_SPAWN_MODE(bool set_value = true, bool do_set = false)
Query or set the forgeview spawn mode.
Default is true (spawn enabled). Set to false via --no-tui to disable forgeview spawning while keeping JSONL output active.
abs¶
template <typename T> forgeCol<T> abs(const forgeCol<forgeComplex<T>>& A)
Element-wise magnitude of a complex vector: out[i] = |A[i]|.
T
: Floating-point base type.
A
: Input complex vector.
Return
: Real vector of magnitudes, same length as A.
template <typename T, typename std::enable_if< !std::is_same<T, forgeComplex<float>>::value && !std::is_same<T, forgeComplex<double>>::value, int>::type = 0> forgeCol<T> abs(const forgeCol<T>& A)
Element-wise absolute value of a real vector.
T
: Floating-point type (real only).
A
: Input real vector.
Return : New vector with std::abs(A[i]) at each position.
accu¶
template <typename T, typename std::enable_if< !std::is_same<T, forgeComplex<float>>::value && !std::is_same<T, forgeComplex<double>>::value, int>::type = 0> T accu(const forgeCol<T>& A)
Sum all elements of a real vector (alias for `sum`).
T
: Floating-point type (real only).
A
: Input real vector.
Return : Scalar sum of all elements.
accu_schur¶
template <typename T> forgeComplex<T> accu_schur(const forgeCol<forgeComplex<T>>& A, const forgeCol<forgeComplex<T>>& B)
Element-wise dot product without conjugation: Σᵢ A[i] · B[i].
Unlike cdot, A is not conjugated. Used in the norm_grad
denominator to compute yᵀ · (W ⊙ y) (the non-Hermitian form).
T
: Floating-point base type.
A
: First complex vector (not conjugated).
B
: Second complex vector, same length as A.
Return : Scalar complex dot product.
template <typename T, typename std::enable_if< !std::is_same<T, forgeComplex<float>>::value && !std::is_same<T, forgeComplex<double>>::value, int>::type = 0> T accu_schur(const forgeCol<T>& A, const forgeCol<T>& B)
Element-wise dot product for real vectors: Σᵢ A[i] · B[i].
T
: Floating-point type (real only; SFINAE-excluded for complex).
A
: First real vector.
B
: Second real vector, same length as A.
Return : Scalar dot product.
brainPhantom3D¶
template <typename T1> arma::Col<std::complex<T1>> brainPhantom3D(arma::uword Nx, arma::uword Ny, arma::uword Nz)
Generate a 3D multi-compartment brain-like phantom.
Compartments (painted in order, later overwrites earlier): 1. Skull shell (0.20) 2. Gray matter (0.80) 3. White matter core (1.00) 4. Left lateral ventricle (0.30) 5. Right lateral ventricle (0.30)
Nx
: Image width
Ny
: Image height
Nz
: Number of slices
Return : Vectorized complex column (NxNyNz), matching pcSENSE Cube layout
calcSumOfSquaresImage¶
template <typename T> forgeCol<T> calcSumOfSquaresImage( forgeCol<forgeComplex<T>> coilImages, uword Nx, uword Nz, uword NSliceMax, uword NCoils)
Compute a sum-of-squares combination image from individual coil images.
T
: Floating-point precision type (float or double).
coilImages
: Stacked coil images, length NxNzNSliceMax*NCoils.
Nx
: In-plane image size (pixels per slice).
Nz
: Number of z-partitions.
NSliceMax
: Maximum number of slices.
NCoils
: Number of receiver coils.
Return : Sum-of-squares combined image, length NxNzNSliceMax.
calculateLUT¶
template <typename T1> void calculateLUT(T1 beta, T1 width, T1*& LUT, uword& sizeLUT)
Precompute a Kaiser-Bessel kernel lookup table.
Evaluates the KB kernel on a uniform grid of sizeLUT points over [0, width/2]
and stores the result. Caller takes ownership of the allocated LUT array.
T1
: Floating-point type.
beta
: Kaiser-Bessel shape parameter beta.
width
: Kernel half-width.
LUT
: Output: pointer to newly allocated LUT array.
sizeLUT
: Output: number of entries in the LUT.
cdot¶
template <typename T> forgeComplex<T> cdot(const forgeCol<forgeComplex<T>>& A, const forgeCol<forgeComplex<T>>& B)
Conjugate dot product of two complex vectors: Σᵢ conj(A[i]) · B[i].
Equivalent to the Hermitian inner product ⟨A, B⟩. Accelerate cblas_cdotc is dispatched on Apple platforms for float (faster than Metal for reductions).
T
: Floating-point base type.
A
: First complex vector (conjugated).
B
: Second complex vector, same length as A.
Return : Scalar complex inner product.
circshift2¶
template <typename T> void circshift2(T* __restrict pDst, const T* __restrict pSrc, int xdim, int ydim, int xshift, int yshift)
2-D circular shift by (xshift, yshift) elements.
closeISMRMRDData¶
inline void closeISMRMRDData(ISMRMRD::Dataset*& d, ISMRMRD::IsmrmrdHeader& hdr, acqTracking*& acqTrack)
Close an ISMRMRD dataset and free associated resources.
col¶
forgeCol<T> col(const uword colIndx)
Return a non-owning view into column colIndx.
The returned forgeCol wraps the matrix's memory directly — writes through
the view modify the matrix (e.g., mat.col(ii) %= d).
The view must not outlive the matrix.
forgeCol<T> col(const uword colIndx) const
Const version of column view.
col_copy¶
forgeCol<T> col_copy(const uword colIndx) const
Deep-copy a column (returns an owning forgeCol).
Use this when you need the column to outlive the matrix.
cols¶
template <typename T> forgeMat<T> cols(const forgeMat<T>& A, uword first, uword last)
Extract a deep-copy column subrange [first, last] (inclusive).
computeFH_CPU_Grid¶
template <typename T1> void computeFH_CPU_Grid(int numK_per_coil, const T1* __restrict kx, const T1* __restrict ky, const T1* __restrict kz, const T1* __restrict dIn, int Nx, int Ny, int Nz, T1 gridOS, const T1 kernelWidth, const T1 beta, const T1* LUT, const uword sizeLUT, void* stream, CFTHandle* plan, T1* pGridData_crop_deAp, T1* pGridData_crop_d, T1* pGridData, T1* pGridData_d)
Full adjoint NUFFT pipeline: k-space -> image (grid, IFFT, crop, deapodize).
T1
: Floating-point precision type.
numK_per_coil
: Number of k-space samples per coil.
kx
: k-space x-coordinates.
ky
: k-space y-coordinates.
kz
: k-space z-coordinates.
dIn
: Input k-space data (interleaved real/imag).
Nx
: Image size in x.
Ny
: Image size in y.
Nz
: Image size in z.
gridOS
: Grid oversampling factor.
kernelWidth
: Kaiser-Bessel kernel width.
beta
: Kaiser-Bessel kernel shape parameter beta.
LUT
: Precomputed Kaiser-Bessel lookup table.
sizeLUT
: Number of LUT entries.
stream
: OpenACC/CUDA stream (NULL for CPU).
plan
: cuFFT plan pointer (NULL for CPU).
pGridData_crop_deAp
: Scratch: deapodized cropped image.
pGridData_crop_d
: Scratch: cropped grid data (GPU).
pGridData
: Scratch: full oversampled grid.
pGridData_d
: Scratch: GPU-side oversampled grid.
computeFd_CPU_Grid¶
template <typename T1> void computeFd_CPU_Grid(int numK_per_coil, const T1* __restrict kx, const T1* __restrict ky, const T1* __restrict kz, const T1* __restrict dIn, int Nx, int Ny, int Nz, T1 gridOS, const T1 kernelWidth, const T1 beta, const T1* LUT, const uword sizeLUT, void* stream, CFTHandle* plan, T1* pGridData, T1* pGridData_d, T1* pGridData_os, T1* pGridData_os_d, T1* pSamples)
Full forward NUFFT pipeline: image -> k-space (deapodize, zero-pad, FFT, grid).
T1
: Floating-point precision type.
numK_per_coil
: Number of k-space samples per coil.
kx
: k-space x-coordinates.
ky
: k-space y-coordinates.
kz
: k-space z-coordinates.
dIn
: Input image data (interleaved real/imag).
Nx
: Image size in x.
Ny
: Image size in y.
Nz
: Image size in z.
gridOS
: Grid oversampling factor.
kernelWidth
: Kaiser-Bessel kernel width.
beta
: Kaiser-Bessel kernel shape parameter beta.
LUT
: Precomputed Kaiser-Bessel lookup table.
sizeLUT
: Number of LUT entries.
stream
: OpenACC/CUDA stream (NULL for CPU).
plan
: cuFFT plan pointer (NULL for CPU).
pGridData
: Scratch: full oversampled grid (CPU).
pGridData_d
: Scratch: GPU-side oversampled grid.
pGridData_os
: Scratch: zero-padded oversampled grid (CPU).
pGridData_os_d
: Scratch: GPU-side zero-padded grid.
pSamples
: Output k-space samples array (interleaved real/imag).
conj¶
template <typename T> forgeCol<forgeComplex<T>> conj(const forgeCol<forgeComplex<T>>& A)
Element-wise complex conjugate.
T
: Floating-point base type.
A
: Input complex vector.
Return : New vector with conj(A[i]) at each position.
template <typename T> forgeMat<forgeComplex<T>> conj(const forgeMat<forgeComplex<T>>& pgA)
Element-wise complex conjugate of a forgeMat.
convertFromNDArrayToArma¶
template <typename T1> arma::Col<T1> convertFromNDArrayToArma(ISMRMRD::NDArray<T1>& inArray)
Convert an ISMRMRD NDArray to an Armadillo column vector (zero-copy view).
crop_center_region2d¶
template <typename T1> void crop_center_region2d( T1* __restrict pDst, T1* __restrict pSrc, int imageSizeX, int imageSizeY, int gridSizeX, int gridSizeY)
Crop the centre region of a 2-D oversampled grid to the target image size.
crop_center_region3d¶
template <typename T1> void crop_center_region3d(T1* __restrict pDst, T1* __restrict pSrc, int imageSizeX, int imageSizeY, int imageSizeZ, int gridSizeX, int gridSizeY, int gridSizeZ)
Crop the centre region of a 3-D oversampled grid to the target image size.
deapodization2d¶
template <typename T1> void deapodization2d( T1* __restrict pDst, T1* __restrict pSrc, int imageX, int imageY, T1 kernelWidth, T1 beta, T1 gridOS)
Apply 2-D deapodization correction (divide out the KB convolution kernel in image space).
deapodization3d¶
template <typename T1> void deapodization3d( T1* __restrict pDst, T1* __restrict pSrc, int imageX, int imageY, int imageZ, T1 kernelWidth, T1 beta, T1 gridOS)
Apply 3-D deapodization correction (divide out the KB convolution kernel in image space).
deinterleave_data2d¶
template <typename T1> void deinterleave_data2d(T1* __restrict pSrc, T1* __restrict outR_d, T1* __restrict outI_d, int imageX, int imageY)
Deinterleave a 2-D interleaved complex buffer into separate real and imaginary arrays.
deinterleave_data3d¶
template <typename T1> void deinterleave_data3d( T1* __restrict pSrc, T1* __restrict outR_d, T1* __restrict outI_d, int imageX, int imageY, int imageZ)
Deinterleave a 3-D interleaved complex buffer into separate real and imaginary arrays.
dot_double¶
template <typename T1> inline complex<T1> dot_double(const Col<complex<T1>>& A, const Col<complex<T1>>& B)
Element-wise dot product (without conjugation) of two complex vectors.
Computes Σ_k A[k] · B[k] (no complex conjugate on A).
T1
: Floating-point base type.
A
: First complex vector.
B
: Second complex vector, same length as A.
Return : Scalar dot product.
extractNiftiOrientation¶
inline NiftiOrientation extractNiftiOrientation(const ISMRMRD::IsmrmrdHeader& hdr, ISMRMRD::Dataset* d, arma::uword Nx, arma::uword Ny, arma::uword Nz, acqTracking* acqTrack = nullptr, arma::uword sliceIdx = 0)
Extract NIfTI orientation metadata from an ISMRMRD dataset.
Reads voxel sizes from the reconstructed (or encoded) space FOV and image matrix dimensions, and reads direction cosines and position from the first relevant acquisition. Direction cosines are converted from ISMRMRD LPS convention to NIfTI RAS convention by negating rows 0 and 1.
hdr
: Parsed ISMRMRD header.
d
: Pointer to the open ISMRMRD dataset.
Nx
: Image matrix size along readout direction.
Ny
: Image matrix size along phase-encode direction.
Nz
: Image matrix size along slice/partition direction.
acqTrack
: Optional acquisition tracker; when provided, uses the
acquisition for the given sliceIdx.
sliceIdx
: Slice index used when acqTrack is non-null.
Return : A populated NiftiOrientation struct with valid == true on success.
fft1dCPU¶
template <typename T1, typename std::enable_if<std::is_same<T1, double>::value, uword>::type = 0> void fft1dCPU(T1* d_data, uword nx)
In-place 1-D forward FFT (double precision).
T1
: Must be double.
d_data
: Pointer to interleaved real/imag double array of length 2·nx.
nx
: Number of complex samples.
fft1dGPU¶
template <typename T1, typename std::enable_if<std::is_same<T1, double>::value, int>::type = 0> void fft1dGPU(T1* d_data, int nx, void* stream, cufftHandle* plan)
In-place 1-D forward FFT on GPU (double precision).
T1
: Must be double.
d_data
: GPU pointer to interleaved real/imag double array, length 2·nx.
nx
: Number of complex samples.
stream
: OpenACC/CUDA stream pointer.
plan
: Pointer to a pre-allocated cuFFT plan.
fft2dAccelerate¶
void fft2dAccelerate(double* d_data, uword nx, uword ny)
In-place 2D forward FFT (double, interleaved) — delegates to FFTW.
fft2dCPU¶
template <typename T1, typename std::enable_if<std::is_same<T1, float>::value, uword>::type = 0> void fft2dCPU(T1* d_data, uword nx, uword ny)
In-place 2-D forward FFT (single precision).
T1
: Must be float.
d_data
: Pointer to interleaved real/imag float array of length 2·nx·ny.
nx
: Transform size in x.
ny
: Transform size in y.
template <typename T1, typename std::enable_if<std::is_same<T1, double>::value, uword>::type = 0> void fft2dCPU(T1* d_data, uword nx, uword ny)
In-place 2-D forward FFT (double precision).
T1
: Must be double.
d_data
: Pointer to interleaved real/imag double array of length 2·nx·ny.
nx
: Transform size in x.
ny
: Transform size in y.
fft2dGPU¶
template <typename T1, typename std::enable_if<std::is_same<T1, float>::value, int>::type = 0> void fft2dGPU(T1* d_data, int nx, int ny, void* stream, cufftHandle* plan)
In-place 2-D forward FFT on GPU (single precision).
T1
: Must be float.
d_data
: GPU pointer to interleaved real/imag float array, length 2·nx·ny.
nx
: Transform size in x.
ny
: Transform size in y.
stream
: OpenACC/CUDA stream pointer.
plan
: Pointer to a pre-allocated cuFFT plan.
template <typename T1, typename std::enable_if<std::is_same<T1, double>::value, int>::type = 0> void fft2dGPU(T1* d_data, int nx, int ny, void* stream, cufftHandle* plan)
In-place 2-D forward FFT on GPU (double precision).
T1
: Must be double.
d_data
: GPU pointer to interleaved real/imag double array, length 2·nx·ny.
nx
: Transform size in x.
ny
: Transform size in y.
stream
: OpenACC/CUDA stream pointer.
plan
: Pointer to a pre-allocated cuFFT plan.
fft2shift_grid¶
template <typename T1> void fft2shift_grid(std::complex<T1>* __restrict src, int dimY, int dimX)
In-place 2-D FFT shift on a complex grid (std::complex variant).
fft3dAccelerate¶
void fft3dAccelerate(float* d_data, uword nx, uword ny, uword nz)
In-place 3D forward FFT (float, interleaved).
void fft3dAccelerate(double* d_data, uword nx, uword ny, uword nz)
In-place 3D forward FFT (double, interleaved) — delegates to FFTW.
fft3dCPU¶
template <typename T1, typename std::enable_if<std::is_same<T1, float>::value, uword>::type = 0> void fft3dCPU(T1* d_data, uword nx, uword ny, uword nz)
In-place 3-D forward FFT (single precision).
T1
: Must be float.
d_data
: Pointer to interleaved real/imag float array of length 2·nx·ny·nz.
nx
: Transform size in x.
ny
: Transform size in y.
nz
: Transform size in z.
template <typename T1, typename std::enable_if<std::is_same<T1, double>::value, uword>::type = 0> void fft3dCPU(T1* d_data, uword nx, uword ny, uword nz)
In-place 3-D forward FFT (double precision).
T1
: Must be double.
d_data
: Pointer to interleaved real/imag double array of length 2·nx·ny·nz.
nx
: Transform size in x.
ny
: Transform size in y.
nz
: Transform size in z.
fft3dGPU¶
template <typename T1, typename std::enable_if<std::is_same<T1, float>::value, int>::type = 0> void fft3dGPU(T1* d_data, int nx, int ny, int nz, void* stream, cufftHandle* plan)
In-place 3-D forward FFT on GPU (single precision).
T1
: Must be float.
d_data
: GPU pointer to interleaved real/imag float array.
nx
: Transform size in x.
ny
: Transform size in y.
nz
: Transform size in z.
stream
: OpenACC/CUDA stream pointer.
plan
: Pointer to a pre-allocated cuFFT plan.
template <typename T1, typename std::enable_if<std::is_same<T1, double>::value, int>::type = 0> void fft3dGPU(T1* d_data, int nx, int ny, int nz, void* stream, cufftHandle* plan)
In-place 3-D forward FFT on GPU (double precision).
T1
: Must be double.
d_data
: GPU pointer to interleaved real/imag double array.
nx
: Transform size in x.
ny
: Transform size in y.
nz
: Transform size in z.
stream
: OpenACC/CUDA stream pointer.
plan
: Pointer to a pre-allocated cuFFT plan.
fft3shift_grid¶
template <typename T1> void fft3shift_grid(std::complex<T1>* __restrict src, int dimY, int dimX, int dimZ)
In-place 3-D FFT shift on a complex grid (std::complex variant).
fftshift¶
template <typename T1> arma_inline T1 fftshift(T1 X)
2-D FFT-shift: shift both rows and columns by floor(size/2).
Equivalent to calling fftshift(X, 0) then fftshift(X, 1).
T1
: Armadillo array type.
X
: Input 2-D array.
Return
: 2-D shifted copy of X.
fftshift2¶
template <typename T> void fftshift2(T* __restrict out, const T* __restrict in, int xdim, int ydim)
2-D forward FFT shift (swap quadrants so DC is at centre).
fftshift3¶
template <typename T> void fftshift3(T* __restrict out, const T* __restrict in, int xdim, int ydim, int zdim)
3-D forward FFT shift (swap octants so DC is at centre).
forge_install_signal_handlers¶
inline void forge_install_signal_handlers()
Install signal handlers for forge executables.
Installs forge_signal_handler for SIGINT and ensures SIGPIPE is ignored (SIGPIPE is already SIG_IGN'd by ForgeLog.hpp when TUI mode spawns forgeview, but this is safe to call independently).
forge_signal_handler¶
inline void forge_signal_handler(int /*sig*/)
SIGINT handler: sets g_should_stop to true.
Signal-safe: only sets an atomic flag; no logging or I/O.
ftCpu¶
template <typename T1> void ftCpu(T1* kdata_r, T1* kdata_i, const T1* idata_r, const T1* idata_i, const T1* kx, const T1* ky, const T1* kz, const T1* ix, const T1* iy, const T1* iz, const T1* FM, const T1* t, const int num_k, const int num_i)
Non-uniform forward DFT kernel (CPU / OpenACC).
Computes kdata[j] = Sigma_k (idata_r[k] + iidata_i[k]) * exp(-i2pi(kx[j]ix[k] + ky[j]iy[k] + kz[j]iz[k]) - iFM[k]*t[j]).
T1
: Floating-point precision type.
kdata_r
: Output k-space real part array, length num_k.
kdata_i
: Output k-space imaginary part array, length num_k.
idata_r
: Input image real part array, length num_i.
idata_i
: Input image imaginary part array, length num_i.
kx
: k-space x-coordinates, length num_k.
ky
: k-space y-coordinates, length num_k.
kz
: k-space z-coordinates, length num_k.
ix
: Image-space x-coordinates, length num_i.
iy
: Image-space y-coordinates, length num_i.
iz
: Image-space z-coordinates, length num_i.
FM
: Off-resonance field map (rad/s), length num_i.
t
: Per-sample readout time (s), length num_k.
num_k
: Number of k-space samples.
num_i
: Number of image pixels.
ftCpuWithGrads¶
template <typename T1> void ftCpuWithGrads(T1* kdata_r, T1* kdata_i, const T1* idata_r, const T1* idata_i, const T1* kx, const T1* ky, const T1* kz, const T1* ix, const T1* iy, const T1* iz, const T1* FM, const T1* Gx, const T1* Gy, const T1* Gz, const T1* t, const int num_k, const int num_i, const int num_x, const int num_y, const int num_z)
Non-uniform forward DFT kernel with R2* field-map gradients (CPU / OpenACC).
Extends ftCpu by incorporating spatially-varying R2 decay via gradient field maps Gx, Gy, Gz. Computes: kdata[j] = Sigma_k image[k] * exp(-i2pikx - iFM[k]t[j] - R2(k)t[j]) where R2(k) is interpolated from the gradient maps.
T1
: Floating-point precision type.
kdata_r
: Output k-space real part array, length num_k.
kdata_i
: Output k-space imaginary part array, length num_k.
idata_r
: Input image real part array, length num_i.
idata_i
: Input image imaginary part array, length num_i.
kx
: k-space x-coordinates, length num_k.
ky
: k-space y-coordinates, length num_k.
kz
: k-space z-coordinates, length num_k.
ix
: Image-space x-coordinates, length num_i.
iy
: Image-space y-coordinates, length num_i.
iz
: Image-space z-coordinates, length num_i.
FM
: Off-resonance field map (rad/s), length num_i.
Gx
: R2* gradient in x, length num_x.
Gy
: R2* gradient in y, length num_y.
Gz
: R2* gradient in z, length num_z.
t
: Per-sample readout time (s), length num_k.
num_k
: Number of k-space samples.
num_i
: Number of image pixels.
num_x
: Number of R2* gradient samples in x.
num_y
: Number of R2* gradient samples in y.
num_z
: Number of R2* gradient samples in z.
getArma¶
arma::Mat<T> getArma() const
Const version — returns a non-owning zero-copy arma view (no OpenACC update).
getArmaComplex¶
template <typename U = T, typename std::enable_if< std::is_same<U, forgeComplex<float>>::value || std::is_same<U, forgeComplex<double>>::value, int>::type = 0> arma::Mat<std::complex<typename U::value_type>> getArmaComplex() const
Return a zero-copy arma::Mat view for forgeMat<forgeComplex<T>> types.
Since forgeComplex<T> and std::complex<T> have identical memory layout,
this creates a non-owning arma view of the forgeMat's data.
getCompleteISMRMRDAcqData¶
template <typename T1> void getCompleteISMRMRDAcqData(ISMRMRD::Dataset* d, acqTracking* acqTrack, uword NSlice, uword NRep, uword NAve, uword NEcho, uword NPhase, forgeCol<forgeComplex<T1>>& data, forgeCol<T1>& kx, forgeCol<T1>& ky, forgeCol<T1>& kz, forgeCol<T1>& tvec)
Read all acquisitions for a given (slice, rep, avg, echo, phase) combination.
Iterates over all shots and partitions using the acqTracking lookup array,
collects k-space data and trajectory into flat column vectors ordered
[readout, coil, shot].
getISMRMRDAcqData¶
template <typename T1> void getISMRMRDAcqData(ISMRMRD::Dataset* d, uword Nacq, forgeCol<forgeComplex<T1>>& data, forgeCol<T1>& kx, forgeCol<T1>& ky, forgeCol<T1>& kz, forgeCol<T1>& tvec)
Read a single acquisition's k-space data and trajectory from the dataset.
getISMRMRDCompleteFieldMap¶
template <typename T1> forgeCol<T1> getISMRMRDCompleteFieldMap( ISMRMRD::Dataset* d, const forgeCol<T1>& FieldMaps, uword NSlice, uword imageSize)
Extract the field map sub-volume for a single slice from the full stacked maps.
getISMRMRDCompletePhaseMap¶
template <typename T1> forgeCol<T1> getISMRMRDCompletePhaseMap(ISMRMRD::Dataset* d, uword NSlice, uword NSet, uword NRep, uword NAvg, uword NPhase, uword NEcho, uword NSeg, uword imageSize)
Extract the phase map sub-volume for a specific acquisition combination.
Slices the full phase-map array by (slice, set, rep, avg, phase, echo, seg)
indices to return a single phase-map volume of size imageSize * NShotMax * NParMax.
getISMRMRDCompleteSENSEMap¶
template <typename T1> forgeCol<forgeComplex<T1>> getISMRMRDCompleteSENSEMap( ISMRMRD::Dataset* d, const forgeCol<forgeComplex<T1>>& SENSEMaps, uword NSlice, uword imageSize)
Extract the SENSE map sub-volume for a single slice from the full stacked maps.
getISMRMRDFieldMap¶
template <typename T1> forgeCol<T1> getISMRMRDFieldMap(ISMRMRD::Dataset* d)
Read the off-resonance field map (rad/s) from the "FieldMap" NDArray.
getISMRMRDPhaseMaps¶
template <typename T1> forgeCol<T1> getISMRMRDPhaseMaps(ISMRMRD::Dataset* d)
Read the shot-by-shot phase maps from the "PhaseMaps" NDArray.
getISMRMRDSenseMap¶
template <typename T1> forgeCol<forgeComplex<T1>> getISMRMRDSenseMap(ISMRMRD::Dataset* d)
Read the multi-coil SENSE sensitivity maps from the "SENSEMap" NDArray.
getISMRMRDTemporalBasis¶
template <typename T1> forgeCol<forgeComplex<T1>> getISMRMRDTemporalBasis(ISMRMRD::Dataset* d)
Read the temporal basis vectors ("v" NDArray) for low-rank reconstruction.
gridCoilImages¶
template <typename T> forgeCol<forgeComplex<T>> gridCoilImages(uword Ninplane, uword Nz, ISMRMRD::Dataset* d, ISMRMRD::IsmrmrdHeader* hdr, acqTracking* acqTrack, uword NPhase, uword NEcho, uword NAvg, uword NRep)
Grid all coil images from an ISMRMRD dataset.
Reads raw k-space data from the ISMRMRD dataset, applies density compensation, and grids each coil image using the NUFFT adjoint.
T
: Floating-point precision type (float or double).
Ninplane
: Number of in-plane readout samples.
Nz
: Number of z-partitions / slices.
d
: Pointer to the open ISMRMRD dataset.
hdr
: Pointer to the parsed ISMRMRD header.
acqTrack
: Pointer to acquisition tracking object.
NPhase
: Phase index to reconstruct.
NEcho
: Echo index to reconstruct.
NAvg
: Average index to reconstruct.
NRep
: Repetition index to reconstruct.
Return : Stacked coil images as a flat complex column vector.
gridding_adjoint_2D¶
template <typename T1> int gridding_adjoint_2D(unsigned int n, parameters<T1> params, T1 beta, ReconstructionSample<T1>* __restrict sample, const T1* LUT, const uword sizeLUT, T1* __restrict gridData)
2-D Kaiser-Bessel adjoint gridding (k-space -> oversampled grid).
T1
: Floating-point precision type.
n
: Number of k-space samples.
params
: Gridding parameters (image/grid size, oversampling, kernel).
beta
: Kaiser-Bessel kernel shape parameter beta.
sample
: Array of ReconstructionSample structs (k-space value + coordinates).
LUT
: Precomputed Kaiser-Bessel lookup table.
sizeLUT
: Number of entries in the LUT.
gridData
: Output oversampled grid (interleaved real/imag), length 2gridXgridY.
Return : 0 on success.
gridding_adjoint_3D¶
template <typename T1> int gridding_adjoint_3D(unsigned int n, parameters<T1> params, T1 beta, ReconstructionSample<T1>* __restrict sample, const T1* LUT, const uword sizeLUT, T1* gridData)
3-D Kaiser-Bessel adjoint gridding (k-space -> oversampled grid).
T1
: Floating-point precision type.
n
: Number of k-space samples.
params
: Gridding parameters (image/grid size, oversampling, kernel).
beta
: Kaiser-Bessel kernel shape parameter beta.
sample
: Array of ReconstructionSample structs.
LUT
: Precomputed Kaiser-Bessel lookup table.
sizeLUT
: Number of entries in the LUT.
gridData
: Output oversampled grid (interleaved real/imag), length 2gridXgridY*gridZ.
Return : 0 on success.
gridding_forward_2D¶
template <typename T1> int gridding_forward_2D(unsigned int n, parameters<T1> params, const T1* kx, const T1* ky, T1 beta, T1* __restrict pSamples, const T1* LUT, const uword sizeLUT, T1* __restrict pGridData)
2-D Kaiser-Bessel forward gridding (oversampled grid -> k-space samples).
T1
: Floating-point precision type.
n
: Number of k-space samples.
params
: Gridding parameters.
kx
: k-space x-coordinates, length n.
ky
: k-space y-coordinates, length n.
beta
: Kaiser-Bessel kernel shape parameter beta.
pSamples
: Output k-space samples array, length 2*n (interleaved).
LUT
: Precomputed Kaiser-Bessel lookup table.
sizeLUT
: Number of entries in the LUT.
pGridData
: Input oversampled grid (interleaved real/imag).
Return : 0 on success.
gridding_forward_3D¶
template <typename T1> int gridding_forward_3D(unsigned int n, parameters<T1> params, const T1* kx, const T1* ky, const T1* kz, T1 beta, T1* __restrict pSamples, const T1* LUT, const uword sizeLUT, T1* __restrict pGridData)
3-D Kaiser-Bessel forward gridding (oversampled grid -> k-space samples).
T1
: Floating-point precision type.
n
: Number of k-space samples.
params
: Gridding parameters.
kx
: k-space x-coordinates, length n.
ky
: k-space y-coordinates, length n.
kz
: k-space z-coordinates, length n.
beta
: Kaiser-Bessel kernel shape parameter beta.
pSamples
: Output k-space samples array, length 2*n (interleaved).
LUT
: Precomputed Kaiser-Bessel lookup table.
sizeLUT
: Number of entries in the LUT.
pGridData
: Input oversampled grid (interleaved real/imag).
Return : 0 on success.
has_nan¶
template <typename U = T, typename std::enable_if<std::is_same<U, float>::value, int>::type = 0> bool has_nan() const
Check if any element is NaN. Works for float, double, and forgeComplex types.
ifft2dAccelerate¶
void ifft2dAccelerate(float* d_data, uword nx, uword ny)
In-place 2D inverse FFT (float, interleaved). Normalises by 1/(nx*ny).
void ifft2dAccelerate(double* d_data, uword nx, uword ny)
In-place 2D inverse FFT (double, interleaved) — delegates to FFTW.
ifft2dCPU¶
template <typename T1, typename std::enable_if<std::is_same<T1, float>::value, uword>::type = 0> void ifft2dCPU(T1* d_data, uword nx, uword ny)
In-place 2-D inverse FFT (single precision).
T1
: Must be float.
d_data
: Pointer to interleaved real/imag float array of length 2·nx·ny.
nx
: Transform size in x.
ny
: Transform size in y.
template <typename T1, typename std::enable_if<std::is_same<T1, double>::value, uword>::type = 0> void ifft2dCPU(T1* d_data, uword nx, uword ny)
In-place 2-D inverse FFT (double precision).
T1
: Must be double.
d_data
: Pointer to interleaved real/imag double array of length 2·nx·ny.
nx
: Transform size in x.
ny
: Transform size in y.
ifft2dGPU¶
template <typename T1, typename std::enable_if<std::is_same<T1, float>::value, int>::type = 0> void ifft2dGPU(T1* d_data, int nx, int ny, void* stream, cufftHandle* plan)
In-place 2-D inverse FFT on GPU (single precision).
T1
: Must be float.
d_data
: GPU pointer to interleaved real/imag float array, length 2·nx·ny.
nx
: Transform size in x.
ny
: Transform size in y.
stream
: OpenACC/CUDA stream pointer.
plan
: Pointer to a pre-allocated cuFFT plan.
template <typename T1, typename std::enable_if<std::is_same<T1, double>::value, int>::type = 0> void ifft2dGPU(T1* d_data, int nx, int ny, void* stream, cufftHandle* plan)
In-place 2-D inverse FFT on GPU (double precision).
T1
: Must be double.
d_data
: GPU pointer to interleaved real/imag double array, length 2·nx·ny.
nx
: Transform size in x.
ny
: Transform size in y.
stream
: OpenACC/CUDA stream pointer.
plan
: Pointer to a pre-allocated cuFFT plan.
ifft3dAccelerate¶
void ifft3dAccelerate(float* d_data, uword nx, uword ny, uword nz)
In-place 3D inverse FFT (float, interleaved). Normalises by 1/(nx*ny*nz).
void ifft3dAccelerate(double* d_data, uword nx, uword ny, uword nz)
In-place 3D inverse FFT (double, interleaved) — delegates to FFTW.
ifft3dCPU¶
template <typename T1, typename std::enable_if<std::is_same<T1, float>::value, uword>::type = 0> void ifft3dCPU(T1* d_data, uword nx, uword ny, uword nz)
In-place 3-D inverse FFT (single precision).
T1
: Must be float.
d_data
: Pointer to interleaved real/imag float array of length 2·nx·ny·nz.
nx
: Transform size in x.
ny
: Transform size in y.
nz
: Transform size in z.
template <typename T1, typename std::enable_if<std::is_same<T1, double>::value, uword>::type = 0> void ifft3dCPU(T1* d_data, uword nx, uword ny, uword nz)
In-place 3-D inverse FFT (double precision).
T1
: Must be double.
d_data
: Pointer to interleaved real/imag double array of length 2·nx·ny·nz.
nx
: Transform size in x.
ny
: Transform size in y.
nz
: Transform size in z.
ifft3dGPU¶
template <typename T1, typename std::enable_if<std::is_same<T1, float>::value, int>::type = 0> void ifft3dGPU(T1* d_data, int nx, int ny, int nz, void* stream, cufftHandle* plan)
In-place 3-D inverse FFT on GPU (single precision).
T1
: Must be float.
d_data
: GPU pointer to interleaved real/imag float array.
nx
: Transform size in x.
ny
: Transform size in y.
nz
: Transform size in z.
stream
: OpenACC/CUDA stream pointer.
plan
: Pointer to a pre-allocated cuFFT plan.
template <typename T1, typename std::enable_if<std::is_same<T1, double>::value, int>::type = 0> void ifft3dGPU(T1* d_data, int nx, int ny, int nz, void* stream, cufftHandle* plan)
In-place 3-D inverse FFT on GPU (double precision).
T1
: Must be double.
d_data
: GPU pointer to interleaved real/imag double array.
nx
: Transform size in x.
ny
: Transform size in y.
nz
: Transform size in z.
stream
: OpenACC/CUDA stream pointer.
plan
: Pointer to a pre-allocated cuFFT plan.
ifftshift2¶
template <typename T> void ifftshift2(T* __restrict out, const T* __restrict in, int xdim, int ydim)
2-D inverse FFT shift (undo fftshift2).
ifftshift3¶
template <typename T> void ifftshift3(T* __restrict out, const T* __restrict in, int xdim, int ydim, int zdim)
3-D inverse FFT shift (undo fftshift3).
iftCpu¶
template <typename T1> void iftCpu(T1* idata_r, T1* idata_i, const T1* kdata_r, const T1* kdata_i, const T1* kx, const T1* ky, const T1* kz, const T1* ix, const T1* iy, const T1* iz, const T1* FM, const T1* t, const int num_k, const int num_i)
Non-uniform adjoint DFT kernel (CPU / OpenACC).
Computes idata[k] = Sigma_j (kdata_r[j] + ikdata_i[j]) * exp(+i2pi(kx[j]ix[k] + ky[j]iy[k] + kz[j]iz[k]) + iFM[k]*t[j]).
T1
: Floating-point precision type.
idata_r
: Output image real part array, length num_i.
idata_i
: Output image imaginary part array, length num_i.
kdata_r
: Input k-space real part array, length num_k.
kdata_i
: Input k-space imaginary part array, length num_k.
kx
: k-space x-coordinates, length num_k.
ky
: k-space y-coordinates, length num_k.
kz
: k-space z-coordinates, length num_k.
ix
: Image-space x-coordinates, length num_i.
iy
: Image-space y-coordinates, length num_i.
iz
: Image-space z-coordinates, length num_i.
FM
: Off-resonance field map (rad/s), length num_i.
t
: Per-sample readout time (s), length num_k.
num_k
: Number of k-space samples.
num_i
: Number of image pixels.
iftCpuWithGrads¶
template <typename T1> void iftCpuWithGrads(T1* idata_r, T1* idata_i, const T1* kdata_r, const T1* kdata_i, const T1* kx, const T1* ky, const T1* kz, const T1* ix, const T1* iy, const T1* iz, const T1* FM, const T1* Gx, const T1* Gy, const T1* Gz, const T1* t, const int num_k, const int num_i, const int num_x, const int num_y, const int num_z)
Non-uniform adjoint DFT kernel with R2* field-map gradients (CPU / OpenACC).
Adjoint of ftCpuWithGrads; applies conjugate phase and R2* weighting.
T1
: Floating-point precision type.
idata_r
: Output image real part array, length num_i.
idata_i
: Output image imaginary part array, length num_i.
kdata_r
: Input k-space real part array, length num_k.
kdata_i
: Input k-space imaginary part array, length num_k.
kx
: k-space x-coordinates, length num_k.
ky
: k-space y-coordinates, length num_k.
kz
: k-space z-coordinates, length num_k.
ix
: Image-space x-coordinates, length num_i.
iy
: Image-space y-coordinates, length num_i.
iz
: Image-space z-coordinates, length num_i.
FM
: Off-resonance field map (rad/s), length num_i.
Gx
: R2* gradient in x, length num_x.
Gy
: R2* gradient in y, length num_y.
Gz
: R2* gradient in z, length num_z.
t
: Per-sample readout time (s), length num_k.
num_k
: Number of k-space samples.
num_i
: Number of image pixels.
num_x
: Number of R2* gradient samples in x.
num_y
: Number of R2* gradient samples in y.
num_z
: Number of R2* gradient samples in z.
imag¶
template <typename T> forgeCol<T> imag(const forgeCol<forgeComplex<T>>& A)
Extract imaginary parts of a complex vector.
T
: Floating-point base type.
A
: Input complex vector.
Return
: Real vector of imaginary parts, same length as A.
imageCoordinates2D¶
template <typename T1> void imageCoordinates2D(arma::uword Nx, arma::uword Ny, arma::Col<T1>& ix, arma::Col<T1>& iy, arma::Col<T1>& iz)
Generate image-space coordinates matching forge convention.
For a 2D image of size Nx x Ny, returns ix, iy, iz vectors where: ix(j,i,k) = (i - Nx/2) / Nx iy(j,i,k) = (j - Ny/2) / Ny iz = 0
Nx
: Image width
Ny
: Image height
ix
: x image coordinates (Nx*Ny elements)
iy
: y image coordinates
iz
: z image coordinates (all zeros)
imageCoordinates3D¶
template <typename T1> void imageCoordinates3D( arma::uword Nx, arma::uword Ny, arma::uword Nz, arma::Col<T1>& ix, arma::Col<T1>& iy, arma::Col<T1>& iz)
Generate 3D image-space coordinates matching pcSENSE Cube(Nx,Ny,Nz) layout.
Nx,
: Ny, Nz Image dimensions
ix,
: iy, iz Normalized coordinates in [-0.5, 0.5)
kernel_value_LUT¶
template <typename T1> T1 kernel_value_LUT(T1 dist, const T1* LUT, uword sizeLUT, T1 width)
Evaluate the Kaiser-Bessel kernel from a precomputed lookup table.
Linearly interpolates between LUT entries.
T1
: Floating-point type.
dist
: Distance from kernel centre (must be < width/2).
LUT
: Precomputed LUT array.
sizeLUT
: Number of LUT entries.
width
: Kernel half-width (same units as dist).
Return
: Interpolated kernel value at dist.
linearTimingVector¶
template <typename T1> arma::Col<T1> linearTimingVector(arma::uword nSamples, T1 readoutTime)
Generate a linear timing vector for a single readout.
nSamples
: Number of samples
readoutTime
: Total readout duration in seconds (e.g. 0.01 for 10ms)
Return : Timing vector from 0 to readoutTime
log1p¶
template <typename T, typename std::enable_if< !std::is_same<T, forgeComplex<float>>::value && !std::is_same<T, forgeComplex<double>>::value, int>::type = 0> forgeCol<T> log1p(const forgeCol<T>& A)
Element-wise log(1 + x) for a real vector.
T
: Floating-point type (real only).
A
: Input real vector.
Return : New vector with std::log1p(A[i]) at each position.
matmul¶
template <typename T> forgeMat<forgeComplex<T>> matmul(const forgeMat<forgeComplex<T>>& A, const forgeMat<forgeComplex<T>>& B)
Matrix-matrix multiply for complex forgeMat (conjugate-transpose NOT applied).
template <typename T> forgeCol<forgeComplex<T>> matmul(const forgeMat<forgeComplex<T>>& A, const forgeCol<forgeComplex<T>>& x)
Matrix-vector multiply for complex forgeMat × forgeCol.
template <typename T> forgeMat<T> matmul(const forgeMat<T>& A, const forgeMat<T>& B)
Matrix-matrix multiply for real forgeMat.
memptr¶
T* memptr() const
Return a raw pointer to the underlying data buffer.
metal_cvec_axpy¶
void metal_cvec_axpy( MetalVectorContext* ctx, const float* A, const float* B, float* C, float alphaRe, float alphaIm, size_t n)
Complex AXPY: C[i] = A[i] + (alphaRe + i*alphaIm) * B[i].
metal_cvec_cdot¶
void metal_cvec_cdot(MetalVectorContext* ctx, const float* A, const float* B, float* outRe, float* outIm, size_t n)
Complex dot product: sum(conj(A[i]) * B[i]).
metal_cvec_mul_scalar¶
void metal_cvec_mul_scalar(MetalVectorContext* ctx, const float* A, float alphaRe, float alphaIm, float* C, size_t n)
Complex scalar multiply: C[i] = (alphaRe + i*alphaIm) * A[i].
metal_cvec_norm2sq¶
float metal_cvec_norm2sq(MetalVectorContext* ctx, const float* A, size_t n)
L2 norm squared of complex vector: sum(|A[i]|^2).
metal_dft_adjoint¶
void metal_dft_adjoint( MetalDFTContext* ctx, const float* kdata_r, const float* kdata_i, float* idata_r, float* idata_i)
Adjoint DFT: k-space → image.
ctx
: Active Metal DFT context.
kdata_r
: Input k-space real part (num_k floats).
kdata_i
: Input k-space imag part (num_k floats).
idata_r
: Output image real part (num_i floats).
idata_i
: Output image imag part (num_i floats).
metal_dft_create¶
MetalDFTContext* metal_dft_create(const float* kx, const float* ky, const float* kz, const float* ix, const float* iy, const float* iz, const float* FM, const float* t, unsigned int num_k, unsigned int num_i)
Create a Metal DFT context for Gdft (no gradient correction).
Uploads k-space trajectory, image coordinates, field map, and timing vector to persistent Metal buffers.
kx,ky,kz
: k-space coordinates (num_k elements each)
ix,iy,iz
: image-space coordinates (num_i elements each)
FM
: field map in radians/sec (num_i elements)
t
: timing vector in seconds (num_k elements)
num_k
: number of k-space samples
num_i
: number of image pixels
Return : Pointer to new context, or nullptr on failure.
metal_dft_create_with_grads¶
MetalDFTContext* metal_dft_create_with_grads(const float* kx, const float* ky, const float* kz, const float* ix, const float* iy, const float* iz, const float* FM, const float* t, const float* Gx, const float* Gy, const float* Gz, unsigned int num_k, unsigned int num_i, unsigned int num_x, unsigned int num_y, unsigned int num_z)
Create a Metal DFT context for GdftR2 (with gradient correction).
Same as metal_dft_create() but also uploads gradient maps and grid dims.
metal_dft_destroy¶
void metal_dft_destroy(MetalDFTContext* ctx)
Destroy a Metal DFT context and release resources.
metal_dft_forward¶
void metal_dft_forward( MetalDFTContext* ctx, const float* idata_r, const float* idata_i, float* kdata_r, float* kdata_i)
Forward DFT: image → k-space.
ctx
: Active Metal DFT context.
idata_r
: Input image real part (num_i floats).
idata_i
: Input image imag part (num_i floats).
kdata_r
: Output k-space real part (num_k floats).
kdata_i
: Output k-space imag part (num_k floats).
metal_gridding_adjoint_2D¶
void metal_gridding_adjoint_2D(MetalGriddingContext* ctx, const float* dIn, float* pGridOut)
Adjoint (k-space → image) gridding for a 2D problem.
Zeroes the grid buffer, runs the adjoint scatter kernel on the GPU,
and leaves the result in pGridOut (interleaved real/imag floats,
gridNx * gridNy complex values).
ctx
: Active Metal context.
dIn
: Input k-space data, interleaved real/imag (2*numSamples floats).
pGridOut
: Output oversampled grid buffer (2 * gridNx * gridNy floats).
metal_gridding_adjoint_3D¶
void metal_gridding_adjoint_3D(MetalGriddingContext* ctx, const float* dIn, float* pGridOut)
Adjoint (k-space → image) gridding for a 3D problem.
ctx
: Active Metal context.
dIn
: Input k-space data, interleaved real/imag (2*numSamples floats).
pGridOut
: Output oversampled grid buffer (2 * gridNx * gridNy * gridNz floats).
metal_gridding_create¶
MetalGriddingContext* metal_gridding_create(int gridNx, int gridNy, int gridNz, int imageNx, int imageNy, int imageNz, float gridOS, float kernelWidth, const float* LUT, int sizeLUT, const float* kx, const float* ky, const float* kz, int numSamples)
Allocate and initialise a Metal gridding context.
Upload the Kaiser-Bessel LUT and k-space trajectory to Metal buffers once here so they need not be transferred on every operator call.
gridNx
: Oversampled grid size in X (= ceil(gridOS * Nx))
gridNy
: Oversampled grid size in Y (= ceil(gridOS * Ny))
gridNz
: Oversampled grid size in Z (1 for 2D)
imageNx
: Image size in X
imageNy
: Image size in Y
imageNz
: Image size in Z (1 for 2D)
gridOS
: Grid oversampling factor
kernelWidth
: Kaiser-Bessel kernel width in grid cells
LUT
: Kaiser-Bessel lookup table (host pointer)
sizeLUT
: Number of entries in LUT
kx
: k-space coordinates in X (host pointer, numSamples elements)
ky
: k-space coordinates in Y (host pointer, numSamples elements)
kz
: k-space coordinates in Z (host pointer, numSamples elements; may be all-zero for 2D)
numSamples
: Number of k-space samples per coil
Return : Pointer to a new context, or nullptr if Metal is unavailable (e.g. device does not support MTLGPUFamilyApple6).
metal_gridding_destroy¶
void metal_gridding_destroy(MetalGriddingContext* ctx)
Destroy a Metal gridding context and release all Metal resources.
ctx
: Context previously created by metal_gridding_create().
metal_gridding_forward_2D¶
void metal_gridding_forward_2D(MetalGriddingContext* ctx, const float* pGridIn, float* pSamplesOut)
Forward (image → k-space) gridding for a 2D problem.
Reads from the oversampled grid pGridIn and accumulates into
pSamplesOut (one complex value per k-space sample).
ctx
: Active Metal context.
pGridIn
: Input oversampled grid (2 * gridNx * gridNy floats).
pSamplesOut
: Output k-space samples, interleaved real/imag (2*numSamples floats).
metal_gridding_forward_3D¶
void metal_gridding_forward_3D(MetalGriddingContext* ctx, const float* pGridIn, float* pSamplesOut)
Forward (image → k-space) gridding for a 3D problem.
ctx
: Active Metal context.
pGridIn
: Input oversampled grid (2 * gridNx * gridNy * gridNz floats).
pSamplesOut
: Output k-space samples, interleaved real/imag (2*numSamples floats).
metal_nufft_adjoint¶
void metal_nufft_adjoint(MetalNufftPipelineContext* ctx, const float* samplesIn, float* imageOut)
Full adjoint NUFFT: k-space → image (all on GPU).
ctx
: Active pipeline context.
samplesIn
: Input k-space, interleaved complex (2 * numSamples floats).
imageOut
: Output image, interleaved complex (2 * NxNyNz floats).
metal_nufft_forward¶
void metal_nufft_forward(MetalNufftPipelineContext* ctx, const float* imageIn, float* samplesOut)
Full forward NUFFT: image → k-space (all on GPU).
ctx
: Active pipeline context.
imageIn
: Input image, interleaved complex (2 * NxNyNz floats).
samplesOut
: Output k-space, interleaved complex (2 * numSamples floats).
metal_nufft_pipeline_create¶
MetalNufftPipelineContext* metal_nufft_pipeline_create(int gridNx, int gridNy, int gridNz, int imageNx, int imageNy, int imageNz, float gridOS, float kernelWidth, float beta, const float* LUT, int sizeLUT, const float* kx, const float* ky, const float* kz, int numSamples)
Create a full GPU NUFFT pipeline context.
Uploads the Kaiser-Bessel LUT and k-space trajectory to Metal shared buffers, allocates working buffers for all intermediate stages, and pre-compiles MPSGraph FFT executables for the oversampled grid dimensions.
gridNx
: Oversampled grid size in X (= ceil(gridOS * Nx), made even)
gridNy
: Oversampled grid size in Y
gridNz
: Oversampled grid size in Z (1 for 2D)
imageNx
: Image size in X
imageNy
: Image size in Y
imageNz
: Image size in Z (1 for 2D)
gridOS
: Grid oversampling factor
kernelWidth
: Kaiser-Bessel kernel width in grid cells
beta
: Kaiser-Bessel beta parameter
LUT
: Kaiser-Bessel lookup table (host pointer)
sizeLUT
: Number of entries in LUT
kx
: k-space X coordinates (host, numSamples elements)
ky
: k-space Y coordinates (host, numSamples elements)
kz
: k-space Z coordinates (host, numSamples elements)
numSamples
: Number of k-space samples per coil
Return : Pointer to a new context, or nullptr on failure.
metal_nufft_pipeline_destroy¶
void metal_nufft_pipeline_destroy(MetalNufftPipelineContext* ctx)
Destroy a pipeline context and release all GPU resources.
metal_rvec_cmul¶
void metal_rvec_cmul(MetalVectorContext* ctx, const float* W, const float* X, float* C, size_t n)
Real weight * complex: C[i] = W[i] * X[i]. W has n floats; X and C have 2*n floats.
metal_vec_sum¶
float metal_vec_sum(MetalVectorContext* ctx, const float* A, size_t n)
Sum of real float vector.
metal_vecops_dispatch_count¶
uint64_t metal_vecops_dispatch_count()
Total number of Metal command buffer commits (vector ops only, not gridding).
metal_vecops_reset_stats¶
void metal_vecops_reset_stats()
Reset both counters to zero.
metal_vecops_wait_seconds¶
double metal_vecops_wait_seconds()
Cumulative time in seconds spent in waitUntilCompleted (vector ops only).
metal_vector_get_context¶
MetalVectorContext* metal_vector_get_context()
Get or create the process-wide singleton context. Thread-safe (uses std::call_once internally). Returns nullptr if no Apple GPU is available.
norm¶
template <typename T> T norm(const forgeCol<forgeComplex<T>>& A)
L2 norm of a complex vector: √(Σᵢ |A[i]|²).
T
: Floating-point base type.
A
: Input complex vector.
Return : Non-negative scalar norm.
template <typename T> T norm(const forgeCol<T>& A)
L2 norm of a real vector: √(Σᵢ A[i]²).
T
: Floating-point type (real only).
A
: Input real vector.
Return : Non-negative scalar norm.
norm_grad¶
template <typename T> T norm_grad(const forgeCol<forgeComplex<T>>& g, const forgeCol<forgeComplex<T>>& yi, const forgeCol<T>& W)
Scale-invariant gradient magnitude for PCG convergence testing.
Returns ‖g‖ / |Re(yᵀ · (W ⊙ y))|, where yᵀ is the non-conjugate transpose. This normalisation makes the convergence criterion independent of the data scale.
T
: Floating-point base type.
g
: Current gradient vector.
yi
: Measured k-space data vector.
W
: Non-negative data weight vector, same length as yi.
Return : Normalised gradient magnitude (returns 0 if denominator is ~0).
template <typename T1> inline T1 norm_grad(const Col<complex<T1>>& g, const Col<complex<T1>>& yi, const Col<T1>& W)
Normalized gradient magnitude for convergence testing.
Returns \(\|\mathbf{g}\| / (\mathbf{y}_i^H \mathbf{W} \mathbf{y}_i)\), providing a scale-invariant stopping criterion.
T1
: Floating-point base type.
g
: Current gradient vector.
yi
: Measured k-space data vector.
W
: Non-negative data weights vector, same length as yi.
Return : Normalized gradient magnitude.
normalize_fft2d¶
template <typename T1> void normalize_fft2d(T1* __restrict pDst, T1* __restrict pSrc, int gridSizeX, int gridSizeY)
Normalise a 2-D interleaved buffer by $1/(n_x \times n_y)$ after an inverse FFT.
normalize_fft3d¶
template <typename T1> void normalize_fft3d(T1* __restrict pDst, T1* __restrict pSrc, int gridSizeX, int gridSizeY, int gridSizeZ)
Normalise a 3-D interleaved buffer by $1/(n_x \times n_y \times n_z)$ after an inverse FFT.
ones¶
void ones()
Set all elements to one (GPU-aware via OpenACC).
void ones()
Set all elements to one (GPU-aware via OpenACC).
openISMRMRDData¶
inline void openISMRMRDData( std::string inputDataFile, ISMRMRD::Dataset*& d, ISMRMRD::IsmrmrdHeader& hdr, acqTracking*& acqTrack)
Open an ISMRMRD dataset and initialise bookkeeping structures.
Allocates and opens the ISMRMRD::Dataset object, reads and deserialises the XML header, and constructs an acqTracking object for the file.
inputDataFile
: Path to the ISMRMRD HDF5 file.
d
: Output: pointer to the newly opened dataset.
hdr
: Output: parsed ISMRMRD header.
acqTrack
: Output: pointer to the newly created acqTracking object.
pg_isnan_float¶
static inline bool pg_isnan_float(float x)
Bit-level NaN detection that works even under -ffast-math / -Ofast.
IEEE 754: NaN has all exponent bits set and non-zero mantissa.
processISMRMRDInput¶
template <typename T1> void processISMRMRDInput(std::string inputDataFile, ISMRMRD::Dataset*& d, ISMRMRD::IsmrmrdHeader& hdr, forgeCol<T1>& FM, forgeCol<forgeComplex<T1>>& sen, acqTracking*& acqTrack)
One-shot convenience: open dataset, read field map and SENSE map.
randomShotPhase3D¶
template <typename T1> arma::Col<T1> randomShotPhase3D( arma::uword Nx, arma::uword Ny, arma::uword Nz, arma::uword Ns, T1 maxPhase, unsigned int seed = 42)
Generate random per-shot linear phase ramps (models motion-induced phase).
Each shot gets a random linear ramp: PMap(x,y,z,ss) = gxx + gyy + gz*z with random coefficients in [-maxPhase, maxPhase].
Nx,
: Ny, Nz Image dimensions
Ns
: Number of shots
maxPhase
: Maximum phase gradient magnitude (radians)
seed
: RNG seed for reproducibility
Return : Vectorized phase map (Ni*Ns elements, radians)
real¶
template <typename T> forgeCol<T> real(const forgeCol<forgeComplex<T>>& A)
Extract real parts of a complex vector.
T
: Floating-point base type.
A
: Input complex vector.
Return
: Real vector of real parts, same length as A.
reconSolve¶
template <typename T1, typename TObj, typename RObj> forgeCol<forgeComplex<T1>> reconSolve(forgeCol<forgeComplex<T1>> data, TObj& Sg, RObj R, Col<T1> kx, Col<T1> ky, Col<T1> kz, uword Nx, uword Ny, uword Nz, Col<T1> tvec, uword niter, size_t parent_bar_idx = PG_NO_PARENT_BAR, size_t parent_bar_offset = 0)
High-level iterative MRI image reconstruction via PCG.
Workflow:
1. Constructs uniform data weights \(\mathbf{W} = \mathbf{1}\) (length equal
to the data vector).
2. Creates a zero-filled initial image estimate \(\mathbf{x}_0\) of size
\(N_x \times N_y \times N_z\).
3. Calls solve_pwls_pcg to solve the PWLS objective.
Image-space coordinates (set up by initImageSpaceCoords) span
\([0, (N-1)/N]\) in each dimension with column-major (x-fastest) ordering.
// Full reconstruction pipeline
forgeCol<float> ix, iy, iz;
initImageSpaceCoords<float>(ix, iy, iz, Nx, Ny, Nz);
Gnufft<float> G(n1, 2.0f, Nx, Ny, Nz, kx, ky, kz, ix, iy, iz);
SENSE<float, Gnufft<float>> S(G, SMap, nc);
QuadPenalty<float> R(Nx, Ny, Nz, beta);
auto xhat = reconSolve<float>(data, S, R, kx, ky, kz, Nx, Ny, Nz, tvec, niter);
See : solve_pwls_pcg
See : initImageSpaceCoords
T1
: Floating-point precision type (float or double).
TObj
: Encoding operator type (e.g., SENSE<T1, Gnufft<T1>>).
RObj
: Regularization object type (e.g., QuadPenalty<T1>).
data
: Measured k-space data (stacked multi-coil if using SENSE), length n1.
Sg
: Encoding operator.
R
: Regularization penalty object.
kx
: k-space x-coordinates, length n1 (unused by this function; provided for context).
ky
: k-space y-coordinates, length n1.
kz
: k-space z-coordinates, length n1.
Nx
: Image size in x.
Ny
: Image size in y.
Nz
: Image size in z (use 1 for 2-D).
tvec
: Per-sample readout time vector in seconds, length n1.
niter
: Maximum number of PCG iterations.
Return : Reconstructed image vector, length \(N_x \cdot N_y \cdot N_z\).
reset_mem¶
void reset_mem()
Release ownership of internal memory without freeing it (used by move semantics).
void reset_mem()
Release ownership of internal memory without freeing it (used by move semantics).
reshape¶
template <typename T> forgeMat<T> reshape(const forgeMat<T>& A, uword rows, uword cols)
Reshape a forgeMat to new dimensions (column-major order preserved).
rotationMatrixToQuaternion¶
inline void rotationMatrixToQuaternion(const float R[3][3], float q[4])
Convert a 3x3 rotation matrix to a NIfTI quaternion (a, b, c, d).
Follows Shepperd's method as described in the NIfTI-1 specification. The output quaternion satisfies q[0] >= 0 (canonical form). Only valid for proper rotations (det = +1). The caller must apply qfac to the slice column before calling this function.
R
: 3x3 rotation matrix.
q
: Output quaternion: q[0]=a (w), q[1]=b, q[2]=c, q[3]=d.
serialize¶
template <class Archive> void serialize(Archive& ar, const unsigned int /*version*/)
Boost.Serialization support for MPI communication.
template <class Archive> void serialize(Archive& ar, const unsigned int /*version*/)
Boost.Serialization support for MPI communication.
set_col¶
void set_col(uword colIndx, const forgeCol<T>& src)
Write a forgeCol into column colIndx (memcpy).
set_size¶
void set_size(uword length)
Allocate (or reallocate) storage for `length` elements, freeing any prior allocation.
void set_size(uword nCols, uword nRows)
Allocate (or reallocate) storage for nCols x nRows elements. Note: first arg is COLUMNS.
sheppLogan2D¶
template <typename T1> arma::Col<std::complex<T1>> sheppLogan2D(arma::uword Nx, arma::uword Ny)
Generate a 2D modified Shepp-Logan phantom image.
Nx
: Number of pixels in x direction
Ny
: Number of pixels in y direction
Return : Nx*Ny x 1 complex column vector (vectorized image, column-major)
solve_pwls_pcg¶
template <typename T1, typename Tobj, typename Robj> forgeCol<forgeComplex<T1>> solve_pwls_pcg(const forgeCol<forgeComplex<T1>>& xInitial, Tobj const& A, forgeCol<T1> const& W_pg, forgeCol<forgeComplex<T1>> const& yi_pg, Robj const& R, uword niter, size_t parent_bar_idx = PG_NO_PARENT_BAR, size_t parent_bar_offset = 0)
Solve a penalized weighted least-squares problem via preconditioned CG.
The gradient is \(\nabla = A^H W(A\mathbf{x} - \mathbf{y}) + \nabla R(\mathbf{x})\). Uses Polak-Ribiere-Polyak conjugate direction update with a quadratic surrogate step-size selection.
This is the primary overload accepting and returning forgeCol types.
Convergence thresholds:
- 1e-10: normalized gradient magnitude below this triggers early stop
- 1e-20: denominator guard in step-size computation to avoid division by zero
Thread safety: reads the g_should_stop atomic for cooperative early
termination (e.g., on SIGINT).
Gnufft<float> G(n1, 2.0f, Nx, Ny, Nz, kx, ky, kz, ix, iy, iz);
SENSE<float, Gnufft<float>> S(G, SMap, nc);
QuadPenalty<float> R(Nx, Ny, 1, 0.001f);
forgeCol<float> W(n1 * nc); W.ones();
forgeCol<forgeComplex<float>> x0(Nx*Ny); x0.zeros();
auto xhat = solve_pwls_pcg<float>(x0, S, W, yi, R, 50);
See : Fessler, "Penalized Weighted Least-Squares Image Reconstruction for Positron Emission Tomography," IEEE TMI, 1994. https://doi.org/10.1109/42.363108
T1
: Floating-point precision type (float or double).
Tobj
: Encoding operator type (must support A * x and A / y).
Robj
: Regularization object type (must support Gradient, Denom).
xInitial
: Initial image estimate, length n2.
A
: Encoding operator (forward: image->data, adjoint: data->image).
W_pg
: Non-negative data weights, length n1.
yi_pg
: Measured k-space data, length n1.
R
: Regularization penalty object.
niter
: Maximum number of CG iterations.
parent_bar_idx
: Index of an enclosing forgeview progress bar to tick
after each iteration, or PG_NO_PARENT_BAR (default).
parent_bar_offset
: Tick offset within the parent bar (default 0).
Return : Reconstructed image estimate as forgeCol, length n2.
spiralTrajectory2D¶
template <typename T1> void spiralTrajectory2D(arma::uword nInterleaves, arma::uword nSamplesPerArm, arma::uword Nx, arma::Col<T1>& kx, arma::Col<T1>& ky, arma::Col<T1>& kz)
Generate a 2D Archimedean spiral k-space trajectory.
nInterleaves
: Number of spiral interleaves
nSamplesPerArm
: Samples per spiral arm
Nx
: Image dimension (determines kmax = Nx/2)
kx
: k-space x coordinates
ky
: k-space y coordinates
kz
: k-space z coordinates (all zeros)
sqrt¶
template <typename T, typename std::enable_if< !std::is_same<T, forgeComplex<float>>::value && !std::is_same<T, forgeComplex<double>>::value, int>::type = 0> forgeCol<T> sqrt(const forgeCol<T>& A)
Element-wise square root for a real vector.
T
: Floating-point type (real only).
A
: Non-negative real input vector.
Return : New vector with std::sqrt(A[i]) at each position.
square¶
template <typename T, typename std::enable_if< !std::is_same<T, forgeComplex<float>>::value && !std::is_same<T, forgeComplex<double>>::value, int>::type = 0> forgeCol<T> square(const forgeCol<T>& A)
Element-wise square of a real vector: out[i] = A[i]².
T
: Floating-point type (real only).
A
: Input real vector.
Return : New vector of squared values.
stackOfSpiralsTimingVector¶
template <typename T1> arma::Col<T1> stackOfSpiralsTimingVector( arma::uword nSamplesPerArm, arma::uword nKzAcq, arma::uword Ns, T1 readoutTimePerArm)
Generate timing vector for stack-of-spirals readout.
Each shot acquires kz encodes sequentially, each taking readoutTimePerArm. Timing resets to 0 for each shot (pcSENSE reshapes to Mat(Nd,Ns) columns).
nSamplesPerArm
: Samples per spiral arm
nKzAcq
: Acquired kz encodes per shot
Ns
: Number of shots
readoutTimePerArm
: Readout duration per spiral arm (seconds)
Return : Timing vector of length Nd_per_shot * Ns
strans¶
template <typename T> forgeMat<forgeComplex<T>> strans(const forgeMat<forgeComplex<T>>& A)
Non-conjugate (simple) transpose of a complex forgeMat.
subvec¶
forgeCol<T> subvec(uword first, uword last)
Return a non-owning view of elements [first..last] (inclusive).
The view must not outlive this forgeCol (or the underlying memory if this is itself a view).
sum¶
template <typename T> const forgeComplex<T> sum(const forgeCol<forgeComplex<T>>& pgA)
Sum all elements of a complex forgeCol (separate real/imag reductions for OpenACC).
template <typename T> const T sum(const forgeCol<T>& pgA)
Sum all elements of a real forgeCol, dispatching to Accelerate vDSP on Apple.
template <typename T> const forgeCol<forgeComplex<T>> sum(const forgeMat<forgeComplex<T>>& pgA, const uword dim = 0)
Sum a complex forgeMat along the given dimension (0=column-wise, 1=row-wise).
template <typename T> const forgeCol<T> sum(const forgeMat<T>& pgA, const uword dim = 0)
Sum a real forgeMat along the given dimension (0=column-wise, 1=row-wise).
to_complex¶
template <typename T, typename std::enable_if< !std::is_same<T, forgeComplex<float>>::value && !std::is_same<T, forgeComplex<double>>::value, int>::type = 0> forgeCol<forgeComplex<T>> to_complex(const forgeCol<T>& A)
Lift a real vector to complex with zero imaginary part.
T
: Floating-point type (real only).
A
: Real input vector.
Return : Complex vector with A[i] as real part and 0 as imaginary part.
trans¶
template <typename T> forgeMat<forgeComplex<T>> trans(const forgeMat<forgeComplex<T>>& A)
Conjugate transpose of a complex forgeMat.
template <typename T> forgeMat<T> trans(const forgeMat<T>& A)
Transpose of a real forgeMat.
vectorise¶
template <typename T> const forgeCol<T> vectorise(const forgeMat<T>& pgA)
Flatten a forgeMat into a single forgeCol in column-major order.
writeISMRMRDImageData¶
template <typename T1> void writeISMRMRDImageData(ISMRMRD::Dataset* d, forgeCol<forgeComplex<T1>>& image, uword Nx, uword Ny, uword Nz)
Write a reconstructed complex image back into the ISMRMRD dataset.
writeNiftiMagPhsImage¶
template <typename T1> void writeNiftiMagPhsImage(std::string filename, forgeCol<forgeComplex<T1>> imageData, uword Nx, uword Ny, uword Nz, uword Nslab = 1, uword N_TR = 1, const NiftiOrientation& orient = NiftiOrientation { })
Write a complex-valued image as two NIfTI files: magnitude (`_mag`) and phase (`_phs`).
writeNiftiRealImage¶
template <typename T1> void writeNiftiRealImage(std::string filename, forgeCol<T1> imageData, uword Nx, uword Ny, uword Nz = 1, uword Nslab = 1, uword N_TR = 1, const NiftiOrientation& orient = NiftiOrientation { })
Write a real-valued image volume to a NIfTI-1 (.nii) file.
filename
: Base filename (without .nii extension, which is appended automatically).
imageData
: Flat column vector of image voxels (column-major, NxNyNzNslabN_TR).
Nx
: Image size along readout.
Ny
: Image size along phase-encode.
Nz
: Image size along slice/partition (default 1 for 2-D).
Nslab
: Number of slabs (default 1).
N_TR
: Number of time frames (default 1).
orient
: Optional NiftiOrientation with voxel sizes and direction cosines.
zero_pad2d¶
template <typename T1> void zero_pad2d(T1* __restrict pDst, T1* __restrict pSrc, int imageSizeX, int imageSizeY, T1 gridOS)
Zero-pad a 2-D image onto an oversampled grid (inverse of crop_center_region2d).
zero_pad3d¶
template <typename T1> void zero_pad3d(T1* __restrict pDst, T1* __restrict pSrc, int imageSizeX, int imageSizeY, int imageSizeZ, T1 gridOS)
Zero-pad a 3-D image onto an oversampled grid (inverse of crop_center_region3d).
zeros¶
void zeros()
Set all elements to zero (GPU-aware via OpenACC).
void zeros()
Set all elements to zero (GPU-aware via OpenACC).