CudaDftPipeline¶

class CudaDftPipeline

GPU-accelerated field-corrected DFT pipeline.

Computes the forward DFT: s_j = Σ_k m_k · exp(-i·(2π(k_j·r_k) + Δω_k·t_j))

and its adjoint (conjugate phase). Optionally includes R₂* gradient correction via sinc factors (GdftR2 variant).

All trajectory, field map, and gradient data are uploaded to GPU once at construction. The forward_device/adjoint_device methods operate on interleaved complex device pointers [re0,im0,re1,im1,...], matching the forgeComplex memory layout used throughout forge.

Follows the same API pattern as CudaNufftPipeline.

Functions¶

Name	Description
CudaDftPipeline	Construct a basic Gdft pipeline (no gradient correction).
CudaDftPipeline	Construct a GdftR2 pipeline with R₂* gradient correction.
forward_device	Forward DFT on device pointers (no host transfers).
adjoint_device	Adjoint DFT on device pointers (no host transfers).
forward_device_batched	Batched forward DFT: compute numBatch independent DFTs in one kernel launch. All batches share the same trajectory/field data but have different input images. `d_images` : Concatenated interleaved complex images, 2numPixelsnumBatch floats (batch b starts at offset b * 2numPixels) `d_kdata` : Concatenated interleaved complex k-space output, 2numSamples*numBatch floats `numBatch` : Number of independent DFTs to compute in parallel
adjoint_device_batched	Batched adjoint DFT: compute numBatch independent adjoint DFTs in one kernel launch. `d_kdata` : Concatenated interleaved complex k-space, 2numSamplesnumBatch floats `d_images` : Concatenated interleaved complex image output, 2numPixelsnumBatch floats `numBatch` : Number of independent adjoint DFTs to compute in parallel
forward	Forward DFT with host pointers (H2D, compute, D2H, sync).
adjoint	Adjoint DFT with host pointers (H2D, compute, D2H, sync).

Function Details¶

CudaDftPipeline ¶

CudaDftPipeline(const float* kx, const float* ky, const float* kz, const float* ix, const float* iy, const float* iz, const float* FM, const float* t, int numSamples, int numPixels)

Construct a basic Gdft pipeline (no gradient correction).

kx,ky,kz : k-space coordinates, length numSamples (host pointers)

ix,iy,iz : Image-space coordinates, length numPixels (host pointers)

FM : Off-resonance field map in rad/s, length numPixels

t : Per-sample readout time in seconds, length numSamples

numSamples : Number of k-space points (n1)

numPixels : Number of image pixels (n2)

CudaDftPipeline(const float* kx, const float* ky, const float* kz, const float* ix, const float* iy, const float* iz, const float* FM, const float* t, const float* Gx, const float* Gy, const float* Gz, int numSamples, int numPixels, int numX, int numY, int numZ)

Construct a GdftR2 pipeline with R₂* gradient correction.

Gx,Gy,Gz : Gradient maps of the field map, length numPixels

numX,numY,numZ : Image grid dimensions (for sinc normalization) Other parameters same as basic constructor.

adjoint ¶

void adjoint(const float* h_kdata, float* h_image)

Adjoint DFT with host pointers (H2D, compute, D2H, sync).

adjoint_device ¶

void adjoint_device(const float* d_kdata, float* d_image)

Adjoint DFT on device pointers (no host transfers).

d_kdata : Input: interleaved complex k-space, 2*numSamples floats

d_image : Output: interleaved complex image, 2*numPixels floats

adjoint_device_batched ¶

void adjoint_device_batched(const float* d_kdata, float* d_images, int numBatch)

Batched adjoint DFT: compute numBatch independent adjoint DFTs in one kernel launch.

d_kdata : Concatenated interleaved complex k-space, 2numSamplesnumBatch floats

d_images : Concatenated interleaved complex image output, 2numPixelsnumBatch floats

numBatch : Number of independent adjoint DFTs to compute in parallel

forward ¶

void forward(const float* h_image, float* h_kdata)

Forward DFT with host pointers (H2D, compute, D2H, sync).

forward_device ¶

void forward_device(const float* d_image, float* d_kdata)

Forward DFT on device pointers (no host transfers).

d_image : Input: interleaved complex image, 2*numPixels floats

d_kdata : Output: interleaved complex k-space, 2*numSamples floats

forward_device_batched ¶

void forward_device_batched(const float* d_images, float* d_kdata, int numBatch)

Batched forward DFT: compute numBatch independent DFTs in one kernel launch.
All batches share the same trajectory/field data but have different input images.

d_images : Concatenated interleaved complex images, 2numPixelsnumBatch floats (batch b starts at offset b * 2*numPixels)

d_kdata : Concatenated interleaved complex k-space output, 2numSamplesnumBatch floats

numBatch : Number of independent DFTs to compute in parallel

CudaDftPipeline¶

Functions¶

Function Details¶

CudaDftPipeline¶

adjoint¶

adjoint_device¶

adjoint_device_batched¶

forward¶

forward_device¶

forward_device_batched¶

CudaDftPipeline ¶

adjoint ¶

adjoint_device ¶

adjoint_device_batched ¶

forward ¶

forward_device ¶

forward_device_batched ¶