Skip to content

CudaDftPipeline

class CudaDftPipeline

GPU-accelerated field-corrected DFT pipeline.

Computes the forward DFT: s_j = Σ_k m_k · exp(-i·(2π(k_j·r_k) + Δω_k·t_j))

and its adjoint (conjugate phase). Optionally includes R₂* gradient correction via sinc factors (GdftR2 variant).

All trajectory, field map, and gradient data are uploaded to GPU once at construction. The forward_device/adjoint_device methods operate on interleaved complex device pointers [re0,im0,re1,im1,...], matching the forgeComplex memory layout used throughout forge.

Follows the same API pattern as CudaNufftPipeline.

Functions

Name Description
CudaDftPipeline Construct a basic Gdft pipeline (no gradient correction).
CudaDftPipeline Construct a GdftR2 pipeline with R₂* gradient correction.
forward_device Forward DFT on device pointers (no host transfers).
adjoint_device Adjoint DFT on device pointers (no host transfers).
forward_device_batched Batched forward DFT: compute numBatch independent DFTs in one kernel launch. All batches share the same trajectory/field data but have different input images. d_images : Concatenated interleaved complex images, 2numPixelsnumBatch floats (batch b starts at offset b * 2numPixels) d_kdata : Concatenated interleaved complex k-space output, 2numSamples*numBatch floats numBatch : Number of independent DFTs to compute in parallel
adjoint_device_batched Batched adjoint DFT: compute numBatch independent adjoint DFTs in one kernel launch. d_kdata : Concatenated interleaved complex k-space, 2numSamplesnumBatch floats d_images : Concatenated interleaved complex image output, 2numPixelsnumBatch floats numBatch : Number of independent adjoint DFTs to compute in parallel
forward Forward DFT with host pointers (H2D, compute, D2H, sync).
adjoint Adjoint DFT with host pointers (H2D, compute, D2H, sync).

Function Details

CudaDftPipeline

CudaDftPipeline(const float* kx, const float* ky, const float* kz, const float* ix, const float* iy, const float* iz, const float* FM, const float* t, int numSamples, int numPixels)

Construct a basic Gdft pipeline (no gradient correction).

kx,ky,kz : k-space coordinates, length numSamples (host pointers)

ix,iy,iz : Image-space coordinates, length numPixels (host pointers)

FM : Off-resonance field map in rad/s, length numPixels

t : Per-sample readout time in seconds, length numSamples

numSamples : Number of k-space points (n1)

numPixels : Number of image pixels (n2)

CudaDftPipeline(const float* kx, const float* ky, const float* kz, const float* ix, const float* iy, const float* iz, const float* FM, const float* t, const float* Gx, const float* Gy, const float* Gz, int numSamples, int numPixels, int numX, int numY, int numZ)

Construct a GdftR2 pipeline with R₂* gradient correction.

Gx,Gy,Gz : Gradient maps of the field map, length numPixels

numX,numY,numZ : Image grid dimensions (for sinc normalization) Other parameters same as basic constructor.

adjoint

void adjoint(const float* h_kdata, float* h_image)

Adjoint DFT with host pointers (H2D, compute, D2H, sync).

adjoint_device

void adjoint_device(const float* d_kdata, float* d_image)

Adjoint DFT on device pointers (no host transfers).

d_kdata : Input: interleaved complex k-space, 2*numSamples floats

d_image : Output: interleaved complex image, 2*numPixels floats

adjoint_device_batched

void adjoint_device_batched(const float* d_kdata, float* d_images, int numBatch)

Batched adjoint DFT: compute numBatch independent adjoint DFTs in one kernel launch.

d_kdata : Concatenated interleaved complex k-space, 2numSamplesnumBatch floats

d_images : Concatenated interleaved complex image output, 2numPixelsnumBatch floats

numBatch : Number of independent adjoint DFTs to compute in parallel

forward

void forward(const float* h_image, float* h_kdata)

Forward DFT with host pointers (H2D, compute, D2H, sync).

forward_device

void forward_device(const float* d_image, float* d_kdata)

Forward DFT on device pointers (no host transfers).

d_image : Input: interleaved complex image, 2*numPixels floats

d_kdata : Output: interleaved complex k-space, 2*numSamples floats

forward_device_batched

void forward_device_batched(const float* d_images, float* d_kdata, int numBatch)

Batched forward DFT: compute numBatch independent DFTs in one kernel launch.
All batches share the same trajectory/field data but have different input images.

d_images : Concatenated interleaved complex images, 2numPixelsnumBatch floats (batch b starts at offset b * 2*numPixels)

d_kdata : Concatenated interleaved complex k-space output, 2numSamplesnumBatch floats

numBatch : Number of independent DFTs to compute in parallel