Skip to content

pcg_debug

Debug helpers for diagnosing GPU vs CPU numerical discrepancies in PCG. These functions copy GPU data to host and recompute on CPU to cross-check.

Functions

Name Description
compare_cdot Sync both vectors to host, compute conjugate dot product via CPU loop, compare with the GPU result, print both values and relative error.
sync_and_norm Copy a GPU forgeCol to host, compute its L2 norm via CPU loop, and print it.

Function Details

compare_cdot

template <typename T> void compare_cdot(const forgeCol<forgeComplex<T>>& A, const forgeCol<forgeComplex<T>>& B, forgeComplex<T> gpu_result, const char* label)

Sync both vectors to host, compute conjugate dot product via CPU loop, compare with the GPU result, print both values and relative error. Warns if relative error exceeds 1e-3.

sync_and_norm

template <typename T> void sync_and_norm(const forgeCol<forgeComplex<T>>& v, const char* label)

Copy a GPU forgeCol to host, compute its L2 norm via CPU loop, and print it. Warns if the norm is NaN or Inf.