pcg_debug¶
Debug helpers for diagnosing GPU vs CPU numerical discrepancies in PCG. These functions copy GPU data to host and recompute on CPU to cross-check.
Functions¶
| Name | Description |
|---|---|
| compare_cdot | Sync both vectors to host, compute conjugate dot product via CPU loop, compare with the GPU result, print both values and relative error. |
| sync_and_norm | Copy a GPU forgeCol to host, compute its L2 norm via CPU loop, and print it. |
Function Details¶
compare_cdot¶
template <typename T> void compare_cdot(const forgeCol<forgeComplex<T>>& A, const forgeCol<forgeComplex<T>>& B, forgeComplex<T> gpu_result, const char* label)
Sync both vectors to host, compute conjugate dot product via CPU loop, compare with the GPU result, print both values and relative error. Warns if relative error exceeds 1e-3.
sync_and_norm¶
template <typename T> void sync_and_norm(const forgeCol<forgeComplex<T>>& v, const char* label)
Copy a GPU forgeCol to host, compute its L2 norm via CPU loop, and print it. Warns if the norm is NaN or Inf.