Spicy take. @nim_lang will be the best language for code generating high performance GPU kernels for AMD, Metal. Nvidia, OpenCL, Vulkan. And those macros โ˜„๏ธ๐Ÿ”ฅ forum.nim-lang.org/t/12868 Only ~1000 LOC for a compile-time macro-based Cuda codegenerator to compile Nim to Cuda
Nim + NVRTC = 5.3x Faster Than ICICLE for GPU-Accelerated Poseidon2 Merkle Trees A Deep Dive: lita.foundation/blog/nvrtc-cโ€ฆ

May 6, 2025 ยท 9:23 AM UTC

5
27