day 67/100 of GPU Programming
- practiced writing my very fast dot product kernel and a fp16 gemm kernel
day 66/100 of GPU Programming
- started reading the nvidia cutlass documention
- learnt how to write a CuTe DSL vector add kernel, currently fastest as well on all available GPU's on leetgpu
Oct 6, 2025 · 6:29 PM UTC






