CUDAnet post draft
Build website container / Build image (push) Successful in 40s Details

This commit is contained in:
Mathis 2024-03-06 10:24:00 +00:00
parent b8a62de72d
commit 5c8e4c91e3
1 changed files with 17 additions and 0 deletions

17
content/blog/cuda_net.md Normal file
View File

@ -0,0 +1,17 @@
---
title: Writing a Convolutional Neural Network library with CUDA Support
draft: true
---
"Just use cuBLAS, it'll be easier. You don't have to implement custom CUDA kernels.", they said. Actually, noone said that. I just thought that because I didn't do enough research.
Why not combine multiple challenging things into 1 (C++, cmake, CUDA, CNN)
Quickly discovering that without writing custom kernels, you can't really progress
- cuBLAS column major layout, macro
- cmake woes (findCUDA)
- google test
- padding kernel
- column major / row major headache
- removing cuBLAS -> just row major representation