namesny-com/content/blog/cuda_net.md

---
title: Writing a Convolutional Neural Network library with CUDA Support
draft: true
---

"Just use cuBLAS, it'll be easier. You don't have to implement custom CUDA kernels.", they said. Actually, noone said that. I just thought that because I didn't do enough research.

Why not combine multiple challenging things into 1 (C++, cmake, CUDA, CNN)

Quickly discovering that without writing custom kernels, you can't really progress

- cuBLAS column major layout, macro
- cmake woes (findCUDA)
- google test
- padding kernel 
- column major / row major headache
- removing cuBLAS -> just row major representation
- naive conv2d
- learning 3D memory representation
- optimizing conv2d
- softmax sum reduce
CUDAnet post draft 2024-03-06 10:24:00 +00:00			`---`
			`title: Writing a Convolutional Neural Network library with CUDA Support`
			`draft: true`
			`---`

			`"Just use cuBLAS, it'll be easier. You don't have to implement custom CUDA kernels.", they said. Actually, noone said that. I just thought that because I didn't do enough research.`

			`Why not combine multiple challenging things into 1 (C++, cmake, CUDA, CNN)`

			`Quickly discovering that without writing custom kernels, you can't really progress`

			`- cuBLAS column major layout, macro`
			`- cmake woes (findCUDA)`
			`- google test`
			`- padding kernel`
			`- column major / row major headache`
			`- removing cuBLAS -> just row major representation`
Update content/blog/cuda_net.md 2024-04-03 20:34:03 +00:00			`- naive conv2d`
			`- learning 3D memory representation`
			`- optimizing conv2d`
			`- softmax sum reduce`