Update content/blog/cuda_net.md
Build website container / Build image (push) Successful in 3m7s
Details
Build website container / Build image (push) Successful in 3m7s
Details
This commit is contained in:
parent
b8a160bac8
commit
501c92444b
|
@ -3,6 +3,8 @@ title: Writing a Convolutional Neural Network library with CUDA Support
|
||||||
draft: true
|
draft: true
|
||||||
---
|
---
|
||||||
|
|
||||||
|
Straightforward project, learned a lot more than I expected.
|
||||||
|
|
||||||
"Just use cuBLAS, it'll be easier. You don't have to implement custom CUDA kernels.", they said. Actually, noone said that. I just thought that because I didn't do enough research.
|
"Just use cuBLAS, it'll be easier. You don't have to implement custom CUDA kernels.", they said. Actually, noone said that. I just thought that because I didn't do enough research.
|
||||||
|
|
||||||
Why not combine multiple challenging things into 1 (C++, cmake, CUDA, CNN)
|
Why not combine multiple challenging things into 1 (C++, cmake, CUDA, CNN)
|
||||||
|
@ -18,4 +20,10 @@ Quickly discovering that without writing custom kernels, you can't really progre
|
||||||
- naive conv2d
|
- naive conv2d
|
||||||
- learning 3D memory representation
|
- learning 3D memory representation
|
||||||
- optimizing conv2d
|
- optimizing conv2d
|
||||||
- softmax sum reduce
|
- softmax sum reduce
|
||||||
|
- softmax numerical stability - max reduce
|
||||||
|
- custom binary weights file - (safetensors - json parser vs csv) values overwritten by header
|
||||||
|
- tests passing -> implement AlexNet
|
||||||
|
- AlexNet cmake, opencv
|
||||||
|
- AlexNet crashing -> add cuda error checking to tests -> test crashing
|
||||||
|
- compute-sanitizer memecheck
|
Loading…
Reference in New Issue