29 lines
1.0 KiB
Markdown
29 lines
1.0 KiB
Markdown
---
|
|
title: Writing a Convolutional Neural Network library with CUDA Support
|
|
draft: true
|
|
---
|
|
|
|
Straightforward project, learned a lot more than I expected.
|
|
|
|
"Just use cuBLAS, it'll be easier. You don't have to implement custom CUDA kernels.", they said. Actually, noone said that. I just thought that because I didn't do enough research.
|
|
|
|
Why not combine multiple challenging things into 1 (C++, cmake, CUDA, CNN)
|
|
|
|
Quickly discovering that without writing custom kernels, you can't really progress
|
|
|
|
- cuBLAS column major layout, macro
|
|
- cmake woes (findCUDA)
|
|
- google test
|
|
- padding kernel
|
|
- column major / row major headache
|
|
- removing cuBLAS -> just row major representation
|
|
- naive conv2d
|
|
- learning 3D memory representation
|
|
- optimizing conv2d
|
|
- softmax sum reduce
|
|
- softmax numerical stability - max reduce
|
|
- custom binary weights file - (safetensors - json parser vs csv) values overwritten by header
|
|
- tests passing -> implement AlexNet
|
|
- AlexNet cmake, opencv
|
|
- AlexNet crashing -> add cuda error checking to tests -> test crashing
|
|
- compute-sanitizer memecheck |