Part I - Intro to GPUs

Part II - CUDA Kernel Optimization Tips

Part III - Profiling a Pytorch Forward Pass

Part IV - 1D Convolution in CUDA (Naive)

Part V - 1D Convolution in CUDA (Optimized)

Part VI - Kernel Fusion in CUDA