this is a GPU Mode Lecture (but more of a motivational video for people who feel intimidated by the depth of low level engineering)
https://www.youtube.com/watch?v=4jQTb6sRGLg&t=722s
I recommend watching the full video on YouTube but if you want to scheme it fast (or uncomfortable with English), checkout the note I created using Lilys AI below!
Lecture 50: A learning journey CUDA, Triton, Flash Attention