I’m working on this course for the beginning of 2026 to be able to do some deep modifications in deep learning models (and if I have time to play with shaders in game development). This has 4 courses, each of which is tagged (Introduction to Concurrent Programming with GPUs, Introduction to Parallel Programming with CUDA, CUDA at Scale for the Enterprise, and CUDA Advanced Libraries).

•
The last two modules of Introduction to Parallel Programming with CUDA go over the other major types of memory and how to interact with them. I’m also going to add a bit of my own commentary and things that weren’t fully explored in the course. Memory Overview Let’s look at a bird’s eye view…

•
While module 3 of Introduction to Parallel Programming with CUDA has a really helpful section about how to use the command nvidia-smi to look at your GPU and the current load, the main focus is sharing data between the GPU and the CPU. It’s a helpful bird’s eye view and it gestures at the…

•
I’m combining ending the final module of Introduction To Concurrent Programming and the first two modules of Introduction to Parallel Programming in CUDA since all three of these modules focus on the same type of material: what is the most basic CUDA programming? How does it flow? What are the first questions you need…

•
Module 4 of the Introduction To Concurrent Programming is the part that starts talking about CUDA! To be usable by the widest number of people it doesn’t walk through installing CUDA, but it’s important to me. I already have CUDA drivers installed so I wanted to be able to use the code from the…

•
Module 3 of the Introduction To Concurrent Programming does a primer on parallel programming in Python and C++. How useful this section is will really depend on how much the next two modules build on top of this one. What I liked most about it is that I didn’t know about barriers before, it…

•
To really modify deep learning models, I’ve decided to take up Coursera’s Introduction To Concurrent Programming course. This will hopefully be a solid introduction to CUDA and give me a better sense of what operations I can do efficiently. I’ve finished the first two modules (“Course Overview” and “Core Principles of Parallel Programming on…