Module 4 of the Introduction To Concurrent Programming is the part that starts talking about CUDA! To be usable by the widest number of people it doesn’t walk through installing CUDA, but it’s important to me. I already have CUDA drivers installed so I wanted to be able to use the code from the course on my Windows machine. The rest of this blog will tackle 1. My CUDA Setup 2. Anything else I needed to do to actually compile the hello_world example.
CUDA Installation (Windows)
I’m running a Windows 10 machine with a GeForce GTX 1080 Ti. The GPU dictates the best CUDA version, which dictates everything else (compatible library versions, Visual Studio, and of course accessible APIs). To get enough CUDA on my machine to run normal deep learning, I did the following:
- Pick the best CUDA version for the GPU, keeping your library in mind. I landed on 12.1 since at the time I originally setup my machine that was the version that worked with the most up to date version of torchvision.
- Download & install the chosen CUDA version. So for me it’s the download page for 12.1. The installer is pretty simple to follow.
There were other steps I followed at the time to work with torchvision, but for our purposes I think this should be enough. This should get the nvcc tool that is shown in the module. Here’s the version that I see:
nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Wed_Feb__8_05:53:42_Coordinated_Universal_Time_2023
Cuda compilation tools, release 12.1, V12.1.66
Build cuda_12.1.r12.1/compiler.32415258_0
(My) Setup to Run CUDA Code
However this doesn’t mean you can use it to actually compile anything. From what I can tell Windows packages the essential libraries into Visual Studio, and particular versions of CUDA are only compatible with particular versions of Visual Studio and libraries. Here are my steps to get something working:
- Install Visual Studio 2022 with
Desktop development with C++if the option is available. According to errors I got when runningnvcc2022 is the most up to date version that’s compatible with CUDA 12.1. - However not all versions of Visual Studio 2022 work, I had to downgrade to 17.6.5. To check the version and downgrade open the
Visual Studio Installerapp. The option to downgrade might appear in the “More” button on the right. I didn’t manually pick 17.6.5 though so there might be other versions that work. - Click
Modify. This will show a new window with the available ad-ons. - Select
Desktop development with C++. - On the top click
Individual Components. - Install
MSVC v141 - VS 2017 C++ x64/x86 build tools (v14.16). Do not uninstall later versions of MSVC, as you’ll see in my build command there seems to be some interactions that work. - On the bottom right, click install.
- Reboot the computer
Once you follow these steps, create a solution and copy the hello_world.cu file from the site, here’s the command that I currently use for creating files for other C++ projects:
nvcc -ccbin "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.36.32532\bin\HostX64\x64" .\hello_world.cu -MD
Notes:
-MDis very different than the ptx and fatbin commands from the course. I prefer-MDsince this will give me alibfiles to directly include in a make build.-ccbinhelps since the current setup isn’t actually enough for my libraries to point correctly. Without the flag telling it whichcl.exeto use there are both missing and duplicate definitions. On a clean build or with a bit more time probably this could be removed.
Here’s it running in Visual Studio’s PowerShell.

And that means I can locally compile GPU code!
Leave a Reply