Skip to content

Code from the "CUDA Crash Course" YouTube series by CoffeeBeforeArch

License

Notifications You must be signed in to change notification settings

BenoitKAO/cuda_programming

 
 

Repository files navigation

GPGPU Programming with CUDA

This repository contains all code from the YouTube series "GPGPU Programming with CUDA" by CoffeeBeforeArch.

Contact

Suggestions for specific content can be sent to: [email protected]

An up to date list on all series is available at: Google Sheets

Environment

Operating System: Windows 10 & Ubuntu 18.04

IDE: Visual Studio 2017

Text Editor: VIM

GPU: NVIDIA GTX 1050 Ti

CUDA version: 10.0, 9.1

Concepts covered in each video

Environment Setup

Video Concepts Files
CUDA Crash Course: Visual Studio 2017 Environment Setup Setup, Linker, Visual Studio, Environmen, Build Paths vs_setup.cu
CUDA Crash Course: Programming in Linux NVCC, NVprof, Vector Addition vector_add.cu

Vector Addition

Video Concepts Files
GPGPU Programming with CUDA: Vector Add GPU Threads, Memory Allocation, Memory Copy, GPU Kernels, Running Kernels vector_add.cu
GPGPU Programming with CUDA: Vector Add with Unified Memory Unified Memory, Prefetching vector_add_um.cu

Matrix Multiplication

Video Concepts Files
GPGPU Programming with CUDA: Matrix Multiplication 2-D Threadblocks, Alligned Memory Accesses matrix_mul.cu
GPGPU Programming with CUDA: Tiled Matrix Multiplication Shared Memory, Cache Tiling, Performance Analysis, Optimization tiled_matrix_mul.cu
CUDA Crash Course: Why Coalescing Matters Transposing Matrices, Coalescing Techniques alignment_matrix_mul.cu

CUDA Libraries

Video Concepts Files
CUDA Crash Course: cuBLAS for Vector Add cuBLAS, SAXPY simple_cublas.cu
CUDA Crash Course: cuBLAS for Matrix Multiplication Column-Major Order, SGEMM, cuRAND cublas_matrix_mul.cu

Sum Reduction

Video Concepts Files
CUDA Crash Course: Sum Reduction Part 1 Sum Reduction, Warp Divergence sum_reduction_diverged.cu
CUDA Crash Course: Sum Reduction Part 2 Expensive Operations, Optimization, Warp Divergence sum_reduction_bank_conflicts.cu
CUDA Crash Course: Sum Reduction Part 3 Optimization, Shared Memory Bank Conflicts sum_reduction_no_conflicts.cu
CUDA Crash Course: Sum Reduction Part 4 Optimization, Idle Threads sum_reduction_reduce_idle_threads.cu
CUDA Crash Course: Sum Reduction Part 5 Optimization, Device Function, Loop Unrolling sum_reduction_device_function.cu
CUDA Crash Course: Sum Reduction Part 6 Cooperative Groups, Synchronization, Atomic Instructions sum_reduction_cooperative_groups.cu

Convolution

Video Concepts Files
CUDA Crash Course: Naive 1-D Convolution 1-D Convolution convolution.cu
CUDA Crash Course: 1-D Convolution with Constant Memory Constant Memory, Constant Cache convolution.cu
CUDA Crash Course: Tiled 1-D Convolution Shared Memory, Tiling convolution.cu
CUDA Crash Course: 1-D Convolution Cache Simplification Shared Memory, Tiling, Programmability convolution.cu
CUDA Crash Course: 2-D Convolution 2-D Convolution, Multi-Dimensional Thread Blocks convolution.cu

Histogram

Video Concepts Files
CUDA Crash Course: Optimizing Histogram Kernels Global Atomics, Shared Memory Atomics, Histograms, GNU Plot histogram.cu
histogram.cu

Misc. Topics

Video Concepts Files
CUDA Crash Course: Video Corrections TB Calculations, Verification vector_add.cu
matrix_mul.cu

About

Code from the "CUDA Crash Course" YouTube series by CoffeeBeforeArch

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Cuda 80.0%
  • C++ 20.0%