mirror of
https://github.com/pytorch/pytorch.git
synced 2025-10-20 21:14:14 +08:00
Updated CUDA basics (markdown)
@ -30,9 +30,9 @@ TensorIterator kernels
|
||||
* you can use `torch.cuda.set_sync_debug_mode` to warn or error out on cuda synchronizations, if you are trying to understand where synchronizations are coming from in your workload or if you are accidentally synchronizing in your operations
|
||||
* Use pytorch built-in profiler (kineto) or nsys to get information on GPU utilization and most time-consuming kernels
|
||||
|
||||
See https://www.dropbox.com/s/3350s3qfy8rpm5a/CUDA_presentation.key?dl=0 for high level overview of CUDA programming and useful links
|
||||
See the linked slides in https://www.nvidia.com/en-us/on-demand/session/gtc24-s62191/ for a high level overview of CUDA programming.
|
||||
|
||||
Goto N1023015 to do TensorIterator cuda perf lab
|
||||
Go to N1023015 to do TensorIterator cuda perf lab.
|
||||
|
||||
## Next
|
||||
Unit 7: Data (Optional) - [[Data Basics|Data-Basics]]
|
Reference in New Issue
Block a user