cuda synatx
https://docs.nvidia.com/cuda/cuda-c-programming-guide/#execution-configuration
https://blog.csdn.net/qq_39575835/article/details/83027440
1 | kernal_func<<< Dg, Db, Ns, S >>>(...); |
dim3 index
calc index
1 | int idx = (gridDim.x*gridDim.y*blockIdx.z+gridDim.x*blockIdx.y+blockIdx.x)*blockDim.x*blockDim.y*blockDim.z+ blockDim.x * blockDim.y * threadIdx.z + blockDim.x * threadIdx .y + threadIdx.x; |
as you can see, blockIdx is corrdinate in GridDim, threadIdx is corrdinate in blockDim, there are all dim3{int x,y,z;}