xiAPI GPUDirect¶
XIMEA API can enable the use of NVIDIA's GPUDirect RDMA feature on supported configurations.
Requirements¶
- x86_64 Linux system with Quadro or Tesla NVIDIA GPU
- PCIe cameras (xiB, xiB-64, xiX, xiT families)
- XIMEA API package version 4.11.13 or later, installed with
-pcie
option - proprietary NVIDIA video drivers
- CUDA toolkit version 6 or later (tested with versions 7.5 and 9.1)
How to enable¶
First, you need to enable GPUDirect support in XIMEA kernel driver:
sudo /opt/XIMEA/src/ximea_cam_pcie/enable_gpudirect.sh
Then add the following lines to
/etc/rc.local
or analogous file and reboot the system:modprobe ximea_cam_pcie major="$(grep ximea_cam_pcie /proc/devices | cut -f1 -d\ )" for i in $(seq "$(lspci -nd deda:|wc -l)") do minor="$((i - 1))" devname="/dev/ximea$(printf "%02d" ${minor})" mknod -m 660 "${devname}" c "${major}" "${minor}" chgrp plugdev "${devname}" done
Finally, you should set relevant xiApi parameters in your code:
xiSetParamInt(handle, XI_PRM_BUFFER_POLICY, XI_BP_UNSAFE); xiSetParamInt(handle, XI_PRM_IMAGE_DATA_FORMAT, XI_FRM_TRANSPORT_DATA); xiSetParamInt(handle, XI_PRM_TRANSPORT_DATA_TARGET, XI_TRANSPORT_DATA_TARGET_GPU_RAM); // Lower the size of acquisition buffer because it's limited by GPU's BAR size, e.g. (256 - 32) MB xiSetParamInt(handle, XI_PRM_ACQ_BUFFER_SIZE, 200000000);
Note that you can't use safe buffer policy or image format other than transport data.
How to use¶
When XI_PRM_TRANSPORT_DATA_TARGET
is set to XI_TRANSPORT_DATA_TARGET_GPU_RAM
xiGetImage
returns a pointer to GPU memory in bp
field of XI_IMG
structure.
This saves you one cudaMemcpy
operation for copying the data from CPU to device memory on each acquired frame.
GPU memory is allocated using cudaMalloc
in xiStartAcquisition
function and deallocated (using cudaFree
) in xiStopAcquisition
.
You should select necessary CUDA device using cudaSetDevice
before calling xiStartAcquisition
.
Sample application¶
Attached is an example demonstrating xiAPI GPUDirect feature: xiCUDASample.tar.bz2.
It is based on 3_Imaging/histogram
from CUDA samples which computes 64-bin histogram on GPU.
Results are displayed on the terminal using ASCII-art.
Also, this application prints time measurements, so you can compare the time needed for running the computation with and without GPUDirect enabled.
Please refer to readme.txt
included in the tarball for building instructions.
Further reading¶
More information about GPUDirect RDMA technology can be found on NVIDIA website:
http://docs.nvidia.com/cuda/gpudirect-rdma/