## Saturday, June 3, 2017

### First learning into pycuda

With last blog which I fail to get a sample working, today I thought of giving pycuda a try. So what is pycuda?

PyCUDA gives you easy, Pythonic access to Nvidia‘s CUDA parallel computation API.

With that said, I'm gonna give the sample code a try. Let's install python3 pycuda module.

 user@localhost:~/Downloads$sudo apt-get install python3-pycuda Reading package lists... Done Building dependency tree Reading state information... Done The following packages were automatically installed and are no longer required: libgl1-nvidia-glx:i386 libgl1-nvidia-glx-i386:i386 libllvm3.5v5 libnvidia-glcore:i386 linux-image-4.1.0-2-amd64 linux-image-4.2.0-1-amd64 linux-source-4.3 python3-ecdsa syslinux unetbootin-translations Use 'sudo apt autoremove' to remove them. The following additional packages will be installed: fonts-mathjax libboost-python1.61.0 libboost-system1.61.0 libboost-thread1.61.0 libjs-mathjax python-pycuda-doc python3-appdirs python3-decorator python3-pytools Suggested packages: fonts-mathjax-extras fonts-stix libjs-mathjax-doc python-pycuda python3-pytest python3-opengl python3-pycuda-dbg The following NEW packages will be installed: fonts-mathjax libboost-python1.61.0 libboost-system1.61.0 libboost-thread1.61.0 libjs-mathjax python-pycuda-doc python3-appdirs python3-decorator python3-pycuda python3-pytools 0 upgraded, 10 newly installed, 0 to remove and 508 not upgraded. Need to get 7,150 kB of archives. After this operation, 47.8 MB of additional disk space will be used. Do you want to continue? [Y/n] Y Get:1 http://ftp.us.debian.org/debian testing/main amd64 fonts-mathjax all 2.6.1-1 [959 kB] Get:2 http://ftp.us.debian.org/debian testing/main amd64 libboost-python1.61.0 amd64 1.61.0+dfsg-2.1 [137 kB] Get:3 http://ftp.us.debian.org/debian testing/main amd64 libboost-system1.61.0 amd64 1.61.0+dfsg-2.1 [32.1 kB] Get:4 http://ftp.us.debian.org/debian testing/main amd64 libboost-thread1.61.0 amd64 1.61.0+dfsg-2.1 [71.2 kB] Get:5 http://ftp.us.debian.org/debian testing/main amd64 libjs-mathjax all 2.6.1-1 [5,473 kB] Get:6 http://ftp.us.debian.org/debian testing/contrib amd64 python-pycuda-doc all 2016.1-1 [122 kB] Get:7 http://ftp.us.debian.org/debian testing/main amd64 python3-appdirs all 1.4.0-2 [11.1 kB] Get:8 http://ftp.us.debian.org/debian testing/main amd64 python3-decorator all 4.0.6-1 [12.8 kB] Get:9 http://ftp.us.debian.org/debian testing/main amd64 python3-pytools all 2016.2.1-1 [33.9 kB] Get:10 http://ftp.us.debian.org/debian testing/contrib amd64 python3-pycuda amd64 2016.1-1+b2 [298 kB] Fetched 7,150 kB in 1min 23s (85.4 kB/s) Selecting previously unselected package fonts-mathjax. (Reading database ... 281119 files and directories currently installed.) Preparing to unpack .../fonts-mathjax_2.6.1-1_all.deb ... Unpacking fonts-mathjax (2.6.1-1) ... Selecting previously unselected package libboost-python1.61.0. Preparing to unpack .../libboost-python1.61.0_1.61.0+dfsg-2.1_amd64.deb ... Unpacking libboost-python1.61.0 (1.61.0+dfsg-2.1) ... Selecting previously unselected package libboost-system1.61.0:amd64. Preparing to unpack .../libboost-system1.61.0_1.61.0+dfsg-2.1_amd64.deb ... Unpacking libboost-system1.61.0:amd64 (1.61.0+dfsg-2.1) ... Selecting previously unselected package libboost-thread1.61.0:amd64. Preparing to unpack .../libboost-thread1.61.0_1.61.0+dfsg-2.1_amd64.deb ... Unpacking libboost-thread1.61.0:amd64 (1.61.0+dfsg-2.1) ... Selecting previously unselected package libjs-mathjax. Preparing to unpack .../libjs-mathjax_2.6.1-1_all.deb ... Unpacking libjs-mathjax (2.6.1-1) ... Selecting previously unselected package python-pycuda-doc. Preparing to unpack .../python-pycuda-doc_2016.1-1_all.deb ... Unpacking python-pycuda-doc (2016.1-1) ... Selecting previously unselected package python3-appdirs. Preparing to unpack .../python3-appdirs_1.4.0-2_all.deb ... Unpacking python3-appdirs (1.4.0-2) ... Selecting previously unselected package python3-decorator. Preparing to unpack .../python3-decorator_4.0.6-1_all.deb ... Unpacking python3-decorator (4.0.6-1) ... Selecting previously unselected package python3-pytools. Preparing to unpack .../python3-pytools_2016.2.1-1_all.deb ... Unpacking python3-pytools (2016.2.1-1) ... Selecting previously unselected package python3-pycuda. Preparing to unpack .../python3-pycuda_2016.1-1+b2_amd64.deb ... Unpacking python3-pycuda (2016.1-1+b2) ... Processing triggers for fontconfig (2.11.0-6.5) ... Processing triggers for libc-bin (2.19-22) ... Setting up fonts-mathjax (2.6.1-1) ... Setting up libboost-python1.61.0 (1.61.0+dfsg-2.1) ... Setting up libboost-system1.61.0:amd64 (1.61.0+dfsg-2.1) ... Setting up libboost-thread1.61.0:amd64 (1.61.0+dfsg-2.1) ... Setting up libjs-mathjax (2.6.1-1) ... Setting up python-pycuda-doc (2016.1-1) ... Setting up python3-appdirs (1.4.0-2) ... Setting up python3-decorator (4.0.6-1) ... Setting up python3-pytools (2016.2.1-1) ... Setting up python3-pycuda (2016.1-1+b2) ... Processing triggers for libc-bin (2.19-22) ...  Okay, we are all good. Let's start python3 interpreter. By the way, I'm using python3.5  user@localhost:~$ python3
Python 3.5.2+ (default, Aug 5 2016, 08:07:14)
[GCC 6.1.1 20160724] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pycuda.autoinit
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3/dist-packages/pycuda/autoinit.py", line 5, in <module>
cuda.init()
pycuda._driver.RuntimeError: cuInit failed: no CUDA-capable device is detected



ah craps, you would think that something is wrong with the lib. It's just that the library did not detect a gpu that is cuda capable. For your information, I have workstation that has two gpu, an intel and nvidia gpu, so it is currently running intel which is not power consumption intensive and I have to explicitly enable nvidia gpu should I need to. With that said, let's try it again.

 user@localhost:~\$ optirun python3
Python 3.5.2+ (default, Aug 5 2016, 08:07:14)
[GCC 6.1.1 20160724] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pycuda.autoinit
>>> import pycuda.driver as drv
>>> import numpy
>>>
>>> from pycuda.compiler import SourceModule
>>> mod = SourceModule("""
... __global__ void multiply_them(float *dest, float *a, float *b)
... {
...   const int i = threadIdx.x;
...   dest[i] = a[i] * b[i];
... }
... """)
>>> multiple_them = mod.get_function("multiply_them")
>>> a = numpy.random.randn(400).astype(numpy.float32)
>>> b = numpy.random.randn(400).astype(numpy.float32)
>>>
>>> dest = numpy.zeros_like(a)
>>> multiple_them(drv.Out(dest), drv.In(a), drv.In(b), block=(400,1,1), grid=(1,1))
>>> print(dest-a*b)
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0.]


optirun is a command to enable nvidia discreet gpu on debian. So now import library and it works! Brilliant. By the way, I'm using nvidia 960M gpu. That's it for today.