Computational Mathematics . org

Home   Forums   Free Software   Topics   Jobs   Links   About  

Calling Cuda Functions from Fortran

Author: Austen C. Duffy, Florida State University

Cuda functions can be called directly from fortran programs by using a kernel wrapper as long as some simple rules are followed.

1. Data Types: Make sure you use equivalent data types, these basically follow from fortran --> C conventions. Make sure to specify fortran integers and reals, note that integer*2 is a short int in C, I have had alot of problems trying to use
these so I would suggest using integer*4's instead.

integer*4 --> int
real*4 --> float
real*8 --> double

2. Function Names: Fortran functions are appended with _ so you need to account for this in your cuda function call, e.g. calling function 'kernel_wrapper( )' in fortran will be changed to 'kernel_wrapper_( )' in the pre-processing stage, and so your cuda function should be called 'kernel_wrapper_( )' instead. This does not apply to the cuda kernels since they will not be called in the fortran code.

3. Arrays: Fortran and C use a different storage structure, essentially the opposite of each other
, i.e. Fortran array(i,j,k) is equivalent to C array[k][j][i], except the Fortran arrays are stored in linear memory and so will be passed as 1-D arrays to CUDA.
 Fortran array A(i,j,k) -> CUDA array A((k-1)*NX*NY+(j-1)*NY+i) 

Where NX, NY and NZ are the sizes of the x, y and z dimensions respectively. The arrays will be returned to Fortran as their original 3D versions.

4. Compilation: To compile, first use the nvcc compiler to create an object file from the .cu file using the -c option, e.g. 'nvcc -c' will create a cudatest.o file, then you compile your fortran code making sure to link to the cuda libraries (-L) and includes (-I) on your machine e.g.

nvcc -c
gfortran -L /usr/local/cuda/lib -I /usr/local/cuda/include -lcudart -lcuda fortest.f95 cudatest.o

The included libraries may be in a different location on your machine. Note that if your code runs in double precision, you will need to add the nvcc compiler option -arch sm_13, which requires a version 1.3 GPU architecture.

A sample code set complete with makefile demonstrating 1,2 and 4 above is on the next page.

Code Example with Makefile