Usage

pyClesperanto is a GPU-accelerated image processing library for Python. To get started, import the library:

import pyclesperanto as cle

Warning

If you encounter an error at this stage, it is likely that the OpenCL driver is not installed or that you do not have an OpenCL-compatible device. Please check the installation of OpenCL and drivers, as well as the compatibility of your system devices with OpenCL.

Next, you can explore the devices available on your computer using the list_available_devices() function or get more detailed information with the info() function:

# Return the name of all available devices on the computer
print(cle.list_available_devices())

# Return the name, index, and information on all the devices
print(cle.info())

To work on a specific device, you need to select it. By default, the last device found will be automatically selected for convenience at import time. You can check which device you are currently working on with the get_device() function, and select another one using the select_device() function:

# Query the current device
print(cle.get_device())

# Select a device by name, substring, or index
cle.select_device("NVIDIA RTX 4090") # full name
cle.select_device("TX")              # substring
cle.select_device(0)                 # device index

Memory transfer

GPU devices have their own memory, which is separate from the computer’s memory. Therefore, you need to transfer your data to the device memory to process it and back to the computer memory to read the results. This is commonly known as copy from host to device and copy from device to host. In pyclesperanto, we use the functions push, pull, and create to manage memory transfer:

push is used to transfer/copy data from the host to the device.
pull is used to transfer/copy data from the device to the host.
create is used to allocate empty space on the device, which will be used, for example, to store a result.

These copy operations are costly in terms of time as they scale with the data size. Therefore, it is good practice to avoid them as much as possible once you are optimizing your code.

Create

As there is no data transfer, we only need to specify the size of the image we want to create. The image will be created on the GPU and will be empty. The size of the image is specified as a tuple of integers following the numpy convention zyx.

# Create an empty image on the GPU of size 100x100
gpu_image = cle.create((100, 100))

By default, this will create a 32-bit float space. You can specify the type of the image by passing a dtype argument:

# Create an empty image on the GPU of size 100x100 with uint8 data type
gpu_image = cle.create((100, 100), dtype=np.uint8)

It is also possible to use another image as a template to create the new image. This will copy the size and the data type of the template image.

# Create an empty image on the GPU with the same size and data type as the template image
gpu_image = cle.create_like(template_image)

Push

The push function will create a memory space on the GPU like create and then transfer the data array from the host to this new memory space on the device. The data array is expected to be a numpy array or share the same interface as a numpy array (e.g., dask array).

arr = np.random.random((100, 100)).astype(np.float32)
# Push arr to the GPU
gpu_image = cle.push(arr)

The data pushed will keep the same data type as the array. Hence, if you push a uint8 array, the data will be stored as uint8 on the GPU. The array will then use 4 times less memory than if it was stored as float32. This is a good practice to keep in mind when working with GPUs as their memory can be limited.

Warning

pyclesperanto does not support 64-bit data types such as int64 or float64, which are the default data types in Python. This is to ensure full compatibility with most GPU devices. Hence, precision might be lost when converting the data type to 32-bit.

Pull

The pull function transfers data from the GPU back to the host. It will be returned as a numpy array.

# Pull gpu_image to the host
arr = cle.pull(gpu_image)

The data type of the array will be the same as the data type of the image on the GPU.

Free memory

Because memory on the GPU can be limited, it is beneficial to free memory when it is no longer needed. In pyclesperanto, you can free memory similarly to Python using the del keyword.

# Free the memory of the image on the GPU
del gpu_image

Apply operations on images

In pyclesperanto, most functions are filters or mathematical operations on images. We tried to keep the API as simple as possible with a standard convention for all the functions.

cle.function_name(input, output, arg0, arg1, ...)

The output memory is part of the function signature because the GPU cannot allocate memory by itself; you need to specify the output memory space in which it will write the result. For example, to apply a filter such as a Gaussian blur, you need to specify the following code:

# Push an image to the GPU
gpu_input = cle.push(cpu_image)
# Create an output of the same size as the input
gpu_output = cle.create(cpu_image.shape)
# Apply a Gaussian blur with sigma_x=2 and sigma_y=2
cle.gaussian_blur(gpu_input, gpu_output, sigma_x=2, sigma_y=2)
# Pull back the result to the host memory
result = cle.pull(gpu_output)

Even though it can be a bit tedious, this code provides total control over the data and memory being processed. Now, it is also possible to let pyclesperanto manage some of the memory operations, like the push and create of the input and output, making your code shorter. However, we will rely here on the default behavior of the functions, which might not be the most efficient in terms of memory usage in some cases.

# Apply a Gaussian blur directly on a numpy array and save the result in a pyclesperanto array
gpu_output = cle.gaussian_blur(cpu_image, sigma_x=2, sigma_y=2)
# Pull back the result to the host
result = cle.pull(gpu_output)

Here, the cpu_image is pushed to the GPU and the output GPU space is created automatically when calling the operation gaussian_blur. The function will return a pyclesperanto array. Memory transfers are still applied in the background, but the user does not have to worry about it.

Pipeline of operations

Most operations in pyclesperanto are filters. This means that you can chain them together to create a pipeline of operations. For example, to apply a Gaussian blur followed by a threshold, you can write the following code:

# Apply a Gaussian blur
gpu_input = cle.push(cpu_image)
gpu_output = cle.create(cpu_image.shape)
cle.gaussian_blur(gpu_input, gpu_output, sigma_x=2, sigma_y=2)
blurred = cle.pull(gpu_output)

# Apply a threshold
gpu_input = cle.push(blurred)
gpu_output = cle.create(blurred.shape)
cle.greater_constant(gpu_output, gpu_output, constant=0.5)
binarized = cle.pull(gpu_output)

Although this code is correct, it is not optimal due to the push and pull in between the two operations. This code is good for prototyping as it allows you to inspect the result of each operation. But in the final version of the code, it is better to chain the operations together to avoid the memory transfers (e.g., push and pull).

# Apply a Gaussian blur
gpu_blurred = cle.gaussian_blur(cpu_image, sigma_x=2, sigma_y=2)
# Apply a threshold
gpu_binarized = cle.greater_constant(gpu_blurred, constant=0.5)
# Read the output on host
binarized = cle.pull(gpu_binarized)

Here we only use push at the beginning, inside the gaussian_blur operation, and pull at the end of the pipeline when we need to access the data from the CPU. The create operation for output data is done automatically inside each operation.