Usage Guide#

Abracudabra is a Python library designed to simplify conversions between arrays, dataframes, series, and tensors, seamlessly handling CPU (NumPy/Pandas/Torch) and CUDA (CuPy/cuDF/Torch) environments.

Supported Data Types and Libraries#

Data Object

CPU

CUDA

Array

numpy.ndarray

cupy.ndarray

Series

pandas.Series

cudf.Series

DataFrame

pandas.DataFrame

cudf.DataFrame

Index

pandas.Index

cudf.Index

Tensor

torch.Tensor

torch.Tensor

Device Management#

Abracudabra manages devices through the Device object:

  • type ("cpu" or "cuda")

  • idx (optional integer, e.g., 0 for cuda:0)

[1]:
from abracudabra import Device

cpu_device = Device(type="cpu")
cuda_device = Device(type="cuda", idx=0)

print(f"CPU device: {cpu_device}")
print(f"CUDA device: {cuda_device}")
CPU device: cpu
CUDA device: cuda:0

Conversion Functions#

Abracudabra provides high-level functions for data conversion:

Function

Converts From

Converts To

to_array

array, series, dataframe, tensor

array

to_tensor

array, series, dataframe, tensor

tensor

to_series

array, tensor

series

to_dataframe

array, tensor, mapping of arrays/tensors

dataframe

All functions accept an optional device parameter:

  • If specified, output data is moved to that device.

  • If not specified, data stays on its original device.

Example Usage#

Convert a torch tensor to an array:

[2]:
import torch

from abracudabra import to_array

tensor = torch.rand(2, 3, device="cuda:0")
array = to_array(tensor)

print("type:", type(array))
type: <class 'cupy.ndarray'>

Convert array to series:

[3]:
import numpy as np

from abracudabra import to_series

array = np.ones((4,), dtype=np.float32)

series = to_series(array, device="cuda:0")
print(series)
print("type:", type(series))
0    1.0
1    1.0
2    1.0
3    1.0
dtype: float32
type: <class 'cudf.core.series.Series'>

Build dataframe from mixed data types and devices:

[4]:
import cupy as cp
import numpy as np
import torch

from abracudabra import to_dataframe

numpy_array = np.full((5,), 1, dtype=np.float32)
cupy_array = cp.full((5,), 2, dtype=cp.int8)
torch_tensor = torch.full((5,), 3, dtype=torch.float32, device="cuda:0")

dataframe = to_dataframe(
    {"numpy": numpy_array, "cupy": cupy_array, "torch": torch_tensor}, device="cuda:0"
)

print(dataframe)
print("type:", type(dataframe))
   numpy  cupy  torch
0    1.0     2    3.0
1    1.0     2    3.0
2    1.0     2    3.0
3    1.0     2    3.0
4    1.0     2    3.0
type: <class 'cudf.core.dataframe.DataFrame'>

Device Management#

Check the Device of an Object#

Use get_device to determine the current device of an object:

[5]:
import numpy as np

from abracudabra import get_device

numpy_array = np.ones((4,), dtype=np.float32)

get_device(numpy_array)  # CPU device
[5]:
Device(type='cpu', idx=None)

Move Data Between Devices#

Use to_device to move data between devices:

[6]:
import cupy as cp

from abracudabra import to_device

numpy_array = cp.ones((4,), dtype=cp.float32)

cupy_array = to_device(numpy_array, device="cuda:0")  # Move to CUDA device

print(cupy_array)
print("type:", type(cupy_array))
[1. 1. 1. 1.]
type: <class 'cupy.ndarray'>

Library Selection Helpers#

Use helper functions to obtain the correct library based on the target device:

It can be particularly useful since these libraries intentionally share a common API.

Example:

[7]:
from abracudabra import get_np_or_cp

device_type = "cuda"

# Get numpy or cupy (here: cupy)
np_or_cp = get_np_or_cp(device_type)

# Create a numpy or cupy array (here: cupy)
array = np_or_cp.ones((4,), dtype=np.float32)
print(type(array))
<class 'cupy.ndarray'>