abracudabra#

Convert dataframes, arrays, and tensors to CPU/CUDA.

Submodules#

Classes#

Device

A device with a name and index.

Functions#

to_array(sequence, /[, device, strict])

Convert an array, series, or dataframe to a NumPy or CuPy array.

to_dataframe(data, /[, index, device, strict])

Convert to a Pandas/cuDF dataframe.

to_series(sequence, /[, index, device, strict])

Convert an array or tensor to a Pandas/cuDF series.

to_tensor(sequence, /[, device, strict])

Convert an array, series, or dataframe to a Torch tensor.

to_device(…)

Move an array, series, or tensor to a device.

get_np_or_cp([device_type])

Get the numpy or cupy library based on the device type.

get_pd_or_cudf([device_type])

Get the pandas or cudf library based on the device type.

get_device(…)

Get the device of a NumPy/CuPy array or series.

Package Contents#

abracudabra.to_array(sequence, /, device=None, *, strict=False)[source]#

Convert an array, series, or dataframe to a NumPy or CuPy array.

Parameters:
  • sequence (abracudabra._annotations.Array | abracudabra._annotations.Series | abracudabra._annotations.DataFrame | torch.Tensor) – The sequence to convert.

  • device (str | abracudabra.device.base.Device | None) – The device to convert the sequence to. If None, the sequence stays on the same device.

  • strict (bool) – Whether to raise an error if the sequence is not a valid type. A NumPy/CuPy array, Pandas/cuDF series or dataframe, or Torch tensor are valid types. If False, the sequence is converted to a NumPy/CuPy array if possible, but it might raise an error if the conversion is not possible.

Returns:

A NumPy/CuPy array.

Raises:

TypeError – If the sequence is not a valid type and strict is True.

Return type:

abracudabra._annotations.Array

Examples

Build a CuPy array from a sequence

>>> import cupy as cp
>>> cupy_array = to_array([1, 2, 3], "cuda:0")
>>> print(type(cupy_array))
<class 'cupy.ndarray'>

Build a NumPy array from a cuDF series

>>> import cudf
>>> cudf_series = cudf.Series([1, 2, 3])
>>> numpy_array = to_array(cudf_series)
>>> print(type(numpy_array))
<class 'numpy.ndarray'>
abracudabra.to_dataframe(data, /, index=None, device=None, *, strict=False, **kwargs)[source]#

Convert to a Pandas/cuDF dataframe.

Parameters:
  • data (collections.abc.Mapping[str, abracudabra._annotations.Array | torch.Tensor] | torch.Tensor | abracudabra._annotations.Array) – The data to convert. If a mapping, the keys will be used as column names.

  • index (abracudabra._annotations.Array | torch.Tensor | None) – The optional index for the dataframe.

  • device (str | abracudabra.device.base.Device | None) – The device to use for the dataframe. If not provided, the type is guessed from the data.

  • strict (bool) – Whether to raise an error if the provided data does not consist of NumPy/CuPy arrays or Torch tensors.

  • **kwargs (Any) – Additional keyword arguments for the dataframe.

Returns:

The converted dataframe.

Return type:

abracudabra._annotations.DataFrame

Examples

Build a dataframe from mixed data types

>>> import cupy as cp
>>> import numpy as np
>>> import torch
>>> numpy_array = np.full((5,), 1, dtype=np.float32)
>>> cupy_array = cp.full((5,), 2, dtype=cp.int8)
>>> torch_tensor = torch.full((5,), 3, dtype=torch.float32, device="cuda:0")
>>> dataframe = to_dataframe(
...     {"numpy": numpy_array, "cupy": cupy_array, "torch": torch_tensor},
...     device="cuda:0",
... )
>>> print(dataframe)
numpy  cupy  torch
0    1.0     2    3.0
1    1.0     2    3.0
2    1.0     2    3.0
3    1.0     2    3.0
4    1.0     2    3.0
>>> print(type(dataframe))
<class 'cudf.core.dataframe.DataFrame'>
abracudabra.to_series(sequence, /, index=None, device=None, *, strict=False, **kwargs)[source]#

Convert an array or tensor to a Pandas/cuDF series.

Parameters:
  • sequence (object) – The array or tensor to convert.

  • index (abracudabra._annotations.Array | torch.Tensor | None) – The optional index for the series.

  • device (str | abracudabra.device.base.Device | None) – The device to use for the series. If not provided, the array stays on the same device.

  • strict (bool) – Whether to raise an error if the sequence is not a NumPy/CuPy array or Torch tensor.

  • **kwargs (Any) – Additional keyword arguments for the series.

Returns:

The converted series.

Return type:

abracudabra._annotations.Series

Examples

Convert a list to a CuPy series

>>> series = to_series([10, 20, 30], device="cuda")
>>> print(type(series))
<class 'cudf.core.series.Series'>

Convert a CuPy array to a cuDF series

>>> import cupy as cp
>>> cupy_array = cp.array([40, 50, 60])
>>> series = to_series(cupy_array)
>>> print(type(series))
<class 'cudf.core.series.Series'>
abracudabra.to_tensor(sequence, /, device=None, *, strict=False)[source]#

Convert an array, series, or dataframe to a Torch tensor.

Parameters:
  • sequence (abracudabra._annotations.Array | abracudabra._annotations.Series | torch.Tensor) – The sequence to convert.

  • device (abracudabra.device.base.Device | str | None) – The device to convert the sequence to. If None, the sequence stays on the same device.

  • strict (bool) – Whether to raise an error if the sequence is not a valid type. A NumPy/CuPy array, Pandas/cuDF series or dataframe, or Torch tensor are valid types. If False, the sequence is converted to a Torch tensor if possible, but it might raise an error if the conversion is not possible.

Returns:

A Torch tensor.

Raises:

TypeError – If the sequence is not a valid type and strict is True.

Return type:

torch.Tensor

Examples

Build a Torch tensor from a sequence

>>> import torch
>>> to_tensor([1, 2, 3])
tensor([1, 2, 3])

Build a Torch tensor from a CuPy array

>>> import cupy as cp
>>> cupy_array = cp.array([4, 5, 6])
>>> torch_tensor = to_tensor(cupy_array)
>>> print(torch_tensor.device)
tensor([4, 5, 6], device='cuda:0')
class abracudabra.Device[source]#

Bases: NamedTuple

A device with a name and index.

type: DeviceType#

The device type, e.g., "cpu" or "cuda".

idx: int | None = None#

The device index, e.g., 0 or None.

__str__()[source]#

Return the device name.

Return type:

str

classmethod validate(device, idx=None)[source]#

Return a device, validating the device type and index.

Parameters:
  • device (object) – The device type.

  • idx (object | None) – The optional device index.

Returns:

The device.

Return type:

Device

classmethod from_str(device, /)[source]#

Return a device from a string.

The string should be in the format "device[:idx]".

Examples

>>> Device.from_str("cpu")
Device(type="cpu", idx=None)
>>> Device.from_str("cuda:1")
Device(type="cuda", idx=1)
Parameters:

device (str)

Return type:

Device

classmethod parse(device, /)[source]#

Return a device from a string or device.

If the input is already a device, it is returned as is. Otherwise, the input is parsed as a string.

Parameters:

device (str | Device | torch.device) – The device or device string (e.g., "cpu" or "cuda:1").

Returns:

The device.

Return type:

Device

to_torch()[source]#

Return a torch device.

Examples

>>> Device("cpu", None).to_torch()
device(type='cpu')
>>> Device("cuda", 1).to_torch()
device(type='cuda', index=1)
Return type:

torch.device

abracudabra.to_device(sequence: abracudabra._annotations.Series, /, device: abracudabra.device.base.Device | str) abracudabra._annotations.Series[source]#
abracudabra.to_device(sequence: torch.Tensor, /, device: abracudabra.device.base.Device | str) torch.Tensor
abracudabra.to_device(sequence: abracudabra._annotations.DataFrame, /, device: abracudabra.device.base.Device | str) abracudabra._annotations.DataFrame
abracudabra.to_device(sequence: abracudabra._annotations.Index, /, device: abracudabra.device.base.Device | str) abracudabra._annotations.Index
abracudabra.to_device(sequence: abracudabra._annotations.Array, /, device: abracudabra.device.base.Device | str) abracudabra._annotations.Array
abracudabra.to_device(sequence: abracudabra._annotations.Array | abracudabra._annotations.Series | torch.Tensor, /, device: abracudabra.device.base.Device | str) abracudabra._annotations.Index | abracudabra._annotations.Series | abracudabra._annotations.DataFrame | abracudabra._annotations.Array | torch.Tensor

Move an array, series, or tensor to a device.

Call the appropriate function to move the element to the device:

Parameters:
  • sequence – The sequence to move to the device.

  • device – The device to move the sequence to.

Returns:

The sequence on the specified device.

Raises:

TypeError – If the sequence is not a NumPy/CuPy array, Pandas/cuDF index/series/dataframe or Torch tensor.

Examples

Move a Pandas dataframe to the GPU (cuDF dataframe):

>>> import pandas as pd
>>> from abracudabra import to_device
>>> df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
>>> df_gpu = to_device(df, "cuda")
>>> print(type(df_gpu))
<class 'cudf.core.dataframe.DataFrame'>

Move a cuDF dataframe to the CPU (Pandas dataframe):

>>> df_cpu = to_device(df_gpu, "cpu")
>>> print(type(df_cpu))
<class 'pandas.core.frame.DataFrame

Move a numpy array to the GPU (cupy):

>>> import numpy as np
>>> arr = np.array([1, 2, 3])
>>> arr_gpu = to_device(arr, "cuda")
>>> print(type(arr_gpu))
<class 'cupy.ndarray'>
abracudabra.get_np_or_cp(device_type=None)[source]#

Get the numpy or cupy library based on the device type.

  • if device_type is "cpu", return the numpy library

  • if device_type is "cuda", return the cupy library

If device_type is not specified, return the numpy library (default).

Examples

>>> device_type = "cuda"  # in some configuration for example
>>> np_or_cp = get_np_or_cp(device_type)
>>> np_or_cp.random.choice([1, 2, 3], size=1)  # returns a cupy array
array([3])
Parameters:

device_type (abracudabra.device.base.DeviceType | None)

Return type:

types.ModuleType

abracudabra.get_pd_or_cudf(device_type=None)[source]#

Get the pandas or cudf library based on the device type.

  • if device_type is "cpu", return the pandas library

  • if device_type is "cuda", return the cudf library

If device_type is not specified, return the pandas library (default).

Examples

>>> pd_or_cudf = get_pd_or_cudf("cpu")
>>> pd_or_cudf.Series([1, 2, 3])  # returns a pandas series
0    1
1    2
2    3
dtype: int64
Parameters:

device_type (abracudabra.device.base.DeviceType | None)

Return type:

types.ModuleType

abracudabra.get_device(element: abracudabra._annotations.Array | torch.Tensor, /, *, raise_if_unknown: Literal[True] = ...) abracudabra.device.base.Device[source]#
abracudabra.get_device(element: abracudabra._annotations.Array | torch.Tensor, /, *, raise_if_unknown: bool = ...) abracudabra.device.base.Device | None

Get the device of a NumPy/CuPy array or series.

Parameters:
  • element – The element to check.

  • raise_if_unknown – Whether to raise an error if the element is not a known array or tensor.

Returns:

The device of the element.

Examples

>>> import numpy as np
>>> array = np.random.rand(3)
>>> get_device(array)
Device(type="cpu", idx=None)
>>> import torch
>>> tensor = torch.rand(3, device="cuda")
>>> get_device(tensor)
Device(type="cuda", idx=0)