abracudabra#
Convert dataframes, arrays, and tensors to CPU/CUDA.
Submodules#
Classes#
A device with a name and index. |
Functions#
|
Convert an array, series, or dataframe to a NumPy or CuPy array. |
|
Convert to a Pandas/cuDF dataframe. |
|
Convert an array or tensor to a Pandas/cuDF series. |
|
Convert an array, series, or dataframe to a Torch tensor. |
|
Move an array, series, or tensor to a device. |
|
Get the numpy or cupy library based on the device type. |
|
Get the pandas or cudf library based on the device type. |
|
Get the device of a NumPy/CuPy array or series. |
Package Contents#
- abracudabra.to_array(sequence, /, device=None, *, strict=False)[source]#
Convert an array, series, or dataframe to a NumPy or CuPy array.
- Parameters:
sequence (abracudabra._annotations.Array | abracudabra._annotations.Series | abracudabra._annotations.DataFrame | torch.Tensor) – The sequence to convert.
device (str | abracudabra.device.base.Device | None) – The device to convert the sequence to. If None, the sequence stays on the same device.
strict (bool) – Whether to raise an error if the sequence is not a valid type. A NumPy/CuPy array, Pandas/cuDF series or dataframe, or Torch tensor are valid types. If False, the sequence is converted to a NumPy/CuPy array if possible, but it might raise an error if the conversion is not possible.
- Returns:
A NumPy/CuPy array.
- Raises:
TypeError – If the sequence is not a valid type and
strict
is True.- Return type:
abracudabra._annotations.Array
Examples
Build a CuPy array from a sequence
>>> import cupy as cp >>> cupy_array = to_array([1, 2, 3], "cuda:0") >>> print(type(cupy_array)) <class 'cupy.ndarray'>
Build a NumPy array from a cuDF series
>>> import cudf >>> cudf_series = cudf.Series([1, 2, 3]) >>> numpy_array = to_array(cudf_series) >>> print(type(numpy_array)) <class 'numpy.ndarray'>
- abracudabra.to_dataframe(data, /, index=None, device=None, *, strict=False, **kwargs)[source]#
Convert to a Pandas/cuDF dataframe.
- Parameters:
data (collections.abc.Mapping[str, abracudabra._annotations.Array | torch.Tensor] | torch.Tensor | abracudabra._annotations.Array) – The data to convert. If a mapping, the keys will be used as column names.
index (abracudabra._annotations.Array | torch.Tensor | None) – The optional index for the dataframe.
device (str | abracudabra.device.base.Device | None) – The device to use for the dataframe. If not provided, the type is guessed from the data.
strict (bool) – Whether to raise an error if the provided data does not consist of NumPy/CuPy arrays or Torch tensors.
**kwargs (Any) – Additional keyword arguments for the dataframe.
- Returns:
The converted dataframe.
- Return type:
abracudabra._annotations.DataFrame
Examples
Build a dataframe from mixed data types
>>> import cupy as cp >>> import numpy as np >>> import torch
>>> numpy_array = np.full((5,), 1, dtype=np.float32) >>> cupy_array = cp.full((5,), 2, dtype=cp.int8) >>> torch_tensor = torch.full((5,), 3, dtype=torch.float32, device="cuda:0") >>> dataframe = to_dataframe( ... {"numpy": numpy_array, "cupy": cupy_array, "torch": torch_tensor}, ... device="cuda:0", ... ) >>> print(dataframe) numpy cupy torch 0 1.0 2 3.0 1 1.0 2 3.0 2 1.0 2 3.0 3 1.0 2 3.0 4 1.0 2 3.0 >>> print(type(dataframe)) <class 'cudf.core.dataframe.DataFrame'>
- abracudabra.to_series(sequence, /, index=None, device=None, *, strict=False, **kwargs)[source]#
Convert an array or tensor to a Pandas/cuDF series.
- Parameters:
sequence (object) – The array or tensor to convert.
index (abracudabra._annotations.Array | torch.Tensor | None) – The optional index for the series.
device (str | abracudabra.device.base.Device | None) – The device to use for the series. If not provided, the array stays on the same device.
strict (bool) – Whether to raise an error if the sequence is not a NumPy/CuPy array or Torch tensor.
**kwargs (Any) – Additional keyword arguments for the series.
- Returns:
The converted series.
- Return type:
abracudabra._annotations.Series
Examples
Convert a list to a CuPy series
>>> series = to_series([10, 20, 30], device="cuda") >>> print(type(series)) <class 'cudf.core.series.Series'>
Convert a CuPy array to a cuDF series
>>> import cupy as cp >>> cupy_array = cp.array([40, 50, 60]) >>> series = to_series(cupy_array) >>> print(type(series)) <class 'cudf.core.series.Series'>
- abracudabra.to_tensor(sequence, /, device=None, *, strict=False)[source]#
Convert an array, series, or dataframe to a Torch tensor.
- Parameters:
sequence (abracudabra._annotations.Array | abracudabra._annotations.Series | torch.Tensor) – The sequence to convert.
device (abracudabra.device.base.Device | str | None) – The device to convert the sequence to. If None, the sequence stays on the same device.
strict (bool) – Whether to raise an error if the sequence is not a valid type. A NumPy/CuPy array, Pandas/cuDF series or dataframe, or Torch tensor are valid types. If False, the sequence is converted to a Torch tensor if possible, but it might raise an error if the conversion is not possible.
- Returns:
A Torch tensor.
- Raises:
TypeError – If the sequence is not a valid type and
strict
is True.- Return type:
torch.Tensor
Examples
Build a Torch tensor from a sequence
>>> import torch >>> to_tensor([1, 2, 3]) tensor([1, 2, 3])
Build a Torch tensor from a CuPy array
>>> import cupy as cp >>> cupy_array = cp.array([4, 5, 6]) >>> torch_tensor = to_tensor(cupy_array) >>> print(torch_tensor.device) tensor([4, 5, 6], device='cuda:0')
- class abracudabra.Device[source]#
Bases:
NamedTuple
A device with a name and index.
- type: DeviceType#
The device type, e.g.,
"cpu"
or"cuda"
.
- idx: int | None = None#
The device index, e.g.,
0
orNone
.
- classmethod validate(device, idx=None)[source]#
Return a device, validating the device type and index.
- Parameters:
device (object) – The device type.
idx (object | None) – The optional device index.
- Returns:
The device.
- Return type:
- classmethod from_str(device, /)[source]#
Return a device from a string.
The string should be in the format
"device[:idx]"
.Examples
>>> Device.from_str("cpu") Device(type="cpu", idx=None) >>> Device.from_str("cuda:1") Device(type="cuda", idx=1)
- Parameters:
device (str)
- Return type:
- abracudabra.to_device(sequence: abracudabra._annotations.Series, /, device: abracudabra.device.base.Device | str) abracudabra._annotations.Series [source]#
- abracudabra.to_device(sequence: torch.Tensor, /, device: abracudabra.device.base.Device | str) torch.Tensor
- abracudabra.to_device(sequence: abracudabra._annotations.DataFrame, /, device: abracudabra.device.base.Device | str) abracudabra._annotations.DataFrame
- abracudabra.to_device(sequence: abracudabra._annotations.Index, /, device: abracudabra.device.base.Device | str) abracudabra._annotations.Index
- abracudabra.to_device(sequence: abracudabra._annotations.Array, /, device: abracudabra.device.base.Device | str) abracudabra._annotations.Array
- abracudabra.to_device(sequence: abracudabra._annotations.Array | abracudabra._annotations.Series | torch.Tensor, /, device: abracudabra.device.base.Device | str) abracudabra._annotations.Index | abracudabra._annotations.Series | abracudabra._annotations.DataFrame | abracudabra._annotations.Array | torch.Tensor
Move an array, series, or tensor to a device.
Call the appropriate function to move the element to the device:
abracudabra.device.conversion.array_to_device()
for NumPy/CuPy arrays.abracudabra.device.conversion.frame_to_device()
for Pandas/cuDF index/series/dataframes.abracudabra.device.conversion.tensor_to_device()
for Torch tensors.
- Parameters:
sequence – The sequence to move to the device.
device – The device to move the sequence to.
- Returns:
The sequence on the specified device.
- Raises:
TypeError – If the sequence is not a NumPy/CuPy array, Pandas/cuDF index/series/dataframe or Torch tensor.
Examples
Move a Pandas dataframe to the GPU (cuDF dataframe):
>>> import pandas as pd >>> from abracudabra import to_device >>> df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]}) >>> df_gpu = to_device(df, "cuda") >>> print(type(df_gpu)) <class 'cudf.core.dataframe.DataFrame'>
Move a cuDF dataframe to the CPU (Pandas dataframe):
>>> df_cpu = to_device(df_gpu, "cpu") >>> print(type(df_cpu)) <class 'pandas.core.frame.DataFrame
Move a numpy array to the GPU (cupy):
>>> import numpy as np >>> arr = np.array([1, 2, 3]) >>> arr_gpu = to_device(arr, "cuda") >>> print(type(arr_gpu)) <class 'cupy.ndarray'>
- abracudabra.get_np_or_cp(device_type=None)[source]#
Get the numpy or cupy library based on the device type.
if
device_type
is"cpu"
, return the numpy libraryif
device_type
is"cuda"
, return the cupy library
If
device_type
is not specified, return the numpy library (default).Examples
>>> device_type = "cuda" # in some configuration for example >>> np_or_cp = get_np_or_cp(device_type) >>> np_or_cp.random.choice([1, 2, 3], size=1) # returns a cupy array array([3])
- Parameters:
device_type (abracudabra.device.base.DeviceType | None)
- Return type:
types.ModuleType
- abracudabra.get_pd_or_cudf(device_type=None)[source]#
Get the pandas or cudf library based on the device type.
if
device_type
is"cpu"
, return the pandas libraryif
device_type
is"cuda"
, return the cudf library
If
device_type
is not specified, return the pandas library (default).Examples
>>> pd_or_cudf = get_pd_or_cudf("cpu") >>> pd_or_cudf.Series([1, 2, 3]) # returns a pandas series 0 1 1 2 2 3 dtype: int64
- Parameters:
device_type (abracudabra.device.base.DeviceType | None)
- Return type:
types.ModuleType
- abracudabra.get_device(element: abracudabra._annotations.Array | torch.Tensor, /, *, raise_if_unknown: Literal[True] = ...) abracudabra.device.base.Device [source]#
- abracudabra.get_device(element: abracudabra._annotations.Array | torch.Tensor, /, *, raise_if_unknown: bool = ...) abracudabra.device.base.Device | None
Get the device of a NumPy/CuPy array or series.
- Parameters:
element – The element to check.
raise_if_unknown – Whether to raise an error if the element is not a known array or tensor.
- Returns:
The device of the element.
Examples
>>> import numpy as np >>> array = np.random.rand(3) >>> get_device(array) Device(type="cpu", idx=None) >>> import torch >>> tensor = torch.rand(3, device="cuda") >>> get_device(tensor) Device(type="cuda", idx=0)