ray.data.preprocessors.Chain
ray.data.preprocessors.Chain#
- class ray.data.preprocessors.Chain(*preprocessors: ray.data.preprocessor.Preprocessor)[source]#
Bases:
ray.data.preprocessor.PreprocessorCombine multiple preprocessors into a single
Preprocessor.When you call
fit, each preprocessor is fit on the dataset produced by the preceeding preprocessor’sfit_transform.Example
>>> import pandas as pd >>> import ray >>> from ray.data.preprocessors import * >>> >>> df = pd.DataFrame({ ... "X0": [0, 1, 2], ... "X1": [3, 4, 5], ... "Y": ["orange", "blue", "orange"], ... }) >>> ds = ray.data.from_pandas(df) >>> >>> preprocessor = Chain( ... StandardScaler(columns=["X0", "X1"]), ... Concatenator(include=["X0", "X1"], output_column_name="X"), ... LabelEncoder(label_column="Y") ... ) >>> preprocessor.fit_transform(ds).to_pandas() Y X 0 1 [-1.224744871391589, -1.224744871391589] 1 0 [0.0, 0.0] 2 1 [1.224744871391589, 1.224744871391589]
- Parameters
preprocessors – The preprocessors to sequentially compose.
PublicAPI (alpha): This API is in alpha and may change before becoming stable.
Methods
fit(ds)Fit this Preprocessor to the Dataset.
Batch format hint for upstream producers to try yielding best block format.
transform(ds)Transform the given dataset.
transform_batch(data)Transform a single batch of data.
Return Dataset stats for the most recent transform call, if any.