ray.data.Dataset.sort#

Dataset.sort(key: Optional[str] = None, descending: bool = False) ray.data.dataset.Dataset[source]#

Sort the Dataset by the specified column.

Examples

>>> import ray
>>> ds = ray.data.range(100)
>>> ds.sort("id", descending=True).take(3)
[{'id': 99}, {'id': 98}, {'id': 97}]

Time complexity: O(dataset size * log(dataset size / parallelism))

Parameters
  • key – The column to sort by. To sort by multiple columns, call Dataset.map() and generate a new column to sort by.

  • descending – Whether to sort in descending order.

Returns

A new, sorted Dataset.