Handle Dependencies#

Add a runtime environment#

The import path (e.g., fruit:deployment_graph) must be importable by Serve at runtime. When running locally, this path might be in your current working directory. However, when running on a cluster you also need to make sure the path is importable. Build the code into the cluster’s container image (see Cluster Configuration for more details) or use a runtime_env with a remote URI that hosts the code in remote storage.

As an example, we have pushed a copy of the FruitStand deployment graph to GitHub. You can use this config file to deploy the FruitStand deployment graph to your own Ray cluster even if you don’t have the code locally:

import_path: fruit:deployment_graph

runtime_env:
    working_dir: "https://github.com/ray-project/serve_config_examples/archive/HEAD.zip"

Note

You can also package a deployment graph into a standalone Python package that you can import using a PYTHONPATH to provide location independence on your local machine. However, the best practice is to use a runtime_env, to ensure consistency across all machines in your cluster.

Dependencies per deployment#

Ray Serve also supports serving deployments with different (and possibly conflicting) Python dependencies. For example, you can simultaneously serve one deployment that uses legacy Tensorflow 1 and another that uses Tensorflow 2.

This is supported on Mac OS and Linux using Ray’s Runtime environments feature. As with all other Ray actor options, pass the runtime environment in via ray_actor_options in your deployment. Be sure to first run pip install "ray[default]" to ensure the Runtime Environments feature is installed.

Example:

import requests
from ray import serve

serve.start()


@serve.deployment
def requests_version(request):
    return requests.__version__


requests_version.options(
    name="25",
    ray_actor_options={"runtime_env": {"pip": ["requests==2.25.1"]}},
).deploy()
requests_version.options(
    name="26",
    ray_actor_options={"runtime_env": {"pip": ["requests==2.26.0"]}},
).deploy()

assert requests.get("http://127.0.0.1:8000/25").text == "2.25.1"
assert requests.get("http://127.0.0.1:8000/26").text == "2.26.0"

Tip

Avoid dynamically installing packages that install from source: these can be slow and use up all resources while installing, leading to problems with the Ray cluster. Consider precompiling such packages in a private repository or Docker image.

The dependencies required in the deployment may be different than the dependencies installed in the driver program (the one running Serve API calls). In this case, you should use a delayed import within the class to avoid importing unavailable packages in the driver. This applies even when not using runtime environments.

Example:

from ray import serve

serve.start()


@serve.deployment
class MyDeployment:
    def __call__(self, model_path):
        from my_module import my_model

        self.model = my_model.load(model_path)


MyDeployment.deploy("/model_path.pkl")