Serve Config Files (serve build)#

This section should help you:

  • understand the Serve config file format.

  • understand how to generate and update a config file for a Serve application.

This config file can be used with the serve deploy command CLI or embedded in a RayService custom resource in Kubernetes to deploy and update your application in production. The file is written in YAML and has the following format:

http_options: 

  host: ...

  port: ...

  request_timeout_s: ...

applications:
  
- name: ...
    
  route_prefix: ...
    
  import_path: ...
    
  runtime_env: ... 

  deployments:

  - name: ...
    num_replicas: ...
    ...

  - name:
    ...

    ...

The file contains http_options and applications. These are the http_options:

  • host and port are HTTP options that determine the host IP address and the port for your Serve application’s HTTP proxies. These are optional settings and can be omitted. By default, the host will be set to 0.0.0.0 to expose your deployments publicly, and the port will be set to 8000. If you’re using Kubernetes, setting host to 0.0.0.0 is necessary to expose your deployments outside the cluster.

  • request_timeout_s is a field in the http_options that allows you to set the end-to-end timeout for a request before terminating and retrying at another replica. This config is global to your Ray cluster, and it cannot be updated during runtime. By default, the Serve HTTP proxy retries up to 10 times when a response is not received due to failures (e.g. network disconnect, request timeout, etc.). By default, there is no request timeout.

These are the fields per application:

  • name - The names for each application are auto-generated by serve build. The name per application must be unique.

  • route_prefix - An application can be called via HTTP at the specified route prefix. It defaults to /. The route prefix for each application must be unique

  • An import_path, which is the path to your top-level Serve deployment (or the same path passed to serve run). The most minimal config file consists of only an import_path.

  • A runtime_env that defines the environment that the application will run in. This is used to package application dependencies such as pip packages (see Runtime Environments for supported fields). The import_path must be available within the runtime_env if it’s specified. The Serve config’s runtime_env can only use remote URIs in its working_dir and py_modules; it cannot use local zip files or directories. More details on runtime env.

  • A list of deployments. This is optional and allows you to override the @serve.deployment settings specified in the deployment graph code. Each entry in this list must include the deployment name, which must match one in the code. If this section is omitted, Serve launches all deployments in the graph with the settings specified in the code.

Below is an equivalent config for the FruitStand example:

applications:

- name: app1

  route_prefix: /

  import_path: fruit:deployment_graph

  runtime_env: {}

  deployments:

  - name: MangoStand
    user_config:
      price: 3

  - name: OrangeStand
    user_config:
      price: 2

  - name: PearStand
    user_config:
      price: 4

  - name: FruitMarket
    num_replicas: 2

  - name: DAGDriver

The file uses the same fruit:deployment_graph import path that was used with serve run and it has five entries in the deployments list– one for each deployment. All the entries contain a name setting and some other configuration options such as num_replicas or user_config.

Tip

Each individual entry in the deployments list is optional. In the example config file above, we could omit the PearStand, including its name and user_config, and the file would still be valid. When we deploy the file, the PearStand deployment will still be deployed, using the configurations set in the @serve.deployment decorator from the deployment graph’s code.

We can also auto-generate this config file from the code. The serve build command takes an import path to your deployment graph and it creates a config file containing all the deployments and their settings from the graph. You can tweak these settings to manage your deployments in production.

Using the FruitStand deployment graph example:

$ ls
fruit.py

$ serve build fruit:deployment_graph -o fruit_config.yaml

$ ls
fruit.py
fruit_config.yaml

The fruit_config.yaml file contains:

http_options:

  host: 0.0.0.0

  port: 8000

applications:

- name: app1

  route_prefix: /

  import_path: fruit:deployment_graph

  runtime_env: {}

  deployments:

  - name: MangoStand
    user_config:
      price: 3

  - name: OrangeStand
    user_config:
      price: 2

  - name: PearStand
    user_config:
      price: 4

  - name: FruitMarket
    num_replicas: 2

  - name: DAGDriver

Note that the runtime_env field will always be empty when using serve build and must be set manually.

Additionally, serve build includes the default host and port in its autogenerated files. You can modify these parameters to select a different host and port.