Serve Config Files (serve build)
Serve Config Files (serve build)#
This section should help you:
understand the Serve config file format.
understand how to generate and update a config file for a Serve application.
This config file can be used with the serve deploy command CLI or embedded in a RayService custom resource in Kubernetes to deploy and update your application in production. The file is written in YAML and has the following format:
http_options:
host: ...
port: ...
request_timeout_s: ...
applications:
- name: ...
route_prefix: ...
import_path: ...
runtime_env: ...
deployments:
- name: ...
num_replicas: ...
...
- name:
...
...
The file contains http_options and applications. These are the http_options:
hostandportare HTTP options that determine the host IP address and the port for your Serve application’s HTTP proxies. These are optional settings and can be omitted. By default, thehostwill be set to0.0.0.0to expose your deployments publicly, and the port will be set to8000. If you’re using Kubernetes, settinghostto0.0.0.0is necessary to expose your deployments outside the cluster.request_timeout_sis a field in thehttp_optionsthat allows you to set the end-to-end timeout for a request before terminating and retrying at another replica. This config is global to your Ray cluster, and it cannot be updated during runtime. By default, the Serve HTTP proxy retries up to10times when a response is not received due to failures (e.g. network disconnect, request timeout, etc.). By default, there is no request timeout.
These are the fields per application:
name- The names for each application are auto-generated byserve build. The name per application must be unique.route_prefix- An application can be called via HTTP at the specified route prefix. It defaults to/. The route prefix for each application must be uniqueAn
import_path, which is the path to your top-level Serve deployment (or the same path passed toserve run). The most minimal config file consists of only animport_path.A
runtime_envthat defines the environment that the application will run in. This is used to package application dependencies such aspippackages (see Runtime Environments for supported fields). Theimport_pathmust be available within theruntime_envif it’s specified. The Serve config’sruntime_envcan only use remote URIs in itsworking_dirandpy_modules; it cannot use local zip files or directories. More details on runtime env.A list of
deployments. This is optional and allows you to override the@serve.deploymentsettings specified in the deployment graph code. Each entry in this list must include the deploymentname, which must match one in the code. If this section is omitted, Serve launches all deployments in the graph with the settings specified in the code.
Below is an equivalent config for the FruitStand example:
applications:
- name: app1
route_prefix: /
import_path: fruit:deployment_graph
runtime_env: {}
deployments:
- name: MangoStand
user_config:
price: 3
- name: OrangeStand
user_config:
price: 2
- name: PearStand
user_config:
price: 4
- name: FruitMarket
num_replicas: 2
- name: DAGDriver
The file uses the same fruit:deployment_graph import path that was used with serve run and it has five entries in the deployments list– one for each deployment. All the entries contain a name setting and some other configuration options such as num_replicas or user_config.
Tip
Each individual entry in the deployments list is optional. In the example config file above, we could omit the PearStand, including its name and user_config, and the file would still be valid. When we deploy the file, the PearStand deployment will still be deployed, using the configurations set in the @serve.deployment decorator from the deployment graph’s code.
We can also auto-generate this config file from the code. The serve build command takes an import path to your deployment graph and it creates a config file containing all the deployments and their settings from the graph. You can tweak these settings to manage your deployments in production.
Using the FruitStand deployment graph example:
$ ls
fruit.py
$ serve build fruit:deployment_graph -o fruit_config.yaml
$ ls
fruit.py
fruit_config.yaml
The fruit_config.yaml file contains:
http_options:
host: 0.0.0.0
port: 8000
applications:
- name: app1
route_prefix: /
import_path: fruit:deployment_graph
runtime_env: {}
deployments:
- name: MangoStand
user_config:
price: 3
- name: OrangeStand
user_config:
price: 2
- name: PearStand
user_config:
price: 4
- name: FruitMarket
num_replicas: 2
- name: DAGDriver
Note that the runtime_env field will always be empty when using serve build and must be set manually.
Additionally, serve build includes the default host and port in its
autogenerated files. You can modify these parameters to select a different host
and port.