ray.serve.schema.ServeDeploySchema
ray.serve.schema.ServeDeploySchema#
- pydantic model ray.serve.schema.ServeDeploySchema[source]#
Multi-application config for deploying a list of Serve applications to the Ray cluster.
This is the request JSON schema for the v2 REST API
PUT "/api/serve/applications/".PublicAPI (alpha): This API is in alpha and may change before becoming stable.
Show JSON schema
{ "title": "ServeDeploySchema", "description": "Multi-application config for deploying a list of Serve applications to the Ray\ncluster.\n\nThis is the request JSON schema for the v2 REST API\n`PUT \"/api/serve/applications/\"`.\n\n**PublicAPI (alpha):** This API is in alpha and may change before becoming stable.", "type": "object", "properties": { "proxy_location": { "description": "The location of HTTP servers.\n- \"EveryNode\" (default): start one HTTP server per node.\n- \"HeadOnly\": start one HTTP server on the head node.\n- \"NoServer\": disable HTTP server.", "default": "EveryNode", "allOf": [ { "$ref": "#/definitions/DeploymentMode" } ] }, "http_options": { "title": "Http Options", "description": "Options to start the HTTP Proxy with.", "default": { "host": "0.0.0.0", "port": 8000, "root_path": "", "request_timeout_s": null }, "allOf": [ { "$ref": "#/definitions/HTTPOptionsSchema" } ] }, "applications": { "title": "Applications", "description": "The set of Serve applications to run on the Ray cluster.", "type": "array", "items": { "$ref": "#/definitions/ServeApplicationSchema" } } }, "required": [ "applications" ], "additionalProperties": false, "definitions": { "DeploymentMode": { "title": "DeploymentMode", "description": "An enumeration.\n\n**DeveloperAPI:** This API may change across minor Ray releases.", "enum": [ "NoServer", "HeadOnly", "EveryNode", "FixedNumber" ], "type": "string" }, "HTTPOptionsSchema": { "title": "HTTPOptionsSchema", "description": "Options to start the HTTP Proxy with.\n\n**PublicAPI (alpha):** This API is in alpha and may change before becoming stable.", "type": "object", "properties": { "host": { "title": "Host", "description": "Host for HTTP servers to listen on. Defaults to \"0.0.0.0\", which exposes Serve publicly. Cannot be updated once Serve has started running. Serve must be shut down and restarted with the new host instead.", "default": "0.0.0.0", "type": "string" }, "port": { "title": "Port", "description": "Port for HTTP server. Defaults to 8000. Cannot be updated once Serve has started running. Serve must be shut down and restarted with the new port instead.", "default": 8000, "type": "integer" }, "root_path": { "title": "Root Path", "description": "Root path to mount the serve application (for example, \"/serve\"). All deployment routes will be prefixed with this path. Defaults to \"\".", "default": "", "type": "string" }, "request_timeout_s": { "title": "Request Timeout S", "description": "The timeout for HTTP requests. Defaults to no timeout.", "type": "number" } }, "additionalProperties": false }, "RayActorOptionsSchema": { "title": "RayActorOptionsSchema", "description": "Options with which to start a replica actor.\n\n**PublicAPI (beta):** This API is in beta and may change before becoming stable.", "type": "object", "properties": { "runtime_env": { "title": "Runtime Env", "description": "This deployment's runtime_env. working_dir and py_modules may contain only remote URIs.", "default": {}, "type": "object" }, "num_cpus": { "title": "Num Cpus", "description": "The number of CPUs required by the deployment's application per replica. This is the same as a ray actor's num_cpus. Uses a default if null.", "minimum": 0, "type": "number" }, "num_gpus": { "title": "Num Gpus", "description": "The number of GPUs required by the deployment's application per replica. This is the same as a ray actor's num_gpus. Uses a default if null.", "minimum": 0, "type": "number" }, "memory": { "title": "Memory", "description": "Restrict the heap memory usage of each replica. Uses a default if null.", "minimum": 0, "type": "number" }, "object_store_memory": { "title": "Object Store Memory", "description": "Restrict the object store memory used per replica when creating objects. Uses a default if null.", "minimum": 0, "type": "number" }, "resources": { "title": "Resources", "description": "The custom resources required by each replica.", "default": {}, "type": "object" }, "accelerator_type": { "title": "Accelerator Type", "description": "Forces replicas to run on nodes with the specified accelerator type.", "type": "string" } }, "additionalProperties": false }, "DeploymentSchema": { "title": "DeploymentSchema", "description": "Specifies options for one deployment within a Serve application. For each deployment\nthis can optionally be included in `ServeApplicationSchema` to override deployment\noptions specified in code.\n\n**PublicAPI (beta):** This API is in beta and may change before becoming stable.", "type": "object", "properties": { "name": { "title": "Name", "description": "Globally-unique name identifying this deployment.", "type": "string" }, "num_replicas": { "title": "Num Replicas", "description": "The number of processes that handle requests to this deployment. Uses a default if null.", "default": 1, "exclusiveMinimum": 0, "type": "integer" }, "route_prefix": { "title": "Route Prefix", "description": "Requests to paths under this HTTP path prefix will be routed to this deployment. When null, no HTTP endpoint will be created. When omitted, defaults to the deployment's name. Routing is done based on longest-prefix match, so if you have deployment A with a prefix of \"/a\" and deployment B with a prefix of \"/a/b\", requests to \"/a\", \"/a/\", and \"/a/c\" go to A and requests to \"/a/b\", \"/a/b/\", and \"/a/b/c\" go to B. Routes must not end with a \"/\" unless they're the root (just \"/\"), which acts as a catch-all.", "default": 1, "type": "string" }, "max_concurrent_queries": { "title": "Max Concurrent Queries", "description": "The max number of pending queries in a single replica. Uses a default if null.", "default": 1, "exclusiveMinimum": 0, "type": "integer" }, "user_config": { "title": "User Config", "description": "Config to pass into this deployment's reconfigure method. This can be updated dynamically without restarting replicas", "default": 1, "type": "object" }, "autoscaling_config": { "title": "Autoscaling Config", "description": "Config specifying autoscaling parameters for the deployment's number of replicas. If null, the deployment won't autoscale its number of replicas; the number of replicas will be fixed at num_replicas.", "default": 1, "type": "object" }, "graceful_shutdown_wait_loop_s": { "title": "Graceful Shutdown Wait Loop S", "description": "Duration that deployment replicas will wait until there is no more work to be done before shutting down. Uses a default if null.", "default": 1, "minimum": 0, "type": "number" }, "graceful_shutdown_timeout_s": { "title": "Graceful Shutdown Timeout S", "description": "Serve controller waits for this duration before forcefully killing the replica for shutdown. Uses a default if null.", "default": 1, "minimum": 0, "type": "number" }, "health_check_period_s": { "title": "Health Check Period S", "description": "Frequency at which the controller will health check replicas. Uses a default if null.", "default": 1, "exclusiveMinimum": 0, "type": "number" }, "health_check_timeout_s": { "title": "Health Check Timeout S", "description": "Timeout that the controller will wait for a response from the replica's health check before marking it unhealthy. Uses a default if null.", "default": 1, "exclusiveMinimum": 0, "type": "number" }, "ray_actor_options": { "title": "Ray Actor Options", "description": "Options set for each replica actor.", "default": 1, "allOf": [ { "$ref": "#/definitions/RayActorOptionsSchema" } ] }, "is_driver_deployment": { "title": "Is Driver Deployment", "description": "Indicate Whether the deployment is driver deployment Driver deployments are spawned one per node.", "default": 1, "type": "boolean" } }, "required": [ "name" ], "additionalProperties": false }, "ServeApplicationSchema": { "title": "ServeApplicationSchema", "description": "Describes one Serve application, and currently can also be used as a standalone\nconfig to deploy a single application to a Ray cluster.\n\n\nThis is the request JSON schema for the v1 REST API `PUT \"/api/serve/deployments/\"`.\n\n**PublicAPI (beta):** This API is in beta and may change before becoming stable.", "type": "object", "properties": { "name": { "title": "Name", "description": "Application name, the name should be unique within the serve instance", "default": "default", "type": "string" }, "route_prefix": { "title": "Route Prefix", "description": "Route prefix for HTTP requests. If not provided, it will useroute_prefix of the ingress deployment. By default, the ingress routeprefix is '/'.", "default": "/", "type": "string" }, "import_path": { "title": "Import Path", "description": "An import path to a bound deployment node. Should be of the form \"module.submodule_1...submodule_n.dag_node\". This is equivalent to \"from module.submodule_1...submodule_n import dag_node\". Only works with Python applications. This field is REQUIRED when deploying Serve config to a Ray cluster.", "type": "string" }, "runtime_env": { "title": "Runtime Env", "description": "The runtime_env that the deployment graph will be run in. Per-deployment runtime_envs will inherit from this. working_dir and py_modules may contain only remote URIs.", "default": {}, "type": "object" }, "host": { "title": "Host", "description": "Host for HTTP servers to listen on. Defaults to \"0.0.0.0\", which exposes Serve publicly. Cannot be updated once your Serve application has started running. The Serve application must be shut down and restarted with the new host instead.", "default": "0.0.0.0", "type": "string" }, "port": { "title": "Port", "description": "Port for HTTP server. Defaults to 8000. Cannot be updated once your Serve application has started running. The Serve application must be shut down and restarted with the new port instead.", "default": 8000, "type": "integer" }, "deployments": { "title": "Deployments", "description": "Deployment options that override options specified in the code.", "default": [], "type": "array", "items": { "$ref": "#/definitions/DeploymentSchema" } }, "args": { "title": "Args", "description": "Arguments that will be passed to the application builder.", "default": {}, "type": "object" } }, "required": [ "import_path" ], "additionalProperties": false } } }
- Fields
- Validators
- field applications: List[ray.serve.schema.ServeApplicationSchema] [Required]#
The set of Serve applications to run on the Ray cluster.
- field http_options: ray.serve.schema.HTTPOptionsSchema = HTTPOptionsSchema(host='0.0.0.0', port=8000, root_path='', request_timeout_s=None)#
Options to start the HTTP Proxy with.
- Validated by
- field proxy_location: ray.serve.config.DeploymentMode = DeploymentMode.EveryNode#
The location of HTTP servers. - “EveryNode” (default): start one HTTP server per node. - “HeadOnly”: start one HTTP server on the head node. - “NoServer”: disable HTTP server.
- Validated by
- validator application_names_nonempty » applications[source]#
- validator application_names_unique » applications[source]#
- validator application_routes_unique » applications[source]#