Split Runners Registry

1. Usage

The Split Runners Registry is designed to manage and deploy split runner servers. It allows the registration, querying, updating, and deletion of split runner instances, which run model split APIs on Kubernetes clusters.

Steps to Use the Registry:

Create a Split Runner
Use the POST /split-runner endpoint to create a new Split Runner instance. This will:
Deploy a new split runner server to the Kubernetes cluster using the provided cluster_k8s_config.
Register the split runner instance in the registry.
Get a Split Runner
Use the GET /split-runner/{runner_id} endpoint to retrieve details of an existing split runner instance by its ID.
Update a Split Runner
Use the PUT /split-runner/{runner_id} endpoint to update an existing split runner's details.
Delete a Split Runner
Use the DELETE /split-runner/{runner_id} endpoint to delete a split runner instance from the registry and Kubernetes.
Query Split Runners
Use the POST /split-runners endpoint to query and filter split runners based on specific criteria.

2. API Documentation

2.1 `POST /split-runner`

Description:
Creates a new Split Runner instance and deploys it to the Kubernetes cluster.

Request Body:

{
  "cluster_k8s_config": { /* Kubernetes config dict */ },
  "split_runner_public_host": "<server-url>",
  "split_runner_metadata": { "key": "value" },
  "split_runner_tags": ["tag1", "tag2"]
}

Response:

{
  "success": true,
  "data": {
    "message": "SplitRunner created",
    "id": "runner-id"
  }
}

cURL Command:

curl -X POST http://<server-url>:8001/split-runner \
  -H "Content-Type: application/json" \
  -d '{
    "cluster_k8s_config": { /* Kubernetes config dict */ },
    "split_runner_public_host": "192.168.0.106",
    "split_runner_metadata": {"key": "value"},
    "split_runner_tags": ["tag1", "tag2"]
  }'

2.2 `GET /split-runner/{runner_id}`

Description:
Retrieves a Split Runner instance by its ID.

Response:

{
  "success": true,
  "data": {
    "split_runner_id": "runner-id",
    "split_runner_public_url": "http://split-runner-url",
    "split_runner_metadata": { "key": "value" },
    "split_runner_public_host": "<server-url>",
    "split_runner_tags": ["tag1", "tag2"]
  }
}

cURL Command:

curl -X GET http://<server-url>:8001/split-runner/runner-id

2.3 `PUT /split-runner/{runner_id}`

Description:
Updates the details of a Split Runner instance.

Request Body:

{
  "$set": {
    "split_runner_metadata.version": "2.0"
  },
  "$addToSet": {
    "split_runner_tags": "auto-scaled"
  }
}

Response:

{
  "success": true,
  "data": {
    "message": "SplitRunner updated"
  }
}

cURL Command:

curl -X PUT http://<server-url>:8001/split-runner/runner-id \
  -H "Content-Type: application/json" \
  -d '{
    "$set": {
      "split_runner_metadata.version": "2.0"
    },
    "$addToSet": {
      "split_runner_tags": "auto-scaled"
    }
  }'

2.4 `DELETE /split-runner/{runner_id}`

Description:
Deletes a Split Runner instance by its ID.

Response:

{
  "success": true,
  "data": {
    "message": "SplitRunner deleted"
  }
}

cURL Command:

curl -X DELETE http://<server-url>:8001/split-runner/runner-id

2.5 `POST /split-runners`

Description:
Queries split runners based on a filter.

Request Body:

{
  "split_runner_metadata.framework": "transformers",
  "split_runner_tags": { "$in": ["llm"] }
}

Response:

{
  "success": true,
  "data": [
    {
      "split_runner_id": "runner-id",
      "split_runner_public_url": "http://host:32286",
      "split_runner_metadata": { "framework": "transformers" },
      "split_runner_public_host": "<server-url>",
      "split_runner_tags": ["llm", "split"]
    }
  ]
}

cURL Command:

curl -X POST http://<server-url>:8001/split-runners \
  -H "Content-Type: application/json" \
  -d '{
    "split_runner_metadata.framework": "transformers",
    "split_runner_tags": { "$in": ["llm"] }
  }'

Model layers registry

The Model Layers Registry is used to store information about individual model layers obtained as a result of model splitting. This registry enables reusability by tracking layer hashes that can be referenced as metadata within blocks. By doing so, the system can detect whether a block is already running with an existing split, allowing for intelligent sharing and avoiding redundant layer creation.

To ensure uniqueness and consistency, the MD5 hash of each model layer must be computed and stored as the model_layer_hash. This hash serves as the primary key in the model layers registry and is the basis for identifying and matching reusable layers across different model instantiations.

Certainly! Here's the technical documentation of the Model Layer Registry schema, presented in a structured table format:

Model Layer Registry schema:

@dataclass
class ModelLayerObject:
    model_layer_hash: str = ''
    model_asset_id: str = ''
    model_component_registry_uri: str = ''
    model_layer_public_url: str = ''
    model_layer_metadata: List[Dict[str, Any]] = field(default_factory=list)
    model_layer_rank: int = 0
    model_world_size: int = 0

Field	Type	Required	Description
`model_layer_hash`	`string`	yes	Primary Key. MD5 hash of the serialized model layer. Used for uniqueness.
`model_asset_id`	`string`	yes	ID of the original model asset from which this layer was generated.
`model_component_registry_uri`	`string`	yes	URI pointing to the component spec or metadata used for this model layer.
`model_layer_public_url`	`string`	yes	Public URL where the layer artifact is hosted (e.g., S3, HTTP).
`model_layer_metadata`	`array[dict]`	no	Arbitrary metadata attached to this layer (e.g., shape, precision, config).
`model_layer_rank`	`integer`	yes	Rank/index of this layer in the full model pipeline (used in splits).
`model_world_size`	`integer`	yes	Total number of model splits (parallel components) this layer belongs to.

Split Runners Registry

1. Usage

Steps to Use the Registry:

2. API Documentation

2.1 POST /split-runner

2.2 GET /split-runner/{runner_id}

2.3 PUT /split-runner/{runner_id}

2.4 DELETE /split-runner/{runner_id}

2.5 POST /split-runners

Model layers registry

Model Layer Registry schema:

2.1 `POST /split-runner`

2.2 `GET /split-runner/{runner_id}`

2.3 `PUT /split-runner/{runner_id}`

2.4 `DELETE /split-runner/{runner_id}`

2.5 `POST /split-runners`