TensorFusionWorkload â
TensorFusionWorkload is the Schema for the tensorfusionworkloads API.
Kubernetes Resource Information â
Field | Value |
---|---|
API Version | tensor-fusion.ai/v1 |
Kind | TensorFusionWorkload |
Scope | Namespaced |
Table of Contents â
Spec â
WorkloadProfileSpec defines the desired state of WorkloadProfile.
Property | Type | Constraints | Description |
---|---|---|---|
autoScalingConfig | object | AutoScalingConfig configured here will override Pool's schedulingConfig This field can not be fully supported in annotation, if user want to enable auto-scaling in annotation, user can set tensor-fusion.ai/auto-limits|requests|replicas: 'true' | |
gpuCount | integer<int32> | The number of GPUs to be used by the workload, default to 1 | |
gpuModel | string | GPUModel specifies the required GPU model (e.g., "A100", "H100") | |
isLocalGPU | boolean | Schedule the workload to the same GPU server that runs vGPU worker for best performance, default to false | |
nodeAffinity | object | NodeAffinity specifies the node affinity requirements for the workload | |
poolName | string | ||
qos | string | low medium high critical | Qos defines the quality of service level for the client. |
replicas | integer<int32> | If replicas not set, it will be dynamic based on pending Pod If isLocalGPU set to true, replicas must be dynamic, and this field will be ignored | |
resources | object |
Status â
TensorFusionWorkloadStatus defines the observed state of TensorFusionWorkload.
Property | Type | Constraints | Description |
---|---|---|---|
conditions | array | Represents the latest available observations of the workload's current state. | |
phase | string | Pending Running Failed Unknown | Default: Pending |
podTemplateHash | string | Hash of the pod template used to create worker pods | |
readyWorkers | integer<int32> | readyWorkers is the number of vGPU workers ready | |
workerCount | integer<int32> | workerCount is the number of vGPU workers |