SchedulingConfigTemplate

SchedulingConfigTemplate is the Schema for the schedulingconfigtemplates API.

Kubernetes Resource Information

Field	Value
API Version	tensor-fusion.ai/v1
Kind	SchedulingConfigTemplate
Scope	Cluster

API Information
Spec
Status

Spec

Place the workload to right nodes and scale smart.

Property	Type	Description
autoScaling ↓	object	scale the workload based on the usage and traffic
hypervisor ↓	object	single GPU device multi-process queuing and fair scheduling with QoS constraint
placement ↓	object	place the client or worker to best matched nodes
reBalancer ↓	object	avoid hot GPU devices and continuously balance the workload implemented by trigger a simulation scheduling and advise better GPU nodes for scheduler

autoScaling

scale the workload based on the usage and traffic

Properties

Property	Type	Description
autoSetLimits ↓	object	layer 1 vertical auto-scaling, turbo burst to existing GPU cards quickly
autoSetReplicas ↓	object	layer 2 horizontal auto-scaling, scale up to more GPU cards if max limits threshold hit
autoSetRequests ↓	object	layer 3 adjusting, to match the actual usage in the long run
scaleToZero ↓	object	additional layer to save VRAM, auto-freeze memory and cool down to RAM and Disk

autoSetLimits

layer 1 vertical auto-scaling, turbo burst to existing GPU cards quickly

Properties

Property	Type	Description
evaluationPeriod	string
extraTFlopsBufferRatio	string
ignoredDeltaRange	string
maxRatioToRequests	string	the multiplier of requests, to avoid limit set too high, like 5.0
prediction ↓	object
scaleUpStep	string

prediction

Properties

Property	Type	Constraints	Description
enable	boolean
historyDataPeriod	string
model	string
predictionPeriod	string

autoSetReplicas

layer 2 horizontal auto-scaling, scale up to more GPU cards if max limits threshold hit

Properties

Property	Type	Constraints	Description
enable	boolean
evaluationPeriod	string
scaleDownCoolDownTime	string
scaleDownStep	string
scaleUpCoolDownTime	string
scaleUpStep	string
targetTFlopsOfLimits	string

autoSetRequests

layer 3 adjusting, to match the actual usage in the long run

Properties

Property	Type	Description
aggregationPeriod	string
evaluationPeriod	string
extraBufferRatio	string	the request buffer ratio, for example actual usage is 1.0, 10% buffer will be 1.1 as final preferred requests
percentileForAutoRequests	string
prediction ↓	object

prediction

Properties

Property	Type	Constraints	Description
enable	boolean
historyDataPeriod	string
model	string
predictionPeriod	string

scaleToZero

additional layer to save VRAM, auto-freeze memory and cool down to RAM and Disk

Properties

Property	Type	Constraints	Description
autoFreeze ↓	array
intelligenceWarmup ↓	object

autoFreeze (items)

Properties

Property	Type	Constraints
enable	boolean
freezeToDiskTTL	string
freezeToMemTTL	string
qos	string	low medium high critical

intelligenceWarmup

Properties

Property	Type	Constraints	Description
enable	boolean
historyDataPeriod	string
model	string
predictionPeriod	string

hypervisor

single GPU device multi-process queuing and fair scheduling with QoS constraint

Properties

Property	Type	Constraints	Description
multiProcessQueuing ↓	object

multiProcessQueuing

Properties

Property	Type	Constraints	Description
enable	boolean
interval	string
queueLevelTimeSlices	array

placement

place the client or worker to best matched nodes

Properties

Property	Type	Constraints	Description
allowUsingLocalGPU	boolean		Default: `true`
gpuFilters ↓	array
mode	string	CompactFirst LowLoadFirst	Default: `CompactFirst`

gpuFilters (items)

Properties

Property	Type	Constraints	Description
params	object
type	string

reBalancer

avoid hot GPU devices and continuously balance the workload
implemented by trigger a simulation scheduling and advise better GPU nodes for scheduler

Properties

Property	Type	Constraints	Description
internal	string
reBalanceCoolDownTime	string
threshold ↓	object

threshold

Properties

Property	Type	Constraints	Description
matchAny	object

Status

SchedulingConfigTemplateStatus defines the observed state of SchedulingConfigTemplate.

SchedulingConfigTemplate ​

Kubernetes Resource Information ​

Table of Contents ​

Spec ​

autoScaling ​

Properties ​

autoSetLimits ​

Properties ​

prediction ​

Properties ​

autoSetReplicas ​

Properties ​

autoSetRequests ​

Properties ​

prediction ​

Properties ​

scaleToZero ​

Properties ​

autoFreeze (items) ​

Properties ​

intelligenceWarmup ​

Properties ​

hypervisor ​

Properties ​

multiProcessQueuing ​

Properties ​

placement ​

Properties ​

gpuFilters (items) ​

Properties ​

reBalancer ​

Properties ​

threshold ​

Properties ​

Status ​

SchedulingConfigTemplate

Kubernetes Resource Information

Table of Contents

Spec

autoScaling

Properties

autoSetLimits

Properties

prediction

Properties

autoSetReplicas

Properties

autoSetRequests

Properties

prediction

Properties

scaleToZero

Properties

autoFreeze (items)

Properties

intelligenceWarmup

Properties

hypervisor

Properties

multiProcessQueuing

Properties

placement

Properties

gpuFilters (items)

Properties

reBalancer

Properties

threshold

Properties

Status