Skip to content

SchedulingConfigTemplate ​

SchedulingConfigTemplate is the Schema for the schedulingconfigtemplates API.

Kubernetes Resource Information ​

FieldValue
API Versiontensor-fusion.ai/v1
KindSchedulingConfigTemplate
ScopeCluster

Table of Contents ​

Spec ​

Place the workload to right nodes and scale smart.

Property
Type
Constraints
Description
autoScaling ↓objectscale the workload based on the usage and traffic
hypervisor ↓objectsingle GPU device multi-process queuing and fair scheduling with QoS constraint
placement ↓objectplace the client or worker to best matched nodes
reBalancer ↓objectavoid hot GPU devices and continuously balance the workload
implemented by trigger a simulation scheduling and advise better GPU nodes for scheduler

autoScaling ​

scale the workload based on the usage and traffic

Properties ​

Property
Type
Constraints
Description
autoSetLimits ↓objectlayer 1 vertical auto-scaling, turbo burst to existing GPU cards quickly
autoSetReplicas ↓objectlayer 2 horizontal auto-scaling, scale up to more GPU cards if max limits threshold hit
autoSetRequests ↓objectlayer 3 adjusting, to match the actual usage in the long run
scaleToZero ↓objectadditional layer to save VRAM, auto-freeze memory and cool down to RAM and Disk

autoSetLimits ​

layer 1 vertical auto-scaling, turbo burst to existing GPU cards quickly

Properties ​

Property
Type
Constraints
Description
evaluationPeriodstring
extraTFlopsBufferRatiostring
ignoredDeltaRangestring
maxRatioToRequestsstringthe multiplier of requests, to avoid limit set too high, like 5.0
prediction ↓object
scaleUpStepstring

prediction ​

Properties ​
Property
Type
Constraints
Description
enableboolean
historyDataPeriodstring
modelstring
predictionPeriodstring

autoSetReplicas ​

layer 2 horizontal auto-scaling, scale up to more GPU cards if max limits threshold hit

Properties ​

Property
Type
Constraints
Description
enableboolean
evaluationPeriodstring
scaleDownCoolDownTimestring
scaleDownStepstring
scaleUpCoolDownTimestring
scaleUpStepstring
targetTFlopsOfLimitsstring

autoSetRequests ​

layer 3 adjusting, to match the actual usage in the long run

Properties ​

Property
Type
Constraints
Description
aggregationPeriodstring
evaluationPeriodstring
extraBufferRatiostringthe request buffer ratio, for example actual usage is 1.0, 10% buffer will be 1.1 as final preferred requests
percentileForAutoRequestsstring
prediction ↓object

prediction ​

Properties ​
Property
Type
Constraints
Description
enableboolean
historyDataPeriodstring
modelstring
predictionPeriodstring

scaleToZero ​

additional layer to save VRAM, auto-freeze memory and cool down to RAM and Disk

Properties ​

Property
Type
Constraints
Description
autoFreeze ↓array
intelligenceWarmup ↓object

autoFreeze (items) ​

Properties ​
Property
Type
Constraints
Description
enableboolean
freezeToDiskTTLstring
freezeToMemTTLstring
qosstringlow medium high critical

intelligenceWarmup ​

Properties ​
Property
Type
Constraints
Description
enableboolean
historyDataPeriodstring
modelstring
predictionPeriodstring

hypervisor ​

single GPU device multi-process queuing and fair scheduling with QoS constraint

Properties ​

Property
Type
Constraints
Description
multiProcessQueuing ↓object

multiProcessQueuing ​

Properties ​

Property
Type
Constraints
Description
enableboolean
intervalstring
queueLevelTimeSlicesarray

placement ​

place the client or worker to best matched nodes

Properties ​

Property
Type
Constraints
Description
allowUsingLocalGPUbooleanDefault: true
gpuFilters ↓array
modestringCompactFirst LowLoadFirstDefault: CompactFirst

gpuFilters (items) ​

Properties ​

Property
Type
Constraints
Description
paramsobject
typestring

reBalancer ​

avoid hot GPU devices and continuously balance the workload
implemented by trigger a simulation scheduling and advise better GPU nodes for scheduler

Properties ​

Property
Type
Constraints
Description
internalstring
reBalanceCoolDownTimestring
threshold ↓object

threshold ​

Properties ​

Property
Type
Constraints
Description
matchAnyobject

Status ​

SchedulingConfigTemplateStatus defines the observed state of SchedulingConfigTemplate.