LogoTensorFusion Docs
LogoTensorFusion Docs
HomepageDocumentation

Getting Started

OverviewKubernetes InstallVM/Server Install(K3S)Helm On-premises InstallHost/GuestVM InstallTensorFusion Architecture

Application Operations

Create WorkloadConfigure AutoScalingMigrate Existing WorkloadBest Practices

Customize AI Infra

Production-Grade DeploymentConfig QoS and BillingBring Your Own CloudManaging License

Maintenance & Optimization

Upgrade ComponentsSetup AlertsGPU Live MigrationPreload ModelOptimize GPU Efficiency

Troubleshooting

HandbookTracing/ProfilingQuery Metrics & Logs

Reference

Comparison

Compare with NVIDIA vGPUCompare with MIG/MPSCompare with Run.AICompare with HAMi

Config QoS and Billing

Configure QoS for different workloads and virtual usage-based billing for each tenant and workload

Configure Scheduling Priority of Each QoS Level

🚧 Under Construction

Configure Unit Price of GPU

kubectl edit configmap tensor-fusion-sys-public-gpu-info -n tensor-fusion-sys
# Refer:
#  - https://www.techpowerup.com/gpu-specs
#  - https://getdeploying.com/reference/cloud-gpu

# Field Definition:
# - 'model' is `GPUModel_BoardSlotType` to identify the GPU
# - 'costPerHour' is the average cost referring a few Cloud/Serverless GPU vendors
# - 'fp16TFlops' is the max FP16 TFLOPs of the GPU. For NVIDIA, it means none-sparsity performance and using Tensor Cores

# note that this sheet only contains TFLOPs, no VRAM, since variant GPUs have the same TFLOPs but different VRAM, VRAM can be easily detected from NVML lib
# TODO: this should be dynamic after user inputs their cloud vendor and discounts info, for example Azure/AWS has much higher price than this sheet

# Turing Architecture Series
- model: T4
  fullModelName: "Tesla T4"
  vendor: NVIDIA
  costPerHour: 0.53
  fp16TFlops: 65

Table of Contents

Configure Scheduling Priority of Each QoS Level
Configure Unit Price of GPU