Support CPU request override for containerized deployments#1500
Support CPU request override for containerized deployments#1500
Conversation
Media processing pods intentionally omit CPU limits to avoid CFS throttling on real-time media workloads. Without a limit, NumCPU() falls back to runtime.NumCPU() which returns the full node CPU count. This could inflate the admission budget — on a 64-core node with a 15-CPU request, the monitor thinks it has 64 CPUs available and accepts too many jobs. Goruntime GOMAXPROC env variable will also be set to too high value which could affect go runtime goroutine scheduling (frequent context switches). CPU requests are a scheduler-level concept with no cgroup representation, so cgroup-based approaches (automaxprocs, Go 1.25 container awareness) can't help. The Kubernetes Downward API can expose requests.cpu as an env var, which this change reads. EffectiveCPURequest() in cpu.go reads the env var once at startup and NumCPU() returns it when set, falling back to platform detection otherwise — monitorProcesses picks this up automatically. A new maxprocs package caps GOMAXPROCS down to ceil(request) via init(), never increasing it beyond the current value so explicit settings and cgroup quotas are respected.
|
|
Do we still need this? |
As far as I can see - while useful - we don't as that new default value only gets set when cpu limit is set - which we don't do on media processing pods. If a service wants to use it - I think there is nothing explicit to be done - making sure go version is updated and limit is set (for workloads where it makes sense) |
| return effectiveCPURequest | ||
| } | ||
|
|
||
| func parseCPURequestEnvOnce() { |
There was a problem hiding this comment.
:nit: call this parseCPURequestEnv, don't need Once in function name?
Media processing pods intentionally omit CPU limits to avoid CFS throttling on real-time media workloads. Without a limit, NumCPU() falls back to runtime.NumCPU() which returns the full node CPU count. This could inflate the admission budget — on a 64-core node with a 15-CPU request, the monitor thinks it has 64 CPUs available and accepts too many jobs. Goruntime GOMAXPROC env variable will also be set to too high value which could affect go runtime goroutine scheduling (frequent context switches).
CPU requests are a scheduler-level concept with no cgroup representation, so cgroup-based approaches (automaxprocs, Go 1.25 container awareness) can't help. The Kubernetes Downward API can expose requests.cpu as an env var, which this change reads.
EffectiveCPURequest() in cpu.go reads the env var once at startup and NumCPU() returns it when set, falling back to platform detection otherwise — monitorProcesses picks this up automatically. A new maxprocs package caps GOMAXPROCS down to ceil(request) via init(), never increasing it beyond the current value so explicit settings and cgroup quotas are respected.