Kubernetes v1.36: A New Era of Workload Scheduling with Separate PodGroup API

By

Introduction

Kubernetes v1.36 marks a significant milestone in the evolution of workload-aware scheduling, building on the foundation laid in v1.35. AI, ML, and batch workloads often demand scheduling logic that goes far beyond simple per-Pod placement. To address these challenges, the Kubernetes community has introduced a streamlined architecture that cleanly separates the static template definition from runtime state management. This release not only refines the Workload and PodGroup APIs but also debuts topology-aware scheduling, workload-aware preemption, and deeper integration with the Job controller. Let’s dive into the key improvements.

Kubernetes v1.36: A New Era of Workload Scheduling with Separate PodGroup API

Workload and PodGroup API: A Clean Separation

From v1alpha1 to v1alpha2

In Kubernetes v1.35, both the Pod group definition and its runtime state were bundled within the same Workload resource. This design worked but limited scalability and clarity. Kubernetes v1.36 introduces the scheduling.k8s.io/v1alpha2 API group, which entirely replaces the previous v1alpha1 version. Now, the Workload API serves as a static template, while the new PodGroup API handles all runtime aspects. This separation simplifies the scheduler’s logic and improves performance by allowing per-replica sharding of status updates.

How the New Model Works

The kube-scheduler can now read the PodGroup object directly, without needing to parse the Workload resource itself. The scheduler only cares about the runtime state—the group’s current membership, scheduling conditions, and policy. This decoupling makes the scheduling cycle more efficient and paves the way for future enhancements like atomic workload processing.

Configuration Example

A Workload controller (such as the Job controller) defines a Workload object that acts as a template. For instance, a training job might define a template for worker pods:

apiVersion: scheduling.k8s.io/v1alpha2
kind: Workload
metadata:
 name: training-job-workload
 namespace: some-ns
spec:
 podGroupTemplates:
 - name: workers
   schedulingPolicy:
     gang:
       minCount: 4

Controllers then stamp out runtime PodGroup instances based on these templates. Each PodGroup object holds the actual scheduling policy and a reference to the template it was created from. It also includes status conditions that reflect the scheduling state of all member Pods, enabling the scheduler to make informed decisions.

apiVersion: scheduling.k8s.io/v1alpha2
kind: PodGroup
metadata:
 name: training-job-workers-xyz
 namespace: some-ns
spec:
 # ... policy fields inherited from template
status:
 conditions:
 - type: Scheduled
   status: "True"
   lastTransitionTime: "..."

This example demonstrates how the runtime PodGroup carries the necessary information for the scheduler to work with, without requiring access to the original Workload object.

Enhanced Scheduling Capabilities

PodGroup Scheduling Cycle

Kubernetes v1.36 introduces a dedicated PodGroup scheduling cycle in the kube-scheduler. This cycle enables atomic processing of workloads: the scheduler evaluates all Pods in a PodGroup together, ensuring that either the entire group is scheduled or none are (gang scheduling semantics). This is critical for batch workloads where all workers must be available simultaneously.

Topology-Aware Scheduling and Workload-Aware Preemption

The release also debuts the first iterations of topology-aware scheduling and workload-aware preemption. Topology-aware scheduling allows the scheduler to place Pods from a PodGroup in a way that respects node topology (e.g., same rack or availability zone), reducing latency for tightly coupled workloads. Workload-aware preemption ensures that preemption decisions consider the entire PodGroup—for example, not preempting a single Pod from a group if doing so would prevent the rest from running.

Dynamic Resource Allocation via ResourceClaims

Another highlight is the support for ResourceClaim objects within workloads. This unlocks Dynamic Resource Allocation (DRA) for PodGroups, allowing workloads to request specialized hardware (e.g., GPUs, FPGAs) on a per-PodGroup basis. The scheduler can then coordinate the allocation of these resources across all members of the group.

Integration with Job Controller

To demonstrate real-world readiness, v1.36 delivers the first phase of integration between the Job controller and the new Workload/PodGroup APIs. This integration allows existing batch jobs to seamlessly leverage the improved scheduling capabilities without requiring manual configuration of Workload resources. The Job controller can automatically create the appropriate Workload template and PodGroup instances, enabling users to benefit from gang scheduling and topology awareness with minimal changes to their workflows.

Conclusion

Kubernetes v1.36 represents a leap forward in workload-aware scheduling. By cleanly separating static templates from runtime state, introducing a dedicated PodGroup scheduling cycle, and adding support for topology-aware scheduling and DRA, the release provides a robust foundation for AI/ML and batch workloads. The integration with the Job controller ensures that these features are practical and easy to adopt. As the community continues to refine these APIs in future releases, users can expect even more powerful scheduling capabilities for complex workloads.

Related Articles

Recommended

Discover More

Amazon Bedrock Enhances Prompt Engineering with Advanced Optimization and Model Migration ToolWhy JavaScript Date Handling Breaks Software and How Temporal Will Save ItBreaking: Volla Phone Plinius Launches with Rugged Design and Dual OS FreedomHow to Embrace Stratum v2: A Guide for Miners, Pools, and Industry StakeholdersEnterprise AI Teams Face Integration Crisis: New Hybrid Approach Bridges Low-Code Speed and Full-Code Power