KubeCon + CloudNativeCon Japan 2025: Full Schedule

16-17 June
Learn More and Register to Attend

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for KubeCon + CloudNativeCon Japan 2025 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

Please note: This schedule is automatically displayed in Japan Standard Time (UTC+9:00). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date." The schedule is subject to change and session seating is available on a first-come, first-served basis.

11:30 JST

A Journey and Lessons Learned To Enable IBM AIU Accelerators in Kubernetes - Takuya Mishina & Tatsuhiro Chiba, IBM Research

Monday June 16, 2025 11:30 - 12:00 JST

Level 1 | Pegasus A-B1

In this presentation, we will unveil the secrets and journey to enable an IBM's AI Accelerator in Kubernetes. We utilize wide range of tools and frameworks such as device plugin, custom scheduler, metrics exporter, webhooks and custom resources together to satisfy real requirements from various stakeholders - general users, cluster administrators and driver/runtime developers. Our device plugin and custom scheduler can accept special preference such as topology-aware allocation to enable RDMA, and webhook-based validator guides them to follow specification changes. To sync up allocation status among the components, we carefully defined a custom resource after performance estimation. From developer perspective, we provide various debug-purpose capabilities: for example, device allocation by PCI address for inspection, and pseudo device mode to achieve non-real-device test. In addition, multi-architecture support gives freedom of platform choice to all of the participants.

Speakers

Takuya Mishina

Researcher, IBM Research

Takuya Mishina is a Staff Research Scientist at IBM Research - Tokyo. He has been working for enhancing cloud infrastructure lifecycle management such as security and compliance posture management. Recent interests include extending the automation mechanism to provide usable AI hardware... Read More →

Tatsuhiro Chiba

Senior Technical Staff Member, IBM Research

Tatsuhiro Chiba is a STSM and Manager at IBM Research, specialized in performance optimization and acceleration of large scale AI and HPC workloads on Hybrid Cloud. He is leading a project to enhance OpenShift performance and sustainability for AI and HPC by exploiting various cloud... Read More →

KubeCon Japan 2025 A journey and Lessons Learned to enable IBM AIU Accelerators in Kubernetes pdf

Monday June 16, 2025 11:30 - 12:00 JST
Level 1 | Pegasus A-B1

AI + ML

Content Experience Level Intermediate
Presentation Language English

12:10 JST

Beyond Stock Outs: Scaling Inference on Mixed GPU Hardware With DRA - John Belamaric & Bo Fu, Google

Monday June 16, 2025 12:10 - 12:40 JST

Level 1 | Pegasus A-B1

Pods are PENDING?!? Ugh, none of the latest GPUs are available. But there are tons of older ones! If only you could tell Kubernetes “use the best GPU available, as long as it has 20GB+”...(enter: DRA).

Kubernetes’ Dynamic Resource Allocation (DRA) system, beta since 1.32, allows variations on which GPUs get allocated to Pods. You can write a flexible spec so that when the Deployment scales, Pods can land on whatever nodes have available GPUs. DRA works even if you need different numbers of devices for different hardware! This enables a new level of utilization and efficiency, saving your organization real money.

Combined with an advanced Node Autoscaler like Google’s Custom Compute Classes or Karpenter, you can spin up more VMs with whatever GPUs are available - or the most economical - all for a single Deployment. Scaling is simpler and more reliable, and your workload can scale even when your preferred type of GPU is stocked out.

Come learn how, and see it in action with a demo!

Speakers

John Belamaric

Senior Staff Software Engineer, Google

John is a Sr Staff SWE, co-chair of K8s SIG Architecture and of K8s WG Device Management, helping lead efforts to improve how GPUs, TPUs, NICs and other devices are selected, shared, and configured in Kubernetes. He is also co-founder of Nephio, an LF project for K8s-based automation... Read More →

Bo Fu

Senior Product Manager, Google

Monday June 16, 2025 12:10 - 12:40 JST
Level 1 | Pegasus A-B1

AI + ML

Content Experience Level Intermediate
Presentation Language English

14:10 JST

Access AI Models Anywhere: Scaling AI Traffic With Envoy AI Gateway - Dan Sun, Bloomberg & Takeshi Yoneda, Tetrate.io

Monday June 16, 2025 14:10 - 14:40 JST

Level 1 | Orion

As Generative AI adoption increases, organizations face accelerating challenges in deploying, scaling, and managing access to diverse AI models across cloud and on-prem environments. Envoy AI Gateway utilizes Envoy Proxy’s powerful filter architecture and extensibility through ext-proc to deliver key features such as centralized credential management, intelligent model routing, and LLM token usage control.
As the first CNCF-backed open source AI gateway, Envoy AI Gateway is built on top of a robust, high performance Envoy Gateway to help democratize AI infrastructure for organizations of all sizes.

In this talk, we will dive into the architecture of Envoy AI Gateway to learn how it extends Envoy’s capabilities to efficiently manage AI-driven workloads for enterprise needs, while providing robustness, scalability, and adaptability in the rapidly-changing generative AI landscape. We will also showcase a demo of an AI agent seamlessly accessing models anywhere through a unified API.

Speakers

Dan Sun

Team Lead, Bloomberg

Dan Sun is a software engineer team lead at Bloomberg. He is the co-founder and maintainer of KServe, an open source Serverless AI inference platform project. He is a co-founder of the Envoy AI Gateway project.

Takeshi Yoneda

Open Source Software Engineer, Tetrate.io

Takeshi Yoneda is a software engineer at Tetrate.io, with contributions to numerous open source projects, including compilers and network proxies. He is a co-founder of the Envoy AI Gateway project as well as maintainer of Envoy Proxy project.

Access AI Models Anywhere Scaling AI Traffic with Envoy AI Gateway pdf

Monday June 16, 2025 14:10 - 14:40 JST
Level 1 | Orion

AI + ML

Content Experience Level Any
Presentation Language English

14:50 JST

Zero-Extraction Cold Starts: How FUSE-Streaming Slashed ComfyUI Cold Starts by 10x - Fog Dong, BentoML

Monday June 16, 2025 14:50 - 15:20 JST

Level 1 | Orion

Cold-start delays for GPU-heavy GenAI apps like ComfyUI aren’t just about speed—they’re architectural failures. While others optimize incremental steps, we eliminate entire phases: no image downloads, no layer extraction, no redundant model copies.

We introduce a radical Kubernetes-native pattern: Direct-to-GPU streaming via FUSE-mounted object storage (S3/GCS), bypassing legacy container workflows. By rearchitecting the snapshotter to support seekable, on-demand FUSE streaming, we enable:

- Instant container boot: Models/CUDA dependencies mount directly from object storage, avoiding registry bottlenecks (40MB/s → 900MB/s throughput)
- Zero-extraction overhead: Layers load incrementally via range-optimized fetches, eliminating Zstd unpack/copy latency
- True cold start elimination: ComfyUI pods activate in 90s (vs. 8+ mins) by co-locating model mounting and inference prep

We’ll dissect a live ComfyUI deployment using 100% OSS primitives to hack container internals in the session.

Speakers

Fog Dong

Senior Software Engineer, BentoML

Fog Dong, a Senior Engineer at BentoML, KubeVela maintainer, CNCF Ambassador, and LFAPAC Evangelist, has a rich background in cloud native and AI infra. Previously instrumental in developing Alibaba's large-scale Serverless workflows and Bytedance's cloud-native CI/CD platform, she... Read More →

Fog KubeCon Japan 2025 pptx

Monday June 16, 2025 14:50 - 15:20 JST
Level 1 | Orion

AI + ML

Content Experience Level Advanced
Presentation Language English

15:50 JST

Scaling AI Responsibly: Building Ethical, Sustainable, and Cloud Native AI Systems - Amita Sharma & Vincent Caldeira, Red Hat; Mohit Suman, Salesforce; Shamsher Ansari, Platform9

Monday June 16, 2025 15:50 - 16:20 JST

Level 1 | Pegasus A-B1

Panel Discussion - As AI continues to reshape industries, organizations face mounting pressure to scale AI systems responsibly while addressing challenges in efficiency, sustainability, and trust. This panel convenes leading experts to discuss how cloud-native technologies and CNCF projects are paving the way for scalable, ethical, and resource-efficient AI. Attendees will gain actionable insights into optimizing AI workflows, reducing environmental impact, and ensuring transparency in AI decision-making. From leveraging open-source tools to implementing cost-effective and ethical AI practices, this session will equip you with the knowledge to build AI systems that are both innovative and responsible. Discover how to harness the power of cloud-native ecosystems to drive AI transformation without compromising on sustainability or trust.
AI/ML engineers and data scientists looking to scale AI systems in cloud-native environments.

Speakers

Amita Sharma

AI/ML Engineering Manager, Red Hat

Amita is an Engineering Manager at Red Hat, leading Kubeflow Training, Feature Store. With 20 years of industry experience, including 14 years at Red Hat, she has held various roles. She is an active open-source contributor. Since 2011, she has contributed to the Fedora Project and... Read More →

Vincent Caldeira

CTO APAC, Red Hat

Vincent Caldeira, CTO of Red Hat in APAC, is responsible for strategic partnerships and technology strategy. Named a top CTO in APAC in 2023, he has 20+ years in IT, excelling in technology transformation in finance. An authority in open source and cloud-native technologies, Vincent... Read More →

Shamsher Ansari

Technical Product Manager, Platform9

Shamsher Ansari is a Technical Product Manager at Platform9, driving cloud-native infrastructure and Kubernetes solutions. With extensive experience in cloud, edge computing, and open-source technologies, he focuses on delivering scalable and cost-efficient products. Previously, at... Read More →

Mohit Suman

Senior Product Manager, Salesforce

Mohit Suman is a Product Management Leader at Salesforce, driving AI Observability, MLOps, and AI App Dev. With 12+ years in product strategy, engineering, and architecture, he builds scalable solutions for developer productivity. A passionate advocate for open source and public speaking... Read More →

Monday June 16, 2025 15:50 - 16:20 JST
Level 1 | Pegasus A-B1

AI + ML

Content Experience Level Intermediate
Presentation Language English

11:30 JST

12:10 JST

14:10 JST

14:50 JST

15:50 JST

Get help with the event