Loading…
16-17 June
Learn More and Register to Attend

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for KubeCon + CloudNativeCon Japan 2025 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

Please note: This schedule is automatically displayed in Japan Standard Time (UTC+9:00)To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date." The schedule is subject to change and session seating is available on a first-come, first-served basis. 
Company: Intermediate clear filter
Monday, June 16
 

10:11 JST

Keynote: From ECS To Kubernetes (and Sometimes Back Again): A Pragmatist's Guide To Migration - Marc Hildenbrand, Canva
Monday June 16, 2025 10:11 - 10:21 JST
Migrating to Kubernetes offers numerous benefits, but few organizations start from "greenfields" circumstances. Many have critical services running on existing platforms, each leveraging unique capabilities. Transitioning these services to Kubernetes can be a complex and non-trivial challenge.

At Canva—a leading design and collaboration platform serving 220 million monthly active users—we recently undertook an ambitious project to migrate over 400 services from Amazon Elastic Container Service (ECS) to Kubernetes (EKS). These services now span 15,000 pods across two clusters on a typical day.

In this talk, we’ll share the strategies, tools, and processes Canva developed to achieve a (mostly) seamless migration with minimal disruption to our teams and users. Whether you're planning your own migration or just curious about what it takes to migrate at this scale, this session will provide practical insights and lessons learned.
Speakers
avatar for Marc Hildenbrand

Marc Hildenbrand

Staff Engineer, Canva
Marc is a software engineer with 20+ years of experience spanning gaming, consulting, finance, and cloud technologies. He’s served as CTO of a cloud-native consultancy, Lead Gamesystems Engineer for the MMO Lord of the Rings Online, and Developer Advocate for Kubernetes/Openshift... Read More →
Monday June 16, 2025 10:11 - 10:21 JST
Level 1 | Pegasus A-B1
  Keynote Sessions, Platform Engineering

11:30 JST

A Journey and Lessons Learned To Enable IBM AIU Accelerators in Kubernetes - Takuya Mishina & Tatsuhiro Chiba, IBM Research
Monday June 16, 2025 11:30 - 12:00 JST
In this presentation, we will unveil the secrets and journey to enable an IBM's AI Accelerator in Kubernetes. We utilize wide range of tools and frameworks such as device plugin, custom scheduler, metrics exporter, webhooks and custom resources together to satisfy real requirements from various stakeholders - general users, cluster administrators and driver/runtime developers. Our device plugin and custom scheduler can accept special preference such as topology-aware allocation to enable RDMA, and webhook-based validator guides them to follow specification changes. To sync up allocation status among the components, we carefully defined a custom resource after performance estimation. From developer perspective, we provide various debug-purpose capabilities: for example, device allocation by PCI address for inspection, and pseudo device mode to achieve non-real-device test. In addition, multi-architecture support gives freedom of platform choice to all of the participants.
Speakers
avatar for Takuya Mishina

Takuya Mishina

Researcher, IBM Research
Takuya Mishina is a Staff Research Scientist at IBM Research - Tokyo. He has been working for enhancing cloud infrastructure lifecycle management such as security and compliance posture management. Recent interests include extending the automation mechanism to provide usable AI hardware... Read More →
avatar for Tatsuhiro Chiba

Tatsuhiro Chiba

Senior Technical Staff Member, IBM Research
Tatsuhiro Chiba is a STSM and Manager at IBM Research, specialized in performance optimization and acceleration of large scale AI and HPC workloads on Hybrid Cloud. He is leading a project to enhance OpenShift performance and sustainability for AI and HPC by exploiting various cloud... Read More →
Monday June 16, 2025 11:30 - 12:00 JST
Level 1 | Pegasus A-B1
  AI + ML

11:30 JST

Never Underestimate Memory Architecture - Bryan Boreham, Grafana Labs
Monday June 16, 2025 11:30 - 12:00 JST
Modern cloud servers are built on NUMA (Non-Uniform Memory Access) and SMT (Symmetric Multi-Threading, aka Hyperthreading) — but few engineers realize how much these technologies impact application performance.

NUMA means that your cloud server’s memory might be significantly slower to access, because it’s connected to a different CPU, while Hyperthreading makes a single CPU core pretend to be two, but not at twice the speed.

This talk will:
• Demystify NUMA & Hyperthreading — what they are, how they work, and why they matter.
• Explore Kubernetes integration—the (limited) ways Kubernetes interacts with NUMA.
• Show real-world performance impact — illustrated with measurements on AWS and Google Cloud.
• Give you visibility — how to use Prometheus metrics and Linux commands to view your servers’ NUMA and SMT configurations.

By the end of the session, you'll have an understanding of the issues, and the tools to measure and their impact on the performance of your workloads.
Speakers
avatar for Bryan Boreham

Bryan Boreham

Distinguished Engineer, Grafana Labs
Bryan Boreham is a Distinguished Engineer at Grafana Labs, working on highly scalable storage for metrics, logs and traces. Bryan's career has ranged from charting pie sales at a bakery to real-time pricing of billion-dollar bond trades. A contributor to many Open Source projects... Read More →
Monday June 16, 2025 11:30 - 12:00 JST
Level 1 | Pegasus B2-C
  Operations + Performance

12:10 JST

Beyond Stock Outs: Scaling Inference on Mixed GPU Hardware With DRA - John Belamaric & Bo Fu, Google
Monday June 16, 2025 12:10 - 12:40 JST
Pods are PENDING?!? Ugh, none of the latest GPUs are available. But there are tons of older ones! If only you could tell Kubernetes “use the best GPU available, as long as it has 20GB+”...(enter: DRA).

Kubernetes’ Dynamic Resource Allocation (DRA) system, beta since 1.32, allows variations on which GPUs get allocated to Pods. You can write a flexible spec so that when the Deployment scales, Pods can land on whatever nodes have available GPUs. DRA works even if you need different numbers of devices for different hardware! This enables a new level of utilization and efficiency, saving your organization real money.

Combined with an advanced Node Autoscaler like Google’s Custom Compute Classes or Karpenter, you can spin up more VMs with whatever GPUs are available - or the most economical - all for a single Deployment. Scaling is simpler and more reliable, and your workload can scale even when your preferred type of GPU is stocked out.

Come learn how, and see it in action with a demo!
Speakers
avatar for John Belamaric

John Belamaric

Senior Staff Software Engineer, Google
John is a Sr Staff SWE, co-chair of K8s SIG Architecture and of K8s WG Device Management, helping lead efforts to improve how GPUs, TPUs, NICs and other devices are selected, shared, and configured in Kubernetes. He is also co-founder of Nephio, an LF project for K8s-based automation... Read More →
avatar for Bo Fu

Bo Fu

Senior Product Manager, Google
Monday June 16, 2025 12:10 - 12:40 JST
Level 1 | Pegasus A-B1
  AI + ML

14:24 JST

⚡ Lightning Talk: From Kernel To Kubernetes: Mapping eBPF-Detected Processes To Pods - Yuki Nakamura, mapbox
Monday June 16, 2025 14:24 - 14:29 JST
When a process like /usr/bin/curl runs in a pod (e.g., xwing in the default namespace), Tetragon detects it like bellow:

```
🚀 process default/xwing /bin/bash -c "curl https://ebpf.io/applications/#tetragon"
🚀 process default/xwing /usr/bin/curl https://ebpf.io/applications/#tetragon
💥 exit default/xwing /usr/bin/curl https://ebpf.io/applications/#tetragon 60
```

But how does it map that process to its pod?

This Lightning Talk explores how Tetragon connects the Linux kernel to Kubernetes by enriching eBPF-detected process data with Kubernetes metadata. I’ll break down how it extracts cgroup information from task_struct in kernel space and maps it to pod details using the Kubernetes API.
Speakers
avatar for Yuki Nakamura

Yuki Nakamura

Platform Engeneer, mapbox
- Master’s degree in Computer Science at the University of Tokyo - IBM - Mapbox Blog: https://yuki-nakamura.com/
Monday June 16, 2025 14:24 - 14:29 JST
Level 1 | Pegasus A-B1
  ⚡ Lightning Talks, Security

14:38 JST

⚡ Lightning Talk: Practical Monitoring for Knative Serving - Kazuki Higashiguchi, Autify
Monday June 16, 2025 14:38 - 14:43 JST
Knative is a widely adopted CNCF-hosted software for running serverless applications using Kubernetes. Knative Serving consists of many system components, such as Activator, Autoscaler, Controller, Webhook, and Istio or Kourier as an ingress gateway. Therefore, end users need to implement monitors for common error patterns and best metrics from many metrics. However, there is relatively little knowledge and resources for Knative end users.

This talk will present a production case study of monitoring for Knative Service. Specifically, it will explain how we can monitor Knative control plane efficiency, reconciliation operations, pod scaling health, concurrency observation, HTTP request success rate, and more.
That includes how Knative components implement Prometheus metrics, metrics pipelines (on Google Kubernetes Engine), dashboards and alerts.

This case study will benefit existing Knative users and potential users considering employing Knative in their Kubernetes clusters.
Speakers
avatar for Kazuki Higashiguchi

Kazuki Higashiguchi

Senior Site Reliability Engineer, Autify
Kazuki Higashiguchi has more than 9 years of industrial experience, focusing on SRE and operational excellence. He is a Senior Site Reliability Engineer at Autify, a company serving AI-powered quality assurance platforms. He is mainly responsible for designing business infrastructure... Read More →
Monday June 16, 2025 14:38 - 14:43 JST
Level 1 | Pegasus A-B1
  ⚡ Lightning Talks, Observability

14:45 JST

⚡ Lightning Talk: Providing Sufficient PVCs for Your StatefulSets: Creating New Volumes Larger Than the PVCTemplate - Kaoru Esashika, Cybozu, Inc.
Monday June 16, 2025 14:45 - 14:50 JST
Have you ever used automatic PVC resizing tools in Kubernetes? Have you ever encountered a situation where newly created PVCs remained at the small, template-defined size, causing capacity shortages during restores or clones?

To address this challenge, this session introduces a new approach in pvc-autoresizer, one of the tools for automatically resizing PVCs. By aligning newly created PVCs with the largest capacity in the same group, it ensures that PVCs have sufficient space right from creation time. Using annotations and a Mutating Webhook, we avoid capacity shortages during restores by provisioning enough volume capacity immediately after a PVC is generated. We will also explain why we chose a Webhook-based method over a Controller-based one, along with the design trade-offs involved.

Anyone seeking to enhance reliability for stateful workloads should not miss this session. Learn how a new PVC management strategy can deliver more stable Kubernetes storage.
Speakers
avatar for Kaoru Esashika

Kaoru Esashika

Software Engineer, Cybozu, Inc.
He works at Cybozu, Inc. For the past three years, he has focused on the operation and development of the storage area for a new infrastructure using Kubernetes. His work includes developing and operating distributed storage with Rook and Ceph, and own CSI plugin, TopoLVM.
Monday June 16, 2025 14:45 - 14:50 JST
Level 1 | Pegasus A-B1
  ⚡ Lightning Talks, Data Processing + Storage

14:50 JST

Streamlined Baremetal Deployment: A Journey of Custom Controllers Integrated With OpenStack - Mitsuhiro Tanino & Masanori Kuroha, LY Corporation
Monday June 16, 2025 14:50 - 15:20 JST
As LY Corporation transitioned to developing a new private cloud infrastructure, we confronted significant challenges in managing over 6,000 baremetal servers through Kubernetes integrated with OpenStack, resulting in increased complexity, extended deployment times, and excessive resource consumption.

This session delivers how our custom controllers enhanced our approach, automating complex configurations and addressing disruptions in ArgoCD and Ansible, Kubernetes resource shortages, and scalability constraints.

We will focus on:

- Challenges in Existing Baremetal Provisioning: Explore the operational complexities and inefficiencies caused by scale, including deployment delays.

- Implementation of Custom Controllers: How we automated configurations and leading to faster, more reliable deployments with Helm.

- Enhancements in Resource Management: Techniques that streamlined processes and enhanced OpenStack integration, ultimately boosting operational simplicity and efficiency.
Speakers
avatar for Mitsuhiro Tanino

Mitsuhiro Tanino

Senior software engineer, LY Corporation
Mitsuhiro Tanino is a senior software engineer who has been working for LY Corporation since 2019. He has experience to contribute OpenStack Cinder project for several years and also contributed Kubernetes sig-storage for several years. His current working area is operating hyper-scale... Read More →
avatar for Masanori Kuroha

Masanori Kuroha

Software Engineer, LY Corporation
Masanori Kuroha has been working as a Cloud Software Engineer at LY Corporation since 2021. He is responsible for managing private clusters using OpenStack and operating Kubernetes clusters on physical machines. Recently, he focuses on customizing OpenStack Nova for internal purposes... Read More →
Monday June 16, 2025 14:50 - 15:20 JST
Level 1 | Pegasus B2-C
  Operations + Performance

14:52 JST

⚡ Lightning Talk: What's New in Prometheus-Operator? - Ashwin Sriram, IIT-BHU
Monday June 16, 2025 14:52 - 14:57 JST
Join us for an exciting overview of the latest developments in Prometheus-Operator! This lightning talk will explore game-changing features, including the new ScrapeClasses for simplified monitoring configuration, DaemonSet Mode in PrometheusAgent for enhanced deployment flexibility, and expanded Service Discovery support across multiple cloud providers.

We'll explore how these additions, along with our improved documentation and new Scale Subresource capabilities, are making Prometheus-Operator more powerful and user-friendly than ever. You'll also learn about "poctl", our new command-line tool that is designed to handle Prometheus-Operator custom resources.

Plus, get a sneak peek into our roadmap and upcoming features that will shape the future of Kubernetes monitoring. Perfect for anyone interested in Kubernetes monitoring, this talk will equip you with knowledge about the latest tools to enhance your observability stack.
Speakers
avatar for Ashwin Sriram

Ashwin Sriram

Student
I'm an engineering student in my final year at IIT-BHU, majoring in metallurgy, but my heart truly lies in software engineering. I'm a maintainer for the Prometheus-Operator website repository, a role I earned after completing my GSoC mentorship. Through this experience, I've developed... Read More →
Monday June 16, 2025 14:52 - 14:57 JST
Level 1 | Pegasus A-B1
  ⚡ Lightning Talks, Observability

15:50 JST

Scaling AI Responsibly: Building Ethical, Sustainable, and Cloud Native AI Systems - Amita Sharma & Vincent Caldeira, Red Hat; Mohit Suman, Salesforce; Shamsher Ansari, Platform9; Anusha Hegde, Nirmata
Monday June 16, 2025 15:50 - 16:20 JST
Panel Discussion - As AI continues to reshape industries, organizations face mounting pressure to scale AI systems responsibly while addressing challenges in efficiency, sustainability, and trust. This panel convenes leading experts to discuss how cloud-native technologies and CNCF projects are paving the way for scalable, ethical, and resource-efficient AI. Attendees will gain actionable insights into optimizing AI workflows, reducing environmental impact, and ensuring transparency in AI decision-making. From leveraging open-source tools to implementing cost-effective and ethical AI practices, this session will equip you with the knowledge to build AI systems that are both innovative and responsible. Discover how to harness the power of cloud-native ecosystems to drive AI transformation without compromising on sustainability or trust.
AI/ML engineers and data scientists looking to scale AI systems in cloud-native environments.
Speakers
avatar for Amita Sharma

Amita Sharma

AI/ML Engineering Manager, Red Hat
Amita is an Engineering Manager at Red Hat, leading Kubeflow Training, Feature Store. With 20 years of industry experience, including 14 years at Red Hat, she has held various roles. She is an active open-source contributor. Since 2011, she has contributed to the Fedora Project and... Read More →
avatar for Vincent Caldeira

Vincent Caldeira

CTO APAC, Red Hat
Vincent Caldeira, CTO of Red Hat in APAC, is responsible for strategic partnerships and technology strategy. Named a top CTO in APAC in 2023, he has 20+ years in IT, excelling in technology transformation in finance. An authority in open source and cloud-native technologies, Vincent... Read More →
avatar for Shamsher Ansari

Shamsher Ansari

Technical Product Manager, Platform9
Shamsher Ansari is a Technical Product Manager at Platform9, driving cloud-native infrastructure and Kubernetes solutions. With extensive experience in cloud, edge computing, and open-source technologies, he focuses on delivering scalable and cost-efficient products. Previously, at... Read More →
avatar for Anusha Hegde

Anusha Hegde

Senior Technical Product Manager, Nirmata
Anusha Hegde is a Senior Technical Product Manager at Nirmata, focusing on cloud security, Kubernetes policy management, policy-as-code automation, and building AI-first products while analyzing AI’s impact on her product and customers. Previously, she was a Tech Lead at VMware... Read More →
avatar for Mohit Suman

Mohit Suman

Senior Product Manager, Salesforce
Mohit Suman is a Product Management Leader at Salesforce, driving AI Observability, MLOps, and AI App Dev. With 12+ years in product strategy, engineering, and architecture, he builds scalable solutions for developer productivity. A passionate advocate for open source and public speaking... Read More →
Monday June 16, 2025 15:50 - 16:20 JST
Level 1 | Pegasus A-B1
  AI + ML

16:30 JST

Optimizing Data Locality and GPU Utilization for Training Workloads in Kubernetes - Haoyuan Li & Bin Fan, Alluxio
Monday June 16, 2025 16:30 - 17:00 JST
As organizations scale their model training workloads in cloud-native environments, they face significant data processing and storage challenges: managing massive training datasets across distributed storage systems while ensuring optimal I/O performance. While Kubernetes excels at compute orchestration, the increasing distribution of data across multiple storage backends creates bottlenecks that impact training performance and infrastructure costs.

This presentation introduces a Kubernetes-native distributed caching system that utilizes NVMe storage to overcome data locality challenges. Haoyuan Li will also share real-world, large-scale production use cases to show how this architecture lowers data infrastructure costs, increases GPU utilization, and enables workload portability to navigate GPU scarcity challenges.
Speakers
avatar for Bin Fan

Bin Fan

Alluxio
Bin Fan is VP of open source at Alluxio and the PMC maintainer of Alluxio open source. Prior to joining Alluxio as a founding engineer, he worked for Google to build the next-generation storage infrastructure. Bin received his Ph.D. in computer science from Carnegie Mellon University... Read More →
avatar for Haoyuan Li

Haoyuan Li

Founder and CEO, Alluxio
Haoyuan Li is the Founder and CEO of Alluxio. He graduated with a Computer Science Ph.D. from the AMPLab at UC Berkeley. At the AMPLab, he co-created and led Alluxio (formerly Tachyon), an open-source virtual distributed file system. Before UC Berkeley, he got a M.S. from Cornell... Read More →
Monday June 16, 2025 16:30 - 17:00 JST
Level 1 | Pegasus A-B1
  Data Processing + Storage

16:30 JST

Breaking Limits: Highly-Isolated and Low-Overhead Wasm Container - Soichiro Ueda, Kyoto University & Ai Nozaki, The University of Tokyo
Monday June 16, 2025 16:30 - 17:00 JST
Wasm is touted as the next generation of containers, offering a smaller, more secure, and more portable application format. However, challenges remain, particularly in achieving enough isolation for public clouds where multi-tenancy exists. This is because Wasm shares the host kernel between workloads like containers. To take full advantage of Wasm, there is still insufficient discussion on this problem.

To address the issue, we've developed a new Wasm runtime, Mewz. It runs a single Wasm module within a dedicated VM while also having a lightweight and specialized kernel (unikernel). This revolutionary execution model enables more secure and low-overhead Wasm containers. We've open-sourced the implementation, and Mewz is listed in the CNCF cloud native landscape! In this session, we'll explain the architecture of Mewz and why it's more isolated and low-overhead than ordinary Wasm runtimes. Building on this presentation, let’s discuss the future of cloud workloads powered by Wasm!
Speakers
avatar for Ai Nozaki

Ai Nozaki

Master Student, The University of Tokyo
Ai Nozaki is a Master's student at The University of Tokyo. She is a member of Mewz project. Her interest lies in WebAssembly, systems softwares and GPUs.
avatar for Soichiro Ueda

Soichiro Ueda

Student, Kyoto University
Master's Student in Computer Science at Kyoto University. Working on the Mewz project. Love cloud-native technologies and system software.
Monday June 16, 2025 16:30 - 17:00 JST
Level 1 | Pegasus B2-C
  Emerging + Advanced

17:10 JST

Standardizing on Multi-Cluster App Topologies for Platforms With Linkerd - William Rizzo, Mirantis
Monday June 16, 2025 17:10 - 17:40 JST
As organizations scale their Kubernetes adoption, multi-cluster architectures are becoming the backbone of resilience, scalability, and compliance. However, building a unified developer experience across these clusters while abstracting operational complexities is a significant challenge.
In this session we’ll demonstrate how Cluster-API (CAPI), a declarative tool for Kubernetes lifecycle management and Linkerd, the powerful yet lightweight service mesh, can work together to simplify multi-cluster topologies for Internal Developer Platforms (IDP). By combining CAPI's robust cluster management with Linkerd’s seamless cross-cluster service communication, platform teams can deliver a streamlined and intuitive experience for developers, enabling them to focus on building and deploying applications without worrying about underlying infrastructure.
Speakers
avatar for William Rizzo

William Rizzo

Consulting Architect, Mirantis
William is a CNCF and Linkerd Ambassador, working at Mirantis as a Consulting Architect. He’s focused in helping customers designing, building, and running Developer Platform and Edge systems. He wore many hats, Engineering, Product Owner and Consulting. from HPC, Storage to Distributed... Read More →
Monday June 16, 2025 17:10 - 17:40 JST
Level 1 | Orion
  Platform Engineering

17:10 JST

Addons Need Love Too: Maintaining Addons for Better Cluster Security - Stevie Caldwell & Andy Suderman, Fairwinds
Monday June 16, 2025 17:10 - 17:40 JST
Projects both within and outside of the CNCF ecosystem provide additional capabilities for Kubernetes clusters. These "addons" become integral to the functioning of our clusters, but we don't often talk about their impact as a whole or managing them holistically as first-class citizens.

We know there are barriers to keeping things like addons up-to-date and that it can be difficult to get buy-in for allocating the time and resources for updating something that is working just fine (for now), especially if you’re multiple major versions behind. In this session we will help you understand and articulate the benefits of catching up and keeping addons updated and how to be proactive moving forward. You will walk away with some tools and strategies for navigating the complexity of the addon ecosystem and make the process as painless as possible. You will be able to create an action plan for improving the stability and security of your clusters and share that with stakeholders.
Speakers
avatar for Stevie Caldwell

Stevie Caldwell

Senior Tech Lead, Fairwinds
Stevie Caldwell is a Senior Site Reliability Engineering Technical Lead at Fairwinds. Stevie also participates in the R&D arm of Fairwinds where she contributes to Fairwinds’s open source projects. She has worked with Kubernetes for 6+ years, has presented at a number of webinars... Read More →
avatar for Andy Suderman

Andy Suderman

CTO, Fairwinds
Andy Suderman is CTO at Fairwinds, a managed Kubernetes-as-a-Service provider. Andy has worked with cloud native technologies for the last eight years helping organizations adopt and manage Kubernetes. Andy is the creator and primary developer of Goldilocks—an open source tool that... Read More →
Monday June 16, 2025 17:10 - 17:40 JST
Level 1 | Pegasus A-B1
  Security
 
Tuesday, June 17
 

11:30 JST

Multi Cluster Magics With Argo CD and Cluster Inventory or Don't Get Lost in the Clusterverse: Navig - Nick Eberts, Google
Tuesday June 17, 2025 11:30 - 12:00 JST
You probably have more than one cluster and there is a decent chance you are using Argo CD. Additionally, it is quite likely that you have a few other variations of Kubernetes cluster lists. We posit that writing glue code to stitch together these clusters lists is not an awesome use of your time. Thankfully the good folks in SIG-Multicluster built this super cool api for cluster lists, cluster profile/cluster inventory! We are going to show you how to use said fancy new list with Argo CD along with other multi-cluster tools across Kubernetes clusters hosted by different providers. There will be demos. Possibly Mustaches. And a decent amount of awful puns. So come on down to bear witness to some sweet multi-cluster abstractions that will surely get your heart rate up.
Speakers
avatar for Nick Eberts

Nick Eberts

Product Manager, Google
Nick is currently the product manager for GKE Fleets & Teams focusing on multi-cluster capabilities that streamline GCP customers experience while building platforms on GKE. He also is a Kubernetes contributor, participates in SIG-Multicluster, and has been part of the community since... Read More →
Tuesday June 17, 2025 11:30 - 12:00 JST
Level 1 | Orion
  Platform Engineering

12:10 JST

Cloud Native Scalability for Internal Developer Platforms - Hiroshi Hayakawa, LY Corporation
Tuesday June 17, 2025 12:10 - 12:40 JST
Platform Engineering enables developers to focus on business value-aligned tasks by providing internal developer platforms (IDPs) that automate non-essential tasks. Kubernetes is widely used as a foundation for IDPs thanks to its scalability and flexibility.

However, Kubernetes was designed as a general workload orchestrator, not a platform component. As a result, IDP builders must integrate additional Cloud Native technologies and customizations, which can create scalability bottlenecks. At LY Corporation, his team has developed a Kubernetes-based, multi-tenant IDP running over 140K pods, and they faced such scalability challenges.

In this session, he will discuss scalability bottlenecks faced in the IDP, including observability pipelines, access control, etc. He will also explore scaling strategies for IDPs and how they address real-world scalability issues. By the end of this session, you will gain deeper insights into scalability challenges from a platform builder’s perspective.
Speakers
avatar for Hiroshi Hayakawa

Hiroshi Hayakawa

Platform Engineer, LY Corporation
Hiroshi is a lead engineer for Kubernetes-based application platforms in LY Corporation's Private Cloud Division. The company operates numerous large-scale applications on its Kubernetes-based platform, and he excels in ensuring stable operations at scale on Kubernetes and driving... Read More →
Tuesday June 17, 2025 12:10 - 12:40 JST
Level 1 | Orion
  Platform Engineering

14:10 JST

BGP Peering Patterns for Kubernetes Networking at Preferred Networks - Sho Shimizu, Preferred Networks, Inc. & Yutaro Hayakawa, Isovalent at Cisco
Tuesday June 17, 2025 14:10 - 14:40 JST
BGP (Border Gateway Protocol) is increasingly being used to connect Kubernetes networking with the rest of the IT estate, especially in large-scale and on-premises environments. However, the complexity of many network architectures requires users to have more flexibility and control over how they deploy BGP. Based on the experience at Preferred Networks, this session introduces key BGP peering patterns that enhance Kubernetes networking while maintaining operational simplicity, including:

1. The Sidecar BGP Peering Pattern: A method of running a dedicated BGP speaker alongside Kubernetes networking components, balancing automation with fine-grained control.
2. Native Routing over IP Clos Networks – A tunneling-free approach that integrates Kubernetes with large-scale BGP-based datacenter fabrics for better performance.

Based on real-world experience, we will share best practices and lessons learned, helping attendees design scalable and reliable Kubernetes networking with BGP.
Speakers
avatar for Sho Shimizu

Sho Shimizu

Software Engineer, Preferred Networks, Inc.
Sho Shimizu, software engineer at Preferred Networks, Inc., specializes in Kubernetes networking for AI/ML workloads. Since joining in 2019, he has developed a custom CNI plugin and is responsible for container networking architecture across the company's AI/ML infrastructure. Previously... Read More →
avatar for Yutaro Hayakawa

Yutaro Hayakawa

Software Engineering Technical Leader, Isovalent at Cisco
Working for Cilium at Isovalent. Linux Networking & BPF enthusiast.
Tuesday June 17, 2025 14:10 - 14:40 JST
Level 1 | Pegasus A-B1
  Connectivity

14:10 JST

Navigating Millions of Kafka Events in Real Time With OTel - Siddharth Vijay, Baazi Games & Shivay Lamba, Couchbase
Tuesday June 17, 2025 14:10 - 14:40 JST
How can real-time event streaming platforms, handling millions of events and complex data processing, maintain peak performance and reliability? Managing the same has previously been complex. The latest agent changes and addition of semantic convention in OpenTelemetry make it ideal to monitor highly distributed event streaming architectures (EDA) like Kafka. In this session we will discuss how these changes help standardize telemetry, explain the usage of span links for capturing several traces for a transaction in EDA.

The talk will also cover how Otel enables automatic anomaly detection particularly useful for identifying issues like Consumer Lag, Increased Latency in Event Processing, and Partition Failures. By leveraging context propagation, Otel tracks end-to-end latency across the entire Kafka ecosystem, including producers, brokers, and consumers.

The talk covers real-world examples from gaming platforms and data systems which have enabled Otel for Kafka monitoring.
Speakers
avatar for Shivay Lamba

Shivay Lamba

Senior Engineer, Couchbase
Shivay Lamba is a software developer specializing in DevOps, Machine Learning and Full Stack Development. He is an Open Source Enthusiast and has been part of various programs like Google Code In and Google Summer of Code as a Mentor and is currently a MLH Fellow. He has also worked... Read More →
avatar for Siddharth Vijay

Siddharth Vijay

AVP Engineering & Head of DevOps, Baazi Games
Siddharth Vijay, AVP at Pokerbaazi and KubeCon India speaker, brings over 12 years of experience driving impactful projects in AI, Security, and Cloud. A firm advocate of open-source technologies, he has a proven track record of delivering practical solutions with real-world value... Read More →
Tuesday June 17, 2025 14:10 - 14:40 JST
Level 1 | Orion
  Observability

14:50 JST

Green OpenTelemetry: Have Your Cake and Eat It Too - Adriana Villela, Dynatrace & Nancy Chauhan, Student
Tuesday June 17, 2025 14:50 - 15:20 JST
It’s a not-so-dirty little secret that the technology that we so heavily rely on comes at an environmental cost. As technology becomes more complex, we need Observability to better understand it, and yet this too contributes to an increasing global tech carbon footprint.

Luckily, we have tools at our disposal that can help us understand our carbon footprint, and take mitigating actions. Tools like Kepler, Kube-Green, and green reviews.

In this talk, attendees will learn about Kepler, Kube-Green and green reviews. They will learn how to use these tools to make tweaks to their OpenTelemetry Collectors and other Kubernetes infrastructure. This will therefore keep systems observable while keeping the environment in mind.
Speakers
avatar for Nancy Chauhan

Nancy Chauhan

CNCF Ambassador, Engineer
I am Nancy Chauhan, a software engineer passionate about solving complex problems and enhancing software reliability. As a CNCF Ambassador, I engage with a global cloud-native community, contributing to open-source projects and fostering collaboration. I also founded the Women in... Read More →
avatar for Adriana Villela

Adriana Villela

Principal Developer Advocate, Dynatrace
Adriana Villela is a Principal Developer Advocate, helping companies achieve reliability greatness through Observability, SRE, & DevOps practices. Previously, she managed a Platform Engineering team & an Observability Practices team at Tucows. Adriana has worked at various large-scale... Read More →
Tuesday June 17, 2025 14:50 - 15:20 JST
Level 1 | Pegasus A-B1
  Observability

14:50 JST

Mastering Authorization: Integrating Authentication and Authorization Data in Cloud Native Apps - Yoshiyuki Tabata, Hitachi, Ltd.
Tuesday June 17, 2025 14:50 - 15:20 JST
Authorization is one of the most important considerations for cloud-native applications, as highlighted by the OWASP Top 10. For a long time, there was no clear standard, making authorization a significant challenge for many implementers. The OpenID Foundation AuthZEN WG is now working on standards, focusing on interfaces between PEP (Policy Enforcement Point) and PDP (Policy Decision Point), which provides some hope.
However, managing authorization data remains challenging. Since this data is closely related to authentication data, architects often struggle with how the OP (OpenID Provider) and PDP should manage and integrate it. There are multiple methods, and the best approach varies by use case.
In this session, Yoshiyuki Tabata will explain various methods for managing and integrating authentication and authorization data. He will also describe implementation using Keycloak for OP and Topaz for PDP, providing valuable insights into effective data management.
Speakers
avatar for Yoshiyuki Tabata

Yoshiyuki Tabata

Senior OSS Consultant, Hitachi, Ltd.
He's a Senior OSS Consultant at Hitachi, Ltd. As an expert in IAM and APIs, he has provided numerous consultations over the past decade, including designing API and Authn/Authz platforms. He has actively contributed to CNCF TAG Security and has added significant functionalities to... Read More →
Tuesday June 17, 2025 14:50 - 15:20 JST
Level 1 | Pegasus B2-C
  Security

15:50 JST

Reimagining Cloud Native Networks: The Critical Role of DRA - Lionel Jouin, Ericsson Software Technology & Sunyanan Choochotkaew, IBM Research
Tuesday June 17, 2025 15:50 - 16:20 JST
As AI/ML, high-performance and telecom workloads are progressing in their cloud-native journey, the unique platform requirements inherent to the nature of their functionality are exposing the limitations of existing solutions such as Multus and device plugins. Device Resource Allocation (DRA) offers a fresh approach overcoming these challenges with better resource management for non-homogeneous platforms, topology-aware use cases and beyond! By leveraging the latest Kubernetes features, DRA Drivers are redefining the network interface configuration and enhancing capabilities for multi-network deployments. This talk explores the evolving cloud-native networking landscape and the trade-offs between extending Kubernetes and leveraging add-on components. We will delve into recent advancements including the network device status with KEP-4817, the virtual device allocation with KEP-5075 and the role of the CNI-DRA-Driver in shaping the future of cloud-native networking infrastructure.
Speakers
avatar for Lionel Joiun

Lionel Joiun

Software Engineer, Ericsson Software Technology
Lionel Jouin is a Software Engineer at Ericsson Software Technology, based in Stockholm, Sweden. He actively contributes to Kubernetes with a focus on bringing native support for secondary networks and its ecosystem including services and policies…. His contributions span SIG Network... Read More →
avatar for Sunyanan Choochotkaew

Sunyanan Choochotkaew

Staff Research Scientist, IBM Research
Sunyanan Choochotkaew is a staff research scientist at IBM Research, specializing in distributed computing and performance acceleration on cloud platforms. She holds the role of maintainer of Kepler. She has made contributions to Environmental Sustainability TAG, operator framework... Read More →
Tuesday June 17, 2025 15:50 - 16:20 JST
Level 1 | Orion
  Connectivity

15:50 JST

Practical Cloud Native Compliance Automation With OSCAL Compass - Chris Butler, Red Hat & Takumi Yanagawa, IBM Research
Tuesday June 17, 2025 15:50 - 16:20 JST
Cloud presents many advantages to users in terms of flexibility, scalability and innovation. Unfortunately compliance has become more complex as standards and regulations are used by end consumers as a proxy for security of underlying platforms whose operations are opaque. Consequently platform providers have ever increasing compliance obligations.

Compliance-as-code encompasses many activities such as automation of system configuration and general DevSecOps approaches. One perpetual challenge is how to provide machine readable workflows which span from standard to audit to allow automation in a way that scales.

OSCAL-Compass, a CNCF sandbox project, provides tooling to manage both the compliance artefacts as code and link those artefacts to executable policies. This talk will provide practical introduction to using OSCAL compass to document and enforce compliance controls using two of its tools: Compliance Trestle and C2P (compliance2policy) in the context of Kubernetes clusters.
Speakers
avatar for Takumi Yanagawa

Takumi Yanagawa

Advisory Software Developer, IBM Research
Takumi is an advisory software developer working in IBM Research - Tokyo on AI for Code and Security. He has a strong background in DevOps engineer and AI Governance product development using cloud-native technologies. With several years of experience, he has worked on building and... Read More →
avatar for Chris Butler

Chris Butler

Senior Principal Chief Architect, Red Hat
Dr. Chris Butler is a Chief Architect in the APAC Field CTO Office at Red Hat. Chris’ focus is working with regulated clients who are building infrastructure, application and AI platforms. Chris facilitates co-innovation engagements with our clients and partners with our product... Read More →
Tuesday June 17, 2025 15:50 - 16:20 JST
Level 1 | Pegasus B2-C
  Security

16:30 JST

Dynamic Provisioning and Capacity-Aware Scheduling for Local Storage - Yuma Ogami, Cybozu, Inc.
Tuesday June 17, 2025 16:30 - 17:00 JST
In this session, the speaker presents TopoLVM, a CSI plugin for local storage, and introduces an upcoming Kubernetes feature for local storage that he and his team are working on.

Local storage is promising for applications that require high I/O performance, like Elasticsearch and MySQL. TopoLVM provides many features like raw block volumes, resizing, and dynamic provisioning to manage local storage in Kubernetes easily. It also includes a capacity-aware pod scheduling feature that considers each node's local storage capacity.

Currently, this capacity-aware feature is achieved by a scheduler extender, which has two main issues:

1. Many admins don't have the right to install scheduler extenders.
2. The scheduler extender is TopoLVM specific.

To address these issues, he will introduce a KEP titled "KEP-4049: Storage Capacity Scoring of Nodes for Dynamic Provisioning." to be able to TopoLVM's scheduling logic for all CSI drivers without using scheduler extenders.
Speakers
avatar for Yuma Ogami

Yuma Ogami

Software Engineer, Cybozu, Inc.
He works at Cybozu, Inc. and spent four years involved in the operation and development of a server infrastructure using a custom system with VMs. For the past three years, he has focused on the operation and development of the storage area for a new infrastructure using Kubernetes... Read More →
Tuesday June 17, 2025 16:30 - 17:00 JST
Level 1 | Orion
  Data Processing + Storage
 
  • Filter By Date
  • Filter By Venue
  • Filter By Type
  • Content Experience Level
  • Presentation Language
  • Timezone

Share Modal

Share this link via

Or copy link

Filter sessions
Apply filters to sessions.