Training Machine Learning (ML) models on Kubernetes

Kubernetes Bytes

Το περιεχόμενο παρέχεται από το Kubernetes Bytes, Ryan Wallner, and Bhavin Shah. Όλο το περιεχόμενο podcast, συμπεριλαμβανομένων των επεισοδίων, των γραφικών και των περιγραφών podcast, μεταφορτώνεται και παρέχεται απευθείας από τον Kubernetes Bytes, Ryan Wallner, and Bhavin Shah ή τον συνεργάτη της πλατφόρμας podcast. Εάν πιστεύετε ότι κάποιος χρησιμοποιεί το έργο σας που προστατεύεται από πνευματικά δικαιώματα χωρίς την άδειά σας, μπορείτε να ακολουθήσετε τη διαδικασία που περιγράφεται εδώ https://el.player.fm/legal.

1+ y ago 55:29

MP3•Αρχική οθόνη επεισοδίου

Fetch error

Hmmm there seems to be a problem fetching this series right now. Last successful fetch was on March 03, 2025 17:11 (9M ago)

What now? This series will be checked again in the next day. If you believe it should be working, please verify the publisher's feed link below is valid and includes actual episode links. You can contact support to request the feed be immediately fetched.

In this episode of the Kubernetes Bytes podcast, Bhavin sits down with Bernie Wu, VP Strategic Partnerships and AI/CXL/Kubernetes Initiatives at Memverge. They discuss about how Kubernetes is the most popular platform to run AI model training and model inferencing jobs. The discussion dives into model training, talking about different phases of a DAG, and then talk about how Memverge can help users with efficient and cost-effective model checkpoints. The discussion goes into topics like saving costs by using spot instances, hot restart of training jobs, reclaiming unused GPU resources, etc.

Check out our website at https://kubernetesbytes.com/

Episode Sponsor: Nethopper

Learn more about KAOPS: @nethopper.io
For a supported-demo: [email protected]
Try the free version of KAOPS now! https://mynethopper.com/auth

Cloud Native News:

https://www.aquasec.com/blog/linguistic-lumberjack-understanding-cve-2024-4323-in-fluent-bit/
https://kubernetes.io/blog/2024/05/20/completing-cloud-provider-migration/
https://thenewstack.io/introducing-aks-automatic-managed-kubernetes-for-developers/
https://www.harness.io/blog/harness-to-acquire-split

Show Links:

https://www.linkedin.com/in/berniewu/
https://criu.org/Main_Page
https://memverge.com/
https://youtu.be/tY8YOMRuqWI?si=yB3hHqLUpYPZ-KWN
https://youtu.be/ND4seSKpJHI?si=shh0iuA9qC-dO6eb

Timestamps:

01:04 Cloud Native News
08:47 Interview with Bernie
51:40 Key takeaways

88 επεισόδια

#Tech #Bhavin Shah #Ryan Wallner #Kubernetes Bytes #Kuberentes #Cloud IT #CONTAIN #Cloud Native #Software