AWS re:Invent 2021 - Large-scale distributed training of media ML models with Amazon FSx
In this session, learn about the challenges of scalable distributed training of media machine learning models on multi-GPU nodes used by Netflix and how the Amazon FSx solution is used to resolve the data loader performance bottlenecks of the training system. See the impressive results in terms of performance and throughput improvements on multi-node GPUs and the scalability of Amazon FSx.
Learn more about re:Invent 2021 at
Subscribe:
More AWS videos
More AWS events videos
ABOUT AWS
Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts.
AWS is the world’s most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—a
3 views
79
28
3 months ago 00:13:03 1
The influence of Shakespeare on everyday English
6 months ago 00:03:39 1
AWS re:Invent 2016: Move Exabyte-Scale Data Sets with AWS Snowmobile
9 months ago 02:08:47 1
David Mamet | Club Random with Bill Maher
11 months ago 00:00:56 1
Re:Invent in Las Vegas (@AWSEventsChannel): The #BASF Case Study
11 months ago 00:02:25 1
ANYmal Demo at AWS re:Invent in Las Vegas
11 months ago 00:04:50 1
Siraj Raval - Deep Learning with 4th Gen Xeon Processors and Intel® Accelerator Engines (AWS re:Invent 2023)
12 months ago 00:00:41 1
AWS Re:Invent - Why Autonomy is important for industrial inspection robots like #ANYmal
12 months ago 00:04:35 1
Belle
12 months ago 00:03:04 1
Why We Build The Autonomous Industry - AWS re:Invent 2023