𝗖𝗼𝗿𝗲𝗪𝗲𝗮𝘃𝗲 𝗦𝗲𝘁𝘀 𝗡𝗲𝘄 𝗥𝗲𝗰𝗼𝗿𝗱 𝗪𝗶𝘁𝗵 𝗗𝗲𝗲𝗽𝗦𝗲𝗲𝗸-𝗩𝟯
CoreWeave trained DeepSeek-V3 in 2 minutes.
This result sets a new MLPerf v6.0 record. It beats the previous AWS record by 43 percent. AWS took 3.5 minutes to complete the same task.
How they did it:
- Used over 11,000 NVIDIA H100 GPUs.
- Spread workload across 4 data centers.
- Used a custom orchestration layer to manage compute.
CoreWeave also validated Nvidia Vera Rubin NVL72 at rack scale. This makes them the first cloud provider to do so.
Why this matters for AI:
- Training costs drop when you use thousands of GPUs across sites.
- Specialized infrastructure beats general cloud services for AI workloads.
- DeepSeek-V3 shows high performance at a fraction of GPT-4 costs.
CoreWeave is now building a 1.2 GW data center campus in Texas to expand this capacity.
Watch how big players like AWS and Google Cloud respond to these speeds.
Source: https://dev.to/gentic_news/coreweave-trains-deepseek-v3-in-2-minutes-claims-mlperf-v60-record-3dp4
Optional learning community: https://t.me/GyaanSetuAi