XDOF Hadir untuk Mengatasi Hambatan Data Kritis dalam Physical AI

📅2 hours ago⏱3 min read

In this article

XDOF Emerges to Solve the Critical Data Bottleneck in Physical AI

As the race for physical intelligence heats up with OpenAI relaunching its robotics program, a new challenge has surfaced: the lack of high-fidelity training data. While Large Language Models (LLMs) thrived on the vast expanse of the public internet, robotics requires precise, physical interaction data that current datasets simply cannot provide.

The Data Gap: Why LLMs Won't Solve Robotics

The primary hurdle in developing capable robots isn't just compute or model architecture; it is the absence of a "data moat" comparable to the text used for GPT models. Current alternatives, such as YouTube videos or low-fidelity footage captured by gig workers, are difficult to reconcile with the complex physical realities of robotic movement. This "chicken-and-egg" problem—needing data to train models, but needing models to collect efficient data—has become the primary bottleneck for the industry.

XDOF, a startup emerging from stealth, is positioning itself as the infrastructure layer to solve this. Having raised $70 million from heavyweights including Thrive Capital, Spark Capital, a16z, Lux, and WndrCo, the company is building the pipelines, collection tools, and annotation systems that frontier AI labs are struggling to build in-house.

Building the ABC Dataset and the Data Pyramid

To jumpstart the ecosystem, XDOF is partnering with UC Berkeley’s AI Research lab to release "ABC," a massive collection of high-quality robot training data. This dataset includes:

130,000 trajectories of robot manipulation data.
300 hours of simulation data.
100 hours of evaluations.

Using this data, teams have already successfully trained robots on granular tasks such as folding T-shirts, flattening boxes, and performing delicate operations like loading AirPods into their cases.

XDOF’s strategy follows a three-tier "data pyramid" to ensure comprehensive learning. The most valuable tier involves teleoperation data collected directly on the target robot. This is followed by general data gathered via devices like GELLO (a low-cost teleoperation system developed by XDOF co-founders Philippe Wu and Fred Shentu). The final tier involves "egocentric" data, where humans perform everyday tasks while wearing XDOF’s proprietary sensors to capture first-person physical movement.

Melampaui Skala Lab-Lab Terdepan

Pertanyaan kritis bagi para investor adalah mengapa lab-lab AI besar tidak membangun pabrik data ini sendiri. Menurut CEO Philippe Wu, kompleksitas operasionalnya sangat besar. Menjalankan operasi pengumpulan data membutuhkan ratusan ribu kaki persegi ruang gudang, ratusan robot yang terkalibrasi, dan tenaga kerja teleoperator terlatih dalam jumlah besar.

Dengan berspesialisasi dalam pekerjaan yang "tidak glamor" ini—termasuk pembersihan data dan kalibrasi khusus perangkat keras—XDOF memungkinkan lab-lab AI untuk fokus pada arsitektur model sementara XDOF mengelola beban logistik produksi data fisik yang masif. Nama perusahaan tersebut, sebuah permainan kata dari "degrees of freedom" (derajat kebebasan), mencerminkan tujuannya untuk menyediakan data bagi kompleksitas gerakan apa pun, mulai dari tujuh derajat kebebasan lengan manusia hingga 30 derajat pada humanoid.

Poin-Poin Penting

Infrastruktur di atas Model: XDOF mengatasi hambatan "physical AI" dengan menyediakan jalur data khusus dan alat anotasi yang tidak dimiliki oleh lab-lab yang berpusat pada LLM.
Dataset dengan Fidelitas Tinggi: Perilisan dataset ABC memberikan skala yang belum pernah ada sebelumnya bagi industri, yang menampilkan 130.000 lintasan manipulasi.
Outsourcing Operasional: XDOF memungkinkan lab-lab terdepan untuk melewati persyaratan modal dan logistik yang masif dalam mengelola gudang data fisik skala besar dan armada teleoperasi.

XDOF Hadir untuk Mengatasi Hambatan Data Kritis dalam Physical AI

XDOF Emerges to Solve the Critical Data Bottleneck in Physical AI

The Data Gap: Why LLMs Won't Solve Robotics

Building the ABC Dataset and the Data Pyramid

Melampaui Skala Lab-Lab Terdepan

Poin-Poin Penting

Continue reading

Bagaimana Fleksibilitas AI Dapat Menyelesaikan Krisis Daya Pusat Data Global

AI Gateway: Sistem Saraf Pusat untuk LLM Perusahaan

Kesenjangan Kesiapan Data AI

Kesenjangan Infrastruktur AI: Para Hyperscaler Menghadapi Krisis Arus Kas

Pramaana Labs Amankan $27 Juta untuk Mengatasi Keandalan AI dengan Verifikasi Formal