<p>Nvidia has announced the launch of Cosmos 3, an open AI world model aimed at enhancing the capabilities of robots, autonomous vehicles, and other physical systems in understanding and predicting real-world environments.</p><p><strong>Details:</strong> The model was trained on 20 trillion tokens of multimodal data, which includes nearly one billion images, 400 million real and synthetic videos, ambient audio, text, and action data from both humans and robots.</p><ul><li>According to Ming-Yu Liu, Vice President of Nvidia's Cosmos Lab, the action data differentiates Cosmos from typical video generators, as it focuses on modeling machine movements rather than just visual scenes.</li><li>Developers can utilize Cosmos 3 to simulate actions in physical environments and create task-specific models for robots and other machines.</li><li>The model is capable of generating action data, such as robot joint angles and gripper positions, which can assist in training machines for navigation and manipulation in the physical world.</li></ul><p><strong>Model Customization:</strong> Cosmos is designed as an open model, allowing hardware manufacturers to tailor it to their specific requirements, according to Liu.</p><ul><li>Nvidia is forming a coalition of companies to support this initiative, with initial partners including Agile Robots, Black Forest Labs, and Runway.</li><li>The model can also simulate rare or hazardous scenarios, such as robot collisions or unusual road events, which are typically challenging and costly to replicate.</li></ul><p><strong>Product Versions:</strong> Nvidia is releasing two versions of Cosmos 3: a “super” model for tasks that require high physics accuracy, and a “nano” model that can produce results in fractions of a second. An “edge” model for local operation is expected to be released soon.</p><p><strong>Industry Context:</strong> World models are increasingly recognized as a vital area of growth in AI, as businesses seek to extend the capabilities of chatbots and agents to perform real-world tasks. Notable startups in this domain include World Labs and AMI Labs.</p><p><strong>Conclusion:</strong> Nvidia aims to position itself as a foundational platform for the next generation of AI, which will focus on predicting, simulating, and acting within the physical world.</p>
Nvidia Introduces Cosmos 3 AI Model for Robotics and Autonomous Systems
Nvidia has launched Cosmos 3, an open AI world model designed to improve how robots and autonomous vehicles understand and navigate real-world environments. The model utilizes extensive multimodal data and allows developers to create tailored solutions for various applications. Nvidia aims to establish itself as a key player in the evolving landscape of AI that integrates physical actions and predictions.
No note attached
on this article.
Original vs. Neutral
Nvidia's new world model helps robots navigate the world
Nvidia Introduces Cosmos 3 AI Model for Robotics and Autonomous Systems