Unveiled at SIGGRAPH 2025, Cosmos Reason is a 7 billion-parameter reasoning model built for physical AI and robotics. The idea is to let bots analyse situations, plan their actions and use prior knowledge and physical understanding to get jobs done, rather than standing there waiting for line-by-line instructions.
According to Nvidia, the tech acts like a robot’s brain, letting visual language action models split complex orders into smaller tasks and carry them out using common sense, even when dumped in an unfamiliar environment. The company claims it is fully open and customisable, ready for developers to tinker with.
While visual language models have been around since OpenAI’s CLIP, they have mostly been good at spotting objects or patterns, not chaining together multi-step tasks or navigating ambiguous situations. Nvidia says Cosmos Reason fixes that gap by giving AI agents proper planning and reasoning.
Potential uses include robot planning, data management, annotation, and video analysis. Industry watchers reckon the tech could give a boost to big robotics players like Hon Hai and Quanta, as well as spur demand for AI servers to power these more capable mechanical helpers.
It is, of course, all very clever on paper. Now we just have to see if the robots remember to open the door before trying to walk through it.