Designing an inference chip for robots is actually very difficult.
In data centers each chip is bathed in jacuzzi and babysat by nannies. If they died it would be hot swapped by one of their clones.
The fault rate of GPUs in datacenter is actually quite high. Industrial average annual fault rate of H100 is 9%. Ideal conditions could reduce it down to 2% but never below a single digit.
The fault recovery of GPU nodes actually could take a while, from minutes to hours. It is not instantaneous.
In robots, the chips are out in the cold and they need rapid self recovery. The fault tolerance is in a different league. It is not uncommon many robotic companies struggle to get the chip running more than a few hours without rebooting.
For chip companies, this is great since they would tell robotic companies to buy more chips for hot swapping.
For robotic companies, this is bad since it is obviously not a scaleable solution but they are stuck with endless back-and-forth JIRA tickets with vendors.