Nvidia has no plans to add RISC-V support for CUDA, which is the proprietary GPU software platform, a company representative told HPCwire in response to a question during the CUDA 12 session at the GTC summit.
CUDA is crucial to Nvidia’s pivot to becoming a software company. The parallel-programming framework is behind the software and service offerings in markets like robotics, automotive and healthcare, which are designed to run only on Nvidia’s GPUs.
Nvidia is moving its software stack heavily in the direction of Arm, which is behind the company’s homegrown CPUs. CUDA already supports x86, but there’s no RISC-V support on the roadmap.
The upcoming version of CUDA, version 12, which is expected fairly soon, has many optimizations for Nvidia’s Arm-based CPUs called Grace. The chipmaker is pairing its latest Hopper-based GPUs with Grace CPUs, which can communicate with a proprietary interconnect called NVLink, and has five times more bandwidth than PCIe Gen 5, which will be in systems with x86 CPUs and Nvidia GPUs.
Nvidia was an early adopter of RISC-V for controllers in its GPUs, but that’s the best use of the architecture for now, said Jensen Huang, CEO at Nvidia, during a press briefing with the Asia-Pacific press.
“We like RISC-V because it’s open source… but more importantly, it’s adaptable. We can use it for all kinds of interesting configurations of CPUs. However, RISC-V is not appropriate yet and not for some time for external third-party software,” Huang said.
By comparison, the x86 and Arm architectures have a large software ecosystem that is not fragmented and is stable, regardless of the supplier it comes from, Huang said.
The benefit of RISC-V being open source and adaptable could also have its disadvantages, Huang said.
The RISC-V architecture is more like a chip version of Linux, and is free to license and modify. The goal is for companies to make and manufacture their own chips at a low cost, while cutting reliance on proprietary x86 and ARM architectures, which have to be bought or licensed.
The RISC-V architecture has a base instruction set on which companies can customize by putting their proprietary extensions. For example, Nvidia competitor Imagination has made its own RISC-V CPU called Catapult, on which it can bundle its compatible GPU for graphics and AI. Imagination offers full software and debug support. Similarly, others offer RISC-V AI chips with vector extensions with their own software stacks.
There-in lies the problem. Huang views that incoherent software ecosystem, with different software offerings tuned to different chips, as the disadvantage for RISC-V. He indicated that contributing to a fragmented ecosystem won’t be healthy for the development of RISC-V.
“We’ll see how the world evolves in the long term. But building an ecosystem that is software compatible, that is architecturally compatible, it’s very, very hard to do,” Huang said, adding “can you make a RISC-V that is like an ecosystem, like Arm and x86? Of course, but it will probably take a decade or two.”
Huang’s view could be a reflection of how Apple views RISC-V. Apple is replacing Arm controllers with RISC-V cores on non-user facing parts, semiconductor analyst Dylan Patel said in a newsletter post earlier this month. Those parts typically rely less on system software.
RISC-V International, which drives the development of the architecture and extensions, is mostly focused on hardware extensions. Open-source developers and companies backing RISC-V are developing and upstreaming Linux 6.0 support for the new extensions, which is being documented by Michael Larabel at Phoronix.
While software remains an issue, the hardware adoption of RISC-V architecture is growing. Intel is creating a RISC-V chips with the Barcelona Supercomputing Center, and Google collaborating with SiFive on chips for artificial intelligence applications.
Asked to comment on Huang’s remarks and on Nvidia’s stand on CUDA for RISC-V, RISC-V International CEO Calista Redmond didn’t directly address the topic.
“We’re seeing increased momentum and investment across the full spectrum of computing, from datacenter to mobile. The ecosystem is growing and moving fast as well. What may have taken decades in the past is now well given the demand for design flexibility is ushering in diversity on a shared and open set of standards such as our single hypervisor approach,” said Redmond via email.