There’s no denying that we’re making huge strides in technology right now, not just in major language models, but also in infrastructure.
In fact, I’ve seen a lot of focus in the community towards the standards we’re looking at with GPUs and hardware. Of course, there’s still the lightning speed at which models evolve, but it’s worth looking at the bare metal side as well.
With that in mind, there’s a lot to unpack from a recent episode of the No Priors podcast where Nvidia CEO Jensen Huang contributes some insight.
And of course it’s Nvidia’s time, as the company eclipses Apple and Microsoft for the title of biggest tech corporation around. The major data centers being built now are using Nvidia products, high-end, off-the-shelf GPUs, and Huang has a lot to say about this transformation.
“The world has changed,” he said, speaking of cluster parallelism and advances in co-design. “The scale has changed.”
The evolution of Moore’s Law
Huang wrote his take on some of the history of hardware evolution with hosts, talking about how a maxim in the community known as Moore’s Law held true for many years, in which people referenced more doubling predictions of transistor capacity and processing each year. or so.
For reference, here’s how ChatGPT explains Moore’s Law:
“Moore’s Law is an observation made by Gordon Moore, co-founder of Intel, in 1965. It predicts that the number of transistors on a microchip will roughly double every two years, leading to a corresponding increase in computing power and a decrease of relative cost. This trend has fueled rapid advances in computing power and has been a fundamental principle guiding the semiconductor industry for decades.
It’s a bit ironic, because the company being beaten by Nvidia is none other than Intel. But I digress…
Now, Huang said, with even more rapid change, we’re looking at a kind of what he called “hyper Moore’s Law.”
To get there, he suggested, planners need to look at architecture and system together in a “whole stack approach.”
“You can treat the network as a computational fabric and push a lot of work to the network, push a lot of work to the fabric,” he said. “And as a result, you’re compressing … on very large scales.”
Conclusion and Latency: Making Real-Time Systems Smarter
Huang also mentioned the work of adapting to language models and neural networks that are realizing completion time scaling, generating chains of thought and reasoning on the fly.
“We have to invent something new,” he said, noting that fundamentally, low latency and high throughput are at odds. In addition, he mentioned the possibility of the industry moving into a diverse era with all kinds of language model sizes, including small language models, or TLMs.
“You’re still going to create these incredible borderline patterns,” he said. “They will be (used for) groundbreaking work. You will use (them) for generating synthetic data. You’re going to use patterns, big patterns, to learn smaller patterns and distill (to) the smaller pattern.”
The Big Consumer: The X.AI Project
Later, Huang revealed some very interesting elements of building the X.AI data center that the company worked on with Elon Musk.
He gave Musk a lot of credit for the quick implementation and decision-making that built this supergroup, as many people were working to get everything done quickly.
“It’s really a testament to his will and how he’s able to think through mechanical things, electrical things and overcome what seem like tremendous obstacles,” Huang said.
It also found that stakeholders used the digital twinning process to help implement the systems.
“We have simulated all the network configurations, we have pre-staged everything as a digital twin. We pre-staged the entire supply chain. We pre-staged all network wiring. We even created a small version of it, kind of, you know, just a first example of it … so by the time it all came out, it was all staged. All the exercises were done, all the simulations were done. And then the massive integration, … a monument of giant teams of humanity falling over each other, installing everything, 24/7, and within a few weeks, the clusters were up.”
What was special about the project? With literally tons of equipment, he said, the pace of the project was “abnormal.”
AI chip designers?
Huang confirmed that the company uses AI entities such as chip designers and software engineers.
“We couldn’t have built Hopper without (them),” he said. “They can explore a much larger space than we can. They have endless time to explore space.”
Companies and Change
Reflecting on the past few years, in which Nvidia’s market cap has skyrocketed, Huang talked about how the effect has been within the company.
“A company can’t change as fast as the stock price,” he said, citing the value of discussion, of knowing what’s really happening within the industry to drive change.
What he’s realized, he said, is that Nvidia has essentially reinvented computing for the first time in about 60 years, driving down the marginal cost of computing until it makes sense for computers to do the tasks themselves.
This is a game changer – and that’s an understatement! I’ve been looking at OpenAI’s Claude, o1 and Orion, and one thing is for sure – when we get to market effects, things will never be the same.
I may cover more of Huang’s comments elsewhere as he dives deeper into the new ability of systems to do things on their own – with minimal supervision and for AI to take a bigger role in the processes of the company.
This is where you really start to see the effects of ‘Agent Artificial Intelligence’ – that you’ll have AI entities taking on the engineering and design roles and being judged on the results.
It is, without a doubt, a time of incredible change.