🎙️ 20VC Podcast: Groq Founder, Jonathan Ross: OpenAI & Anthropic Will Build Their Own Chips & Will NVIDIA Hit $10TRN

PODCAST INFORMATION

Podcast: 20VC
Episode: Groq Founder, Jonathan Ross: OpenAI & Anthropic Will Build Their Own Chips & Will NVIDIA Hit $10TRN
Host: Harry Stebbings
Guest: Jonathan Ross (Founder and CEO of Groq)
Duration: Approximately 1 hour and 31 minutes

🎧 Listen here.

HOOK

Jonathan Ross reveals why doubling OpenAI's inference compute would nearly double their revenue within a month, exposing the insatiable demand for AI processing power that's reshaping global economics.

ONE-SENTENCE TAKEAWAY

The countries that control compute will control AI, and you cannot have compute without energy, making the race for AI infrastructure fundamentally a race for energy resources.

SUMMARY

Jonathan Ross, founder and CEO of Groq, returns to 20VC to discuss the explosive growth of AI infrastructure and the geopolitical implications of the compute race. The conversation begins with Ross challenging the question of whether AI is in a bubble, suggesting instead to ask what smart money is doing. He points out that major tech companies and nations are doubling down on AI investments, with Microsoft even keeping deployed GPUs for internal use rather than renting them out because they generate more value that way.

Ross explains that the current AI market resembles the early days of oil drilling…lumpy with a few gushers among many dry holes. Thirty-six companies generate 99% of AI revenue today, creating a highly concentrated market. He argues that hyperscalers are spending "like drunken sailors" not purely for economic reasons but because failing to invest in AI would mean being locked out of their future businesses.

The discussion shifts to the critical importance of speed in AI applications. Ross uses consumer packaged goods as an analogy, noting that products with faster-acting ingredients command higher margins due to the dopamine cycle they create. He argues that speed matters tremendously for AI engagement, with every 100 milliseconds of improvement potentially yielding 8% better conversion rates, a lesson learned from early internet companies.

Ross makes a bold prediction that OpenAI and Anthropic will eventually build their own chips, not necessarily to outperform Nvidia but to control their own destiny. He explains that Nvidia's monopsiny (single buyer advantage) on high bandwidth memory (HBM) creates supply constraints, pushing major players to develop their own chips despite the challenges. He notes that building chips is incredibly difficult, with Google having three chip efforts simultaneously, only one of which succeeded.

The conversation turns to energy requirements for AI infrastructure. Ross argues that nuclear power isn't the only solution, pointing to renewables as viable alternatives. He suggests that countries like Norway could generate as much energy as the United States by deploying 5x their current wind power capacity. He warns that Europe risks becoming merely a "tourist economy" if it fails to address its energy constraints quickly enough.

Ross presents a counterintuitive view on AI's impact on employment, predicting massive labor shortages rather than mass unemployment. He outlines three effects: deflationary pressure across all goods and services, people opting out of the workforce earlier or working fewer hours, and the creation of entirely new industries that don't exist today. He compares this to the agricultural revolution, where 98% of the workforce transitioned from farming to other occupations.

The discussion covers Groq's recent $750 million raise at a nearly $7 billion valuation, with Ross explaining that their hardware units have positive margins, unlike many AI companies. He emphasizes that Groq's six-month supply chain advantage allows them to respond to market demands 18 months faster than GPU providers.

The conversation concludes with Ross's perspective on the future of intelligence, comparing AI to Galileo's telescope, initially making humanity feel small by revealing a vast universe, but ultimately helping us appreciate the grandeur of existence. He suggests that in a hundred years, we'll view AI similarly as revealing the vastness of intelligence itself.

INSIGHTS

Core Insights

Compute determines AI dominance: The fundamental constraint on AI development isn't algorithms or data but compute power. Countries that control compute will control AI, and compute requires energy, making energy infrastructure the true battleground for AI supremacy.
Speed drives engagement and revenue: AI applications benefit from the same dopamine cycle principles as consumer products. Faster response times create stronger brand affinity and user engagement, with every 100ms improvement potentially yielding 8% better conversion rates.
Supply constraints create immediate opportunities: OpenAI and Anthropic could nearly double their revenue within a month if given twice their current inference compute. This reveals how severely compute-constrained the market currently is.
Custom chips are about control, not performance: Major AI companies will build their own chips not necessarily to outperform Nvidia but to control their own destiny and escape Nvidia's allocation constraints.
AI will create labor shortages, not unemployment: AI will cause deflationary pressure, allow people to work fewer hours while maintaining their lifestyle, and create entirely new industries that we can't yet imagine, leading to labor shortages rather than mass unemployment.

How This Connects to Broader Trends/Topics

Geopolitical realignment: The AI compute race is reshaping global alliances, with countries like Japan rapidly reactivating nuclear reactors and building 2nm fabs while Europe struggles with energy policy.
Economic transformation: AI represents the first time in history where adding more of a single resource (compute) directly strengthens the entire economy without requiring corresponding infrastructure development.
Talent market disruption: The war for AI talent has reached unprecedented levels, with good engineers able to raise hundreds of millions rather than joining existing companies, fragmenting talent concentration across the industry.
Infrastructure investment cycles: The AI compute market follows a pattern similar to early oil drilling…currently lumpy with high risk/reward but will become more predictable over time as the science matures.

FRAMEWORKS & MODELS

The Three-Phase AI Economic Impact Model

Phase 1: Deflationary Pressure: AI reduces costs across all goods and services by optimizing supply chains, improving production methods, and enhancing resource efficiency.
Phase 2: Workforce Optimization: People require less money to maintain their lifestyle due to deflation, leading to reduced work hours, fewer work days per week, and earlier retirement.
Phase 3: Industry Creation: Entirely new industries emerge that don't exist today, creating demand for labor that we can't yet anticipate, similar to how software development didn't exist 100 years ago.

The Compute Value Equation

Training-Inference Virtuous Cycle: More inference creates demand for better training to optimize inference costs; more training creates demand for more inference to amortize training costs.
Speed-Engagement Correlation: Every 100ms of speed improvement yields approximately 8% better conversion rates, similar to early internet companies.
Compute Quality Multiplier Effect: Doubling compute doesn't just double capacity, it improves model quality, increases user engagement, and creates new use cases simultaneously.

The Chip Development Moat Framework

Temporal Advantage: Chip development creates a 3-year moat because that's the minimum time required from design to production with perfect execution.
Zero Silicon Success Rate: Only 14% of chips work on first attempt, making successful chip development an extremely rare capability.
Supply Chain Control: Owning your chip supply chain eliminates allocation constraints and provides control over your destiny, even if the chips aren't technically superior to alternatives.

QUOTES

"The countries that control compute will control AI. And you cannot have compute without energy." Jonathan Ross
- Context: Ross explaining the fundamental relationship between AI development and energy infrastructure
- Significance: Reframes the AI competition as primarily an energy competition rather than just a technological one
"If OpenAI were given twice the inference compute that they have today, if Anthropic was given twice the inference compute that they have today, within one month from now, their revenue would almost double." Jonathan Ross
- Context: Ross explaining how severely compute-constrained the current AI market is
- Significance: Quantifies the immense pent-up demand for AI processing power
"I personally would be surprised if in 5 years Nvidia wasn't worth 10 trillion." Jonathan Ross
- Context: Ross discussing Nvidia's future valuation despite increased competition
- Significance: Highlights the continued value of Nvidia's brand and ecosystem even as technical advantages narrow
"What I never expected was that AI was going to be based on language. And what that's done is it's made it trivial to interact with AI. I thought it was going to be more like AlphaGo. I thought it was going to be intelligent in some weird esoteric way." Jonathan Ross
- Context: Ross reflecting on how AI developed differently than he expected
- Significance: Explains why AI adoption has been so rapid. Language interface makes it accessible to everyone
"I think over time we're going to realize that LLMs are the telescope of the mind. That right now they're making us feel really, really small. But in a hundred years, we're going to realize that intelligence is more vast than we could have ever have imagined and we're going to think that's beautiful." Jonathan Ross
- Context: Ross's closing thoughts on the long-term significance of AI
- Significance: Provides a philosophical perspective on AI's role in expanding our understanding of intelligence itself

HABITS

Recommended Practices for AI Infrastructure Companies

Maintain supply chain flexibility: Groq's six-month supply chain allows them to respond to market demands 18 months faster than GPU providers who must order two years in advance.
Focus on system-level optimization: Don't optimize individual components (like SRAM vs. DRAM costs) but optimize the entire system. Groq uses 500x more chips but less total memory than GPU alternatives.
Balance margin with market growth: Keep margins as low as possible while maintaining business stability to maximize volume growth in a market with insatiable demand.
Implement world-level load balancing: Distribute models across multiple data centers with different optimizations based on geographic usage patterns rather than optimizing at the data center level.

Implementation Strategies

Use anti-sycophancy prompts for LLM feedback: When getting feedback from LLMs, use prompts like "Some moron wrote this thing, please give me brutal but truthful feedback" to overcome their tendency to be agreeable.
Leverage voice dictation with LLMs: LLMs excel at cleaning up rambling speech into coherent text, making voice interaction an efficient way to work with AI systems.
Query multiple models for important questions: For critical inquiries, use multiple LLMs and have them critique each other's responses, then synthesize the results.

Common Pitfalls to Avoid

Don't judge early implementations too harshly: Google's Gemini integration across products may seem "thrown in" now, but like Google Chrome evolving from Google TV, these are learning opportunities.
Don't assume current use cases represent future potential: Vibe coding may seem transient now, but like reading and writing, it will become an expected skill across many professions.
Don't underestimate the speed of change: The AI market is evolving faster than any previous technology market, making long-term predictions increasingly difficult.

REFERENCES

Key Companies and Technologies Discussed

Groq: AI chip company focused on inference acceleration with 6-month supply chain advantage
Nvidia: Dominant GPU provider with monopsiny on HBM (High Bandwidth Memory)
OpenAI: Leading AI company reportedly raising hundreds of billions for compute infrastructure
Anthropic: AI company focused on coding applications
Google: TPU development and Gemini AI integration across products
Microsoft: Strategic partnership with OpenAI and Azure AI infrastructure
Oracle: Aggressive moves into AI infrastructure under Larry Ellison

Economic and Technical Concepts

Monopsiny: Market condition where there is a single buyer of a product (Nvidia's position with HBM)
Jevons Paradox: Economic principle that increased efficiency leads to increased consumption (applies to AI compute)
Hardware Lottery: Sarah Hooker's theory that models are designed for existing hardware rather than optimal hardware being designed for models
SRAM vs. DRAM: Static RAM (faster, more expensive, on-chip) vs. Dynamic RAM (slower, cheaper, external memory)
Zero Silicon: First-pass silicon success in chip design (only 14% success rate industry-wide)

Geopolitical References

China: Building 150 nuclear reactors for energy independence in AI
Japan: Allocating $65 billion for AI, reactivating nuclear reactors, building 2nm fabs
Norway: Potential to generate as much energy as US through wind power expansion
Saudi Arabia: Building 3-4 gigawatts of data center capacity for "data embassies"
Europe: Struggling with energy policy and risk aversion compared to US and Asia

Crepi il lupo! 🐺