skip to content
Site header image reelikklemind

🎥  How I use LLMs: Andrej Karpathy


🎥  How I use LLMs: Andrej Karpathy's Practical Guide


Introduction

In an era where large language models have transitioned from academic curiosity to indispensable tools, Andrej Karpathy's latest video stands as a definitive guide to navigating the complex and rapidly evolving LLM landscape.

As a follow-up to his previous deep dive into LLM fundamentals, this comprehensive 2+ hour tour de force demonstrates not just theoretical understanding but practical, real-world application of these transformative technologies. Andrej Karpathy, drawing from his unique perspective as a founding researcher at OpenAI and former Director of AI at Tesla, delivers what might be the most thorough examination of LLM capabilities available to the general public today.


The Vision: From Theory to Practice

Karpathy establishes his vision early: this is not just another technical overview but an "example-driven, practical walkthrough" designed to show viewers exactly how he integrates LLMs into his daily life and work. The video represents a crucial bridge between understanding how LLMs work (covered in his previous video) and actually using them effectively. This practical focus, combined with Karpathy's insider perspective and hands-on demonstrations, creates an invaluable resource for anyone looking to move beyond casual experimentation to serious LLM utilization.


Methodology and Presentation Style

What sets this video apart is Karpathy's commitment to showing rather than telling. Throughout the extensive runtime, he maintains a consistent pattern of explaining concepts followed immediately by live demonstrations. This approach transforms abstract capabilities into tangible skills viewers can replicate. His presentation style balances technical precision with accessibility, using clear analogies (like comparing LLMs to "zip files" containing compressed knowledge) while never shying away from the complexities that serious users need to understand.

The video's structure follows a logical progression from basic interactions to advanced features, making it suitable for viewers at various expertise levels. Karpathy's candid and conversational tone, complete with personal anecdotes and honest assessments of tool limitations, creates an authentic learning experience that feels more like a mentorship session than a formal tutorial.


Core Themes and Key Insights

1. The LLM Ecosystem Landscape

Karpathy begins by mapping the current LLM ecosystem, showing how the field has evolved since ChatGPT's 2022 debut. He identifies ChatGPT as the "Original Gangster incumbent" but acknowledges the emergence of numerous competitors including Anthropic's Claude, Google's Gemini, xAI's Grok, and international players like China's DeepSeek and France's Mistral. This overview provides crucial context for understanding the competitive dynamics driving rapid innovation in the space.

A particularly valuable insight is his introduction of leaderboards like Chatbot Arena and Scale's SEAL leaderboard as tools for tracking model performance. This empowers viewers to make informed decisions about which models to use for specific tasks, rather than relying solely on marketing claims or brand recognition.

2. Understanding Model Tiers and Economics

One of the most practical sections covers the pricing tiers and model availability across different platforms. Karpathy demonstrates how the same query can yield different results depending on whether you're using GPT-4, GPT-4 mini, or other model variants. His explanation of the economics and showing how larger models require more computational resources (and therefore command higher prices), helps viewers understand the trade-offs between cost and capability.

His personal approach of subscribing to multiple services (ChatGPT Pro, Claude Professional, etc.) and maintaining what he calls an "LLM Council" for important decisions reveals how serious users leverage the strengths of different models. This strategy of using multiple LLMs as a council for cross-verification and diverse perspectives is particularly insightful for professionals relying on these tools for critical work.

3. The Revolution of Thinking Models

Perhaps the most technically significant portion covers "thinking models" - those trained with reinforcement learning to develop internal reasoning processes. Karpathy's demonstration of models like OpenAI's O1 series and DeepSeek R1 shows how these newer models approach complex problems differently, often taking more time but delivering superior results on mathematical, coding, and reasoning tasks.

His live comparison between GPT-4 and O1 Pro on a gradient checking problem provides compelling evidence of the practical benefits. While the standard model offers generic debugging suggestions, the thinking model identifies the specific parameter mismatch issue, showing exactly when and why users should opt for reasoning-capable models despite their slower response times.

4. Tool Use: Expanding LLM Capabilities

Karpathy dedicates substantial time to tool use, demonstrating how LLMs have evolved from simple text generators to versatile systems capable of interacting with external resources. His coverage includes:

Internet Search: Showing how models can automatically search the web for recent information, with practical examples like finding release dates for TV shows. His comparison of search capabilities across different platforms reveals significant variations in implementation and effectiveness.

Deep Research: Perhaps the most impressive demonstration shows advanced models spending tens of minutes conducting comprehensive research on complex topics like supplement ingredients. Karpathy's example of researching C-AKG (Alpha-Ketoglutarate) showcases how these systems can synthesize information from multiple sources, though he wisely cautions about potential hallucinations and the need for verification.

File Uploads and Document Analysis: His practical examples of analyzing nutrition labels, blood test results, and academic papers demonstrate the transformative potential of LLMs as research assistants. The technique of combining OCR capabilities with document analysis for everything from ingredient analysis to technical paper understanding provides a blueprint for knowledge workers.

5. Programming and Development Integration

As a programmer himself, Karpathy's coverage of development tools is particularly authoritative. His demonstration of Cursor's Composer feature shows how AI-assisted coding has evolved from simple code completion to autonomous development. The live building of a Tic-Tac-Toe game with confetti effects and sound demonstrates the current state of AI-assisted development, what he termed "vibe coding."

His explanation of the progression from simple code snippets (Command-K) to full autonomous development (Command-I) provides context for understanding where this technology is heading. The honest admission that these tools still require oversight and fallback to traditional programming, offers a balanced perspective on their current capabilities and limitations.

6. Multimodal Capabilities

The video extensively covers the expansion of LLMs beyond text to handle multiple modalities:

Audio Integration: Karpathy demonstrates both "fake audio" (speech-to-text conversion) and "true audio" (models that process audio natively). His comparison shows the advantages of true audio models like OpenAI's Advanced Voice Mode, which can handle nuances like timing, emotion, and even animal sounds that text-based systems miss.

Image Processing: Practical examples include analyzing nutrition labels, blood test results, and even memes. His technique of screenshotting and uploading information for analysis provides a workflow that viewers can immediately adopt. The demonstration of OCR capabilities combined with contextual understanding shows how these systems can extract meaningful information from visual data.

Video Understanding: The brief but impressive demonstration of video input capabilities through mobile apps hints at future possibilities. While noting that current implementations may still process video as sequences of images rather than true video streams, the practical applications for education, entertainment, and documentation are clear.

7. Quality of Life Features

Karpathy doesn't overlook the smaller features that significantly impact user experience:

Memory Functionality: His demonstration of ChatGPT's memory feature shows how LLMs can maintain context across conversations, learning user preferences over time. This capability transforms these tools from stateless utilities into personalized assistants that improve with continued use.

Custom Instructions: The ability to customize how models respond by setting preferences for formality, educational approach, or language learning needs, shows how users can tailor these tools to their specific requirements.

Custom GPTs: His examples of specialized GPTs for language learning demonstrate the power of creating customized tools for specific tasks. The detailed translator and vocabulary extractor provide practical templates that viewers can adapt for their own needs.


Production Quality and Technical Execution

The video represents a high standard for educational content about AI technology. Karpathy's screen recordings are clear and well-organized, with effective use of highlighting and annotation to draw attention to important elements. The pacing, while necessarily slow due to the technical depth, maintains engagement through Karpathy's engaging delivery and the constant introduction of new, practical information.

The inclusion of timestamps and chapter markers in the description makes the extensive content navigable, allowing viewers to focus on specific areas of interest. This consideration for the viewer experience demonstrates Karpathy's understanding of his audience's needs and the practical challenges of consuming such comprehensive content.


Critical Assessment and Limitations

While overwhelmingly positive, a balanced assessment should note some limitations:

The video's sheer length (2+ hours) may be intimidating for casual viewers, though the chapter structure helps mitigate this issue. Additionally, Karpathy's position as an industry insider with access to premium tiers of multiple services may not reflect the reality of average users working with free or limited-access versions of these tools.

Some advanced features, like the $200/month Deep Research capability, while impressive, may be beyond the reach of many viewers. However, Karpathy is transparent about these limitations and often shows alternative approaches for users with different budget constraints.


Broader Implications and Future Directions

Beyond the practical tutorials, this video offers insights into the broader trajectory of AI development and adoption. Karpathy's demonstrations reveal several key trends:

Integration: The increasing integration of LLMs into existing workflows and tools suggests a future where AI capabilities are seamlessly embedded rather than separate applications.

Specialization: The emergence of specialized models for different tasks (reasoning, creative writing, code generation, etc.) indicates a trend toward more targeted AI solutions.

Accessibility: The progression from command-line interfaces to natural language interactions and multimodal inputs shows how these technologies are becoming more accessible to non-technical users.

Democratization: Despite the advanced capabilities shown, Karpathy's emphasis on practical, replicable techniques demonstrates how these powerful tools are becoming available to increasingly broader audiences.


Conclusion

Andrej Karpathy's "The example-driven, practical walkthrough of Large Language Models" stands as a landmark educational resource in the AI space. It successfully bridges the gap between theoretical understanding and practical application, providing viewers with not just knowledge but actionable skills they can immediately implement.

The video's greatest strength lies in its authenticity. Karpathy doesn't just demonstrate what these tools can do in theory; he shows how he actually uses them in his daily work and life. This practical wisdom, combined with his technical expertise and clear communication style, creates an educational experience that is both comprehensive and immediately applicable.

For anyone serious about understanding and utilizing LLMs effectively, from casual users to professionals, this video provides an essential foundation. As these technologies continue to evolve and integrate into every aspect of digital life, Karpathy's practical guide serves as both a comprehensive introduction to the current state of the art and a framework for adapting to future developments.

In a field characterized by rapid change and sometimes exaggerated claims, Karpathy's measured, practical, and thoroughly demonstrated approach offers exactly what users need: not hype, but helpful guidance on how to actually use these transformative tools to enhance their work, learning, and daily life.



Crepi il lupo! 🐺