Close Menu
Karachi Chronicle
  • Home
  • AI
  • Business
  • Entertainment
  • Fashion
  • Politics
  • Sports
  • Tech
  • World

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

What's Hot

Joni Ernst: Iowa Senator Joni Ernst won’t launch a major Senate race in 2026

UFC legend explains why athletes like LeBron James are redefineing longevity

Cryptocurrency Live News & Updates: Vaneck proposes SolanaETF for traditional investors

Facebook X (Twitter) Instagram
  • Home
  • About us
  • Advertise
  • Contact us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
Facebook X (Twitter) Instagram Pinterest Vimeo
Karachi Chronicle
  • Home
  • AI
  • Business
  • Entertainment
  • Fashion
  • Politics
  • Sports
  • Tech
  • World
Karachi Chronicle
You are at:Home » Dive deep into LLMS like ChatGpt by Andrej Karpathy
AI

Dive deep into LLMS like ChatGpt by Andrej Karpathy

Adnan MaharBy Adnan MaharFebruary 9, 2025No Comments3 Mins Read6 Views
Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Reddit
Share
Facebook Twitter LinkedIn Pinterest WhatsApp Email


I think that’s everything Andrei Carpathy Share x or YouTube It’s a gold mine of information and his latest video. Dive deep into LLMS like chatgptno exception. It provides a detailed breakdown of the important mechanisms behind LLMS. It’s a whopping 3 hours and 30 minutes and it covers a lot of ground, so I highly recommend watching it. However, if you haven’t been watching because it’s too long, there’s a TL;DW version.

LLM is trained in two major stages. Pre-training (learning from a vast dataset) and post-training (fine-tuning with human feedback).

Before training, it includes “downloading and processing the internet.” It uses datasets such as FineWeb (44TB, 15 trillion tokens) from Face.

Improve the post-training model It features reinforced learning with monitored fine-tuning (SFT) and human feedback (RLHF).

LLM suffers from hallucinations However, mitigation includes model interrogation, web search integration, and structured inference techniques.

Fast engineering plays a key role Optimizing LLM output and in-context learning allows adaptation without retraining.

LLM still has its challenges Improvements are ongoing, like fighting counts, spells, and vulnerabilities in certain scenarios.

The journey to building LLM begins with pre-training on a large text dataset. Karpathy effectively describes this step as “downloading and preprocessing the internet.” Large datasets such as embracing FineWeb (44TB, 15 trillion tokens) play an important role in training the model, predicting the next word in a sentence.

Key pre-requisite processes include:

However, the base model at this stage is basically an “Internet Document Simulator.” Next, we’ll predict the next token, but it’s not optimized for useful conversations.

As explained here, I really like the idea of ​​”Psychology of Basic Models.”

Once the base model is trained, two important steps are performed after training.

Reinforcement learning through human feedback (RLHF): Karpathy compares this stage to a student practicing the student’s problems after learning theory. RLHF helps shape the behavior of the model, more consistent with human preferences, reduces adverse reactions and improves consistency.

A key issue with LLMS is hallucination. Here, the model confidently generates incorrect or meaningless information. Karpathy highlights the following mitigation techniques:

Model Interrogation – Teach a model for recognizing knowledge gaps.

Using tools and web search – Allows LLMS to query external sources to verify facts.

Encourage reasoning – Ask the model to “think step by step” before answering complex questions.

Karpathy is divided into rapid engineering and explains how structured prompts can greatly improve the output of the model. He also discusses in-context learning, where LLMS learns from the structure of the prompt without the need for weight adjustment.

Despite their incredible abilities, Karpathy acknowledges that LLM still faces fundamental limitations such as:

I’m struggling with counting and spelling.

The brittleness of a specific use case (Swiss cheese model of ability).

The challenge of extending structured reasoning into open-ended, creative tasks.

However, this field is moving rapidly, and simply scratching the surface of what is possible with AI-driven reasoning and problem solving.

I don’t think LLM has been compared to Swiss cheese before (or at least I’ve never come across this analogy), but I like it.

Karpathy’s presentations provide a non-technical, clear, and engaging depth about how LLM is trained, tweaked and improved. Whether you’re interested in building AI applications, understand AI tuning, or are fascinated by the internal mechanisms of CHATGPT, this talk is an essential watch.



Source link

Share. Facebook Twitter Pinterest LinkedIn Reddit WhatsApp Telegram Email
Previous ArticleMetastaff torrently 82TB of warehoused for AI training – Court records revealed copyright violations
Next Article Was Patrick Mahomes a good baseball player?
Adnan Mahar
  • Website

Adnan is a passionate doctor from Pakistan with a keen interest in exploring the world of politics, sports, and international affairs. As an avid reader and lifelong learner, he is deeply committed to sharing insights, perspectives, and thought-provoking ideas. His journey combines a love for knowledge with an analytical approach to current events, aiming to inspire meaningful conversations and broaden understanding across a wide range of topics.

Related Posts

Dig into Google Deepmind CEO “Shout Out” Chip Engineers and Openai CEO Sam Altman, Sundar Pichai responds with emojis

June 1, 2025

Google, Nvidia invests in AI startup Safe Superintelligence, co-founder of Openai Ilya Sutskever

April 14, 2025

This $30 billion AI startup can be very strange by a man who said that neural networks may already be aware of it

February 24, 2025
Leave A Reply Cancel Reply

Top Posts

20 Most Anticipated Sex Movies of 2025

January 22, 2025349 Views

President Trump’s SEC nominee Paul Atkins marries multi-billion dollar roof fortune

December 14, 2024113 Views

How to tell the difference between fake and genuine Adidas Sambas

December 26, 202475 Views

Alice Munro’s Passive Voice | New Yorker

December 23, 202472 Views
Don't Miss
AI June 1, 2025

Dig into Google Deepmind CEO “Shout Out” Chip Engineers and Openai CEO Sam Altman, Sundar Pichai responds with emojis

Demis Hassabis, CEO of Google Deepmind, has expanded public approval to its chip engineers, highlighting…

Google, Nvidia invests in AI startup Safe Superintelligence, co-founder of Openai Ilya Sutskever

This $30 billion AI startup can be very strange by a man who said that neural networks may already be aware of it

As Deepseek and ChatGpt Surge, is Delhi behind?

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

About Us
About Us

Welcome to Karachi Chronicle, your go-to source for the latest and most insightful updates across a range of topics that matter most in today’s fast-paced world. We are dedicated to delivering timely, accurate, and engaging content that covers a variety of subjects including Sports, Politics, World Affairs, Entertainment, and the ever-evolving field of Artificial Intelligence.

Facebook X (Twitter) Pinterest YouTube WhatsApp
Our Picks

Joni Ernst: Iowa Senator Joni Ernst won’t launch a major Senate race in 2026

UFC legend explains why athletes like LeBron James are redefineing longevity

Cryptocurrency Live News & Updates: Vaneck proposes SolanaETF for traditional investors

Most Popular

10 things you should never say to an AI chatbot

November 10, 20040 Views

Character.AI faces lawsuit over child safety concerns

December 12, 20050 Views

Analyst warns Salesforce investors about AI agent optimism

July 1, 20070 Views
© 2025 karachichronicle. Designed by karachichronicle.
  • Home
  • About us
  • Advertise
  • Contact us
  • DMCA
  • Privacy Policy
  • Terms & Conditions

Type above and press Enter to search. Press Esc to cancel.