Close Menu
Karachi Chronicle
  • Home
  • AI
  • Business
  • Entertainment
  • Fashion
  • Politics
  • Sports
  • Tech
  • World

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

What's Hot

The world’s largest air force with the F-35 fleet in 2025

AI systems learn from many types of scientific information and run experiments to discover new materials | MIT News

Among the most troublesome relationships in healthcare AI

Facebook X (Twitter) Instagram
  • Home
  • About us
  • Advertise
  • Contact us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
Facebook X (Twitter) Instagram Pinterest Vimeo
Karachi Chronicle
  • Home
  • AI
  • Business
  • Entertainment
  • Fashion
  • Politics
  • Sports
  • Tech
  • World
Karachi Chronicle
You are at:Home » Dive deep into LLMS like ChatGpt by Andrej Karpathy
AI

Dive deep into LLMS like ChatGpt by Andrej Karpathy

Adnan MaharBy Adnan MaharFebruary 9, 2025No Comments3 Mins Read6 Views
Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Reddit
Share
Facebook Twitter LinkedIn Pinterest WhatsApp Email


I think that’s everything Andrei Carpathy Share x or YouTube It’s a gold mine of information and his latest video. Dive deep into LLMS like chatgptno exception. It provides a detailed breakdown of the important mechanisms behind LLMS. It’s a whopping 3 hours and 30 minutes and it covers a lot of ground, so I highly recommend watching it. However, if you haven’t been watching because it’s too long, there’s a TL;DW version.

LLM is trained in two major stages. Pre-training (learning from a vast dataset) and post-training (fine-tuning with human feedback).

Before training, it includes “downloading and processing the internet.” It uses datasets such as FineWeb (44TB, 15 trillion tokens) from Face.

Improve the post-training model It features reinforced learning with monitored fine-tuning (SFT) and human feedback (RLHF).

LLM suffers from hallucinations However, mitigation includes model interrogation, web search integration, and structured inference techniques.

Fast engineering plays a key role Optimizing LLM output and in-context learning allows adaptation without retraining.

LLM still has its challenges Improvements are ongoing, like fighting counts, spells, and vulnerabilities in certain scenarios.

The journey to building LLM begins with pre-training on a large text dataset. Karpathy effectively describes this step as “downloading and preprocessing the internet.” Large datasets such as embracing FineWeb (44TB, 15 trillion tokens) play an important role in training the model, predicting the next word in a sentence.

Key pre-requisite processes include:

However, the base model at this stage is basically an “Internet Document Simulator.” Next, we’ll predict the next token, but it’s not optimized for useful conversations.

As explained here, I really like the idea of ​​”Psychology of Basic Models.”

Once the base model is trained, two important steps are performed after training.

Reinforcement learning through human feedback (RLHF): Karpathy compares this stage to a student practicing the student’s problems after learning theory. RLHF helps shape the behavior of the model, more consistent with human preferences, reduces adverse reactions and improves consistency.

A key issue with LLMS is hallucination. Here, the model confidently generates incorrect or meaningless information. Karpathy highlights the following mitigation techniques:

Model Interrogation – Teach a model for recognizing knowledge gaps.

Using tools and web search – Allows LLMS to query external sources to verify facts.

Encourage reasoning – Ask the model to “think step by step” before answering complex questions.

Karpathy is divided into rapid engineering and explains how structured prompts can greatly improve the output of the model. He also discusses in-context learning, where LLMS learns from the structure of the prompt without the need for weight adjustment.

Despite their incredible abilities, Karpathy acknowledges that LLM still faces fundamental limitations such as:

I’m struggling with counting and spelling.

The brittleness of a specific use case (Swiss cheese model of ability).

The challenge of extending structured reasoning into open-ended, creative tasks.

However, this field is moving rapidly, and simply scratching the surface of what is possible with AI-driven reasoning and problem solving.

I don’t think LLM has been compared to Swiss cheese before (or at least I’ve never come across this analogy), but I like it.

Karpathy’s presentations provide a non-technical, clear, and engaging depth about how LLM is trained, tweaked and improved. Whether you’re interested in building AI applications, understand AI tuning, or are fascinated by the internal mechanisms of CHATGPT, this talk is an essential watch.



Source link

Share. Facebook Twitter Pinterest LinkedIn Reddit WhatsApp Telegram Email
Previous ArticleMetastaff torrently 82TB of warehoused for AI training – Court records revealed copyright violations
Next Article Was Patrick Mahomes a good baseball player?
Adnan Mahar
  • Website

Adnan is a passionate doctor from Pakistan with a keen interest in exploring the world of politics, sports, and international affairs. As an avid reader and lifelong learner, he is deeply committed to sharing insights, perspectives, and thought-provoking ideas. His journey combines a love for knowledge with an analytical approach to current events, aiming to inspire meaningful conversations and broaden understanding across a wide range of topics.

Related Posts

AI systems learn from many types of scientific information and run experiments to discover new materials | MIT News

September 25, 2025

Among the most troublesome relationships in healthcare AI

September 25, 2025

Does access to AI become a fundamental human right? Sam Altman says, “Everyone would want…”

September 23, 2025
Leave A Reply Cancel Reply

Top Posts

20 Most Anticipated Sex Movies of 2025

January 22, 2025461 Views

President Trump’s SEC nominee Paul Atkins marries multi-billion dollar roof fortune

December 14, 2024122 Views

How to tell the difference between fake and genuine Adidas Sambas

December 26, 202486 Views

Alice Munro’s Passive Voice | New Yorker

December 23, 202474 Views
Don't Miss
AI September 25, 2025

AI systems learn from many types of scientific information and run experiments to discover new materials | MIT News

Machine learning models can speed up discovery of new materials by making predictions and proposing…

Among the most troublesome relationships in healthcare AI

Does access to AI become a fundamental human right? Sam Altman says, “Everyone would want…”

Google’s Gemini AI is on TV

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

About Us
About Us

Welcome to Karachi Chronicle, your go-to source for the latest and most insightful updates across a range of topics that matter most in today’s fast-paced world. We are dedicated to delivering timely, accurate, and engaging content that covers a variety of subjects including Sports, Politics, World Affairs, Entertainment, and the ever-evolving field of Artificial Intelligence.

Facebook X (Twitter) Pinterest YouTube WhatsApp
Our Picks

The world’s largest air force with the F-35 fleet in 2025

AI systems learn from many types of scientific information and run experiments to discover new materials | MIT News

Among the most troublesome relationships in healthcare AI

Most Popular

10 things you should never say to an AI chatbot

November 10, 20040 Views

Character.AI faces lawsuit over child safety concerns

December 12, 20050 Views

Analyst warns Salesforce investors about AI agent optimism

July 1, 20070 Views
© 2025 karachichronicle. Designed by karachichronicle.
  • Home
  • About us
  • Advertise
  • Contact us
  • DMCA
  • Privacy Policy
  • Terms & Conditions

Type above and press Enter to search. Press Esc to cancel.