Close Menu
Karachi Chronicle
  • Home
  • AI
  • Business
  • Entertainment
  • Fashion
  • Politics
  • Sports
  • Tech
  • World

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

What's Hot

Who is Graham Platner, the oyster farmer running for Maine Senate? | US News

Lessons to learn how to make your code vibrate using AI like ChatGPT

Masala Bond: DBS faces IT-related prosecution over 2019 Masala Bond investment

Facebook X (Twitter) Instagram
  • Home
  • About us
  • Advertise
  • Contact us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
Facebook X (Twitter) Instagram Pinterest Vimeo
Karachi Chronicle
  • Home
  • AI
  • Business
  • Entertainment
  • Fashion
  • Politics
  • Sports
  • Tech
  • World
Karachi Chronicle
You are at:Home » OpenAI’s o3 model passes AI inference tests but is not yet AGI
AI

OpenAI’s o3 model passes AI inference tests but is not yet AGI

Adnan MaharBy Adnan MaharDecember 20, 2024No Comments5 Mins Read0 Views
Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Reddit
Share
Facebook Twitter LinkedIn Pinterest WhatsApp Email


OpenAI announces breakthrough of new o3 AI model

Locas Tennis / Alamy

OpenAI’s new o3 artificial intelligence model achieved groundbreaking high scores in a prestigious AI inference test called the ARC Challenge, leading some AI fans to wonder if o3 has achieved artificial general intelligence (AGI). I’m guessing. However, while the organizers of the ARC Challenge described o3’s achievement as a major milestone, it did not win the competition’s grand prize. He also warned that this is just one step on the path to AGI, a term used to describe .

The o3 model is the latest in a series of AI releases following the large-scale language model that powers ChatGPT. “This is a surprising and significant step-function increase in AI capabilities, demonstrating novel task adaptation capabilities not previously seen in GPT family models,” said Google Engineer and ARC Challenge Principal. said its creator François Cholet. Blog post.

What did OpenAI’s o3 model actually do?

In 2019, Chollet designed the Abstraction and Reasoning Corpus (ARC) Challenge to test how well an AI could find the correct pattern connecting pairs of colored grids. These visual puzzles are intended to give the AI ​​general intelligence with basic reasoning abilities. However, if you put enough computing power into a puzzle, a non-reasoning program could easily solve it by brute force. To prevent this, the contest also requires that official score submissions meet certain computing power limitations.

OpenAI’s newly announced o3 model (scheduled for release in early 2025) is an ARC Challenge “semi-private” test used to rank competitors on public leaderboards, with an official breakthrough score of 75.7 percent achieved. The computational cost of this work was approximately $20 per visual puzzle task, meeting the competition’s limit of less than $10,000 total. However, the more difficult “private” tests used to determine grand prize winners have even stricter computational power limitations, equivalent to spending just 10 cents on each task, which OpenAI can meet. It wasn’t.

The o3 model also achieved an unofficial score of 87.5 percent by applying approximately 172 times more computing power than the official score. For comparison, a typical human score is 84 percent, and if you can also keep the model’s computing costs within the required limits, a score of 85 percent is enough to win the $600,000 grand prize in the ARC Challenge. is.

However, to achieve the unofficial score, o3’s cost rose to thousands of dollars spent solving each task. OpenAI asked challenge organizers not to publish exact computing costs.

Does achieving this o3 indicate that AGI has been reached?

No, the organizers of the ARC Challenge have clearly stated that they do not believe that beating this competition benchmark is an indicator of achieving AGI.

ARC Challenge organizer Mike Knoop of software company Zapier said that even though OpenAI applied a very large amount of computational power to create the unofficial scores, the o3 model was able to solve more than 100 visual puzzle tasks. said in a social media post about X. .

In a social media post to Bluesky, Melanie Mitchell of the Santa Fe Institute in New Mexico said of o3’s progress in the ARC benchmark: “I think solving these tasks through brute force computing defeats the purpose.”

“While the new model is very impressive and represents a major milestone towards AGI, I do not believe it is AGI. There are still quite a few.” Cholet to another X post.

But Cholet explained how we will know when human-level intelligence is demonstrated by some form of AGI. “You’ll see AGI emerge when the task of creating a task that is easy for a normal human but difficult for an AI becomes completely impossible,” he said in a blog post.

Thomas Dieterich of Oregon State University proposes another way to recognize AGI. “These architectures are claimed to contain all the functional components necessary for human cognition,” he says. “This measure means commercial AI systems lack episodic memory, planning, logical reasoning, and most importantly, metacognition.”

So what does a high score in o3 actually mean?

The o3 model’s high score comes as the tech industry and AI researchers expect a slow pace of advancement for modern AI models in 2024, compared to an initial explosion of development in 2023. It belongs to.

Although it did not win the ARC Challenge, o3’s high score indicates that the AI ​​model has the potential to outperform competitive benchmarks in the near future. Beyond the unofficial high scores, many of the official low computing submissions have already scored above 81 percent on the unofficial assessment test set, Chollet said.

Dieterich agrees: “This is a very impressive jump in performance.” However, he cautions that it’s impossible to assess how impressive this high score is without knowing more about how OpenAI’s o1 and o3 models work. For example, if o3 can practice ARC questions beforehand, it will be easier to achieve. “We’ll have to wait for open source replication to fully understand the significance of this,” says Dieterich.

ARC Challenge organizers are already considering launching a second, more difficult benchmark test sometime in 2025. We also plan to continue the ARC Prize 2025 challenge until someone wins the grand prize and open sources their solution.

topic:

artificial intelligence/A.I.



Source link

Share. Facebook Twitter Pinterest LinkedIn Reddit WhatsApp Telegram Email
Previous ArticleCharli XCX, Post Malone receive additional nominations for 2025 Grammy Awards
Next Article iPhone SE cancellation, AirPods Mac issues, iPhone 17 Air price
Adnan Mahar
  • Website

Adnan is a passionate doctor from Pakistan with a keen interest in exploring the world of politics, sports, and international affairs. As an avid reader and lifelong learner, he is deeply committed to sharing insights, perspectives, and thought-provoking ideas. His journey combines a love for knowledge with an analytical approach to current events, aiming to inspire meaningful conversations and broaden understanding across a wide range of topics.

Related Posts

AI systems learn from many types of scientific information and run experiments to discover new materials | MIT News

September 25, 2025

Among the most troublesome relationships in healthcare AI

September 25, 2025

Does access to AI become a fundamental human right? Sam Altman says, “Everyone would want…”

September 23, 2025
Leave A Reply Cancel Reply

Top Posts

20 Most Anticipated Sex Movies of 2025

January 22, 2025525 Views

President Trump’s SEC nominee Paul Atkins marries multi-billion dollar roof fortune

December 14, 2024126 Views

How to tell the difference between fake and genuine Adidas Sambas

December 26, 202493 Views

Alice Munro’s Passive Voice | New Yorker

December 23, 202478 Views
Don't Miss
AI September 25, 2025

AI systems learn from many types of scientific information and run experiments to discover new materials | MIT News

Machine learning models can speed up discovery of new materials by making predictions and proposing…

Among the most troublesome relationships in healthcare AI

Does access to AI become a fundamental human right? Sam Altman says, “Everyone would want…”

Google’s Gemini AI is on TV

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

About Us
About Us

Welcome to Karachi Chronicle, your go-to source for the latest and most insightful updates across a range of topics that matter most in today’s fast-paced world. We are dedicated to delivering timely, accurate, and engaging content that covers a variety of subjects including Sports, Politics, World Affairs, Entertainment, and the ever-evolving field of Artificial Intelligence.

Facebook X (Twitter) Pinterest YouTube WhatsApp
Our Picks

Who is Graham Platner, the oyster farmer running for Maine Senate? | US News

Lessons to learn how to make your code vibrate using AI like ChatGPT

Masala Bond: DBS faces IT-related prosecution over 2019 Masala Bond investment

Most Popular

10 things you should never say to an AI chatbot

November 10, 20040 Views

Character.AI faces lawsuit over child safety concerns

December 12, 20050 Views

Analyst warns Salesforce investors about AI agent optimism

July 1, 20070 Views
© 2025 karachichronicle. Designed by karachichronicle.
  • Home
  • About us
  • Advertise
  • Contact us
  • DMCA
  • Privacy Policy
  • Terms & Conditions

Type above and press Enter to search. Press Esc to cancel.