Close Menu
Karachi Chronicle
  • Home
  • AI
  • Business
  • Entertainment
  • Fashion
  • Politics
  • Sports
  • Tech
  • World

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

What's Hot

Instead of Timothée Chalamett or Tom Holland, Sean Penn declares the Oscar-winning actress “the last movie star.” Hollywood

Does an American pope change U.S. politics? : The NPR Politics Podcast : NPR

Amazon will face Elon Musk’s Tesla with the robot launch.

Facebook X (Twitter) Instagram
  • Home
  • About us
  • Advertise
  • Contact us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
Facebook X (Twitter) Instagram Pinterest Vimeo
Karachi Chronicle
  • Home
  • AI
  • Business
  • Entertainment
  • Fashion
  • Politics
  • Sports
  • Tech
  • World
Karachi Chronicle
You are at:Home » Everything you need to know
AI

Everything you need to know

Adnan MaharBy Adnan MaharSeptember 29, 2021No Comments8 Mins Read0 Views
Facebook Twitter Pinterest Telegram LinkedIn Tumblr Email Reddit
Share
Facebook Twitter LinkedIn Pinterest WhatsApp Email


Google Gemini was first launched in December 2023, but recently underwent a major upgrade with the release of Gemini 2.0 in early December. It’s built for what Google calls the “age of agents,” with the ability to run complex, multi-step processes more independently.

Other key improvements include native image and audio processing, faster response times, improved coding capabilities, and other Google apps being developed to power your Android smartphone, computer, and other connected devices. and new integrations with solutions.

Mobile phone showing conversation with Google Gemini

related

5 easy ways to supercharge your Android with Google Gemini

Google Assistant Killer?

A dizzying onslaught of new Gemini models

Screenshot of Gemini's model dropdown highlighting 2.0 Experimental Advanced.

Google has been developing a ton of different AI models lately, with multiple new versions released in the past few weeks. 2.0 Improvements in certain areas, such as Flash speed, are immediately noticeable. Some go into more specialized fields, such as coding. 2.0 Pro, on the other hand, is still in development.

The new 2.0 model is available on desktop and recently on the Gemini mobile app, where there is a selector to select the model. And let’s not forget the on-device Nano model. This already powers certain Google Pixel features such as call summarization. It is also worth noting that in recent days another new model 2.0 Experimental Advanced has appeared on the desktop.

But as Taylor Kearns points out, Gemini is becoming increasingly complex, making it difficult to keep track of all its variants. There isn’t much information available about Experimental Advanced, so I’m sticking to the two in the comparison below.

Features Gemini 1.5 Pro Gemini 2.0 Flash Experimental Context Window 1 million tokens (approximately 750,000 words or 1,500 pages of text) 1 million tokens (approximately 750,000 words or 1,500 pages of text) Speed ​​Response within seconds approximately 2x faster Power Consumption High Low Reasoning/Logic Powerful reasoning and collaboration It claims that inference is improved and agent capabilities are added for multimodal processing by converting images and audio to text. Native image and audio processing. You can now “speak” using AI voice. Image creation has been paused Supported Coding Can generate code Can generate and run code, parse API responses, and integrate data into external applications

Gemini 2.0 Flash is all about speed and efficiency

Gemini 2.0 image (from Google Blog)

Source: Google

As the name suggests, Gemini 2.0 flash is designed for speed. Google claims it’s twice as fast as the previous version. As a user of both 1.5 Pro and 2.0 Flash Experimental, I can attest to its agility.

2.0 provides near-instantaneous responses to the same queries that could take seconds in 1.5 Pro. This may not sound like a big impact, but instantaneous responses unlock new possibilities for real-time applications such as voice interaction. It also makes the overall user experience feel more polished. Despite its increased power, Gemini 2.0 flash is designed to be more energy efficient, which can directly translate into improved battery life for your smartphone.

Gemini 2.0 Flash brings enhancements to other core areas. Google says it performs better than the Gemini 1.5 Pro on complex tasks like coding, math, and logical reasoning. Additionally, Gemini 2.0 Flash can now directly execute code, autonomously process API responses, and call user-defined functions. 2.0 is starting to look more like an end-to-end development solution than a simple code generator.

Gemini wants to be your AI agent

Gemini 2.0 aggregates travel planning details

Agentic AI brings Gemini to proactive assistance. This means that Gemini can act as an agent and perform multi-step tasks on your behalf. Future applications will include everything from gaming and robotics to travel planning.

Suppose you are planning a trip to Tokyo. Instead of just asking your Gemini for sightseeing suggestions, you can ask them, “Create me a detailed itinerary for a 5-day trip to Tokyo, including must-see attractions, recommended local restaurants, and estimated costs.” Probably. I tried this exact prompt and the platform generated an attractive daily itinerary for me. However, there are still missing components.

In theory, Gemini can do much more, such as booking flights, accommodation, and reserving a table at a restaurant. In fact, 2.0 Flash integrates with Google Flights and can display hotel availability at your destination, but the final step to automate the entire process is still to come. It’s easy to see how this can be a difficult problem, as booking the wrong flight, for example, can literally cost you a lot of money. Imagine an AI booking the wrong trip to Springfield.

Gemini 2.0 can see, hear and speak

Voice chat with Gemini 2.0

Multimodal input/output advances within Gemini 2.0 are another important feature. By seamlessly integrating information from various sources such as text, images, video, and audio, Gemini 2.0 can experience the world as we do. This paves the way for more human-like communication.

With Gemini 2.0, you can now have conversations using AI voices. The mobile app had several different voices to choose from, and I had a surprisingly natural and fluid conversation where I selected my favorite and asked the AI ​​questions about the city I wanted to visit. The level of effort is clearly lower than typing a query and reading the response. While this feature is not new to the industry (think AI “companion” apps), it is new to Gemini.

Native image and audio processing provides noticeable improvements

A great improvement in Gemini 2.0 is the ability to process images and audio directly. In contrast, previous versions converted these inputs to text, which lost more information. Direct processing allows for a deeper understanding of the input. Gemini 2.0 can not only identify elements in images and audio, but also understand their relationships to each other and the scene as a whole.

During my test, I took an image looking out from my office and sent it to Gemini 2.0 Flash. There is a window screen in the foreground, and shrubs and other objects in the midground. The AI ​​quickly recognized that the photo was taken through a screen and detailed other elements in the scene. Overall, we found that the 2.0 model provided more nuanced and detailed image analysis than previous versions.

Gemini image production is back, but who cares?


Despite all the hype about Gemini 2.0’s improvements, the return of Imagen image generation was a bit boring. After the initial controversy and subsequent feature disabling due to bias and inaccuracy, the re-release felt uninteresting. Perhaps Imagen has been watered down to avoid further controversy, or perhaps the novelty of AI image generation has simply worn off during Google’s long hiatus.

Gemini 2.0 example Imagen image

The image above was created by Gemini 2.0 Flash Experimental when asked to create the most interesting image imaginable. I understand it’s a subjective prompt, but I can still say the results are underwhelming. At best, it looks like a scene from a video game.

After further experimentation, I asked 2.0 Flash Experimental to simply “create an image of a person” and it refused. When I went back to 1.5 Pro and saw the same prompt, I was greeted with a brightly colored stock photo-like image of a group of friends. Imagen allows you to see through the eyes of Google’s AI, but the perspective isn’t very exciting.

New integration portends the future

A scene from Google I/O 2024 with the following written on a large screen: "project astra"

Source: Google

Google aims to provide a more unified user experience by incorporating Gemini functionality into core services such as Search, Maps, and Workspaces.

In the future, search queries on Google will generate dynamic AI-powered responses, leveraging information from emails, documents, and even location history to provide more personalized results. It will be like this. Google is already experimenting with AI search summaries that feature audio summaries in the style of its sister product, NotebookLM.

Early efforts like Project Astra and Project Mariner have finally seen the light of day with the latest Gemini models. Astra consists of experiments with AI-powered code agents such as Jules. Mariner, on the other hand, may enable tasks such as autofilling forms and summarizing web pages. These projects are essentially the philosophical pillars on which Google develops AI applications and services.

Google Gemini

related

Google’s experimental Gemini 2.0 Advanced model is here, but it’s not for everyone

A free Pixel subscription may help

Google is building a strong AI foundation with Gemini

Gemini 2.0 is a significant step forward for Google AI, delivering faster speeds, enhanced inference, and seamless multimodal integration. The lackluster image generation and confusing model variations highlight the complexity of this rapidly changing category.

However, advances in agent AI, new coding, voice and image capabilities, and deeper integration with core Google services portend good things to come in 2025.



Source link

Share. Facebook Twitter Pinterest LinkedIn Reddit WhatsApp Telegram Email
Previous ArticleI never expected Pixar’s dream production to tackle Nepobaby (but I kind of like it)
Next Article Grammy-nominated producer Paul Kathryn launches Doge Dash: The New Play to Earn Game
Adnan Mahar
  • Website

Adnan is a passionate doctor from Pakistan with a keen interest in exploring the world of politics, sports, and international affairs. As an avid reader and lifelong learner, he is deeply committed to sharing insights, perspectives, and thought-provoking ideas. His journey combines a love for knowledge with an analytical approach to current events, aiming to inspire meaningful conversations and broaden understanding across a wide range of topics.

Related Posts

Google, Nvidia invests in AI startup Safe Superintelligence, co-founder of Openai Ilya Sutskever

April 14, 2025

This $30 billion AI startup can be very strange by a man who said that neural networks may already be aware of it

February 24, 2025

As Deepseek and ChatGpt Surge, is Delhi behind?

February 18, 2025
Leave A Reply Cancel Reply

Top Posts

President Trump’s SEC nominee Paul Atkins marries multi-billion dollar roof fortune

December 14, 202495 Views

Alice Munro’s Passive Voice | New Yorker

December 23, 202453 Views

2025 Best Actress Oscar Predictions

December 12, 202434 Views

20 Most Anticipated Sex Movies of 2025

January 22, 202533 Views
Don't Miss
AI April 14, 2025

Google, Nvidia invests in AI startup Safe Superintelligence, co-founder of Openai Ilya Sutskever

Alphabet and Nvidia are investing in Safe Superintelligence (SSI), a stealth mode AI startup co-founded…

This $30 billion AI startup can be very strange by a man who said that neural networks may already be aware of it

As Deepseek and ChatGpt Surge, is Delhi behind?

Openai’s Sam Altman reveals his daily use of ChatGpt, and that’s not what you think

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

About Us
About Us

Welcome to Karachi Chronicle, your go-to source for the latest and most insightful updates across a range of topics that matter most in today’s fast-paced world. We are dedicated to delivering timely, accurate, and engaging content that covers a variety of subjects including Sports, Politics, World Affairs, Entertainment, and the ever-evolving field of Artificial Intelligence.

Facebook X (Twitter) Pinterest YouTube WhatsApp
Our Picks

Instead of Timothée Chalamett or Tom Holland, Sean Penn declares the Oscar-winning actress “the last movie star.” Hollywood

Does an American pope change U.S. politics? : The NPR Politics Podcast : NPR

Amazon will face Elon Musk’s Tesla with the robot launch.

Most Popular

ATUA AI (TUA) develops cutting-edge AI infrastructure to optimize distributed operations

October 11, 20020 Views

10 things you should never say to an AI chatbot

November 10, 20040 Views

Character.AI faces lawsuit over child safety concerns

December 12, 20050 Views
© 2025 karachichronicle. Designed by karachichronicle.
  • Home
  • About us
  • Advertise
  • Contact us
  • DMCA
  • Privacy Policy
  • Terms & Conditions

Type above and press Enter to search. Press Esc to cancel.