Google’s Gemini and OpenAI’s ChatGPT are the most widely used artificial intelligence platforms today. Each has millions of active users and new features are added regularly.
In December alone, Google and OpenAI both released improved image generation models, AI inference, and investigation tools to make finding information easier.
Both feature voice assistants in the form of OpenAI’s Advanced Voice and Google’s Gemini Live, and both can connect to external data sources to build projects.
In human evaluation tests, the leading models from both Google and OpenAI regularly swap positions in the chatbot space, and in our own comparisons, sometimes Gemini wins, sometimes ChatGPT wins. did.
To determine the winner, after 12 days of OpenAI announcements and December’s release of Google Gemini, I devised seven prompts to test them.
Creating a prompt
In our tests, we use ChatGPT Plus and Gemini Advanced to take advantage of the best models both platforms have to offer. Both subscription versions cost about the same, about $20 per month, so this is another good point of comparison.
I test my image generation and analysis, how well I can code games, and my creative writing skills. Next, we also came up with prompts to test each bot’s research model (ChatGPT’s o1 and Gemini’s 1.5 Deep Research).
1. Image generation
First, we asked ChatGPT and Gemini to each create an image of a cyborg cat sitting in a futuristic living room. Currently, neither model generates its own images. The prompt is sent to Imagen 3 for Gemini and DALL-E 3 for ChatGPT.
A future version of the model will allow you to create your own images, but for now we are testing how well it can interpret the prompts.
Prompt: “Create a highly detailed image of a cyborg cat in a futuristic living room. The cat should be sitting on a floating chair while playing with a hovering game console. The room should include: Holographic displays, neon lighting, and an eclectic mix of lights are a must. Spend your evenings with metallic and organic elements and views of the city lights through the large windows.”
Winner: ChatGPT that turned a cat into an actual cyborg
2. Image analysis
For the second prompt, I wanted to test the image analysis capabilities of Gemini and ChatGPT. Both of them are very good at what they do, so I was able to give them specific instructions, not just images. I used a photo from the “Dream Setup” story.
Prompt: “In this photo of a game setup, analyze the following:
Ergonomic monitor position and height
Cable management solutions and problems
Lighting settings and potential eye strain factors
Space utilization and organization
Equipment placement efficiency
Include specific recommendations for improvement along with estimated costs. ”
Winner: Split summary into tables ChatGPT
3. Coding
For the third prompt, I wanted to test the “one-shot coding” feature of both models by giving a descriptive prompt. We used the o1 model for ChatGPT and the 2.0 Experimental Advanced model for Gemini.
This is one of the more complex prompts, as the main purpose was to do the output in one shot. It should work right away. I’ve put the code for both of these games on GitHub.
Prompt: “Use PyGame to create a fast-paced arcade game called “Color Dash” where quick reactions and color matching are important.” Here’s what you need:
Core gameplay:
Colored shapes fall from the top of the screen (circles, squares, triangles)
Three “collector zones” of different colors at the bottom
Player moves through the zone using left/right arrow keys
Match falling shapes with zones of the same color
Missing a match or playing the wrong match = loss of life
As the score increases, so does the speed.
Must include:
The clean, minimalistic UI shows:
current score
high score
Remaining lives (starting from 3)
Basic animation of matches/mistakes
simple title screen
Game over screen with final score displayed
smooth controls
Basic sound effects:
Match success
false match
game over
Score saved to local file
Press space to resume after game over
Your game should use only basic PyGame shapes (no sprites or complex graphics), but give it a polished look with good use of color and smooth animation. Please include commented code that explains how it works. ”
Winner: Gemini for a more functional game
4. Creative writing
It’s no secret that AI is good at creative writing. This was one of the first use cases for ChatGPT when it came out. Here, we used the o1 model for ChatGPT and the 2.0 Experimental Advanced model for Gemini to tell a story about smartphones.
This story is about a smartphone that regains consciousness after a rogue software update, and I compiled both stories into a Google Doc. Both were incredibly similar in quality and storytelling ability.
Prompt: “Write a 500-word story about a smartphone that gained consciousness through a software update. Please include the following:
Growing mobile phone awareness of user habits
Moral dilemmas regarding personal information
elements of humor and irony
References to current technology trends
twist ending
The tone should be light but thought-provoking. ”
Winner: ChatGPT (story length)
5. Problem solving
Once again, we used o1 and Gemini 2.0 Experimental Advanced to improve our inference capabilities. The prompts give setup and problems for both models. Now you need to figure out how to fix it.
Full responses from both can be found in the Google Docs. Both provided detailed step-by-step instructions on how to complete each attempt. In practice, I run through this type of prompt one problem at a time, and both worked fine.
Prompt: “My setup: LG C3 4K OLED TV, PS5, high speed HDMI 2.1 cable”
Problem: A black screen flashes for 2-3 seconds every 45-60 minutes while gaming.
Additional details:
Doesn’t happen with streaming apps
Started after recent PS5 system update
HDMI cable is properly secured
TV firmware is up to date
Provides step-by-step troubleshooting, including potential hardware and software solutions. ”
Winner: Gemini with a better structured response.
6. Room design
In this attempt, we pitted ChatGPT o1 against Gemini 1.5 Deep Research. Although not the latest model from Google, Deep Research is amazing because it looks at problems in the same way as ChatGPT’s o1.
I’m a big fan of Deep Research. Deep Research is very helpful in finding well-cited studies with links to accurate sources. However, as you can see from the Google Docs, ChatGPT o1 followed the prompts more accurately.
Prompt: “Help us transform a 4×3 meter guest room into a multi-purpose space that:
Home office on weekdays (I work remotely as a graphic designer)
Comfortable guest room for elderly parents who visit monthly
Craft space for kids’ weekend projects
Requirements:
Budget: $2,000
Must include storage area for artifacts
Video calls require proper lighting
Need an accessible bed for parents with disabilities
Space for a 27-inch monitor and drawing tablet
Floor space for children to expand their projects
Good airflow and natural light from one window should be maintained.
provide:
Detailed floor plan proposal
Specific furniture recommendations and prices
storage solution
Color scheme and lighting plan
Renovation schedule
Tips for maintaining organization between different uses. ”
Winner: ChatGPT wins this by successfully following the prompts. Gemini went over budget and focused too much on expensive options
7. AI education
Finally, the best use for chatbots like ChatGPT and Gemini is in education. I asked them to explain AI image generation to the general public and outline their ideas for what might happen next with this technology.
I’ve shared the full instructions on Google Docs, but for me the winner was easily Google Gemini. Not because ChatGPT was bad, but because Gemini went further, including providing more details on bias in image data.
Prompt: “Describe the process of AI image generation in everyday terms, including:
How AI learns from existing images
The role of text prompts in authoring
Why certain elements appear distorted
Legal and ethical considerations
Current limitations and challenges
Expected to improve over the next 1-2 years
Tips for better results
Please include specific examples of popular AI image generators. ”
Winner: Gemini (details on image data bias)
ChatGPT vs Gemini: Winner
ChatGPT was the winner of this challenge, but by just one point. Since our last comparison, Gemini has improved significantly. It turns out that Geminis are much better at coding and problem solving than I ever imagined.
There are also features we haven’t tested, such as comparing projects to Gems and running more complex code problems on multiple messages. But I hope this gives you a better idea of how far ChatGPT and Gemini are and how they compare.