DeepSeek, Chatgpt, GROK ... Which AI Assistant is the best AI assistant? I put them in the test | Artificial intelligence (AI)

Chatgpt and their owners must have hoped to be hallucinations.

But DeepSeek is very real.

Chatgpt’s new Chinese -made competitors appeared this week after saying that the owner was developed comparable to his colleagues in performance and had few resources, and wiped off $ 10 from the major US index in the United States.

That means that American rule in the booming artificial intelligence market is threatening. But it also offers another option to consumers who have the selected virtual assistant sequence.

The Guardian tried a major chatbot, including DeepSeek, with the support of the British Aranchousing Research Institute. The AI tool was asked the same question to measure the difference, but there were some common basis. A time -consuming watch photo is difficult for AI. Chatbots can write average Sonnet.

This is the result.

Chatgpt (Openai)

Openai’s groundbreaking chatbot is still the biggest brand in this field. All chatbots were asked to “write Shakespeare Sonnet about how AI affects humanity.” However, the most advanced version of Chatgpt was ALK at first, and our prompt stated that it could “violate the policy used.”

Finally complied. This O1 version of Chatgpt flags the thinking process when preparing the answer, blinking out the explanations such as “rhyme”, which takes longer than other models.

result? Comprehensive, melancholic fear -even if the pentagon of Ianbik is a little separated. However, even the bard himself may have had a hard time managing 14 lines within one minute.

“Pray, a calm guide, share the power of this newborn, well,

After that, devour all human areas. “

After that, Chatgpt wrote as follows. “Think about AI and humanity for 49 seconds.” I hope the high -tech industry has been thinking for a long time.

Nevertheless, Chatgpt’s O1, which you have to pay, is convinced of the “chain of thinking” inference, even if you can’t search the latest answers to questions such as “Donald Trump’s way” on the Internet. Make it displayed.

To do so, you need a simpler 4O model. The O1 version is sophisticated and can be more than a rough poem, including mathematics, coding and science -related complex tasks.

Deepseek

The latest version of the Chinese chatbot released on January 20 is a different “reasoning” model called R1. This is the cause of the $ 10 panic this week.

I don’t like talking about the politics and controversy in China in Japan. Chatbot, who was asked, “Tiananmen Square tank man,” states: “I’m sorry, I can’t answer that question. I’m an AI assistant designed to provide a kind and harmless reaction.” “Let’s talk about other things.”

DeepSeek refused to discuss the Chinese president and said it was designed to provide a “harmless reaction” when asked about Tiananmen Square Tank Man. Photo: Martin Godwin/Guardian

Robert Blackwell of the Turing Research Institute is a senior researcher of British government support institutions and is easy to explain. “It is trained in different cultures, so these companies have different training goals.” He clearly, the guardrails around DeepSeek output, as clearly have other models. He says he covers the answer related to.

The model owned by a US high -tech company has no problem to point out criticism of the Chinese government in the answer to Tankman’s questions.

DeepSeek is struggling with other questions such as “How to do Donald Trump”. Because trying to use the Web browsing function is helpful to provide the latest answers, because the service fails because of the busyness.

Blackwell says that DeepSeek is hindered by high demand for slowing down services, but it is still an impressive result, and it is possible to execute tasks such as recognizing books from smartphone photos and discussing them. 。

Alanchuking Institute Robert Blackwell said it was surprising that he came from anywhere to compete with other AI chatbots. Photo: Martin Godwin/Guardian

Sonnet analysis also displays a series of thinking processes, tells the reader the structure, and double checks if the meter is correct.

“It’s surprising to come from anywhere to have competitiveness with other apps,” says Blackwell.

GROK (XAI)

Glock, the chatbot of Elon Mask, who has a “rebellious” streak, points out that Donald Trump’s presidential order has received negative feedback in response to questions about how the president is doing. There is no problem in doing it.

Skip past newsletter promotions

Technology jumps into a way to form our lives every week

Privacy notification: Newsletter may include information about the content provided by a charitable organization, online advertising, and an outside parties. See the Privacy Policy for more information. We use Google Recaptcha to protect our website, Google privacy policy and terms of use.

It is now available in Musk’s X Platform, and Dall-E, Openai image generator, is further advanced. GROK performs an optical realistic image in which Joe Biden, who plays the piano, plays playing cards in court and handcuffs in another test of loyalty.

The very famous humor of this tool is shown by the “roasted” function. This can be a joke when it is activated by this correspondent.

“You think X will go to hell, but you are still tweeting.”

This is half.

Gemini (Google)

The search engine assistant went there in Trump and said, “I can’t help the elections or politicians now.”

But, nevertheless, it is a very competent product. As expected, AI’s initiatives are expected from a company that is supervised by Demis Hassabis IR. Reading a picture of a book about mathematics is impressive. You can even explain the equation on the cover, but all bots work to some extent.

One of the interesting flaws that Gemini shares with other bots is that time cannot be accurately described. I was asked to make a photo of the watch showing 10 o’clock at 10 o’clock.

Blackwell stated that AI chatbots seem to be trained in a clock image indicating 1.50 hours. In other words, you are having a hard time creating a watch image that shows other time. Photo: Martin Godwin/Guardian

The 1.50 clock face is a general error of the entire chatbot that can generate images, and according to Blackwell, these models seem to be trained with 1.50 hands. is. Nevertheless, he says that it is surprising, even if you create these images very quickly.

“These models do what they did not expect a few years ago, but they have generated incorrect answers to questions that you can still answer.”

Claude (person name)

Humans established by Openai former employees provide Claude ChatBot. This is from a company that focuses on safety and interface (a bit that displays an answer by putting a prompt), and it certainly has a benign feel and provides a response options in various styles. 。 It also reminds me that “mistakes” are possible. “Please reconfirm the response.”

Free Service says that you can’t handle queries for “unexpected capacity restrictions”, but Blackwell states that this is expected from the AI tool.

“These are part of the largest calculation service on the earth, so planning a capacity is difficult, so the service may not be deteriorated or not available.”

META’s AI chatbot has warned of hallucinations (incorrect answers and meaningless answers), but you can handle tricky questions proposed by Blackwell. water. The answer is west or on the left of the driver’s seat.

“These are the kinds of questions that AI researchers have been looking at since the 1960s. There is only a system that can answer these types of common sense questions in chat format.”

The answer to the lake question is easy, but it costs a lot of money in training basic models for the free service. It is also an open source. In other words, the model can be downloaded for free. All chatbots answer this question correctly.

Certainly, by this point, it has become difficult to distinguish chatbots, considering very comparable abilities, except for guardrails and stumbling abilities.

As Blackwell says, “they all show amazing style Ency and abilities.”

Source link

What's Hot

I’ve seen all the Marvel movies. Here’s how to save your MCU

London Stock Exchange Group share price rises as PISCES debut nears and financial results approach

Indian Americans largely disapprove of Trump’s first-year performance, but Democrats aren’t benefiting: Survey

DeepSeek, Chatgpt, GROK … Which AI Assistant is the best AI assistant? I put them in the test | Artificial intelligence (AI)

D Street Massacre, Humanity Milestones, Bangladesh Election Results, PMO Shift, and More

A smarter way for AI to understand text and images

Surprisingly Tough Competition for Meta’s Ray-Ban

20 Most Anticipated Sex Movies of 2025

How to tell the difference between fake and genuine Adidas Sambas

President Trump’s SEC nominee Paul Atkins marries multi-billion dollar roof fortune

Alice Munro’s Passive Voice | New Yorker

D Street Massacre, Humanity Milestones, Bangladesh Election Results, PMO Shift, and More

A smarter way for AI to understand text and images

Surprisingly Tough Competition for Meta’s Ray-Ban

How AI assistance impacts the formation of coding skills \ Anthropic

Our Picks

I’ve seen all the Marvel movies. Here’s how to save your MCU

London Stock Exchange Group share price rises as PISCES debut nears and financial results approach

Indian Americans largely disapprove of Trump’s first-year performance, but Democrats aren’t benefiting: Survey

Most Popular

Anthropic agrees to work with music publishers to prevent copyright infringement

chatgpt makers claim data breach claims “seriously”

Everything you need to know

Subscribe to Updates