Despite the fact that Chatgpt’s creator Openai is facing a barrage of copyright infringement in some countries, the company may have copied by Chinese rival Deepseek from artificial intelligence (AI) technology. I’m thinking. Not only Openai but also one of the US President Donald Trump’s top advisors has leveled this claim without presenting many evidence.
DeepSeek’s entry into the AI space has been advertised as open source and accuracy, and some of the costs as a US competitors have been built, causing drastic changes in the technology industry. Masu. Compared to something like Openai, the model is trained in a graphic processing UNITES (GPU) with inferior models, so the NVIDIA inventory was sent to a lower -facing spiral. And the entry reigned a more strict export regulation conversation.
In this context, Openai states that DeepSeek may have used a technique called “distillation”. Deepseek has been accused of theft of intellectual property since gaining mainstream attention, but some industry experts have an inadequate understanding of how models such as DeepSeek are trained. We rejected these claims that it was caused.
Suspicion of Openai about DeepSeek
Openai is prohibited from training new AI models by repeatingly inquiring with more prior training models, generally called distillation, in accordance with the Terms of Use. And the company suspects that DeepSeek may have tried the same thing.
“I know that the PRC (China) group is actively working to replicate advanced US AI models using a method called distillation,” said Openai’s spokesman. I mentioned. “We are aware that DeepSeek may be inappropriate to the model and share information as you know the details.”
Trump’s AI advisor David Sacks has a substantial evidence that “DeepSeek has been distilled from Openai models.
The story continues under this ad
Industry players compete with Openai’s claim
However, some people oppose the claim that DeepSeek has copied technology from Openai and others.
“There are many misunderstandings that China has” clone “. This is far from the truth, reflecting the incomplete understanding of how these models are trained in the first place … “ARAVIND SRINIVAS, the CEO of PERPLEXITY, is a post of X. I mentioned it.
“Deepseek R1 understands RL (reinforcement) Finetening. They were called Deepseek R1 Zero, which wrote the entire paper on this topic, and did not use SFT (monitored fine -tuned). And combined with some SFTs, the main reasons for the good reason for the domain (also known as filtering) is not to imitate other humans and models. He added the reason from zero, “he added.
The story continues under this ad
The idea of using reinforced learning (RL) was the focus of AI companies in 2024. “This new paradigm starts with a regular type of premise model and accompanies the addition of RL to add inference skills,” said Dario. Amodei, a human CEO, is a blog post.
The monitored fine -tuned (SFT) is a dataset with labeled label in a specific task, and is a machine learning process in which prior trained models are further trained (fine -tuned). This approach adapts to use the general knowledge that the model has already acquired before the first training and work well with a more specialized task.
According to an overview of the GitHub page with a DeepSeek model, the company has applied enhanced learning to the basic models without relying on fine -tuning as a preliminary step.
“With this approach, the model can explore the chain of ideas (Cot) to solve complex problems, and as a result, Deepseeek-R1-ZERO is developed. Deepseeek-R1-ZERO is self-verifying. , Reflection and long COT generation, and verify that the Milstone, which is important for the research community, can be purely incorporated through RL without the need for SFT. This is the first open research, which is a way to make a future.
The story continues under this ad
Openai’s own copyright trouble
News Publishers’ skepticism of materials protected by copyrights such as news reports used for training basic models such as Openai without permission or payment, especially in countries such as the United States and India, especially in the United States and India. The theory is growing.
In November last year, ANI, a news agency, appealed to Openai in the Delhi High Court and accused him of training AI models using Indian copyright. Earlier this week, many digital news publishers, including Indian Express, have submitted intervention in the case.
The claim has developed a large -scale language model (LLM) by “training” with a huge amount of texts, including copyrighted works, including copyrighted works, without licensing or permission. That is. “The illegal use of copyrighted materials will benefit from Openai and their investors in the disadvantages of creative works throughout the Indian industry,” said Digital News Publishers Assortion ( DNPA) is stated in a statement.
The story continues under this ad
Openai faces many similar litigation in other jurisdictions. In December 2023, New York Times appealed for the company and Microsoft, and cited the “illegal” use of copyrighted content. This publication claims that a large language model of Power Chatgpt and Copilot’s PowerAi and Microsoft can “produce content in a translator, closely summarize it, and generate output that imitates the expression style.” I am doing it. At the same time, the relationship between the “damage and damage” time has been deprived of the “subscription, license, license, advertising, and affiliate revenue”.