San Francisco, USA:
Sutir Balaji, a 26-year-old Openai researcher turned Whistleblower, was found dead in his San Francisco apartment last month. His death on November 26 was ruled a suicide by the San Francisco Medical Examiner’s Office after police found no evidence of wrongdoing.
Balaji, who left Openai in August, spoke out against the artificial intelligence company’s practice of training chatbots on copyrighted material, which has been scraped very publicly in recent months. The artificial intelligence (AI) giant is fighting several lawsuits related to its data collection practices.
About Sutir Balaji
Indian-American Suchir Balaji grew up in Cupertino, California. A very sharp kid, he excels in programming contests, taking 31st place at the ACM ICPC 2018 World Finals and 1st place at the 2017 Pacific Northwest Regional and Berkeley Programming Contest.
Balaji won 7th place and $100,000 in Kaggle’s TSA-sponsored Passenger Screening Algorithm Challenge. According to his LinkedIn profile, he was a 2016 national champion and USACO finalist.
Like most others in his field, Balaji was fascinated by the promise of artificial intelligence from an early age. In an interview given to The New York Times in October, he said his interest in AI began after coming across a news article about the technology as a teenager, imagining that neural networks could solve humanity’s biggest problems. .
“I thought AI was something that could be used to solve unsolvable problems, like treating diseases or stopping aging…I thought we could invent some kind of scientist to help solve them.” he said. .
Even before graduating, he worked at Scale AI, Helia, and was a software engineer at Quora. In 2020, Balaji joined Ogawa, a Berkeley graduate who went to work at Open Ally.
Balaji era in open lie
He worked at Openai for four years, during which time he spent a year and a half helping collect and organize vast amounts of internet data that the company used to build ChatGpt, an online chatbot. Ta.
Balaji told the NYT that in his initial data at Openai, he did not carefully consider whether he had the legal right to use both copyright data and open internet data to build products. Ta. It was only after the release of ChatGpt in late 2022 that he began to ponder this issue, stating that technologies like ChatGPT are harming the Internet by using copyrighted data and violating laws in the process. I noticed that
By 2024, Balaji said he realized that “he no longer wanted to contribute to a technology that he believed would do society more harm than good.” He left the company without a new job in August of this year and began working on what he called a “personal project.”
He died a day after he was named in a court filing as someone whose files Openai would search as part of a lawsuit filed by people suing the AI giant.
Balaji’s allegations against Open Allies
After leaving Openai, Sungir Balaji spoke out publicly against how AI companies are using copyrighted data to create technology. He argued that AI models are too dependent on the labor of others because they are trained on copyrighted material scraped from the internet without permission.
“This is not a sustainable model for the entire internet ecosystem,” he told the NYT.
He also explained his concerns on his personal website. There, generative models rarely produce outputs that are identical to the training data, but the act of reproducing copyrighted material during training is legal if not protected by “fair use.” stated that there is a possibility of a violation.
“Fair use is determined on a case-by-case basis, so we cannot make broad statements about when generative AI qualifies as fair use,” he noted.
Balaji argued that in some cases chatbots compete directly with the copyrighted works they studied. “Generative models are designed to mimic online data, so they can replace ‘basically anything’ on the internet, from news articles to online forums,” he said.
The biggest problem, he says, is that AI technology will gradually replace existing internet services, creating “false and sometimes completely constructed information – what researchers call ‘hallucinations.'” That’s true.
The internet, he said, is getting worse.
Claims against AI companies
Balaji was not alone in his concerns that AI companies were misusing significant data to train chatbots. News publishers in the U.S. and Canada, including The New York Times, have filed lawsuits against Openai and its major partner Microsoft, using millions of articles as sources of authoritative information for news outlets. We built a competitive chatbot.
Many best-selling authors, including John Grisham, also filed lawsuits against the company.
Openai dispute claims
Openai disputes Balaji’s claims and maintains that their data complies with fair use principles and legal precedent.
“We build AI models using publicly available data in a manner that is protected by fair use and related principles and supported by long-standing and widely accepted legal precedent. “We believe this principle is fair to creators, necessary to innovators, and important to America’s competitiveness,” Openai said in a statement.
The company told the BBC in November that its software is “based on fair use and related international copyright principles.”
Reacting to Balaji’s death, an Openai spokesperson said: “We are devastated to learn of this incredibly sad news today and our hearts go out to Sungir’s loved ones at this difficult time. ” said.