This week, the tech world saw what could only be described as a drama straight out of an HBO boardroom. The whole OpenAI sage unfolded right before our eyes, with the firing and subsequent return of co-founder Sam Altman as the company’s CEO becoming a public spectacle.
It all started last Friday when the OpenAI board ousted Sam Altman, citing a lack of transparency in his communications and stating that Altman “was not consistently candid in his communications with the board.” The actual reasons behind Altman’s dismissal are still wrapped in mystery, but the grapevine on social media is buzzing with rumors and speculations.
Among the speculations circulating is the notion that Sam’s potential firing might be linked to OpenAI’s secret artificial intelligence breakthrough known as Q*, or Q-Learning, seen as a precursor to achieving artificial general intelligence (AGI). The intrigue continues to fuel conversations and curiosity about the events unfolding at OpenAI.
In a post on social media, X user by the name of Cheung said that Altman’s firing was driven in part by “OpenAI’s secret breakthrough AI named Q* (possibly Q-learning).” Citing a report from Reuters, Cheung wrote:
“OpenAI’s secret breakthrough called Q* (pronounced Q-Star) precipitated the ousting of Sam Altman. Ahead of Sam’s firing, researchers sent the board a letter warning of a new AI discovery that could “threaten humanity.”
According to a report by Reuters, OpenAI\\\’s secret breakthrough called Q* (pronounced Q-Star) precipitated the ousting of Sam Altman.
Ahead of Sam\\\’s firing, researchers sent the board a letter warning of a new AI discovery that could \\\”threaten humanity\\\”. pic.twitter.com/F9bAeJG0fX
— Rowan Cheung (@rowancheung) November 23, 2023
Per the report, Sam Altman’s departure from OpenAI was triggered by a letter to the board concerning a significant AI breakthrough. According to two individuals familiar with the situation, a group of staff researchers penned a letter cautioning the board about a potent artificial intelligence discovery, expressing concerns about its potential threat to humanity.
“Ahead of OpenAI CEO Sam Altman’s four days in exile, several staff researchers wrote a letter to the board of directors warning of a powerful artificial intelligence discovery that they said could threaten humanity,” Reuters reported.
This undisclosed letter and the AI algorithm were pivotal events leading up to Altman’s removal from his position, especially considering his prominent role in the realm of generative AI, as noted by the sources. Before Altman’s return on Tuesday, over 700 employees had reportedly considered leaving, pledging to join Microsoft in solidarity with their ousted leader.
The sources highlighted the letter as just one factor in a series of grievances that contributed to Altman’s dismissal. Concerns included worries about advancing commercially without a full understanding of the potential consequences. Unfortunately, Reuters was unable to obtain a copy of the letter, and the staff members who authored it did not respond to requests for comment.
Upon being contacted by Reuters, OpenAI, though choosing not to comment, reportedly acknowledged the existence of a project called Q* and a letter to the board in an internal message to staff before the recent events. An OpenAI spokesperson mentioned that the message, sent by long-time executive Mira Murati, informed the staff about certain media stories without explicitly confirming their accuracy.
What is Q-Learning?
In the vast realm of artificial intelligence and machine learning, Q-Learning is a reinforcement learning algorithm that helps computers learn by trial and error, allowing them to make strategic decisions and maximize rewards. Think of Q-Learning as a clever way to teach computers how to play games. It works by giving the computer a set of rules and rewards and then letting it play the game over and over again. As the computer plays, it keeps track of which actions lead to the biggest rewards. Eventually, the computer learns to play the game well by choosing the actions that lead to the most rewards.
In the world of artificial intelligence and machine learning, Q-learning is a reinforcement learning algorithm that helps computers learn by trial and error and through experience. It enables computers to make smart decisions and aim for maximum rewards. You can think of Q-Learning as a nifty method for teaching computers how to excel in games.
Here’s how it works: The computer is given a set of rules and rewards, and then it dives into playing the game repeatedly. While playing, it keeps tabs on which actions result in the best rewards. Over time, it gets the hang of the game and becomes quite the expert, consistently picking actions that lead to the highest rewards. It’s like the computer learns the game’s tricks by figuring out which moves bring in the most prizes.
Did OpenAI Achieve a Q-Learning Breakthrough?
According to Reuters, insiders at OpenAI believe Q* (pronounced Q-Star) could be a breakthrough in the startup’s quest for artificial general intelligence (AGI), which refers to autonomous systems surpassing humans in most economically valuable tasks, as defined by OpenAI.
Another source, who preferred to remain anonymous, told Reuters that with substantial computing resources, this new model demonstrated the ability to tackle specific mathematical problems. While the math it handled was at the level of grade-school students, the fact that it aced these tests left researchers feeling quite optimistic about Q*’s potential success.
The researchers see math as the next frontier for advancing generative AI. While current generative AI excels in tasks like writing and language translation by predicting the next word based on statistical patterns, it faces a unique challenge with math. Unlike these tasks where there can be varied answers, conquering math, which has a clear-cut right answer, suggests that AI could develop enhanced reasoning capabilities, similar to human intelligence.
Experts in AI believe this capability could be a game-changer, particularly in novel scientific research applications. Unlike a calculator that solves predefined operations, AGI possesses the ability to generalize, learn, and comprehend a wide range of tasks.
In their letter to the board, researchers highlighted the prowess of AI and its potential risks, although the exact safety concerns weren’t specified. There has been ongoing discourse in the tech community about the potential dangers associated with highly intelligent machines, considering scenarios where they might decide that humanity’s destruction is in their best interest.
Furthermore, researchers have drawn attention to the work of an “AI scientist” team, a combination of the earlier “Code Gen” and “Math Gen” teams. This group is delving into ways to optimize existing AI models, enhancing their reasoning capabilities and potentially enabling them to contribute to scientific endeavors, another source told Reuters.
Meanwhile, a day before he was fired by the OpenAI board, Altman said this at the Asia-Pacific Economic Cooperation summit:
“Four times now in the history of OpenAI, the most recent time was just in the last couple weeks, I’ve gotten to be in the room when we sort of push the veil of ignorance back and the frontier of discovery forward, and getting to do that is the professional honor of a lifetime.”
Was Altman fired because he was pushing AI boundaries in the hope of sparking AGI? Let us know what you think.
You can learn more about OpenAI’s Q* in the video below.