Sep 20, 2024
We want to share this great TVNewsCheck post about OpenAI's new breakthrough reasoning model, “Strawberry," and its impact on Media Companies from industry expert, Jon Accarrino.
OpenAI announced o1, its latest AI model code-named “Strawberry.” Unlike previous models that focused on size scaling, o1 emphasizes “reasoning” through complex problems. This AI can “think” for itself rather than merely imitate human responses, marking a fundamental shift in AI capabilities.
For local media companies, o1’s potential impact on newsgathering, data analysis and investigative reporting is significant. However, it also brings new challenges, including ethical concerns and increased 4x pricing. Understanding these implications is crucial as organizations consider adopting this powerful, yet complex, AI technology.
The Power Of Step-By-Step AI Reasoning
Unlike traditional large language models that generate answers in one step, o1 “reasons” through problems step-by-step, mimicking human thought processes.
Mira Murati, OpenAI’s chief technology officer, explains: “This is what we consider the new paradigm in these models. It is much better at tackling very complex reasoning tasks.” This approach allows o1 to solve problems that stump existing AI models, including OpenAI’s most powerful previous model, GPT-4o. According to OpenAI’s benchmarks, on the American Invitational Mathematics Examination (AIME), a test for advanced math students, GPT-4o only solved on average 12% of the problems, while o1 with its advanced reasoning capabilities got an impressive 83% correct.
“OpenAI’s o1 isn’t just another incremental improvement; it’s a significant advancement with many benefits for journalists,” says Pete Pachal, founder of the AI training and consultancy company, The Media Copilot. “Investigative journalism in particular could make great use of o1’s ability to ‘think’ through complex topics, turning data-heavy stories into compelling narratives that communities can truly engage with and understand.”
OpenAI o1: Pros For Local Media Companies
The introduction of OpenAI’s o1 “Strawberry” model presents a range of opportunities for local media companies to enhance their workflows:
1. Enhanced Fact-Checking and Analysis: With its improved reasoning capabilities, o1 can be a valuable tool for verifying information, spotting inconsistencies in reports and analyzing complex datasets.
2. Advanced Problem-Solving: The model’s proficiency in tackling complex problems could aid in investigating intricate local issues, from city budget analyses to environmental impact studies.
3. Improved Coding Capabilities: o1 excels in coding tasks, reaching the 89th percentile in Codeforces competitions. This could be particularly useful for developers that want to create custom data analysis tools or interactive digital content.
4. Reduced Hallucinations: OpenAI claims that o1 is significantly less likely to hallucinate compared to previous models, potentially leading to more reliable AI-assisted content creation and research.
5. Enhanced Safety Features: The model is reportedly much harder to “jailbreak,” providing an additional layer of security when using AI in sensitive news operations. OpenAI states, “On one of our hardest jailbreaking tests, GPT-4o scored 22 (on a scale of 0-100) while our o1-preview model scored 84.”
6. Specialized Versions: OpenAI has introduced o1-mini, a faster and cheaper version particularly effective at coding. This could be beneficial for developers at local media companies looking to implement o1’s capabilities, but on a budget.
OpenAI o1: Cons And Considerations
While the potential benefits of o1 for local media are significant, there are several important challenges and limitations:
1. Cost Implications: OpenAI’s pricing structure for o1 is considerably higher than for previous models. For o1-preview, OpenAI is charging $15 per 1 million input tokens and $60 per 1 million output tokens making o1 is roughly four times more expensive to use than GPT-4o.
“We are getting to the point where these models are getting specialized and the costs of the models reflect that,” says Michael Newman, director of transformation at Graham Media Group. “Media companies are going to need to be savvy in finding the right model at the least cost to serve the goals of the AI tool they are building.”
2. Slower Response Times: The enhanced reasoning capabilities come at the cost of speed. The step-by-step reasoning process can take significantly longer than traditional AI responses, which could impact real-time operations.
3. Limited Integration and Capabilities: Currently, OpenAI o1 “Strawberry” lacks many features that make ChatGPT useful, such as web browsing and file uploading capabilities. It also doesn’t have the multimodal capabilities (like image processing) that made GPT-4o so impressive. This may limit its immediate applicability in some organizations.
4. Struggles with Simple Tasks: Surprisingly, OpenAI admits that o1 can struggle with simpler tasks. It also has a tendency to overkill users with lengthy responses. So, for many day-to-day operations, GPT-4o might still be the better option.
5. Safety Concerns With Its Persuasion Capabilities: According to OpenAI’s system card for o1, its new models have been rated with a “medium risk” for chemical, biological, radiological and nuclear (CBRN) weapons as well as for persuasion, the highest risk level the company has ever attributed to its AI technology. (Note to reader: Raise your eyebrows now). CBRN is obviously alarming, but equally concerning is o1’s powerful persuasive capability. The model’s advanced reasoning could potentially craft highly convincing arguments or narratives that subtly shape public opinion, even unintentionally. This persuasive power, combined with o1’s “black box” nature, makes it challenging to detect and mitigate potential biases or manipulations in its output.
For journalists and editors, this underscores the need for heightened vigilance, robust fact-checking processes and transparent disclosure of AI use in content creation. As OpenAI continues to collaborate with government bodies and enhance safety measures, newsrooms considering o1 adoption must develop strict ethical guidelines to ensure its use aligns with journalistic integrity and public trust.
Implications For Newsrooms
For local media companies, o1’s reasoning capabilities present both exciting opportunities and significant challenges. O1’s ability to “walk backwards from big ideas” could be particularly valuable for breaking down complex stories or investigations. For instance, it could help journalists map out the intricate connections in a corruption scandal or unravel the long-term implications of a new local policy.
However, harnessing the full potential of such AI models requires more than just the technology itself. Dan Goikhman, CEO of Dappier.com, emphasizes this point: “For reasoning AI agents like o1 to become truly useful in journalism, they need ubiquitous access to trusted sources of data in real-time. This is where the industry needs to evolve.”
Instead of giving o1 access to web browsing, perhaps providing access to approved data sources would be a safer way to deploy AI models with reasoning capabilities. Giving the AI access to approved RAG data sources could help a reasoning AI “think” smarter.
It’s also crucial to understand that o1 is best suited for complex, big-picture questions rather than simple queries. Using it for basic tasks would be both inefficient and costly. As newsrooms grapple with integrating advanced AI like o1 into their workflows, they’ll need to balance the potential benefits with these practical considerations. The key will be finding ways to leverage o1’s capabilities while maintaining efficiency, access to trusted data sources and managing costs.
What’s Next For AI Models With Reasoning Capabilities?
While o1 represents a significant advancement, it’s important to remember that OpenAI doesn’t consider o1 to be a final product, yet. As OpenAI CEO Sam Altman tweeted (or is it now X’ed?), “o1 is still flawed, still limited, and it still seems more impressive on first use than it does after you spend more time with it.”
OpenAI has stated: “We expect regular updates and improvements” to the o1 series. Moreover, the company is already working on GPT-5, which is likely to incorporate this new reasoning technology alongside further scaling improvements.
“Think of o1 as an ‘ultra prompter’. Behind the scenes, o1 is simply automating a complex hierarchy of prompting steps and self-checks, which yields these higher level reasoning capabilities” says Robert Caulk, founder of the AI development company, Emergent Methods. “This level of self-reflection and prompt automation is especially helpful when it comes to the reconciliation of important facts in news research platforms like AskNews.app.”
OpenAI’s o1 offers exciting possibilities for enhancing local media operations, from improved fact-checking to more sophisticated data analysis and investigative research. And there are a lot of advantages to knowing how and how not to use this next generation of AI models. However, its implementation should be approached very thoughtfully, considering both the potential benefits and the associated costs and risks. For managers, it’s a lot to think about.
As media professionals across our industry grapple with how, and if, to deploy these powerful new AI tools, just remember this: OpenAI’s developers are smart enough to create an AI that can “think” and reason through complex problems, but they still don’t know how to properly name a product. Is “GPT-4o” and “o1” the best that they can come up with? Maybe it’s time they let their own AI take a crack at the naming process — it couldn’t possibly do worse than “o1,” could it?