Don’t miss the leaders from OpenAI, Chevron, Nvidia, Kaiser Permanente, and Capital One at VentureBeat Transform 2024. Gain essential insights about GenAI and grow your network during this exclusive three-day event. Learn more
A new large language model (LLM) has apparently taken the performance crown from OpenAI’s GPT-4o about a month after its release: rival AI company Anthropic’s new Claude 3.5 Sonnet chatbot and LLM, released today, outpaces all others in the world in the field of According to the company, there are important third-party benchmark tests. And while it is faster and cheaper than previous Claude 3 models.
But it’s one thing to drop a new model and claim dominance, and another when users can actually experience and leverage the performance improvements (Google Gemini family – I’m looking at you: Supposedly better than OpenAI’s previous flagship GPT -4 on some stats, but who really uses you?).
Anthropic’s latest release of Claude 3.5 Sonnet does not appear to have this problem. Many AI influencers and power users have taken to the internet in the few hours since its release to share their largely positive impressions of Anthropic’s new model, and to showcase what the new, “most intelligent” LLM in the world can achieve.
Improving coding skills and product creation
As AI influencer and expert Allie K. Miller wrote on X, Claude 3.5 Sonnet was able to create a fully playable game for her in less than half a minute from just a screenshot:
Countdown to VB Transform 2024
Join business leaders in San Francisco from July 9 to 11 for our flagship AI event. Connect with colleagues, explore the opportunities and challenges of generative AI, and learn how to integrate AI applications into your industry. register now
Likewise, the informative and timely real, working web form that Claude 3.5 Sonnet built.
It was even able to recreate footage from the groundbreaking 1995 film Hackers:
Pietro Schirano, founder of AI image generation startup EverArt, wrote on X that the combination of Claude 3.5 Sonnet with another tool, Maestro, “showed sparks of AGI?”
Anthropic staff members start working on Claude 3.5 Sonnet
Although clearly biased, Anthropic Developer Relations team lead Alex Albert posted a thread on : “It’s becoming clear that in a year’s time a large percentage of the code will be written by LLMs.”
Similarly, Anthropic technical staffer Maggie Vo posted on X that Claude 3.5 Sonnet can now “do half my job… and I couldn’t be happier.”
Putting pressure on OpenAI
Others noted that now that Claude 3.5 Sonnet has eclipsed OpenAI’s GPT-4o and is available at similar prices, the latter company is under renewed pressure to continue advocating its models as the right choice.
Wharton School of Business professor and Pennsylvania University AI booster Ethan Mollick compared the Artefacts feature to a “simpler version of Code Interpreter” from OpenAI’s GPT-4.
X user @kimmonismus went even further, saying that OpenAI will “sleep through AGI,” or artificial general intelligence, the company’s goal: an AI model that outperforms humans in the most economically valuable work. They condemned the company for announcing additional features with GPT-4o that have yet to be released, including new voice modalities.
Still not human level
Despite the lofty praise surrounding
Similarly, technology journalist Timothy B. Lee, known for his handle @binarybits on what is worth more: 100 cents or three quarters? to which it replied Three quartersinitial.
Still, even with these so-far minor issues, Claude 3.5 Sonnet seems like a huge leap forward for Anthropic and LLMs in general, and shows that the performance gains of individual AI modelers are certainly not slowing down with the current levels of available computing resources (i.e. GPUs).