Anthropic’s Claude 3.5 Sonnet impresses AI power users: ‘this is wild’

Don’t miss the leaders from OpenAI, Chevron, Nvidia, Kaiser Permanente, and Capital One at VentureBeat Transform 2024. Gain essential insights about GenAI and grow your network during this exclusive three-day event. Learn more

A new large language model (LLM) has apparently taken the performance crown from OpenAI’s GPT-4o about a month after its release: rival AI company Anthropic’s new Claude 3.5 Sonnet chatbot and LLM, released today, outpaces all others in the world in the field of According to the company, there are important third-party benchmark tests. And while it is faster and cheaper than previous Claude 3 models.

But it’s one thing to drop a new model and claim dominance, and another when users can actually experience and leverage the performance improvements (Google Gemini family – I’m looking at you: Supposedly better than OpenAI’s previous flagship GPT -4 on some stats, but who really uses you?).

Anthropic’s latest release of Claude 3.5 Sonnet does not appear to have this problem. Many AI influencers and power users have taken to the internet in the few hours since its release to share their largely positive impressions of Anthropic’s new model, and to showcase what the new, “most intelligent” LLM in the world can achieve.

Improving coding skills and product creation

As AI influencer and expert Allie K. Miller wrote on X, Claude 3.5 Sonnet was able to create a fully playable game for her in less than half a minute from just a screenshot:

Countdown to VB Transform 2024

Join business leaders in San Francisco from July 9 to 11 for our flagship AI event. Connect with colleagues, explore the opportunities and challenges of generative AI, and learn how to integrate AI applications into your industry. register now

This is wild.
In just 25 seconds Claude 3.5 Sonnet coded a fully functional Mancala web app for me ?️
I have only provided ONE screenshot of the game instructions.
It did the rest:
– Coded the entire game
– I previewed it so I could test it
– Rules provided pic.twitter.com/WLweZUGt5C
— Allie K. Miller (@alliekmiller) June 20, 2024

Likewise, the informative and timely real, working web form that Claude 3.5 Sonnet built.

Claude 3.5 just generated React jsx code with a simple contact form and managed to run it in the Artefacts playground? pic.twitter.com/KREZaArObw
— TestingCatalogue news? (@testcatalog) June 20, 2024

It was even able to recreate footage from the groundbreaking 1995 film Hackers:

Pietro Schirano, founder of AI image generation startup EverArt, wrote on X that the combination of Claude 3.5 Sonnet with another tool, Maestro, “showed sparks of AGI?”

Claude 3.5 Sonnet + Maestro = Sparks of AGI?
I asked to make a Mario clone with only geometric shapes, and the wildest thing is that the characters were also animated, and the shapes seemed like new concepts.
It took 3 minutes. Watch the game! pic.twitter.com/YVQYp7m5Ed
— Pietro Schirano (@skirano) June 20, 2024

Anthropic staff members start working on Claude 3.5 Sonnet

Although clearly biased, Anthropic Developer Relations team lead Alex Albert posted a thread on : “It’s becoming clear that in a year’s time a large percentage of the code will be written by LLMs.”

Claude is getting very good at coding and autonomously fixing pull requests. It’s becoming clear that in a year’s time, a large percentage of code will be written by LLMs.
I’ll show you what I mean:
— Alex Albert (@alexalbert__) June 20, 2024

Similarly, Anthropic technical staffer Maggie Vo posted on X that Claude 3.5 Sonnet can now “do half my job… and I couldn’t be happier.”

Putting pressure on OpenAI

Others noted that now that Claude 3.5 Sonnet has eclipsed OpenAI’s GPT-4o and is available at similar prices, the latter company is under renewed pressure to continue advocating its models as the right choice.

Wharton School of Business professor and Pennsylvania University AI booster Ethan Mollick compared the Artefacts feature to a “simpler version of Code Interpreter” from OpenAI’s GPT-4.

I’ve been using the new Claude 3.5 model as a tester and now that it’s out I can say that it’s very, very impressive, and the “artifacts” it generates are like a simpler version of Code Interpreter
This is a real-time video of me creating a playable game and editing it together with Claude pic.twitter.com/bWqw8F8CdH
— Ethan Mollick (@emollick) June 20, 2024

X user @kimmonismus went even further, saying that OpenAI will “sleep through AGI,” or artificial general intelligence, the company’s goal: an AI model that outperforms humans in the most economically valuable work. They condemned the company for announcing additional features with GPT-4o that have yet to be released, including new voice modalities.

Hi, @OpenAI. You sleep through AGI. While you make promises all the time (“Patience Jimmy, it’s worth the wait”) and announce without delivering (“GPT-4o-Voice within a few weeks”), the competition manages to deliver without making any major announcements in advance. doing! Take a leaf out of… https://t.co/o6ROsZwDRG
— Chubby♨️ (@kimmonismus) June 20, 2024

Still not human level

Despite the lofty praise surrounding

Groundbreaking models like GPT-4o (and now Claude 3.5 Sonnet) may be at the level of a “Smart High Schooler” in some respects, but they still struggle with basic tasks like tic-tac-toe. There was hope that native multimodal training would help, but that wasn’t the case. pic.twitter.com/1iDq0DCL4Q
— Noam Brown (@polynoamial) June 20, 2024

Similarly, technology journalist Timothy B. Lee, known for his handle @binarybits on what is worth more: 100 cents or three quarters? to which it replied Three quartersinitial.

Still, even with these so-far minor issues, Claude 3.5 Sonnet seems like a huge leap forward for Anthropic and LLMs in general, and shows that the performance gains of individual AI modelers are certainly not slowing down with the current levels of available computing resources (i.e. GPUs).

VB daily

Stay informed! Receive the latest news in your inbox every day

By subscribing you agree to VentureBeat’s Terms of Service.

Thanks for subscribing. View more VB newsletters here.

An error has occurred.

Improving coding skills and product creation

Anthropic staff members start working on Claude 3.5 Sonnet

Putting pressure on OpenAI

Still not human level

Leave a Comment Cancel reply