14 Charts That Tell the Story of AI Right Now
Data on Elo rankings, memory tokens, GitHub repos, domain registrations & more
So much is happening in artificial intelligence right now.
There’s the race between OpenAI and Anthropic.
Open source projects are hard at work to show they’re still relevant.
A gusher of money is flowing into the space.
And we’re already seeing projects ascend into hyped up glory, only to plummet back to Earth a month later.
We wanted to put some data behind everything that’s going on to make sense of what’s happening. We looked to answer questions like, How are models performing in independent tests? What projects are accumulating the largest repositories on GitHub. And what types of AI startups are raising the most funding?
With the help of Wenqi Shao — a data scientist who has worked at Facebook, Uber, and Flexport and is now the lead data scientist at Webflow — we compiled 14 charts that offer thought-provoking snapshots of what’s really going on with artificial intelligence.
The following funding related charts were built off of NFX’s Generative Tech Open Source Market Map and supplemented with additional data from Base 10’s Generative AI dashboard, company websites, and PitchBook. Together, this dataset covers over 440 funded generative AI startups that have raised over $26 billion. Other charts were sourced from LMSYS Org, data.ai, GitHub, AngelList, Domain Name Stat, and individual AI company websites.
OpenAI, Anthropic & open source project Vacuna lead large language model rankings
The Chatbot Arena is a benchmark platform for large language models (LLMs) that relies on the crowd to run anonymous, randomized language model battles. After over 27,000 matchups, OpenAI’s GPT-4 took first-place.
Meanwhile, Anthropic — a startup founded by OpenAI defectors — took second and third place with versions of its LLM, Claude.
But the most surprising ranking came from Vicuna-13B at fifth. Vicuna is an open source language model, launched in March by a cross-university collaboration, that’s building on top of Stable Diffusion. The performance of Vicuna and other open source models gives hope that private companies might not run away with this technological arms race.
Anthropic & Mosaic provide more than double OpenAI’s memory
If you’ve tried to sustain a running conversation with ChatGPT, then you know how sad it is when it starts to forget earlier parts of your conversations. That’s because OpenAI is offering paid consumer GPT-4 users a measly 8,000 tokens of memory.
There was a moment when that was enticing, but rival Anthropic is offering 100,000 tokens, approximately the equivalent of 75,000 words. (For reference, Mark Twain’s The Adventures of Tom Sawyer has 69,000 words.)
AI GitHub star leaderboard puts Auto-GPT on top
Activity on GitHub repositories offers a unique window into artificial intelligence developer projects’ pickup. The chart below shows some of the most starred “AI” repositories on the platform.
The most popular repository by far is Auto-GPT, which is an open-source attempt to make GPT-4 fully autonomous.
Some of the other super active repositories include collaborative web UIs for popular projects like Stable Diffusion, methods to run Facebook’s open source project LLaMa on a local machine, and desktop apps for GPT-4/GPT-3.5-powered chat.
Additionally, there are technologies that aim to make AI development easier, such as LangChain (which enables building applications with LLMs through composability), Open-Assistant (a chat-based assistant), ColossalAI (which focuses on distributed techniques to make running large AI models cheaper and faster), and generative audio models like Bark and SoftVC VITS.
AI app Lensa crashes, while a basket of Chatbots peaked in April
Mass market consumer interest in generative AI first occurred with photo AI apps like Lensa in November 2022.
While we saw a quick flurry of activity and the emergence of many similar photo avatar apps in December, consumer interest quickly cratered and later moved into chatbot-style LLM apps.
Based on app analytics download data from Data.ai (formerly App Annie), we observed peak weekly downloads of Lensa reaching 2 million in the United States.
However, more recently, Lensa has fallen to around 20,000 weekly downloads.
Starting in March, consumer interest began to shift toward chatbots, which gained momentum after ChatGPT APIs became available.
While our basket of chatbots — which comprises of Nova AI, Genie AI, Ask AI, and AI Chat — peaked in April, chatbots are proving more resilient. The group collectively earned more than 1 million downloads all but one week since the category exploded in late March.
The top four chatbots have collectively amassed 10 million cumulative downloads since March 2023, which is more than double the download count of Lensa.
Also of note in the consumer AI app world: OpenAI recently announced their official iOS mobile app, which immediately rose to the top of the App Store upon release.
Demand for “.ai” domain names was so December
If there’s a chart in this collection that serves as a word of warning to the AI hype, it’s this one: Companies gobbled up “.ai” domains in December and then really chilled out on the domain land grab.
Is the new company AI hype cycle beginning or ending? The domain registrations reached a peak in December 2022, but have stayed at an elevated rate this year, averaging more than 1,500 per week.
AI startups chase opportunities in marketing & sales, audio, and customer support
With all the hype in artificial intelligence, venture capitalists risk funding a lot of competitive startups. We looked at the most competitive sectors within generative AI.
The most popular space by the number of startups is text applications targeting marketing and sales. The second most popular are audio apps. The third is customer support and chatbot apps. Image apps came in fourth.
Investors pour money into machine learning operations
Of the $15 billion in total AI funding (excluding OpenAI), the most funded category is machine learning operations/platform tools.
Sorting the categories by the median amount raised reveals that, after ML Ops/Platform, the most well-funded startups are those in the semantic and vector search category. They are followed by startups in the legal, biology, customer support and chatbots, and synthetic data generation categories.
At least 36 generative AI companies have raised more than $100 million
With investors scrambling to find ways into AI rounds, there aren’t too many generative AI-based companies left without significant fundraising.