One of the most awe-inspiring stories of our day is how breakthroughs in Artificial Intelligence — especially generative AI applications based on large language models — have recently achieved almost-incomprehensibly magical results in art, coding, healthcare, music, science, and more.
However, while most individuals can marvel at the enormous opportunities ahead, not everyone can yet wield the truly awesome power of these tools today. The ability to build AI applications is still limited to a relatively small number of machine learning engineers whose work is inhibited by a myriad of constraints, including the inability to find and use the vast majority of an organization’s data to inform and train their models. In order for generative AI to optimize its potential, more people and more organizations — not just the largest tech companies in the world — must be capable of building their own AI applications. And in order for that to happen, more of the world’s data must be expressed in the language of AI: vector embeddings.
So that is why today, ICONIQ Growth is proud to announce our partnership with Pinecone, the vector database company providing long-term memory for AI. Along with Andreessen Horowitz and previous investors Menlo Ventures and Wing Venture Capital, as well as experienced operators we have been fortunate to partner with in the past, including Bret Taylor (former President & Co-CEO, Salesforce), Bob Muglia (former CEO, Snowflake), and Olivier Pomel (Co-founder & CEO, Datadog), ICONIQ Growth is investing in Pinecone’s $100 million Series B.
Pinecone is a core component of the infrastructure stack underpinning AI-powered applications that search and rank results based on similarity. With Pinecone, engineers and data scientists can build vector-based AI applications that are accurate, fast, and scalable, and get them into production quickly.
Meeting the growing demand for AI
According to McKinsey, as of 2022, 50% of surveyed organizations reported having adopted AI in at least one business unit or function. International Data Corporation says global spending on AI, including software, hardware, and services for AI-centric systems is expected to double in three years to $300 billion in 2026. The newer, fast-growing slice of AI—generative AI—is expected to grow 35.6% per year and reach $109 billion by 2030, according to Grand View Research.
The unbridled growth of AI and the need for vector-based databases attracted our attention, but we became more enthusiastic about investing in Pinecone after meeting its founder and CEO, Edo Liberty, as well as Bob Wiederhold, President and COO, and the entire Pinecone team. They bring deep domain expertise from years building similar technologies at Amazon, Couchbase, Databricks, Splunk, and Yahoo.
The early Pinecone team founded the company with the goal of giving every company access to the same sophisticated AI tools and infrastructure as tech behemoths. “The biggest and most successful AI and machine learning-driven products in the world—search at Google, product recommendations at Amazon, feed-ranking at Facebook and TikTok—are powered by vector search,” Edo says, “But most companies can’t afford to build those solutions. Pinecone makes accurate, reliable, and scalable generative AI accessible to organizations of all sizes.”
Edo intimately understands the industry’s most advanced AI and machine learning systems because he designed, built and led some of them: Prior to starting Pinecone, he was Director of Research at Amazon Web Services and Head of Amazon AI Labs; previously he was Senior Research Director at Yahoo, where he focused on scalable machine learning and data mining for critical applications. He earned a PhD in Applied Mathematics from Yale and published more than 70 academic papers on algorithms and machine learning systems.
Helping data speak the language of AI: Vector embeddings
AI understands the world as arrays of numbers, known as vectors. These vectors are the input and output of machine learning models. That’s why the first step in almost any machine learning project is to “embed” or convert data objects such as images, text, music, videos, etc. into numbers.
Translating data objects into the language of AI
What is a vector database?
Machine learning is able to deliver such impressive results because the system can consume more types and quantities of data from multiple sources at once and process them across thousands of dimensions. Traditional databases—where data is more rigid and stored in siloed records, files and tables—were simply not made for multi-dimensional computing. But there is hope: advancements in large language models allow you to convert your data into embeddings that can be stored and made available in vector databases — a complex task Pinecone streamlines and delivers for its customers.
Turning “dark” data into valuable insights
The potential impact of transforming all data into vector databases is staggering. How much smarter and more successful would you be if you could instantly access every single thing you’ve ever learned or experienced precisely at the moment when you needed to make a decision or offer a thoughtful response? Now expand that idea to massive scale: Imagine indexing the entire internet to have the answers to all of your most abstract questions at your fingertips.
However, today, the vast majority of data in the world is inaccessible, or “dark,” because it is stored in an unstructured way. As much as 80% of the world’s data is unstructured, according to IDC. Large language models transform dark data into vectors. Pinecone’s vector databases make it possible for the data to be stored, seen and understood so that it can be used for semantic search, chatbots, ranking and recommendation engines, anomaly detection, and more.
Expanding access to AI creation
The elegance of Pinecone lies in its simplicity. Pinecone handles the difficult parts of machine learning — vector math, the job of building and testing similarity search algorithms, etc. — so you can focus on developing applications that serve your needs.“One of the things we are driven by is how simple can we make this?” Edo says. “Can we make this completely plug-and-play? Can we make it so you can just drop in your data and run queries and not worry about anything?”
Making AI more accessible is especially important given the shortage of machine learning specialists. By some reports, software engineers outnumber ML engineers by approximately 30 million engineers to 500,000 ML specialists. Pinecone’s team has made it their mission to expand access to AI creation and help all developers discover, explore, and build with vector databases.
Building vector databases is hard; performance + reliability is harder
While its ease-of-use helps developers get started quickly, Pinecone’s long-term value lies in its performance. Customers include some of the world’s most innovative companies, such as Gong, HubSpot, Shopify, and Zapier. Pinecone’s most ardent customers rave about its simplicity, reliability, and measurable ROI. This kind of work — high-dimensional geometry, rounding and clustering data in different ways — is so ground-breaking that many of these techniques are still in research. Doing all of this in real time, at scale, means marrying algorithms and math with super high performance code.
“We are constantly grappling with the edges of physics and computer science and math and how computers fundamentally work to make the impossible possible” Edo says. “As an engineer myself, if something always does what I want in a predictable way and I don’t have to worry about it, I sleep better at night.”
Through our investments in data and developer infrastructure tools like Alteryx, Collibra, Datadog, Dataiku, dbt Labs, Fivetran, Gitlab, HashiCorp, Hightouch, Monte Carlo and Snowflake the ICONIQ Growth team has developed an appreciation for the major tectonic shift underway as the world reorients their strategic priority towards embracing new AI-powered capabilities that leverage large estates of data.
Pinecone’s innovations enable people at the forefront of AI applications to design and run ambitious, creative programs at scale. Having talked extensively to customers and industry experts, we are convinced that vector databases will become the cornerstone of every AI-powered application. We are excited to support Pinecone and partner with Edo, Bob and the entire team as Pinecone becomes a foundational element of many of the world’s most magical AI experiences to come.
Published:
April 27, 2023