First-of-its-kind Pinecone Knowledge Platform to Power Best-in-class Retrieval for Customers

Press Releases

Dec 03, 2024

Industry-leading vector database capabilities combined with proprietary AI models to help developers build up to 48% more accurate AI applications, faster and more easily

Pinecone recognized as AWS GenAI Innovator Partner of the Year

LAS VEGAS, Dec. 3, 2024 /PRNewswire/ — With its vector database at the core, Pinecone, the leading knowledge platform for building accurate, secure, and scalable artificial intelligence (AI) applications, has announced industry-first integrated inference capabilities. These include fully-managed embedding and reranking models, along with a novel approach to sparse embedding retrieval. By combining these innovations with Pinecone’s proven dense retrieval capabilities, the platform delivers an approach to cascading retrieval that defines a new standard for AI-powered solutions.

New proprietary reranking and embedding models, as well as the addition of third-party models like Cohere’s Rerank 3.5 model, further provide customers quick, easy access to high-quality retrieval and significantly streamline the development of grounded AI applications.

“Our goal at Pinecone has always been to make it as easy as possible for developers to build production-ready knowledgeable AI applications quickly and at scale,” said Edo Liberty, founder and CEO of Pinecone. “By adding built-in and fully-managed inference capabilities directly into our vector database, as well as new retrieval functionality, we’re not only simplifying the development process but also dramatically improving the performance and accuracy of AI-powered solutions.”

Pinecone’s composable platform now includes the following updates:

pinecone-rerank-v0 proprietary reranking model
pinecone-sparse-english-v0 proprietary sparse embedding model
New sparse vector index type
Integration of Cohere’s Rerank 3.5 model
New security features, including role-based access controls (RBAC), audit logs, customer-managed encryption keys (CMEK), and the general availability (GA) of Private Endpoints for AWS PrivateLink

Advancing the state of the art for retrieval

High-quality retrieval is key to delivering the best user experience in AI search and retrieval-augmented generation (RAG) applications. Pinecone’s research shows that state-of-the-art performance requires combining three key components:

Dense vector retrieval to capture deep semantic similarities
Fast and precise sparse retrieval for keyword and entity search using a proprietary sparse indexing algorithm
Best-in-class reranking models to combine dense and sparse results and maximize relevance

By combining the sparse retrieval, dense retrieval, and reranking capabilities within Pinecone, developers will be able to create end-to-end retrieval systems that deliver up to 48% and on average 24% better performance than dense or sparse retrieval alone.

“With the advent of GenAI, we knew we could challenge the status quo in talent acquisition by building an experience focused on the job seeker rather than the hiring company,” said Alex Bowcut, CTO of Hyperleap. “With Pinecone, we’ve seen 40% better click-through rates for the job matches we deliver with search results using their semantic retrieval as opposed to traditional full-text search. Now, with the addition of sparse vector retrieval to Pinecone’s proven natural language search capabilities, we’re excited to explore how we can bring deeper personalization to people looking for work.”

Pinecone proprietary models

With the introduction of its first proprietary models, Pinecone is making it easier for developers to build knowledgeable AI.

pinecone-rerank-v0 improves search accuracy by up to 60% and on average 9% over industry-leading models on the Benchmarking-IR (BEIR) benchmark
pinecone-sparse-english-v0 boosts performance for keyword-based queries, delivering up to 44% and on average 23% better normalized discounted cumulative gain (NDCG)@10 than BM25 on Text REtrieval Conference (TREC) Deep Learning Tracks

Natively integrated into Pinecone’s platform, these models simplify the development of production-ready AI applications.

AI search simplified with integrated inference

With the release of Pinecone’s integrated inference capability, engineers can now develop state-of-the-art applications without the burden of managing model hosting, integration, or infrastructure. By offering these capabilities behind a single API, developers can seamlessly access top embedding and reranking models hosted on Pinecone’s infrastructure, eliminating the need to worry about vectors or data being routed through multiple providers. This consolidation not only simplifies development but also enhances security and efficiency.

“Pinecone’s new integrated inference capabilities are a game-changer for us,” said Isaac Pohl-Zaretsky, CTO & Co-Founder at Pocus. “The ability to have embedding, reranking, and retrieval all within the same environment not only streamlines our workflows but also powers our AI solutions with minimal latency, less technical debt, and improved performance. Pinecone was already helping us deliver tremendous value with precise signals to power our customers’ go-to-market efforts, and now with their unique platform we’re thrilled to be able to deliver even more.”

Greater choice with Cohere Rerank

As part of Pinecone’s expanding inference capabilities, we’ve collaborated with Cohere to host cohere-rerank-v3.5 natively within the Pinecone platform. This allows customers to easily select and use cohere-rerank-v3.5 directly from the Pinecone API to enhance the relevance of their search results. Rerank 3.5 excels at understanding complex business information across languages making it optimal for global organizations in sectors like finance, healthcare, the public sector, and more. By incorporating Cohere’s latest industry-leading reranking model, developers can further refine search outputs, ensuring more accurate and contextually relevant responses for their applications.

Enhanced security for mission-critical workloads

Pinecone’s database is built for production, which means the security of customer workloads is paramount. The following advancements further strengthen Pinecone’s commitment to enterprise-grade security and compliance:

More granular role-based access controls (RBAC) let users set API key roles for control and data plane operations
Customer-managed encryption keys (CMEK) enable users to control their own data encryption and enhance tenant isolation
Audit logs for control plane activities (e.g. index creation or deletion) via Amazon Simple Storage Service (Amazon S3) endpoints
Support for AWS PrivateLink is now generally available (GA) for serverless indexes

Unlocking more with AWS

Pinecone is the recipient of the 2024 AWS GenAI Innovator Partner of the Year award. This award recognizes Pinecone for possessing a unique advantage in driving the advancement of services, tools, and infrastructure pivotal for implementing generative AI technologies.

Pinecone’s AWS Generative AI Competency acknowledges the company as an expert generative AI solution provider that creates value and drives business growth for customers. Customers can leverage Amazon Bedrock Knowledge Bases with Pinecone to build more effectively with AI and reduce operational complexity and costs. Specifically, Knowledge Bases for Amazon Bedrock provides “one click” integration with Pinecone, fully automating the ingestion, embedding, and querying of customer data as part of the LLM generation process. This seamless flow provides a scalable foundation for AI innovation, enabling faster time-to-value and more grounded, production-grade AI applications. Furthermore, customers using Amazon Bedrock Knowledge Bases with Pinecone can now run RAG evaluations natively in Amazon Bedrock instead of having to connect third-party tools.

Creating new possibilities with knowledgeable AI

As the first AI infrastructure company to provide a single platform for inference, retrieval, and knowledge base management, Pinecone is setting a new standard in the industry. This integrated approach is expected to lead to significant performance improvements and open up new possibilities for AI application development.

Customers can access Pinecone through the AWS Marketplace to fast-track procurement, accelerate deployment, and optimize costs to quickly and easily drive better outcomes with knowledgeable AI. Developers can also get started for free on the Pinecone console.

About Pinecone

Pinecone’s mission is to make AI knowledgeable. With its vector database at the core, Pinecone is the leading knowledge platform for building accurate, secure, and scalable AI applications. More than 5000 customers across various industries have shipped AI applications faster and more confidently with Pinecone’s developer-friendly technology. Pinecone has raised $138M in funding from leading investors Andreessen Horowitz, ICONIQ Growth, Menlo Ventures, and Wing Venture Capital, and operates in New York, San Francisco, and Tel Aviv.

Media Contact

Mike Sefanov
[email protected]
Director, Communications

Photo – https://mma.prnewswire.com/media/2572068/Pinecone_integrated_inference.jpg
Logo – https://mma.prnewswire.com/media/2418074/Pinecone_Systems_Inc_Logo.jpg