What Is a Vector Database?

What is a vector database?

A vector database is designed to store, manage and index massive quantities of high-dimensional vector data efficiently.

Vector databases are rapidly growing in interest to create additional value for generative artificial intelligence (AI) use cases and applications. According to Gartner, by 2026, more than 30 percent of enterprises will have adopted vector databases to ground their foundation models with relevant business data.¹ 

Unlike traditional relational databases with rows and columns, data points in a vector database are represented by vectors with a fixed number of dimensions, clustered based on similarity. This design enables low latency queries, making them ideal for AI-driven applications.

How to choose the right AI foundation model

Use this model selection framework to choose the most appropriate model while balancing your performance requirements with cost, risks and deployment needs.

Related content

Vector databases versus traditional databases

The nature of data has undergone a profound transformation. It's no longer confined to structured information easily stored in traditional databases. Unstructured data is growing 30 to 60 percent year over year, comprising social media posts, images, videos, audio clips and more.² Typically, if you wanted to load unstructured data sources into a traditional relational database to store, manage and prepare for AI, the process is labor-intensive and far from efficient, especially when it comes to new generative use cases such as similarity search. Relational databases are great for managing structured and semi-structured datasets in specific formats, while vector databases are best suited for unstructured datasets through high-dimensional vector embeddings.

What are vectors?

Enter vectors. Vectors are arrays of numbers that can represent complex objects like words, images, videos and audio, generated by a machine learning (ML) model. High-dimensional vector data is essential to machine learning, natural language processing (NLP) and other AI tasks. Some examples of vector data include:

Text: Think about the last time you interacted with a chatbot. How do they understand natural language? They rely on vectors which can represent words, paragraphs and entire documents, that are converted via machine learning algorithms.
Images: Image pixels can be described by numerical data and combined to make up a high-dimensional vector for that image.
Speech/Audio: Like images, sound waves can also be broken down into numerical data and represented as vectors, enabling AI applications such as voice recognition.

What are vector embeddings?

The volume of unstructured datasets your organization needs for AI will only continue to grow, so how do you handle millions of vectors? This is where vector embeddings and vector databases come into play. These vectors are represented in a continuous, multi-dimensional space known as an embedding, which are generated by embedding models, specialized to convert your vector data into an embedding. Vector databases serve to store and index the output of an embedding model. Vector embeddings are a numerical representation of data, grouping sets of data based on semantic meaning or similar features across virtually any data type.

For example, take the words “car” and “vehicle.” They both have similar meanings even though they are spelled differently. For an AI application to enable effective semantic search, vector representations of “car” and “vehicle” must capture their semantic similarity. When it comes to machine learning, embeddings represent high-dimensional vectors that encode this semantic information. These vector embeddings are the backbone of recommendations, chatbots and generative apps like ChatGPT.  

Vector database versus graph database

Knowledge graphs represent a network of entities such as objects or events and depicts the relationship between them. A graph database is a fit-for-purpose database for storing knowledge graph information and visualizing it as a graph structure. Graph databases are built on nodes and edges that represent the known entities and complex relationships between them, while vector databases are built on high-dimensional vectors. As a result, graph databases are preferred for processing complex relationships between data points while vector databases are better for handling different forms of data such as images or videos.

How vector embeddings and vector databases work

Enterprise vector data can be fed into an embedding model such as IBM’s watsonx.ai models or Hugging Face (link resides outside ibm.com), which are specialized to convert your data into an embedding by transforming complex, high-dimensional vector data into numerical forms that computers can understand. These embeddings represent the attributes of your data used in AI tasks such as classification and anomaly detection.

Vector storage

Vector databases store the output of an embedding model algorithm, the vector embeddings. They also store each vector’s metadata, which can be queried using metadata filters. By ingesting and storing these embeddings, the database can then facilitate fast retrieval of a similarity search, matching the user’s prompt with a similar vector embedding.

Vector indexing

Storing data as embeddings isn't enough. The vectors need to be indexed to accelerate the search process. Vector databases create indexes on vector embeddings for search functionality. The vector database indexes vectors using a machine learning algorithm. Indexing maps vectors to new data structures that enable faster similarity or distance searches, such as nearest neighbor search between vectors.

Similarity search based on querying or prompting

Querying vectors can be done via calculations measuring the distance between vectors using algorithms, such as nearest neighbor search. This measuring can be based on various similarity metrics such as cosine similarity, used by that index to measure how close or distant those vectors are. When a user queries or prompts an AI model, an embedding is computed using the same embedding model algorithm. The database calculates distances and performs similarity calculations between query vectors and vectors stored in the index. They return the most similar vectors or nearest neighbors according to the similarity ranking. These calculations support various machine learning tasks such as recommendation systems, semantic search, image recognition and other natural language processing tasks.

Vector databases and retrieval augmented generation (RAG)

Enterprises are increasingly favoring retrieval augmented generation (RAG) approach in generative AI workflows for its faster time-to-market, efficient inference and reliable output, particularly in key use cases such as customer care and HR/Talent. RAG ensures that the model is linked to the most current, reliable facts and that users have access to the model’s sources, so that its claims can be checked for accuracy. RAG is core to our ability to anchor large language models in trusted data to reduce model hallucinations. This approach relies on leveraging high-dimensional vector data to enrich prompts with semantically relevant information for in-context learning by foundation models. It requires effective storage and retrieval during the inference stage, which handles the highest volume of data. Vector databases excel at efficiently indexing, storing and retrieving these high-dimensional vectors, providing the speed, precision and scale needed for applications like recommendation engines and chatbots.

Advantages of vector databases

While it’s clear that vector database functionality is rapidly growing in interest and adoption to enhance enterprise AI-based applications, the following benefits have also demonstrated business value for adopters:

Speed and performance: Vector databases use various indexing techniques to enable faster searching. Vector indexing along with distance-calculating algorithms such as nearest neighbor search, are particularly helpful with searching for relevant results across millions if not billions of data points, with optimized performance.

Scalability: Vector databases can store and manage massive amounts of unstructured data by scaling horizontally, maintaining performance as query demands and data volumes increase.

Cost of ownership: Vector databases are a valuable alternative to training foundation models from scratch or fine-tuning them. This reduces the cost and speed of inferencing of foundation models.

Flexibility: Whether you have images, videos or other multi-dimensional data, vector databases are built to handle complexity. Given the multiple use cases ranging from semantic search to conversational AI applications, the use of vector databases can be customized to meet your business and AI requirements.

Long term memory of LLMs: Organizations can start with a general-purpose models like IBM watsonx.ai’s Granite series models, Meta's Llama-2 or Google's Flan models, and then provide their own data in a vector database to enhance the output of the models and AI applications critical to retrieval augmented generation.

Data management components: Vector databases also typically provide built-in features to easily update and insert new unstructured data.

Considerations for vector databases and your data strategy

There is a breadth of options when it comes to choosing a vector database capability to meet your organization’s data and AI needs.

Types of vector databases

There are a few alternatives to choose from.

Standalone, proprietary vector databases such as Pinecone
Open-source solutions such as weaviate or milvus, which provide built-in RESTful APIs and support for Python and Java programming languages
Platforms with vector database capabilities integrated, coming soon to IBM watsonx.data
Vector database/search extensions such as PostgreSQL’s open source pgvector extension, providing vector similarity search capabilities

Integration with your data ecosystem

Vector databases should not be considered as standalone capabilities, but rather a part of your broader data and AI ecosystem. Many offer APIs, native extensions or can be integrated with your databases. Since they are built to leverage your own enterprise data to enhance your models, you must also have proper data governance and security in place to ensure the data with which you are grounding these LLMs can be trusted.

This is where a trusted data foundation plays an important role in AI, and that starts with your data and how it’s stored, managed and governed before being used for AI. Central to this is a data lakehouse, one that is open, hybrid and governed, such IBM watsonx.data, part of the watsonx AI data platform that fits seamlessly into a data fabric architecture. For example, IBM watsonx.data, is built to access, catalog, govern and transform all of your structured, semi-structured and unstructured data and metadata. You can then leverage this governed data and watsonx.data’s integrated vector database capabilities (tech preview Q4, 2023) for machine learning and generative AI use cases.

When vector indexing is not optimal

Using a vector store and index is well suited for applications that are based on facts or fact-based querying. For example, asking about a company’s legal terms last year or extracting specific information from complex documents. The set of retrieval context you would get would be those that are most semantically similar to your query through embedding distance. However, if you want to get a summary of topics, this doesn’t lend itself well to a vector index. In this case you would want the LLM to go through all of the different possible contexts on that topic within your data. Instead, you may use a different kind of index, such as a list index rather than a vector index, since a vector index would only fetch the most relevant data.

Use cases of vector databases

The applications of vector databases are vast and growing. Some key use cases include:

Semantic search: Perform searches based on the meaning or context of a query, enabling more precise and relevant results. As not only words but phrases can be represented as vectors, semantic vector search functionality understands user intent better than general keywords.

Similarity search and applications: Find similar images, text, audio or video data with ease, for content retrieval including advanced image and speech recognition, natural language processing and more.

Recommendation engines: E-commerce sites, for instance, can use vector databases and vectors to represent customer preferences and product attributes. This enables them to suggest items similar to past purchases based on vector similarity, enhancing user experience and increasing retention.

Conversational AI: Improving virtual agent interactions by enhancing the ability to parse through relevant knowledge bases efficiently and accurately to provide real-time contextual answers to user queries, along with the source documents and page numbers for reference. 

Vector database capabilities

watsonx.ai

A next generation enterprise studio for AI builders to build, train, validate, tune and deploy both traditional machine learning and new generative AI capabilities powered by foundation models. Build a Q&A resource from a broad internal or external knowledge base with the help of AI tasks in watsonx.ai, such as retrieval augmented generation.

Learn more

watsonx.data

A fit-for-purpose data store built on an open data lakehouse architecture to scale AI workloads, for all your data, anywhere. Store, query and search vector embeddings in watsonx.data with integrated vector capabilities (planned tech preview Q4 2023).

Learn more

IBM Cloud® Databases for PostgreSQL-

Our PostgreSQL database-as-a-service offering lets teams spend more time building with high availability, backup orchestration, point-in-time-recovery (PITR) and read replica with ease. PostgreSQL offers pgvector, an open-source vector extension that will be able to be configured with IBM Cloud PostgreSQL extensions (coming soon), providing vector similarity search capabilities.

Learn more

IBM Cloud Databases for Elasticsearch

Our Elasticsearch database-as-a-service comes with a full-text search engine, which makes it the perfect home for your unstructured text data. Elasticsearch also support various forms of  semantic (link resides outside ibm.com) similarity search. It supports dense vectors (link resides outside ibm.com) for exact nearest neighbor search, but it also provides built-in AI models to compute sparse vectors and conduct advanced similarity search (link resides outside ibm.com).

Learn more

Vector database resources

Foundation models and data stores unlock the potential of generative AI

Organizations that utilize generative AI models correctly can see a myriad of benefits—from increased operational efficiency and improved decision-making to the rapid creation of marketing content.

Enterprise-ready, IBM-developed watsonx Granite models now available

IBM announces the general availability of the first models in the watsonx Granite model series — a collection of generative AI models to advance the infusion of generative AI into business applications and workflows.

What is retrieval-augmented generation?

RAG is an AI framework for retrieving facts from an external knowledge base to ground LLMs on the most accurate, up-to-date information and to give users insight into LLMs' generative process.

Take the next step

Train, validate, tune and deploy generative AI, foundation models and machine learning capabilities with IBM watsonx.ai, a next-generation enterprise studio for AI builders. Build AI applications in a fraction of the time with a fraction of the data.

Explore watsonx.ai

Book a live demo

Footnotes

¹ Gartner Innovation Insight: Vector Databases (link resides outside ibm.com), requires Gartner account), Gartner

² Gartner 2022 Strategic Roadmap for Storage (link resides outside ibm.com), requires Gartner account), Gartner