Chapter 5 Quiz: Embeddings and Vector Databases
Test your understanding of embeddings and vector databases covered in this chapter.
Question 1
What is a word embedding?
- A compressed file containing text
- A dense vector representation of a word that captures its semantic meaning
- A dictionary definition of a word
- A list of synonyms for a word
Show Answer
The correct answer is B.
A word embedding is a dense vector representation of a word that captures its semantic meaning in a continuous vector space. Words with similar meanings have similar vector representations. Option A describes file compression, option C describes a definition, and option D describes a thesaurus.
Question 2
What is an embedding vector?
- A one-dimensional list of words
- A multi-dimensional numerical representation that captures semantic relationships
- A binary encoding of text
- A hash code for fast lookup
Show Answer
The correct answer is B.
An embedding vector is a multi-dimensional numerical representation (typically hundreds of dimensions) that captures semantic relationships. Points that are close together in this vector space represent similar concepts. Option A is too simple, option C doesn't capture semantics, and option D describes hashing rather than embeddings.
Question 3
What is the primary purpose of a vector database?
- To store relational data with SQL
- To efficiently store and retrieve high-dimensional vectors with similarity search
- To compress text files
- To execute JavaScript code
Show Answer
The correct answer is B.
A vector database is specifically designed to efficiently store and retrieve high-dimensional vectors and perform similarity searches. Unlike traditional databases, they're optimized for finding the nearest neighbors to a query vector. Option A describes relational databases, option C describes compression utilities, and option D describes JavaScript engines.
Question 4
Which of the following is an open-source library for similarity search developed by Facebook AI?
- Pinecone
- Weaviate
- FAISS
- MongoDB
Show Answer
The correct answer is C.
FAISS (Facebook AI Similarity Search) is an open-source library developed by Meta/Facebook for efficient similarity search and clustering of dense vectors. Pinecone (option A) and Weaviate (option B) are vector databases but not specifically the Facebook library. MongoDB (option D) is a traditional document database.
Question 5
What is a vector store?
- A retail shop that sells vectors
- A system for storing and managing embedding vectors
- A type of CPU cache
- A cloud storage service for files
Show Answer
The correct answer is B.
A vector store is a system for storing and managing embedding vectors, often used interchangeably with vector database. It provides the infrastructure for storing vectors and performing similarity searches. Option A is a play on words, option C describes hardware, and option D describes general file storage.
Question 6
Which of the following is a cloud-based vector database service?
- MySQL
- PostgreSQL
- Pinecone
- SQLite
Show Answer
The correct answer is C.
Pinecone is a cloud-based vector database service designed specifically for storing and searching embeddings at scale. MySQL (option A), PostgreSQL (option B), and SQLite (option D) are traditional relational databases, though PostgreSQL can support vectors through extensions.
Question 7
What is Weaviate?
- A text editor
- An open-source vector database with GraphQL and RESTful APIs
- A data compression algorithm
- A programming language
Show Answer
The correct answer is B.
Weaviate is an open-source vector database that provides GraphQL and RESTful APIs for storing and searching vectors. It supports various AI models and can be self-hosted or used as a cloud service. Option A describes software like VSCode, option C describes algorithms like gzip, and option D describes languages like Python.
Question 8
Why are embeddings important for semantic search?
- They make text shorter
- They represent meaning in a way that allows mathematical comparison of similarity
- They encrypt sensitive information
- They reduce storage costs
Show Answer
The correct answer is B.
Embeddings represent semantic meaning in a mathematical form (vectors) that allows us to compute similarity between texts using operations like cosine similarity. This enables semantic search to find conceptually similar content, not just keyword matches. Option A is not the purpose, option C relates to encryption, and option D is not the primary goal.
Question 9
What property of word embeddings allows us to perform analogies like "king - man + woman = queen"?
- Random distribution of vectors
- Semantic relationships encoded in vector arithmetic
- Alphabetical ordering
- Character length encoding
Show Answer
The correct answer is B.
Word embeddings encode semantic relationships in vector space such that meaningful vector arithmetic is possible. The difference between "king" and "man" captures the concept of royalty, which when added to "woman" points to "queen". This demonstrates that semantic relationships are encoded geometrically. Options A, C, and D don't explain this property.
Question 10
In a chatbot system using RAG, what role does a vector database play?
- It stores user passwords
- It stores document embeddings and retrieves relevant context based on query similarity
- It generates responses directly
- It handles user authentication
Show Answer
The correct answer is B.
In RAG (Retrieval-Augmented Generation) systems, a vector database stores embeddings of documents or knowledge chunks. When a user asks a question, the system converts the query to an embedding, searches the vector database for similar embeddings, and retrieves the relevant context to augment the LLM's response. Options A and D relate to security, and option C is incorrect as the LLM generates responses, not the database.