Riassunto analitico
This thesis explores Vector Database Management Systems (VDBMS) as a solution for handling unstructured data by representing them as vector embeddings. We examine the creation of embeddings, compare sparse and dense vector representations, and analyze state-of-the-art indexing techniques such as HNSW and IVF. Additionally, we discuss Approximate Nearest Neighbors (ANN) search and its role in scalable querying, along with filtering methods to enhance retrieval accuracy.
Beyond querying, we investigate Multi-Modal search, which integrates multiple data types (text, images, and audio) within a unified vector space. We also explore Retrieval-Augmented Generation (RAG), focusing on optimizing retrieval strategies to improve Large Language Model (LLM) responses. Our final experiment compares different RAG techniques, including knowledge graph integration, to refine retrieved contexts and enhance response reliability.
By evaluating various VDBMS solutions and retrieval methodologies, this thesis aims to provide insights into the evolving landscape of vector-based data management and its applications in modern AI-driven systems.
|