Also: This entry is intended to help librarians and other information professionals learn about AI. It is not, in itself, meant to be seen as promotion of AI.'
Introduction
Vector-based searching and embedding models refer to techniques used to retrieve documents, articles, web pages or other textual content based on their similarities to queries.
Vector-based searching enables users to find relevant information even when the exact words or terms used within a given query are not present in the retrieved documents.
The first step in vector searching is translating text to vectors (or numbers) and processing them through large language models (LLMs). VS is a type of AI system trained on vast amounts of text (for example, research papers, books, and web content).
During training, the model learns how words appear together and in what contexts. Over time, it builds a kind of map of language, where meanings cluster naturally. In medicine, words that often appear in similar contexts, such as doctor and physician, end up close together in this semantic map. Words that rarely co-occur or belong to very different contexts, like insulin and wheelchair, are far apart.
As Tay (2025) says, "...What we typically call "semantic search" aims for retrieval based on "meaning" rather than just matching keywords. This is achieved these days using embedding models."
What are embeddings? and how do they related to AI searching?
Vector-based searching uses embeddings, which are numerical representations of data (text, images, audio, etc.), to find items based on their underlying meaning or context rather than exact keyword matches.
Embeddings are "high-dimensional numerical vectors" capturing semantic characteristics and relationships of data within a given corpus.
Machine learning models generate these vectors, placing semantically similar items close together in a multi-dimensional space (vector space). Embeddings for "cat," "dog," and "lion" would be closer to each other than to embedding for "car," as they are types of animals.
Support Vector Machines (SVMs) are a supervised learning method used in classification, and suitable for some regression tasks.
The SVM algorithm is a binary linear classifier categorizing unseen data points into one of two groups based on labelled "training" points, drawing division between two categories.
SVMs helped to bridge traditional keyword searching and modern AI searching by learning how to classify and rank documents using mathematically rigorous boundaries—before “semantic searching” started its rise.
Vectorization
Embedding creation: Data in a dataset (e.g., documents, images, or user queries) are converted into numerical vector embeddings using a machine-learning model trained to capture semantic meaning.
Indexing: vectors are stored in a specialized retrieval system—often a vector database—which uses indexing algorithms such as Approximate Nearest Neighbor (ANN) to organize vectors so semantically similar items are located near one another for retrieval.
Query representation: a query is transformed into a vector embedding to ensure comparability within the vector space.
Similarity search: The system searches the vector index to identify data vectors closest to the query vector, using similarity or distance metrics such as cosine similarity or Euclidean distance.
Ranking and retrieval: Vectors with the smallest distance (or highest similarity) to the query are assumed to represent the most semantically relevant results and are returned to the user, often combined with additional ranking or filtering steps.
Boolean vs. vectorization
Boolean searching uses explicit, human-defined logic to include or exclude documents, maximizing transparency and recall.
Support Vector Machines learn relevance from examples, ranking documents probabilistically.
Boolean search is auditable and reproducible; SVMs prioritize efficiency and precision but rely on training data and are less interpretable.
Tay argues that embedding-based semantic search (vector search) is less objectionable than generative LLMs because it uses encoder models that are significantly smaller and less computationally costly than full decoder LLMs, therefore having lower environmental impact; also it doesn’t generate text and less likely to reproduce copyrighted content; it serves mainly to rank documents by semantic similarity rather than generate novel outputs, reducing risks of hallucination or cognitive offloading. Tay says embedding search still poses challenges—such as interpretability, potential bias, and reproducibility issues—but these are generally narrower than those associated with generative AI.