Information retrieval

queryretrievalqueriesqueryinginformation retrieval systeminformation retrieval systemsInformation retrieval (IR)IRrecallharmonic mean of precision and recall
Information retrieval (IR) is the activity of obtaining information system resources that are relevant to an information need from a collection of those resources.wikipedia
509 Related Articles

Information retrieval applications

IR applications
Web search engines are the most visible IR applications. Depending on the application the data objects may be, for example, text documents, images, audio, mind maps or videos.
Areas where information retrieval techniques are employed include (the entries are in alphabetical order within each category):

Full-text search

full text searchfull textfull-text
Searches can be based on full-text or other content-based indexing.
When dealing with a small number of documents, it is possible for the full-text-search engine to directly scan the contents of the documents with each query, a strategy called "serial scanning".

Relevance (information retrieval)

relevancerelevantrelevancy ranking
Instead, several objects may match the query, perhaps with different degrees of relevancy.
In information science and information retrieval, relevance denotes how well a retrieved document or set of documents meets the information need of the user.

Text Retrieval Conference

TRECTREC Legal TrackText REtrieval Conference (TREC)
In 1992, the US Department of Defense along with the National Institute of Standards and Technology (NIST), cosponsored the Text Retrieval Conference (TREC) as part of the TIPSTER text program.
The Text REtrieval Conference (TREC) is an ongoing series of workshops focusing on a list of different information retrieval (IR) research areas, or tracks. It is co-sponsored by the National Institute of Standards and Technology (NIST) and the Intelligence Advanced Research Projects Activity (part of the office of the Director of National Intelligence), and began in 1992 as part of the TIPSTER Text program.

Boolean model of information retrieval

Standard Boolean modelboolean model
Traditional evaluation metrics, designed for Boolean retrieval or top-k retrieval, include precision and recall.
The (standard) Boolean model of information retrieval (BIR) is a classical information retrieval (IR) model and, at the same time, the first and most-adopted one.

Vector space model

cosine similaritydocument vector space representationvector space retrieval
It is used in information filtering, information retrieval, indexing and relevancy rankings.

Gerard Salton

Gerald SaltonSaltonG. Salton
In the 1960s, the first large information retrieval research group was formed by Gerard Salton at Cornell.
Salton was perhaps the leading computer scientist working in the field of information retrieval during his time, and "the father of Information Retrieval".

Extended Boolean model

The goal of the Extended Boolean model is to overcome the drawbacks of the Boolean model that has been used in information retrieval.

Binary Independence Model

The Binary Independence Model (BIM) is a probabilistic information retrieval technique that makes some simple assumptions to make the estimation of document/query similarity probability feasible.

Okapi BM25

BM25BM25Fokapi (BM25)
In information retrieval, Okapi BM25 (BM is an abbreviation of best matching) is a ranking function used by search engines to estimate the relevance of documents to a given search query.

Latent semantic analysis

latent semantic indexinglatent semanticLatent Semantic Analysis (LSA)
In the context of its application to information retrieval, it is sometimes called latent semantic indexing (LSI).

Uncertain inference

Uncertain inference was first described by C. J. van Rijsbergen as a way to formally define a query and document relationship in Information retrieval.

Learning to rank

Machine-learned rankingLearn to Rankmachine-learned
Learning to rank or machine-learned ranking (MLR) is the application of machine learning, typically supervised, semi-supervised or reinforcement learning, in the construction of ranking models for information retrieval systems.

Generalized vector space model

The Generalized vector space model is a generalization of the vector space model used in information retrieval.

Precision and recall

Traditional evaluation metrics, designed for Boolean retrieval or top-k retrieval, include precision and recall.
In pattern recognition, information retrieval and classification (machine learning), precision (also called positive predictive value) is the fraction of relevant instances among the retrieved instances, while recall (also known as sensitivity) is the fraction of the total amount of relevant instances that were actually retrieved.


database management systemdatabasesDBMS
An object is an entity that is represented by information in a content collection or database.


meta datameta-datacommunications metadata
Information retrieval is the science of searching for information in a document, searching for documents themselves, and also searching for the metadata that describes data, and for databases of texts, images or sounds.

Probabilistic relevance model

probabilistic retrieval framework
It is a formalism of information retrieval useful to derive ranking functions used by search engines and web search engines in order to rank matching documents according to their relevance to a given search query.

Divergence-from-randomness model

In the field of information retrieval, divergence from randomness, one of the very first models, is one type of probabilistic model.

Language model

language modelingstatistical language modelsNeural network language models
Language modeling is used in speech recognition, machine translation, part-of-speech tagging, parsing, Optical Character Recognition, handwriting recognition, information retrieval and other applications.

Calvin Mooers

Calvin N. MooersZator Company
Calvin Northrup Mooers (October 24, 1919 – December 1, 1994), was an American computer scientist known for his work in information retrieval and for the programming language TRAC.

Mind map

mind mappingmind mapsmindmapping
Depending on the application the data objects may be, for example, text documents, images, audio, mind maps or videos.
To do so, mind maps can be analysed with classic methods of information retrieval to classify a mind map's author or documents that are linked from within the mind map.

Topic-based vector space model

(Enhanced) Topic-based Vector Space Model
The Topic-based Vector Space Model (TVSM) (literature: ) extends the vector space model of information retrieval by removing the constraint that the term-vectors be orthogonal.

Cyril Cleverdon

Cyril W. Cleverdon
Cyril Cleverdon (9 September 1914 – 4 December 1997) was a British librarian and computer scientist who is best known for his work on the evaluation of information retrieval systems.

SMART Information Retrieval System

The SMART (System for the Mechanical Analysis and Retrieval of Text) Information Retrieval System is an information retrieval system developed at Cornell University in the 1960s.