Named-entity recognition

named entity recognitionentity extractionnamed entitiesNamed Entity Extractionentitiesentityentity detectionentity recognitionnamed entities recognitionNamed Entity Classification
Named-entity recognition (NER) (also known as entity identification, entity chunking and entity extraction) is a subtask of information extraction that seeks to locate and classify named entity mentions in unstructured text into pre-defined categories such as the person names, organizations, locations, medical codes, time expressions, quantities, monetary values, percentages, etc.wikipedia
106 Related Articles

Information extraction

extractionextraction of informationextract information
Named-entity recognition (NER) (also known as entity identification, entity chunking and entity extraction) is a subtask of information extraction that seeks to locate and classify named entity mentions in unstructured text into pre-defined categories such as the person names, organizations, locations, medical codes, time expressions, quantities, monetary values, percentages, etc.

General Architecture for Text Engineering

GATE
GATE includes an information extraction system called ANNIE (A Nearly-New Information Extraction System) which is a set of modules comprising a tokenizer, a gazetteer, a sentence splitter, a part of speech tagger, a named entities transducer and a coreference tagger.

SpaCy

The library is published under the MIT license and currently offers statistical neural network models for English, German, Spanish, Portuguese, French, Italian, Dutch and multi-language NER, as well as tokenization for various other languages.

Named entity

named entities
Named-entity recognition (NER) (also known as entity identification, entity chunking and entity extraction) is a subtask of information extraction that seeks to locate and classify named entity mentions in unstructured text into pre-defined categories such as the person names, organizations, locations, medical codes, time expressions, quantities, monetary values, percentages, etc.
There is also a general agreement in the Named Entity Recognition community to consider as named entities temporal and numerical expressions such as amounts of money and other types of units, which may violate the rigid designator perspective.

Apache OpenNLP

OpenNLP
It supports the most common NLP tasks, such as language detection, tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing and coreference resolution.

F1 score

F-MeasureF-scoreF1
For example, the best system entering MUC-7 scored 93.39% of F-measure while human annotators scored 97.60% and 96.95%.
The F-score has been widely used in the natural language processing literature, such as the evaluation of named entity recognition and word segmentation.

Message Understanding Conference

MUCMUC-6 evaluation campaignMUC-7
For example, the best system entering MUC-7 scored 93.39% of F-measure while human annotators scored 97.60% and 96.95%.
At the sixth conference (MUC-6) the task of recognition of named entities and coreference was added.

Conditional random field

conditional random fieldsCRF
Many different classifier types have been used to perform machine-learned NER, with conditional random fields being a typical choice.
Specifically, CRFs find applications in POS tagging, shallow parsing, named entity recognition, gene finding and peptide critical functional region finding, among other tasks, being an alternative to the related hidden Markov models (HMMs).

Question answering

answer enginequestion answering systemquestion-answering
BBN categories, proposed in 2002, is used for question answering and consists of 29 types and 64 subtypes.
For questions such as "Who" or "Where", a named-entity recogniser is used to find relevant "Person" and "Location" names from the retrieved documents.

Entity linking

Named entity disambiguationcross-linking them to WikipediaDifferences from other techniques
A recently emerging task of identifying "important expressions" in text and cross-linking them to Wikipedia can be seen as an instance of extremely fine-grained named-entity recognition, where the types are the actual Wikipedia pages describing the (potentially ambiguous) concepts.
Entity linking is different from named-entity recognition (NER) in that NER identifies the occurrence of a named entity in text but it does not identify which specific entity it is (see Differences from other techniques).

Shallow parsing

chunkingChunking (computational linguistics)chunker
This segmentation problem is formally similar to chunking.

Knowledge extraction

knowledge discoveryderivation of knowledgediscovery
# DBpedia Spotlight, OpenCalais, Dandelion dataTXT, the Zemanta API, Extractiv and PoolParty Extractor analyze free text via named-entity recognition and then disambiguates candidates via name resolution and links the found entities to the DBpedia knowledge repository ( Dandelion dataTXT demo or DBpedia Spotlight web demo or PoolParty Extractor Demo).

Natural language processing

NLPnatural languagenatural-language processing
Since about 1998, there has been a great deal of interest in entity identification in the molecular biology, bioinformatics, and medical natural language processing communities.

Onomastics

onomasticonomasticianonomatology
Onomastics can be helpful in data mining, with applications such as named-entity recognition, or recognition of the origin of names.

Crowdsourcing

crowdsourcedcrowd-sourcedcrowdsource
In recent years, many projects have turned to crowdsourcing, which is a promising solution to obtain high-quality aggregate human judgments for supervised and semi-supervised machine learning approaches to NER.
Crowdsourcing has been extensively used to collect high-quality gold standard for creating automatic systems in natural language processing (e.g. named entity recognition, entity linking).

Record linkage

Identity resolutionentity resolutionobject identification/entity resolution/record linkage

Unstructured data

unstructuredunstructured text(unstructured)
Named-entity recognition (NER) (also known as entity identification, entity chunking and entity extraction) is a subtask of information extraction that seeks to locate and classify named entity mentions in unstructured text into pre-defined categories such as the person names, organizations, locations, medical codes, time expressions, quantities, monetary values, percentages, etc.

Medical classification

statistical classificationclassificationWHO Family of International Classifications
Named-entity recognition (NER) (also known as entity identification, entity chunking and entity extraction) is a subtask of information extraction that seeks to locate and classify named entity mentions in unstructured text into pre-defined categories such as the person names, organizations, locations, medical codes, time expressions, quantities, monetary values, percentages, etc.

Rigid designator

rigid designatorsrigid designationrigidity
This is closely related to rigid designators, as defined by Kripke, although in practice NER deals with many names and referents that are not philosophically "rigid".

Saul Kripke

KripkeKripke, SaulSaul Aaron Kripke
This is closely related to rigid designators, as defined by Kripke, although in practice NER deals with many names and referents that are not philosophically "rigid".

Ford (disambiguation)

Ford
For instance, the automotive company created by Henry Ford in 1903 can be referred to as Ford or Ford Motor Company, although "Ford" can refer to many other entities as well (see Ford).