# Sequence alignment, Optimal matching, Dynamic time warping and Levenshtein distance

The method was tailored to social sciences from a technique originally introduced to study molecular biology (protein or genetic) sequences (see sequence alignment).

- Optimal matching

It is closely related to pairwise string alignments.

- Levenshtein distance

The optimal match is denoted by the match that satisfies all the restrictions and the rules and that has the minimal cost, where the cost is computed as the sum of absolute differences, for each matched pair of indices, between their values.

- Dynamic time warping

This sequence alignment method is often used in time series classification.

- Dynamic time warping

The methods used for biological sequence alignment have also found applications in other fields, most notably in natural language processing and in social sciences, where the Needleman-Wunsch algorithm is usually referred to as Optimal matching.

- Sequence alignment

266 related topics

## Dynamic programming

Both a mathematical optimization method and a computer programming method.

Both a mathematical optimization method and a computer programming method.

Dynamic programming is widely used in bioinformatics for the tasks such as sequence alignment, protein folding, RNA structure prediction and protein-DNA binding.

Many string algorithms including longest common subsequence, longest increasing subsequence, longest common substring, Levenshtein distance (edit distance)

The dynamic time warping algorithm for computing the global distance between two time series

## Needleman–Wunsch algorithm

The Needleman–Wunsch algorithm is an algorithm used in bioinformatics to align protein or nucleotide sequences.

It is also sometimes referred to as the optimal matching algorithm and the global alignment technique.

("time warping"), and by Robert A. Wagner and Michael J. Fischer in 1974 for string matching.

Another possibility is to minimize the edit distance between sequences, introduced by Vladimir Levenshtein.

## Hirschberg's algorithm

In computer science, Hirschberg's algorithm, named after its inventor, Dan Hirschberg, is a dynamic programming algorithm that finds the optimal sequence alignment between two strings.

Optimality is measured with the Levenshtein distance, defined to be the sum of the costs of insertions, replacements, deletions, and null actions needed to change one string into the other.

## Edit distance

Way of quantifying how dissimilar two strings are to one another by counting the minimum number of operations required to transform one string into the other.

Way of quantifying how dissimilar two strings are to one another by counting the minimum number of operations required to transform one string into the other.

Levenshtein distance operations are the removal, insertion, or substitution of a character in the string.

Hirschberg's algorithm computes the optimal alignment of two strings, where optimality is defined as minimizing edit distance.

## Speech recognition

Interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers with the main benefit of searchability.

Interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers with the main benefit of searchability.

Around this time Soviet researchers invented the dynamic time warping (DTW) algorithm and used it to create a recognizer capable of operating on a 200-word vocabulary.

The loss function is usually the Levenshtein distance, though it can be different distances for specific tasks; the set of possible transcriptions is, of course, pruned to maintain tractability.

## Spell checker

Software feature that checks for misspellings in a text.

Software feature that checks for misspellings in a text.

Spell checkers can use approximate string matching algorithms such as Levenshtein distance to find correct spellings of misspelled words.

## Hamming distance

Number of positions at which the corresponding symbols are different.

Number of positions at which the corresponding symbols are different.

However, for comparing strings of different lengths, or strings where not just substitutions but also insertions or deletions have to be expected, a more sophisticated metric like the Levenshtein distance is more appropriate.

## Approximate string matching

Technique of finding strings that match a pattern approximately (rather than exactly).

Technique of finding strings that match a pattern approximately (rather than exactly).

In fact, we can use the Levenshtein distance computing algorithm for E(m, j), the only difference being that we must initialize the first row with zeros, and save the path of computation, that is, whether we used E(i − 1,j), E(i,j − 1) or E(i − 1,j − 1) in computing E(i, j).

## Optical character recognition

Electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene-photo (for example the text on signs and billboards in a landscape photo) or from subtitle text superimposed on an image (for example: from a television broadcast).

Electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene-photo (for example the text on signs and billboards in a landscape photo) or from subtitle text superimposed on an image (for example: from a television broadcast).

The Levenshtein Distance algorithm has also been used in OCR post-processing to further optimize results from an OCR API.