Hash table

hash tableshashhasheshashinghashtablehash mapload factorLoad factor (computer science)rehashCollision resolution
In computing, a hash table (hash map) is a data structure that implements an associative array abstract data type, a structure that can map keys to values.wikipedia
331 Related Articles

Associative array

mapdictionariesdictionary
In computing, a hash table (hash map) is a data structure that implements an associative array abstract data type, a structure that can map keys to values. For this reason, they are widely used in many kinds of computer software, particularly for associative arrays, database indexing, caches, and sets.
The two major solutions to the dictionary problem are a hash table or a search tree.

Data structure

data structuresstructurestructures
In computing, a hash table (hash map) is a data structure that implements an associative array abstract data type, a structure that can map keys to values.
For example, relational databases commonly use B-tree indexes for data retrieval, while compiler implementations usually use hash tables to look up identifiers.

Open addressing

For open addressing schemes, the hash function should also avoid clustering, the mapping of two or more keys to consecutive slots.
Open addressing, or closed hashing, is a method of collision resolution in hash tables.

Cryptographic hash function

cryptographic hashhashhashing
Cryptographic hash functions are believed to provide good hash functions for any table size, either by modulo reduction or by bit masking.
They can also be used as ordinary hash functions, to index data in hash tables, for fingerprinting, to detect duplicate data or uniquely identify files, and as checksums to detect accidental data corruption.

Perfect hash function

Perfect hashingperfect hashperfect
If all keys are known ahead of time, a perfect hash function can be used to create a perfect hash table that has no collisions.
A perfect hash function has many of the same applications as other hash functions, but with the advantage that no collision resolution has to be implemented.

Linear probing

hashing with linear probing
Another alternative open-addressing solution is hopscotch hashing, which combines the approaches of cuckoo hashing and linear probing, yet seems in general to avoid their limitations.
Linear probing is a scheme in computer programming for resolving collisions in hash tables, data structures for maintaining a collection of key–value pairs and looking up the value associated with a given key.

SUHA (computer science)

sufficiently uniform
If the distribution of keys is sufficiently uniform, the average cost of a lookup depends only on the average number of keys per bucket—that is, it is roughly proportional to the load factor.
In computer science, SUHA (Simple Uniform Hashing Assumption) is a basic assumption that facilitates the mathematical analysis of hash tables.

Concurrent hash table

The algorithm is well suited for implementing a resizable concurrent hash table.
A concurrent hash table (concurrent hash map) is an implementation of hash tables allowing concurrent access by multiple threads using a hash function.

Dynamic perfect hashing

Dynamic perfect hash table
An elaboration on this approach is the so-called dynamic perfect hashing, where a bucket that contains k entries is organized as a perfect hash table with k 2 slots.
In computer science, dynamic perfect hashing is a programming technique for resolving collisions in a hash table data structure.

Cuckoo hashing

Another alternative open-addressing solution is hopscotch hashing, which combines the approaches of cuckoo hashing and linear probing, yet seems in general to avoid their limitations. Another alternative open-addressing solution is cuckoo hashing, which ensures constant lookup and deletion time in the worst case, and constant amortized time for insertions (with low probability that the worst-case will be encountered).
Cuckoo hashing is a scheme in computer programming for resolving hash collisions of values of hash functions in a table, with worst-case constant lookup time.

Linear search

sequential searchlinear scanSequential
However, this introduces extra complexity into the implementation, and may cause even worse performance for smaller hash tables, where the time spent inserting into and balancing the tree is greater than the time needed to perform a linear search on all of the elements of a list.
Linear search is rarely practical because other search algorithms and schemes, such as the binary search algorithm and hash tables, allow significantly faster searching for all but short lists.

Distributed hash table

DHTDistributed Hash Tablesdistributed
Such hash functions are prevalent in disk-based and distributed hash tables, where rehashing is prohibitively costly.
A distributed hash table (DHT) is a distributed system that provides a lookup service similar to a hash table : (key, value) pairs are stored in a DHT, and any participating node can efficiently retrieve the value associated with a given key.

2-choice hashing

2-choice hashing employs two different hash functions, h 1 (x) and h 2 (x), for the hash table.
2-choice hashing, also known as 2-choice chaining, is "a variant of a hash table in which keys are added by hashing with two hash functions. The key is put in the array position with the fewer (colliding) keys. Some collision resolution scheme is needed, unless keys are kept in buckets. The average-case cost of a successful search is O(2 + (m-1)/n), where m is the number of keys and n is the size of the array. The most collisions is with high probability."

Coalesced hashing

A hybrid of chaining and open addressing, coalesced hashing links together chains of nodes within the table itself.
Coalesced hashing, also called coalesced chaining, is a strategy of collision resolution in a hash table that forms a hybrid of separate chaining and open addressing.

Double hashing

rehashing
Double hashing is a computer programming technique used in conjunction with open-addressing in hash tables to resolve hash collisions, by using a secondary hash of the key as an offset when a collision occurs.

Content addressable network

CANcontent-addressable networks
The four most popular approaches are rendezvous hashing, consistent hashing, the content addressable network algorithm, and Kademlia distance.
The Content Addressable Network (CAN) is a distributed, decentralized P2P infrastructure that provides hash table functionality on an Internet-like scale.

Linear hashing

linear hash
Linear hashing is a hash table algorithm that permits incremental hash table expansion.
Linear hashing (LH) is a dynamic data structure which implements a hash table and grows or shrinks one bucket at a time.

Hopscotch hashing

Another alternative open-addressing solution is hopscotch hashing, which combines the approaches of cuckoo hashing and linear probing, yet seems in general to avoid their limitations.
Hopscotch hashing is a scheme in computer programming for resolving hash collisions of values of hash functions in a table using open addressing.

Software

Computer softwareSoftware & Programmingsoftware technology
For this reason, they are widely used in many kinds of computer software, particularly for associative arrays, database indexing, caches, and sets.
Data structures such as hash tables, arrays, and binary trees, and algorithms such as quicksort, can be useful for creating software.

Collision (computer science)

hash collisioncollisioncollisions
Ideally, the hash function will assign each key to a unique bucket, but most hash table designs employ an imperfect hash function, which might cause hash collisions where the hash function generates the same index for more than one key.

Prime number

primeprime factorprime numbers
On the other hand, some hashing algorithms prefer to have the size be a prime number.
Prime numbers are also used in computing for checksums, hash tables, and pseudorandom number generators.

Birthday problem

birthday paradoxbirthday problem of 2 typesBirthday theorem
For example, if 2,450 keys are hashed into a million buckets, even with a perfectly uniform random distribution, according to the birthday problem there is approximately a 95% chance of at least two of the keys being hashed to the same slot.
. This is exploited by birthday attacks on cryptographic hash functions and is the reason why a small number of collisions in a hash table are, for all practical purposes, inevitable.

Universal hashing

universal hash functionuniversal hashuniversal
However, the risk of sabotage can also be avoided by cheaper methods (such as applying a secret salt to the data, or using a universal hash function).
Universal hashing has numerous uses in computer science, for example in implementations of hash tables, randomized algorithms, and cryptography.

Quadratic probing

Quadratic probing is an open addressing scheme in computer programming for resolving hash collisions in hash tables.

Mask (computing)

maskmaskingbit mask
Cryptographic hash functions are believed to provide good hash functions for any table size, either by modulo reduction or by bit masking. In the case that the array size is a power of two, the remainder operation is reduced to masking, which improves speed, but can increase problems with a poor hash function.
To create a hashing function for a hash table, often a function is used that has a large domain.