Linear hashing (LH) is a dynamic data structure which implements a hash table and grows or shrinks one bucket at a time. It was invented by Witold Litwin in 1980.[1][2] It has been analyzed by Baeza-Yates and Soza-Pollman.[3] It is the first in a number of schemes known as dynamic hashing[3][4] such as Larson's Linear Hashing with Partial Extensions, [5] Linear Hashing with Priority Splitting,[6] Linear Hashing with Partial Expansions and Priority Splitting,[7] or Recursive Linear Hashing.[8]
The file structure of a dynamic hashing data structure adapts itself to changes in the size of the file, so expensive periodic file reorganization is avoided.[4] A Linear Hashing file expands by splitting a predetermined bucket into two and shrinks by merging two predetermined buckets into one. The trigger for a reconstruction depends on the flavor of the scheme; it could be an overflow at a bucket or load factor (i.e., the number of records divided by the number of buckets) moving outside of a predetermined range.[1] In Linear Hashing there are two types of buckets, those that are to be split and those already split. While extendible hashing splits only overflowing buckets, spiral hashing (a.k.a. spiral storage) distributes records unevenly over the buckets such that buckets with high costs of insertion, deletion, or retrieval are earliest in line for a split.[5]
Linear Hashing has also been made into a scalable distributed data structure, LH*. In LH*, each bucket resides at a different server.[9] LH* itself has been expanded to provide data availability in the presence of failed buckets.[10] Key based operations (inserts, deletes, updates, reads) in LH and LH* take maximum constant time independent of the number of buckets and hence of records.[1][10]
^ abcLitwin, Witold (1980), "Linear hashing: A new tool for file and table addressing" (PDF), Proc. 6th Conference on Very Large Databases: 212–223
^
Ellis, Carla Schlatter (June 1987), "Concurrency in Linear Hashing", ACM Transactions on Database Systems, 12 (2): 195–217, doi:10.1145/22952.22954, S2CID 14260177
^ abBaeza-Yates, Ricardo; Soza-Pollman, Hector (1998), "Analysis of Linear Hashing Revised" (PDF), Nordic Journal of Computing: 70–85, S2CID 7497598, archived from the original (PDF) on 2019-03-07
^ abEnbody, Richard; Du, HC (June 1988), "Dynamic hashing schemes", ACM Computing Surveys, 20 (2): 85–113, doi:10.1145/46157.330532, S2CID 1437123
^ abLarson, Per-Åke (April 1988), "Dynamic Hash Tables", Communications of the ACM, 31 (4): 446–457, doi:10.1145/42404.42410, S2CID 207548097
^
Ruchte, Willard; Tharp, Alan (Feb 1987), "Linear hashing with Priority Splitting: A method for improving the retrieval performance of linear hashing", IEEE Third International Conference on Data Engineering: 2–9
^
Manolopoulos, Yannis; Lorentzos, N. (1994), "Performance of linear hashing schemes for primary key retrieval", Information Systems, 19 (5): 433–446, doi:10.1016/0306-4379(94)90005-1
^
Ramamohanarao, K.; Sacks-Davis, R. (Sep 1984), "Recursive linear hashing", ACM Transactions on Databases, 9 (3): 369–391, doi:10.1145/1270.1285, S2CID 18577730
^
Litwin, Witold; Neimat, Marie-Anne; Schneider, Donavan A. (1993), "LH: Linear Hashing for distributed files", ACM SIGMOD Record, 22 (2): 327–336, doi:10.1145/170036.170084, S2CID 259938726
^ ab
Litwin, Witold; Moussa, Rim; Schwarz, Thomas (Sep 2005), "LH*RS - a highly-available scalable distributed data structure", ACM Transactions on Database Systems, 30 (3): 769–811, doi:10.1145/1093382.1093386, S2CID 1802386
Linearhashing (LH) is a dynamic data structure which implements a hash table and grows or shrinks one bucket at a time. It was invented by Witold Litwin...
perfect hash function can be created if all the keys are known ahead of time. The schemes of hashing used in integer universe assumption include hashing by...
H(z,n) with probability close to n/(n + 1). Linearhashing and spiral hashing are examples of dynamic hash functions that execute in constant time but...
double hashing, linear probing is a form of open addressing. In these schemes, each cell of a hash table stores a single key–value pair. When the hash function...
systems is linearhashing. Indices can be implemented using a variety of data structures. Popular indices include balanced trees, B+ trees and hashes. In Microsoft...
Double hashing is a computer programming technique used in conjunction with open addressing in hash tables to resolve hash collisions, by using a secondary...
In computer science, consistent hashing is a special kind of hashing technique such that when a hash table is resized, only n / m {\displaystyle n/m} keys...
perfect hashing in Java MPHSharp: perfect hashing methods in C# BBHash: minimal perfect hash function in header-only C++ Perfect::Hash, perfect hash generator...
password hashing is performed; original passwords cannot be recalculated from the stored hash value. However, use of standard cryptographic hash functions...
hopscotch hashing, as in cuckoo hashing, and unlike in linear probing, a given item will always be inserted-into and found-in the neighborhood of its hashed bucket...
families are known (for hashing integers, vectors, strings), and their evaluation is often very efficient. Universal hashing has numerous uses in computer...
Double hashing can also require more computation than other forms of probing. Some open addressing methods, such as Hopscotch hashing, Robin Hood hashing,...
inserting a new key into a cuckoo hashing table may push an older key to a different location in the table. Cuckoo hashing was first described by Rasmus Pagh...
materials. Zobrist hashing is the first known instance of the generally useful underlying technique called tabulation hashing. Zobrist hashing starts by randomly...
methods that require a high-quality hash function, including hopscotch hashing, cuckoo hashing, and the MinHash technique for estimating the size of...
Spiral hashing, also known as Spiral Storage is an extensible hashing algorithm. As in all hashing schemes, spiral hashing stores records in a varying...
Larson) is a computer scientist. He is most famous for inventing the linearhashing algorithm with Witold Litwin. Paul Larson is currently a senior researcher...
In machine learning, feature hashing, also known as the hashing trick (by analogy to the kernel trick), is a fast and space-efficient way of vectorizing...
computer science and data mining, MinHash (or the min-wise independent permutations locality sensitive hashing scheme) is a technique for quickly estimating...
A rolling hash (also known as recursive hashing or rolling checksum) is a hash function where the input is hashed in a window that moves through the input...
take place when a hash collision happens and this method is implemented. Some types of probing are linear probing, double hashing, and quadratic probing...
content management systems were reported to still use MD5 for password hashing. In 1996, a flaw was found in the design of MD5. While it was not deemed...
probability. Unfortunately, we do not have linear perfect hashing, so we have to use an almost linearhash function, i.e. a function h such that: h (...
element vary. Linear search is rarely practical because other search algorithms and schemes, such as the binary search algorithm and hash tables, allow...
alternative to linear probing because it incurs less clustering. Quadratic probing exhibits better locality of reference than many other hash table such as...
algorithms: linear, binary, and hashing. Linear search algorithms check every record for the one associated with a target key in a linear fashion. Binary, or half-interval...