Linear hashing information

Linear hashing (LH) is a dynamic data structure which implements a hash table and grows or shrinks one bucket at a time. It was invented by Witold Litwin in 1980.^[1] ^[2] It has been analyzed by Baeza-Yates and Soza-Pollman.^[3] It is the first in a number of schemes known as dynamic hashing^[3] ^[4] such as Larson's Linear Hashing with Partial Extensions, ^[5] Linear Hashing with Priority Splitting,^[6] Linear Hashing with Partial Expansions and Priority Splitting,^[7] or Recursive Linear Hashing.^[8]

The file structure of a dynamic hashing data structure adapts itself to changes in the size of the file, so expensive periodic file reorganization is avoided.^[4] A Linear Hashing file expands by splitting a predetermined bucket into two and shrinks by merging two predetermined buckets into one. The trigger for a reconstruction depends on the flavor of the scheme; it could be an overflow at a bucket or load factor (i.e., the number of records divided by the number of buckets) moving outside of a predetermined range.^[1] In Linear Hashing there are two types of buckets, those that are to be split and those already split. While extendible hashing splits only overflowing buckets, spiral hashing (a.k.a. spiral storage) distributes records unevenly over the buckets such that buckets with high costs of insertion, deletion, or retrieval are earliest in line for a split.^[5]

Linear Hashing has also been made into a scalable distributed data structure, LH*. In LH*, each bucket resides at a different server.^[9] LH* itself has been expanded to provide data availability in the presence of failed buckets.^[10] Key based operations (inserts, deletes, updates, reads) in LH and LH* take maximum constant time independent of the number of buckets and hence of records.^[1]^[10]

^ ^a ^b ^c Litwin, Witold (1980), "Linear hashing: A new tool for file and table addressing" (PDF), Proc. 6th Conference on Very Large Databases: 212–223
^ Ellis, Carla Schlatter (June 1987), "Concurrency in Linear Hashing", ACM Transactions on Database Systems, 12 (2): 195–217, doi:10.1145/22952.22954, S2CID 14260177
^ ^a ^b Baeza-Yates, Ricardo; Soza-Pollman, Hector (1998), "Analysis of Linear Hashing Revised" (PDF), Nordic Journal of Computing: 70–85, S2CID 7497598, archived from the original (PDF) on 2019-03-07
^ ^a ^b Enbody, Richard; Du, HC (June 1988), "Dynamic hashing schemes", ACM Computing Surveys, 20 (2): 85–113, doi:10.1145/46157.330532, S2CID 1437123
^ ^a ^b Larson, Per-Åke (April 1988), "Dynamic Hash Tables", Communications of the ACM, 31 (4): 446–457, doi:10.1145/42404.42410, S2CID 207548097
^ Ruchte, Willard; Tharp, Alan (Feb 1987), "Linear hashing with Priority Splitting: A method for improving the retrieval performance of linear hashing", IEEE Third International Conference on Data Engineering: 2–9
^ Manolopoulos, Yannis; Lorentzos, N. (1994), "Performance of linear hashing schemes for primary key retrieval", Information Systems, 19 (5): 433–446, doi:10.1016/0306-4379(94)90005-1
^ Ramamohanarao, K.; Sacks-Davis, R. (Sep 1984), "Recursive linear hashing", ACM Transactions on Databases, 9 (3): 369–391, doi:10.1145/1270.1285, S2CID 18577730
^ Litwin, Witold; Neimat, Marie-Anne; Schneider, Donavan A. (1993), "LH: Linear Hashing for distributed files", ACM SIGMOD Record, 22 (2): 327–336, doi:10.1145/170036.170084, S2CID 259938726
^ ^a ^b Litwin, Witold; Moussa, Rim; Schwarz, Thomas (Sep 2005), "LH*RS - a highly-available scalable distributed data structure", ACM Transactions on Database Systems, 30 (3): 769–811, doi:10.1145/1093382.1093386, S2CID 1802386

[WL80-1] Litwin, Witold (1980), "Linear hashing: A new tool for file and table addressing" (PDF), Proc. 6th Conference on Very Large Databases: 212–223

[Ellis-2] Ellis, Carla Schlatter (June 1987), "Concurrency in Linear Hashing", ACM Transactions on Database Systems, 12 (2): 195–217, doi:10.1145/22952.22954, S2CID 14260177

[BS-3] Baeza-Yates, Ricardo; Soza-Pollman, Hector (1998), "Analysis of Linear Hashing Revised" (PDF), Nordic Journal of Computing: 70–85, S2CID 7497598, archived from the original (PDF) on 2019-03-07

[RD-4] Enbody, Richard; Du, HC (June 1988), "Dynamic hashing schemes", ACM Computing Surveys, 20 (2): 85–113, doi:10.1145/46157.330532, S2CID 1437123

[AL-5] Larson, Per-Åke (April 1988), "Dynamic Hash Tables", Communications of the ACM, 31 (4): 446–457, doi:10.1145/42404.42410, S2CID 207548097

[ruchte-6] Ruchte, Willard; Tharp, Alan (Feb 1987), "Linear hashing with Priority Splitting: A method for improving the retrieval performance of linear hashing", IEEE Third International Conference on Data Engineering: 2–9

[7] Manolopoulos, Yannis; Lorentzos, N. (1994), "Performance of linear hashing schemes for primary key retrieval", Information Systems, 19 (5): 433–446, doi:10.1016/0306-4379(94)90005-1

[RS-8] Ramamohanarao, K.; Sacks-Davis, R. (Sep 1984), "Recursive linear hashing", ACM Transactions on Databases, 9 (3): 369–391, doi:10.1145/1270.1285, S2CID 18577730

[WL93-9] Litwin, Witold; Neimat, Marie-Anne; Schneider, Donavan A. (1993), "LH: Linear Hashing for distributed files", ACM SIGMOD Record, 22 (2): 327–336, doi:10.1145/170036.170084, S2CID 259938726

[LMS-10] Litwin, Witold; Moussa, Rim; Schwarz, Thomas (Sep 2005), "LH*RS - a highly-available scalable distributed data structure", ACM Transactions on Database Systems, 30 (3): 769–811, doi:10.1145/1093382.1093386, S2CID 1802386

Linear hashing information

and 28 Related for: Linear hashing information

Linear hashing

Hash table

Hash function

Linear probing

Database index

Double hashing

Consistent hashing

Perfect hash function

Cryptographic hash function

Extendible hashing

Hopscotch hashing

List of data structures

Universal hashing

Open addressing

Cuckoo hashing

Zobrist hashing

Tabulation hashing

Spiral hashing

Paul Larson

Feature hashing

MinHash

Rolling hash

Hash collision

MD5

3SUM

Linear search

Quadratic probing

Search algorithm