Hashing ADT
The hashing ADT stores
records in a table.
const int MAX_SIZE_HASH_TABLE;
struct Info { ... };
struct Record
{
int key;
Info info;
};
Record hashTable[MAX_SIZE_HASH_TABLE];
The implementation supplies
a function ( called the hashing function ) that
maps keys to hash table indexes.
If the key is an int, then a good choice for the hashing function is:
hash = key %
MAX_SIZE_HASH_TABLE;
where MAX_SIZE_HASH_TABLE is a prime
number.
|
E.g. If MAX_SIZE_HASH_TABLE
= 5, a
record with key = 1 will hash to index
1 ( since 1 % 5 = 1 ) a
record with key = 7 will hash to index
2 ( since 7 % 5 = 2 ) a
record with key = 19 will hash to index 4 (since 19 % 5 = 4 )
|
Many
records can hash to the same index. We
refer to this situation as a collision.
The colliding records are called synonyms.
|
E.g. If MAX_SIZE_HASH_TABLE
= 5, a record with key = 1 will hash to index 1 ( since 1 % 5 = 1 ) a record with key = 6 will also hash to index 1 ( since 6 % 5 = 1
) |
Linear
Probing
Linear
Probing is one technique for dealing with collisions.
To
store a record, search forward from the hash index for the first free slot in
the hash table. Store the record in the
free slot. If you reach the end of the
table, continue scanning from the top of the table.
To
retrieve a record, search forward from the hash index for the record
table. If you reach the end of the
table, continue searching from the top of the table. If you reach a free slot,
then the record isn't in the table.
|
E.g. If MAX_SIZE_HASH_TABLE
= 7, a
record with key = 8 will hash to index
1 a
record with key = 13 will hash to index 6 a record with key
= 1 will hash to index 1; the record
is stored in the next free slot, namely at index 2 a record with key
= 6 will hash to index 6; the record is stored in the next free slot, namely
at index 0 ( slot 0 follows slot 6 ) a record with key
= 0 will hash to index 0; the record is stored in the next free slot, namely
at index 3
To search for the
record with key 1, search forward from index 1. The record is not found at index 1. The record is found at index 2. To search for the
record with key 15, search forward from index 1. ( 15 % 7 = 1 ) The record is not found at index 1. The record is not found at index 2. The record is not found at index 3. There is no record stored at index 4, therefore there is no stored record with key 15. |
Linear
probing, stores synonyms next to each other.
As the hash table becomes full, these groupings of synonyms coalesce
together and the time to insert or search a record becomes extremely long.

One should avoid using hash
tables that are near full.
Chaining
Another technique for
dealing with collisions is to store the synonyms in linear linked lists.
The hash table becomes a
table of pointers to the linked lists.

Hash Table File Structure
If the hash table's records
are stored on a hard disk, then we can reduce the number of disk accesses by
storing synonyms in a contiguous region of the hard disk. We call such regions buckets.
The bucket size will be
between 1 sector (the smallest portion of a hard disk that can be transfered to/from RAM) and 1 cluster (the smallest
contiguous portion of a hard disk that can be allocated).
Hash Table ADT
Specification
|
Prototype |
HashTable( /* in */ int maxSizeTable ); |
|
|
Purpose |
Create an empty hash table. |
|
|
Preconditions |
|
|
|
Postconditions |
An empty hash table has been created. |
|
|
Prototype |
~HashTable(); |
|
|
Purpose |
Destroy the hash table. |
|
|
Preconditions |
|
|
|
Postconditions |
The hash table has been destroyed. |
|
|
Prototype |
bool
Search( /* in */ int
key, /* out */ Info& info ); |
|
|
Purpose |
Search for an item. |
|
|
Preconditions |
|
|
|
Postconditions |
If a record is found whose stored key == key |
|
|
Prototype |
bool
Delete( /* in */ int key ); |
|
|
Purpose |
Delete a record. |
|
|
Preconditions |
|
|
|
Postconditions |
Function value = true, if there was a record with the
specified key, false otherwise. If the
function value = true, then the record whose stored key == key has been
deleted. If the function value =
false, the hash table is unchanged. |
|
|
Prototype |
bool
Insert( /* in */ int key, /* in
*/ Info info ); |
|
|
Purpose |
Insert a new record at its correct position. |
|
|
Preconditions |
The hash table is not full. |
|
|
Postconditions |
Function value = true if the record has been inserted. Function value = false, if there was already a record with
the specified key. If the function
value = false, the hash table is unchanged. |
|
|
Prototype |
bool
IsEmpty() const; |
|
|
Purpose |
Determine if the hash table is empty. |
|
|
Preconditions |
|
|
|
Postconditions |
Function value = true, if the hash table is empty, false
otherwise. |
|
|
Prototype |
bool
IsFull() const; |
|
|
Purpose |
Determine if the hash table is full. |
|
|
Preconditions |
|
|
|
Postconditions |
Function value = true, if the hash table is full, false
otherwise. |
|