Hashing ADT

 

The hashing ADT stores records in a table.

 

     const int MAX_SIZE_HASH_TABLE;

 

struct Info { ... };

 

struct Record

{

    int  key;

    Info info;

};

 

     Record hashTable[MAX_SIZE_HASH_TABLE];

 

The implementation supplies a function ( called the hashing function ) that maps keys to hash table indexes.

 

If the key is an int, then a good choice for the hashing function is:

 

    hash = key % MAX_SIZE_HASH_TABLE;

 

where  MAX_SIZE_HASH_TABLE  is a prime number.

 

E.g.

 

If MAX_SIZE_HASH_TABLE = 5,

 

a record with key = 1  will hash to index 1 ( since 1 % 5 = 1 )

a record with key = 7  will hash to index 2 ( since 7 % 5 = 2 )

a record with key = 19 will hash to index 4 (since 19 % 5 = 4 )

 

 

 

Many records can hash to the same index.  We refer to this situation as a collision.  The colliding records are called synonyms.

 

 

E.g.

 

If MAX_SIZE_HASH_TABLE = 5,

 

a record with key = 1  will hash to index 1 ( since 1 % 5 = 1 )

a record with key = 6  will also hash to index 1 ( since 6 % 5 = 1 )

 

 

 

Linear Probing

 

Linear Probing is one technique for dealing with collisions.

 

To store a record, search forward from the hash index for the first free slot in the hash table.  Store the record in the free slot.  If you reach the end of the table, continue scanning from the top of the table.

 

To retrieve a record, search forward from the hash index for the record table.  If you reach the end of the table, continue searching from the top of the table. If you reach a free slot, then the record isn't in the table.


 

E.g.

 

If MAX_SIZE_HASH_TABLE = 7,

 

a record with key = 8  will hash to index 1

a record with key = 13 will hash to index 6

 

a record with key = 1  will hash to index 1; the record is stored in the next free slot, namely at index 2

 

a record with key = 6 will hash to index 6; the record is stored in the next free slot, namely at index 0  ( slot 0 follows slot 6 )

 

a record with key = 0 will hash to index 0; the record is stored in the next free slot, namely at index 3

 

To search for the record with key 1, search forward from index 1.  The record is not found at index 1.

  The record is found at index 2.

 

To search for the record with key 15, search forward from index 1. ( 15 % 7 = 1 )

The record is not found at index 1.

The record is not found at index 2.

The record is not found at index 3. 

There is no record stored at index 4, therefore there is no stored record with key 15.

 

 

 

Linear probing, stores synonyms next to each other.  As the hash table becomes full, these groupings of synonyms coalesce together and the time to insert or search a record becomes extremely long.

 

 

One should avoid using hash tables that are near full.

 

 

Chaining

 

Another technique for dealing with collisions is to store the synonyms in linear linked lists.

 

The hash table becomes a table of pointers to the linked lists.

 

 


Hash Table File Structure

 

If the hash table's records are stored on a hard disk, then we can reduce the number of disk accesses by storing synonyms in a contiguous region of the hard disk.  We call such regions buckets. 

 

The bucket size will be between 1 sector (the smallest portion of a hard disk that can be transfered to/from RAM) and 1 cluster (the smallest contiguous portion of a hard disk that can be allocated).

 

 

 


Hash Table ADT Specification

 

 

Prototype

HashTable( /* in */ int maxSizeTable );

 

Purpose

Create an empty hash table.

 

 

Preconditions

 

 

Postconditions

An empty hash table has been created.

 

 

Prototype

~HashTable();

 

Purpose

Destroy the hash table.

 

 

Preconditions

 

 

Postconditions

The hash table has been destroyed.

 

 

Prototype

bool Search( /* in  */ int key, /* out */ Info& info );

 

Purpose

Search for an item.

 

Preconditions

 

Postconditions

If a record is found whose stored key == key
then function value = true and info = record's info
else function value = false. 

 

Prototype

bool Delete( /* in  */ int   key );

 

Purpose

Delete a record.

 

Preconditions

 

Postconditions

Function value = true, if there was a record with the specified key, false otherwise.  If the function value = true, then the record whose stored key == key has been deleted.  If the function value = false, the hash table is unchanged.

 

Prototype

bool Insert( /* in  */ int  key, /* in  */ Info info );

 

Purpose

Insert a new record at its correct position.

 

Preconditions

The hash table is not full.

 

Postconditions

Function value = true if the record has been inserted.

Function value = false, if there was already a record with the specified key.  If the function value = false, the hash table is unchanged.

 

Prototype

bool IsEmpty() const;

 

Purpose

Determine if the hash table is empty.

 

Preconditions

 

Postconditions

Function value = true, if the hash table is empty, false otherwise.

 

Prototype

bool IsFull() const;

 

Purpose

Determine if the hash table is full.

 

Preconditions

 

Postconditions

Function value = true, if the hash table is full, false otherwise.