© www.cellspark.com -
Commercial purchase |
HashStore is a disk data management
system written in Java. It offers:
- Source available under the General Public License for free use or under a commercial license
for commercial products.
- Same method calls as java.util.Hashtable
- The option to be able to add duplicate keys to the table
- Theoretical capacity of one billion entries per table
- No disk space needs to be preallocated and no capacity needs to be specified
- Never a need to rehash or do any other maintenance
- Total JAR file size for all classes less than 8K
- Small and fixed memory requirements
- Deletions result in disk space immediately being given back to the file system
- Disk space usage increases linearly
- Synchronized and non-synchronized options
HashStore is designed to provide a disk based version of Hashtable and therebye providing
long term persistence for Java objects. Objects stored must implement the
HashStore consists of 4 classes:
DiskHashtable - The base non-synchronized disk hashtable
SyncDiskHashtable - A synchronized version of
DuplicateDiskHashtable - A disk hashtable that allows duplicate keys
SyncDuplicateHashtable - A synchronized version of
How does it work?
The idea behind Hashstore is to let the underlying file system do the work of organising storage space which
is why disk usage expands and contracts according to requirements. It is also the
reason why there is need to preallocate space or worry about capacity and how the classes can be so small.
Disk space usage
Disk space usage will increase linearly with the number of entries in a table.
As objects are added it should be remembered that simple objects such as a single
integer will still occupy one file on disk and this probably won't be less than 1K.
On the other hand, as entries are deleted, the space is immediately given back to the
file system for reuse.
How are the object files stored?
Clearly all files can't be stored in a single directory. HashStore uses an
algorithm to spread files across a directory structure that is within the
capabilities of the file system. The top level consists of two directories, one for the keys and
one for the values. Under that there are three more levels of directories until we get to the
The objects hash value translates directly into a filename where the object is stored. Where there
is a collision the filename has a number appended to it which shows it's position in the collision chain.
If we had a billion entries, and the entries are distributed evenly then there should be no more than
a thousand data files in a bottom level directory.
Hashstore is distributed under the standard GPL license. If it is to be used for
a commercial product then a one time (royalty-free) commercial license should be
What about Collections?
Hashstore is Java 1.1 compliant which was done deliberately to allow it to
run in InternetExplorer. At this stage the Collections interface is not implemented but may be in a future