Tutorial - Intro to Hash Tables

intro to hash tables

hash tables are something that might seem daunting at first to the new scripter on the scene (and has also seemingly managed to confuse some of the old scripters on the scene as well). hash tables (at least mirc's hash tables) are really easy to use and understand as it turns out.

first things first. how best to think about a hash table. well mirc's interface for hash tables is alot like dealing associative arrays in some other languages as well as mappings in others. the easiest way to think of them in my opinion is a 2 column table. the first column is your item name and the second column is the data that goes with the item

hash table creation

lets make a hash table, since you have to make it before you can use it.

in the mirc.hlp file we see the syntax for making a hash table is: /hmake -s <name> <S>

the -s tells mirc you want to have it tell you it made the hash table by printing a message in the status window. this switch is completely optional.

<name> as it implies corresponds to the name you give your hash table

<S> is the size of the hash table. this is the part that gets people confused. this does in no way reflect how many items you can actually put in a hash table, but instead will help determine how quickly mirc can deal with your hash table. if you expect the number of items that you will put in your table to be 1000 then a hash table size of 100 is large enough to keep a decent performance.

lets go ahead and make our hash table
/hmake MyHashTable 50
what that just did is tell mirc to make a new hash table with the name "MyHashTable" of size 50 (again this isnt the limit of the number of items we can put in the hash table, this number only deals with the speed of working with the hash table)

lets look at an illustration of MyHashTable as it stands right now:
items      |       data
-----------|-------------
           |
           |
           |
           |
           |
congradulations you just made one of your first blank hash tables ;}

*********** UPDATE: 9.26.2000 ****************

first of all i miscopied out of the mirc.hlp file for the syntax of /hmake. (i was copying it out of my memory and didnt check it against the real help file)

the proper syntax is: /hmake -s <name> <N> (i used an <S> instead of a <N>)

also it has been brought to my attention that the <N> parameter used in /hmake is still confusing some people (understandably so if you dont know what a real hash table is).

basically the <N> is something used internally by mIRC when it allocates your hash table. it has absolutely nothing to do with a limit on the number of items you can put into your hash table. really the only thing it affects is how much memory is used initially by your hash table, how fast mirc can add information to your hash table, and how fast mirc can get information out from your hash table.

when dealing with a hash table, as provided by mirc, there are 2 values that fall under the heading of "size". the first is the <N> value you specified when you used /hmake to create your table. the second is the number of items that are actually in your table (which can be found with $hget(name,0).item). we will refer to this second value as K. the mirc.hlp file says "if you expect that you'll be storing 1000 items in the table, a table of N set to 100 is quite sufficient."

in other words if you already know how many items (K as we said before) you are going to put in your hash table, then when you are making your hash table you should use about (K / 10) for N. this rule of thumb still provides pretty good hash table performance while keeping the initial memory requirement for the hash table relatively low.

ive been asked why not just use the value K for N. well you can if you like, i really dont expect you to gain a great deal more performance out of the table as just using (K / 10). using a value of N that is much larger than K would also be pointless as it just wastes space in the initial table and wont make your table really run any faster than if you just used K for N.

if you didnt follow that, then here is the simple way of doing it. ask yourself "how many items am i going to put in this table?" take that number and divide by 10 (or so, it really doesnt matter all that much). your left with the N value to use when you /hmake your hash table. example: im going to store script settings in a hash table to make accessing them easier. i have about 200 different settings that i want in that table. 200 / 10 = 20. when i make the table i use /hmake settings 20.

hope this clears up any confusion

adding data to our table

lets add some data to this table. from mirc.hlp we find the syntax to be: /hadd -s <name> <item> <data>

the -s switch is the exact same as for /hmake

<name> as it implies is the name of our hash table

<item> is the entry in the first column of our table

<data> is the corresponding entry in the second column

lets add some data to this hash table:
/hadd MyHashTable autojoin1 #mirc
/hadd MyHashTable autojoin2 #blazingsaddles
and again we look at what our table currently looks like:
items      |      data
-------------------------------
autojoin1  |  #mirc
autojoin2  |  #blazingsaddles
           |
one important thing to remember with /hadd is that you HAVE to specify the data to go along with the item. specifying item alone isnt enough and mirc will come back with an error similar to "* /hadd: insufficient parameters". another important thing to remember is that there can be only one item of a given name in the same table. if i take the table we are working on now and /hadd MyHashTable autojoin1 #chatzone then the resulting table will be:
items      |      data
-------------------------------
autojoin1  |  #chatzone
autojoin2  |  #blazingsaddles
           |

deleting entries from our table

mirc.hlp lists /hdel -sw <name> <item> as the deletion command.

the -s switch is again the same as the previous commands

<name> is again the name of the hash table we want to delete entries from

<item> is the name of the item that we want deleted if the -w switch wasnt used. if the -w switch was used then <item> is a wild card match of all the items we want to delete from the table.

so lets say we dont need the autojoin2 information anymore. /hdel MyHashTable autojoin2. the resulting table looks like:
items      |      data
-------------------------------
autojoin1  |  #chatzone
           |

getting information from your hash table

mirc.hlp lists three different versions of the $hget identifier. we will tackle them one at a time.

first off is: $hget(name/N) or $hget(name/N).size

name is the name of your table.

N is a way of looking through your hash tables numerically (but is really beyond the scope of this tutorial)

used with just the name parameter, $hget returns the table name you specified if that table exists (that is if you /hmake-ed the table) otherwise it returns $null. if you add the .size property then the size of the table (the size you used when you ran /hmake to make the table) is returned. the .size value returned for a table that does not exist (that is one that was never /hmake-ed) seems to always be 0. you are probably better off using $hget(name) first to see if the table exists and then checking its size parameter as i know not if this behavior will change in future mirc versions

second is: $hget(name/N,item)

again name is just the name of the table you want to look at

item is the name of the item you want to see the data for
//echo -s $hget(MyHashTable,autojoin1) would echo #chatzone to the status window
third version is: $hget(name/N,N).item

this is not the prefered way of looking at the data in hash tables and really is beyond the scope of what this tutorial is for. a basic explanation of this version is to give the scripter the ability to see every entry in a hash table without knowing what the item name is. the one exception is using $hget(name,0).item to find the total number of items stored in your hash table.

saving your hash table

mirc's hash tables are superfast and are great for storing all kinds of information, both temporary info as well as script settings. the drawback to them is you loose them when you exit mirc. its similar to the fact that when you turn your computer off, everything stored in RAM is lost (which is why you have to boot up when you turn it back on). luckily Khaled added in support for us to save our hash tables to disk. enter the /hsave command.

mirc.hlp lists the /hsave command as: /hsave -sbnoa <name> <filename>

the -s switch is exactly the same as for the previous commands

the -b switch is to tell mirc to do a binary hash save, which basically amounts to keeping the $cr's and $lf's intact if you have any in the data part of your table. if you didnt put any in the data column of your table then you probably have no reason to use the -b switch.

the -o switch tells mirc that if the file specified with <filename> exists, that mirc is to overwrite that old file with the new one you are trying to save

the -a switch tells mirc that if the file specified with <filename> exists, that mirc is to append the new file at the end of the old file (essentially combine the files into 1 larger file)

the -n switch tells mirc to save the data part of the table only, none of the items. this is not a recommended way of saving your hash table as usually the item information is just as important as the data information.

<name> is again the name of the hash table

<filename> is the name of the file you want mirc to store the hash table information to
/hsave -o MyHashTable myhashtable.tmp
what this does is first check to see if the file myhashtable.tmp exists and if it does then delete it. next it stores the content of MyHashTable to the file myhashtable.tmp

deleting your hash table

mirc.hlp lists the hash table deletion command as: /hfree -sw <name>

the -s is the same all the other hash table commands

<name> is the name of the hash table you want deleted. if the -w switch is used then <name> is the wildcard match for all of the hash tables you want deleted.
/hfree MyHashTable
to check to see if it is indeed deleted:
//echo -s $hget(MyHashTable)
mirc should give you an error about "* /echo: insufficient parameters". since MyHashTable no longer exists, the $hget identifier is returning $null.

loading a hash table from a file

in order to load a hash table from a file you first have to make sure it has a table it can go into

we can do this by /hmake MyNewHashTable 10

now we have to load the file into the hash table

mirc.hlp lists: /hload -sbn <name> <filename>

the -s is the same as before

the -n is used to load files saved with the -n switch.

the -b is used to load files saved with the -b switch
/hload MyNewHashTable myhashtable.tmp
now to see if the data got loaded
//echo -s $hget(MyNewHashTable,autojoin1)
that should echo #chatzone to the status window if the table was saved and loaded properly.

conclusion

i do hope that this information was useful to those of you struggling to understand what hash tables are. i hope this has taught you that you really dont have to understand them in order to be able to make good use of them ;}

some possible uses for hash tables in order to speed alot of things up:

clone scanning
script settings
!seen data
anything not covered in this tutorial is left as an exercise for the reader. i dont want to take all the fun away from discovering new ways of doing things on your own now do i.