Hash table separate chaining source code

11 minutes
Share the link to this page
Copied
  Completed
You need to have access to the item to view this lesson.
This is a free item
$0.00
د.إ0.00
Kz0.00
ARS$0.00
A$0.00
৳0.00
Лв0.00
Bs0.00
B$0.00
P0.00
CA$0.00
CHF 0.00
CLP$0.00
CN¥0.00
COP$0.00
₡0.00
Kč0.00
DKK kr0.00
RD$0.00
DA0.00
E£0.00
ብር0.00
€0.00
FJ$0.00
£0.00
Q0.00
GY$0.00
HK$0.00
L0.00
Ft0.00
₪0.00
₹0.00
ISK kr0.00
¥0.00
KSh0.00
₩0.00
DH0.00
L0.00
ден0.00
MOP$0.00
MX$0.00
RM0.00
N$0.00
₦0.00
C$0.00
NOK kr0.00
रु0.00
NZ$0.00
S/0.00
K0.00
₱0.00
₨0.00
zł0.00
₲0.00
L0.00
QR0.00
SAR0.00
SEK kr0.00
S$0.00
฿0.00
₺0.00
$U0.00
R0.00
ZK0.00
Already have an account? Log In

Transcript

All right, it's time to have a look at some separate chaining source code for the hash table. So here I am on my GitHub repository will MCs slash data structures. And you can find the source code here under the hash table folder. I have multiple implementations of the hash table. Today we're going to be looking at the hash table with separate chaining and in later videos, probably one of these three, I haven't figured out which one yet. So let's dive into the code.

I have it here on my computer. So let's get going. All right. So first things first, I have two classes, one of them called entry the other one just separate chaining hash table. So let's have a look at the entry class first, and these entries represent individual items are key value pairs, you would want to be inserting into the hash table. So in Java, we have generics, so a generic key, which has to be hashable, and some value.

So when I create an entry, I give it the key value pairs. And I also compute the hash code. So there's a built in method in Java to compute the hash code for a particular object. And you can override it to specify the hash code for your particular object, which is really convenient. So compute the hash code and cache it, you absolutely want to cache it. So you don't have to recompute this thing multiple times.

It should be cheap to compute. But for something like a string of hash code can take linear time to compute, which is not good. So here I have an equals method, which doesn't override the object equals method because I don't want to have to do any casting. First it checks if the hashes are not equal if the hash is not equal. We know from the first video Do that they're absolutely not equal, so you can return false. Otherwise, I compare the keys to each other.

And that's about it for the entry class. Very simple, very basic. Now let's get into the meat of thing. So the hash table itself. Okay, so my default hash table capacities and v3, so it holds three, three items, and load factor by default if the user doesn't specify one, zero and B 0.75. So that's the maximum capacity I'm willing to tolerate.

Then I have a few important instance variables when you go versal, the maximum load factor. So once the load factor goes above this value, then I resize the table. So there's a capacity so the actual maximum number of items that could or how big the table sizes rather. So the threshold so this is computed to be the capacity times the max load factor. So tells us, hey, you're above the threshold resize time to resize size, how many items are currently in the table. And this is the table itself.

So it's an array of linked lists, which themselves have entries. Pretty simple. So there's a few constructors. So you can construct a hash table just using the default settings with initial capacity, or the capacity and a max load factor. So this is definitely constructor. Let's have a look.

So just say for the max load factor is compute default capacity. And make sure you take the max of say default capacity capacity just so that I know you don't want the capacity be too low. There may be weird things happen. If each pass is set to one or something like that, then calculate threshold and then finally initialize the table. Really simple. Let's have a look at all these methods right here.

Society returns number of elements side hash table empty is the hash table empty. Okay, so this is the first really interesting method. So normalize index, and it's used when we want to convert a hash value into an index. And it says in the comments here, essentially this strips the negative sign and place the hash value in a domain zero capacity. Because hash values are integers, they can be anywhere in the domain of an integer which is about minus 32. To the 31, sorry, to positive to the 31. around that.

So what this does is the mask is stripped The negative sign from the hash value and then modded by the capacity. So bring it into this domain, so we can actually use it as a lookup index. Next up is clear. So we just weren't clear on the table. That's straightforward. Contains key in Haskell do the same thing.

So we're going to do is compute given a key. So we want to find out this key exists within hash table. Right? So we're going to do is compute the keys, hash, normalize it. And then now use the bucket index. So which bucket should this key appear?

Should it be in a hash table? I'm just going to seek to see if I'm going to seek to find the entry. If the entry is not equal to no exists. Physical to know it doesn't exist. Simple right? For add an insert are all common names for just putting something into the hash table or updating a value inside the hash table to.

So we don't want to allow no keys, that's something we absolutely don't want. So just throw an exception there. Otherwise, we're going to create a new entry, find the bucket it belongs to, and then insert it using this method we'll get to. Okay, get. So given a key, I want to find the value associated with that key. Again, don't allow no keys.

And this will be pretty routine all the time. Always want to find which bucket this particular key belongs to. So get the bucket index and find the entry. assuming it's not no then this is the entry you want and return its value. If it is no Well, the key doesn't exist. So return out.

Okay, suppose we want to remove a key now from the hash table. So he is not no find the bucket index. And we're going to call this private remove entry method, which is just down here. So given the bucket index, so which, which bucket does this keep one two, we're going to do is we're going to seek for the entry inside the linked list structure. And if the entry is not now then we're going to extract the actual link list and remove it. So this is a built in a datatype in Java, so it was removed from that link this data structure decrement the size and return the answer value, that's all we have to do.

Otherwise, it's not there. So return now. So insert, bucket, insert, and So given a particular bucket, we want to place this entry inside of it. Okay. So first, since we have the bucket index, we can go in our table and automatically get the linkless structure, check out bucket. So bucket is no Well, we have to create a new linked list.

So we're essentially lazily allocating these linked list data structures, which is good, because we don't want to use more memory than we need to. So next up, I find the entry that already exists. So this is in case we want to do an update, for instance. So if the existence entry is no, this means that we need to add a new entry to the end of the blank list. So add it to the end of the link lists, increment the size, and check if we're above the threshold, and if so, Resize the table. And yeah, use Delta indicate that there was no previous entry otherwise need to update that entry.

So then just update the value in the existing entry and return. All right. So seek entry. This is a method we've been using pretty much everywhere, it finds the returns particular entry at a given bucket index, if it exists, otherwise, just return now. They probably know what's going on by now. So extract the length list at the particular bucket index.

Otherwise, return now it doesn't exist yet. Otherwise, seek through each of the entries in the link list and compare the keys to the key we're after. And if there's a match, return that entry otherwise return null. Okay, here's the last really complicated method called risk. Size table. So this resize the initial table doubling the capacity of the table.

First we double the capacity. We recompute the new threshold because now we have a higher threshold because you've increased the capacity. Now create a new table with this new capacity. So this new table is bigger than the current table. Then go through the current table, look for linkless the structures which are not know, if you find one, we'll loop through all this entries, calculate the bucket index, find the bucket and essentially insert it into this new table. And after that, completely remove the old data that was in the old table.

Now the end, set the table to be equal to the new table. Okay. And then these last two methods are just return out the keys and the values within the hash table. fairly simple. And these last two methods are iterators and to string which we don't need to go over. So that's essentially separate chaining and nutshell, fairly simple to implement with the link lists much more difficult to implement with, say, a balanced binary tree or something like that.

So guys, thanks for watching. I hope you learned something. If you did, hit the subscribe button, hit the like button and leave me a comment. Awesome. See you later.

Sign Up

Share

Share with friends, get 20% off
Invite your friends to LearnDesk learning marketplace. For each purchase they make, you get 20% off (upto $10) on your next purchase.