1. Computing




Hashes are the swiss army knife of Ruby data structures. You will see hashes absolutely everywhere, even to emulate missing language features. It could even be said that you will not see a non-trivial Ruby program that does not use Hashes. But what are Hashes? How do you use them?

What is a Hash?

A Hash is a type of data structure that matches keys with values. Where as an array is indexed with integers, that is the data is in order and indexed by their position in this order, Hashes are unordered. To access a value from a hash, you must index the hash using another object that acts as the key (this object is typically a Symbol, String or more rarely, an integer, but something that uniquely identifies the object being accessed in that Hash). A Hash is a data structure, just like an Array, but instead of storing values as an ordered list, they're unordered and indexed by arbitrary keys.

You can think of Arrays like a stack of papers. Each paper has a number on its top right telling the Array where it goes in the stack. To access a particular paper, you must know the number, and Ruby can search down through the stack of papers until it finds that paper. Hashes, on the other hand, are not ordered. All of the papers are stored in different drawers all over the office. They might not even be in the same building. Each paper has a key at its top right, and when you ask Ruby for the paper with this key, it will know which file cabinet it is in, and which folder, and will retrieve the paper for you.

This all comes down to how hashes (also called "associative arrays") work. Hashes are a tree structure of smaller arrays. In memory, the tree structure is index by keys. Ruby will generate unique numbers based on this key (remember, the key can be any object, but is typically a String or Symbol) that index a "hash bucket," or small array of objects paired with keys. Ruby then only has to look through this small hash bucket, instead of looking through a long list of objects to find the object you want.

Why Hashes, Not Arrays or Structs?

If you're coming from C, C++ or a number of other more static languages, you may be used to making structs, classes, what have you with data members. These are your bread and butter in these languages. A connection handle has a socket, the address it's connected to, what time it connected, the state, etc. Each of these members goes into a pre-defined data structure, and they're accessed very quickly using memory offsets. But things are a bit different in Ruby.

Ruby's focus is on dynamic code. A static data structure is not dynamic, it's the exact opposite of dynamic. Ruby regulates itself more by convention, protocol and Test-Driven Development. While Ruby does have Structs and OpenStructs the Hash is the preferred way. And, of course, Arrays were used in more primitive languages. I can recall using DATA statements in BASIC all the time, remembering offsets and such, but Arrays and ordered collections are rarely used for anything but actual ordered data in a modern programming language.

Using Hashes

Hashes are front and center in Ruby. Hashes are accessed using the Hash class, as well as the Hash literal syntax (which just creates Hash objects in a more pretty and meaningful way). And, at their most basic and common use, Hashes are used by creating a Hash object then using the index and index assignment operators to store and retrieve keys.

The following shows a few ways to create hashes.

hash1 = Hash.new  # Create a new hash object
hash2 = {}  # An empty hash literal, does the same as above
hash3 = { :key => 'value' }  # This hash has a key value pair already in it
hash4 = { key:'value' }  # The same as above, using a syntax new to 1.9.x

Then, once you have a hash object created, there's really only one thing left to do: store and retrieve values using keys.

hash[:key] = 'value'  # Typical assignment
puts hash[:key]  # Typical retrieveal

And remember that keys can be any object, but are typically Symbols. Why Symbols instead of Strings? Many languages use Strings for associative array keys, but Symbols are much more efficient in Ruby. If the symbol :key is used 20 times throughout a program, only one :key object exists and they all refer to the same thing. However, if the String "key" is used as a key, and is used 20 times throughout the program, 20 (or more) String objects will be made, as well as the hashing function (which converts the object into a hash bucket index) being more expensive. So you can use Strings as keys, it's just not recommended.

©2014 About.com. All rights reserved.