1. Computing

Hash Tips and Tricks


Hash Tips and Tricks

There's more you can do with Hashes than simply storing and retrieving key/value pairs. Below are a number of tricks, some of which may not actually be useful or be considered obscene perversions by some, but all of which highlight the flexibility of the Hash class.

Indexed Anything

When you create a new Hash object, you can give it a proc object that will be called whenever the index operator comes up empty. You can then use this to implement mathematical functions that automatically cache their results, so multiple calls using the same parameter are much faster.

In this example, we'll implement the Fibonacci sequence. It's a simple sequence, it starts with 0, 1 and the next number in the sequence in the sum of the previous two. So the next is 1 (0 + 1 = 1), 2 (1 + 1 = 2), and so on. The way it's implemented, it'll automatically cache the results of indexes into the sequence.

#!/usr/bin/env ruby

fib = Hash.new do|hash,key|
	curr, succ = 0, 1
	key.times{|i| curr, succ = succ, curr+succ }
	hash[key] = curr

puts fib[0]
puts fib[1]
puts fib[10]
puts fib[5]
puts fib[10]  # Won't have to compute fib[10] again

With just a few lines, this hash became an automatically caching method of sorts. But what's the real advantage of doing it this way? The cache is sparse. It only caches what's needed to be cached. It doesn't have to cache the entire sequence, just what needs to be.

OpenStruct is a Hash With method_missing

One cup of Hash, and a sprinkling of method_missing and voila, you have an OpenStruct. A Struct in Ruby is supposed to be a faster, more restricted data structure. Much like a Hash, except the keys are fixed. Or, well, closer to a C or C++ structure, and very un-Ruby. OpenStruct, on the other hand, keeps the struct way of accessing members using the standard dot syntax but the keys are not fixed. It's also no faster than a Hash, since it's just another layer on top of a Hash.

If you take a look at the source code for OpenStruct, it has a hash instance variable and all the heavy lifting is done with method_missing. To make things a bit faster, method_missing even goes so far as to define new methods for each of the hash keys.

A Hash of Hashes

Back in the Text Adventure Tutorial, we represented the world as a hash of hashes. This is not technically a tip, trick or hack, but it does demonstrate that you can represent more complex data structures using hashes. Now, the computer scientist in your is probably shuddering right now. Representing a binary tree using a hash table is remarkably inefficient. The whole point of a binary tree is to be blindingly fast, no? Yes, but sometimes you just need to organize the data, and performance is not a concern. Especially in Interactive Fiction, since 99.999% of the time is spent waiting for the user to figure out where to go next (probably while they're drawing notes and maps on graph paper, or, if they're a cheater, using Google).

Strict Hashes

Similar to the "Indexed Anything" tip, this uses the default_proc attribute. However, instead of generating a result, the opposite behavior is expected. A strict hash must have the key available, any attempt to access a key that doesn't exist will raise an exception.

#!/usr/bin/env ruby

hash = Hash.new do|h,k|
	raise IndexError

hash[:test] = "something"
puts hash[:test]
puts hash[:error]  # Will crash program

As you can see, this is pretty simple. Any missing keys fall through to the proc passed to the Hash constructor (which is the same thing as assigning a proc to default_proc) and this proc only does one thing: raises an IndexError exception (part of the standard library of Ruby exceptions). Since Hashes are just so ad hoc and are many times used in place of structs from lower level languages, you may want the security of knowing that hashes always pull out the known values, instead of a default value. Though, others with argue that this is irrelevant, and this is what TDD is for.

Hashes and Format Strings

Format strings allow you to easily insert data into a string without resorting to string interpolation. Hashes can be "crammed" into format strings very easily, and the syntax is much more clean than interpolation. It's best to just see how this works.

#!/usr/bin/env ruby

person = { name:'Alice', age:22 }

# The "bad" way
puts "Hello #{person[:name]}, you are #{person[:age]} years old"

# The "better" way
puts "Hello %{name}, you are %{age} years old" % person

There is no "better" way here. The first is a very traditional way to do things. Format strings are not often used in Ruby, and may be unfamiliar to many programmers. However, as you can see it's much more succinct. The "person" word is not repeated and, in this very small example, you can already see things cleaning up. Imagine a larger example though, you can significantly reduce the size of the string as well as increase readability. There is one downside though: it only really works with a single hash. If you need to take from two hashes, you need to do interpolation.

  1. About.com
  2. Computing
  3. Ruby
  4. Beginning Ruby
  5. Ruby's Basic Features
  6. Hash Tips and Tricks

©2014 About.com. All rights reserved.