1. Technology
Send to a Friend via Email
You can opt-out at any time. Please refer to our privacy policy for contact information.

How to Perform List Operations in Ruby

By

How to Perform List Operations in Ruby

List operations are at the heart of many programming languages, especially Lisp-based or inspired languages. If you first learned to program on such a language, you see all problems as a list, and solving list problems without list operations is going to be difficult. So if operations like car and cdr are your forte, Ruby has you covered.

Lists

A list is an abstract term. At its most abstract, a list is an ordered collection of elements. This says nothing about the implementation of the list, only that the list is an ordered collection of elements. This is referred to as an Array in Ruby, and this has some minor implications over a generic list type.

In Ruby, when you talk about lists you're talking about Arrays. An Array in Ruby is a proper array, if you look at the implementation of the Array class (or, in this case, the C struct that holds the actual objects), you can see that it is indeed an array. While many languages choose to implement the list type as a linked list, Ruby uses an Array. This means operations like push, which adds an element to the end of a list, can not only result in an allocation of a larger block of data, but a potentially expensive memcpy operation. And any operation that operates on the beginning or middle of, or combines two lists, such as push or [1,2,3] + [4,5,6] will have similar negative performance implications. So while Ruby does implement a list API it does not implement them as a linked list, so intensive list operations can perform very poorly. However, it may not be as bad as you think since the Array only holds references to the objects, each reference being the size of a pointer.

But don't let that make you think that Ruby arrays can only hold a single type. The Array is implemented in C, and the C code doesn't know the type of the values it's referring to. All it sees is a C array of pointers. Ruby arrays can hold any type, and any mixture of types, that it chooses to.

Creating Lists

There are a few ways to create Arrays in Ruby. The first and most common is the Array literal. An Array literal takes a sequence of objects and creates an Array object. This is similar to the cons keyword in Lisp, it constructs a list from the arguments given. The following example shows this.


irb(main):001:0> [1,2,3,4,5]
[1, 2, 3, 4, 5]
irb(main):002:0> ['a', 'b', 'c']
["a", "b", "c"]
irb(main):003:0> [1,2,3,'a','b','c']
[1, 2, 3, "a", "b", "c"]

You can also create Array objects using the Array.new constructor. Without arguments, this will create an empty array.


irb(main):004:0> Array.new
[]

It can also take a size argument, and will create an array of that size filled with nil.


irb(main):005:0> Array.new(5)
[nil, nil, nil, nil, nil]

Or if you don't want nil, you can pass another argument to tell it what value you do want.


irb(main):006:0> Array.new(5, 'test')
["test", "test", "test", "test", "test"]

And if you want to define each element of the list using a block, you can do this.


irb(main):007:0> Array.new(3){ sleep 1; Time.now }
[2013-02-04 18:13:14 -0500, 2013-02-04 18:13:15 -0500, 2013-02-04 18:13:16 -0500]

This is equivalent to doing Array.new(3).map{ sleep 1; Time.now }, but more about map below.

The Beginning of the List (CAR)

The Lisp operation car (etymology of the term 'car') evaluates to the first element of the list. This is also referred to as the head of the list, or the list's first element. This can be done a few ways in Ruby.

Note that car operations always return a single element, not a list.

The first is simply to address the first element of the list. This is done with the standard index operator with an index of 0.


irb(main):008:0> [1,2,3][0]
1

And more semantically as some_array.first.


irb(main):009:0> [1,2,3].first
1

Similarly, the last element can be retrieved using the index -1 or the last method. As lists are typically singly linked lists on other languages, this isn't in the typical set of list operations, but as Ruby implements Arrays as an Array, it has O(1) access to the end of the array.


irb(main):010:0> [1,2,3][-1]
3
irb(main):011:0> [1,2,3].last    
3

The Rest of the List (CDR)

The cdr operation in Lisp returns a list of all elements in the list minus the first element. So a cdr operation on [1,2,3] would yield [2,3]. As Ruby's Arrays are not implemented as singly linked lists, there is no semantic way of doing this. You can, however, pass a range operator to the index method and do the same thing. The range of 1..-1 is used, starting at 1 (skipping the first element) and ending at -1 (the last element).


irb(main):013:0> [1,2,3][1..-1]
[2, 3]

Alternatively, you can use the stack operations to do something similar. The push and pop stack operations work on the end of the list, but their companion operations unshift and shift operate on the front of the list. You can use shift as a kind of cdr, however it doesn't evaluate to the rest of the list, it evaluates to the old head of the list, the value that was popped off the front of the list.


irb(main):014:0> some_array = [1,2,3]
[1, 2, 3]
irb(main):015:0> some_array.shift  
1
irb(main):016:0> some_array
[2, 3]

Is the List Empty?

Another common task is to see if a list is empty. This is an easy one for Ruby, use the empty? method.


irb(main):017:0> [1,2,3].empty?
false
irb(main):018:0> [].empty?
true

CAR and CDR Using Splat and Soak

Another way to do this is using the splat and soak operators. Say you're processing a list, and want to get the head of the list and the rest of the list in two variables. You can use the comma assignment idiot with soak on the second operator, and splat on the array on the right side. This is essentially performing the same thing as shift (see below) though, and may be less readable. Especially since the splat and soak operators are not everyday commonplace Ruby, not everyone uses it on a regular basis. And again, this is making a whole new array and copying all the elements (minus one) into the soaked parameter. Memory copying is slow, it should be avoided where possible.


irb(main):062:0> some_array = [1,2,3]
[1, 2, 3]
irb(main):063:0> head,*rest = *some_array
[1, 2, 3]
irb(main):064:0> head
1
irb(main):065:0> rest
[2, 3]

Adding to a List

While a list may be constructed with an Array literal, that's a bit too rigid for most uses. Most lists are formed by parsing a file, calculating some values or otherwise programmatically. The value are not known when the program is written and cannot be in the literal. Array building can be done a few ways in Ruby.

First are push and pop and their cousins unshift and shift. The push and pop methods are the classic stack operations. They respectively add an element to the end of a list and remove the last element from the list. The push method yields the array itself (with its new last element), while the pop element yields the value popped from the end of the list.


irb(main):021:0> some_array = [1,2,3]
[1, 2, 3]
irb(main):022:0> some_array.push(4)
[1, 2, 3, 4]
irb(main):023:0> some_array
[1, 2, 3, 4]
irb(main):024:0> some_array.pop
4
irb(main):025:0> some_array
[1, 2, 3]

Similarly, the unshift and shift methods operate on the front of the array. Though these aren't used as often, as they both need to move large chunks of memory every time (rather than only with the worst case scenario with push if it needs to allocate a larger block of memory). This can be very slow on large lists. The names always confused me as well, I always think of shift as shifting an element onto the list, but it's the other way around. The shift method is pop from the front. But this obviously isn't an issue for everyone.


irb(main):026:0> some_array = [1,2,3]
[1, 2, 3]
irb(main):027:0> some_array.unshift(0)
[0, 1, 2, 3]
irb(main):029:0> some_array
[0, 1, 2, 3]
irb(main):030:0> some_array.shift
0
irb(main):031:0> some_array
[1, 2, 3]

Ruby also has another way to build arrays, the array combination and append operations. To combine two arrays, you can use the + operator. This operator, when applied to an Array, expects an Array on the right-hand side and will evaluate to the two arrays joined into one. Note that it does not modify either array, it returns a whole new array. So, again, it involves memory copying and shouldn't be used unless that's what you really want. The concatenation methods below may be better suited.


irb(main):035:0> a = [1,2]
[1, 2]
irb(main):036:0> b = [3,4]
[3, 4]
irb(main):037:0> c = a + b
[1, 2, 3, 4]
irb(main):039:0> a
[1, 2]
irb(main):040:0> b
[3, 4]
irb(main):041:0> c
[1, 2, 3, 4]

Concatenation is different, it will add to the end of an Array without generating a new array. So while the + operator requires both arrays be copied into a new array, the concatenation operations only normally need to copy the elements onto the end of the array, that's much less copying. Though it's not "free" If the new array doesn't have enough room to accommodate the new array elements, it must re-allocate and copy the entire array into a new block of memory, which is just as much copying as with the + operator. So its worst case is the same as the + operator, but its best case is far better. If possible, prefer concatenation to +.

There are two operations that perform concatenation to an Array in Ruby. The first is the << operator. This adds a single element to the end of an Array. But concatenating two arrays with this operator will not work.


irb(main):042:0> some_array = [1,2,3]
[1, 2, 3]
irb(main):043:0> some_array << 4 
[1, 2, 3, 4]
irb(main):044:0> some_array
[1, 2, 3, 4]
irb(main):045:0> some_array << [5,6]
[1, 2, 3, 4, [5, 6]]
irb(main):046:0> # That's not what we wanted...

To concatenate one array onto the end of another, use the concat method. This resizes the array to accommodate the new data and copies just that data over. This is (usually) more efficient than using the + operator.


irb(main):066:0> a = [1,2]
[1, 2]
irb(main):067:0> b = [3,4]
[3, 4]
irb(main):068:0> a.concat(b)
[1, 2, 3, 4]
irb(main):069:0> a
[1, 2, 3, 4]

Transforming a List

Finally, and perhaps the most important or powerful of the list operations is the transform all of the elements of a list into something else. Say you've gathered the names of all of the US Senators in a list and you want to turn this list into a list of how they all voted on a certain bill, you could use the map method. The map method takes a block and yields each element of the array to that block, and collects every object the block evaluates to and puts them in a new array. A simple example would be to add 1 to every number in a list.


irb(main):080:0> a = [1,2,3]
[1, 2, 3]
irb(main):081:0> b = a.map{|i| i+1 }
[2, 3, 4]
irb(main):082:0> b
[2, 3, 4]

While this, at first, doesn't seem very useful, you'll find yourself using it quite often. To cross-reference two lists, to construct lists, to filter or fix items in a list, etc. It doesn't look like you'll use it often, but you will.

  1. About.com
  2. Technology
  3. Ruby
  4. Beginning Ruby
  5. Collections: Arrays, Hashes and Enumerable
  6. How to Perform List Operations in Ruby

©2014 About.com. All rights reserved.