1. Computing

Indexing Strings

By

When strings are used for output, they're often viewed as single immutable blocks of text. Once you form the string using string literals and interpolation, it often is simply output or stored in a variable to be output later. However, when strings are used as input, they're not usually taken at face value. Some type of parsing, indexing, sanitation, etc is performed. Among the most basic of these tasks is indexing characters in a string and iterating over a string or sub-strings of a string.

However, before reading this article, you'll need to know your Ruby version. The way string indexing is handled in Ruby 1.9.x has changed, so if you're using Ruby 1.8.x, you'll need to do a bit more work. To see what version of Ruby you're running, run the following command.


$ ruby --version
ruby 1.8.7 (2011-06-30 patchlevel 352) [i686-darwin11.0.0]

The following example program demonstrates most uses of the String index operator String#[]. This method takes a variety of parameters, and the output changes drastically depending on the type of parameter. Also note that the first form differs depending on your version of Ruby. On Ruby 1.8.x it will return the ASCII value of the indexed character, and on 1.9.x it will return a single-character string containing the indexed character.


#!/usr/bin/env ruby
puts "Running on #{RUBY_VERSION}"

s = "My hovercraft is full of eels"

puts s[0]
puts s[0,2]
puts s[3..12]
puts s[/\w+/]
puts s[/(hover\w+)/, 1]
puts s["eels"]

And the output of this program running on 1.8.7 and 1.9.2.


$ rvm 1.8.7,1.9.2 index.rb
Running on 1.8.7
77
My
hovercraft
My
hovercraft
eels
Running on 1.9.2
M
My
hovercraft
My
hovercraft
eels

  • If the only argument is a Fixnum, the indexed character is returned. If you're running 1.8.x, the character will be returned as a Fixnum representing the ASCII value of the indexed character. To turn this back into a single-character string, use the Fixnum#chr method, like str[0].chr. In Ruby 1.9.x, this is unnecessary.
  • If the arguments are two Fixnum objects (we'll call a and b), then the sub-string starting at a, whose length is b is returned. This can also be used to get around the incompatibility above, instead of saying str[0], you can simply say str[0,1] and the behavior will be the same across both versions. Also, don't confuse this with the Range version below.
  • If the only argument is a Range object, the sub-string whose indexes start and end with the range begin and end will be returned.
  • If the only argument is a Regexp object, matching text will be returned.
  • If the first argument is a Regexp object with a capture group and the second argument is a Fixnum, then the matched string from the regular expression's returned MatchData is returned.
  • Finally, if the first argument is a String, that string will be returned if it occurs in the indexed string, otherwise nil will be returned.

Negative Indexing

Positive indices count from the beginning of the string, but negative indices count from the end of the string. For example, to get the last character in a string, you could use the method call str[-1,1].

Using negative indices works with the Fixnum, Fixnum, Fixnum and Range indexing modes. The following example demonstrates the usage.


#!/usr/bin/env ruby
puts "Running on #{RUBY_VERSION}"

s = "My hovercraft is full of eels"

puts s[-1]
puts s[-4,4]
puts s[-4..-1]

And the example's output on 1.8.x and 1.9.x.


$ rvm 1.8.7,1.9.2 reverse_index.rb 
Running on 1.8.7
115
eels
eels
Running on 1.9.2
s
eels
eels

Index Assignment

The index assignment operator String#[]= replaces part of the string and returns a new string. The part of the string is the same as the indexed portion as explained above. The replacement need not be of the same length either, a new string is allocated if needed. The following example demonstrates this usage to replace the word "eels" with various things.


#!/usr/bin/env ruby
puts "Running on #{RUBY_VERSION}"

orig_s = "My hovercraft is full of eels"

s = orig_s.dup
s[-4] = "pe"
puts s

s = orig_s.dup
s[-4, 4] = "orangutans"
puts s

s = orig_s.dup
s[-4..-1] = "angry dwarfs"
puts s

s = orig_s.dup
s[/\w+$/] = "high fructose corn syrup"
puts s

s = orig_s.dup
s[/(hover\w+)/, 1] = "banana-shaped house"
puts s

And its output.


$ ruby index_assign.rb 
Running on 1.8.7
My hovercraft is full of peels
My hovercraft is full of orangutans
My hovercraft is full of angry dwarfs
My hovercraft is full of high fructose corn syrup
My banana-shaped house is full of eels

Note that since I wanted to alter the original string each time, the string had to be duplicated each time. Why? Since the String#[]= method alters the internal state of the variable, and doesn't return a new variable each time it's called, simply using the assign operator s = orig_s won't help. The assign operator only makes a new reference to the value, and the String#[]= method is changing the value itself. But duplicating the value each time solves this problem.

  1. About.com
  2. Computing
  3. Ruby
  4. Beginning Ruby
  5. Strings
  6. Indexing Strings

©2014 About.com. All rights reserved.