1. Computing

Base64 in Ruby

By

Base64 in Ruby

Long ago, there was a problem. Computer users wanted to send binary files to other users, but the only way they could reach them (other than sending tapes through the mail) was email and systems like Usenet. Sending text files was OK over these systems, as the text (such as the source code for a program) was indistinguishable from a normal email message. But binary files can't be sent this way, so they must be encoded in Base64 before being embedded in email messages.

Base64 is an encoding that provides "ASCII armor" for any binary file. Encode a binary file in Base64 and you'll get something that looks like this (this is the first three lines of ruby.png, the Ruby logo, encoded in Base64):


iVBORw0KGgoAAAANSUhEUgAAA+MAAAPkCAYAAADCge9oAAAACXBIWXMAAMBWAADAVgGB4Q5XAAAg
AElEQVR4nOy9CbglSVkmHHm2u1Xdqup9b3oBWhoaGlFkEWhQQHZpFmFkRIcZ/xmfceGRURTHnnEf
ZtR55tfxdxwVEXREQARchqVFkYZGFmnZ2oZuoLu6ura71L33rJn5xxcZkScyMiIyIjPPdu/31hOV

Long story short, you can use Base64 to store binary data where text is expected. You'll see it in XML files, databases, email messages, URLs and a variety of other places. It works by encoding each two bytes of the binary data as 3 printable characters. Since 2 bytes becomes 3 bytes in the output, Base64 encoded files are larger than their source files by 33%.

Encoding Base64 in Ruby

Ruby comes with a Base64 encoding library out of the box, no gems are necessary here. To start using it, require 'base64'. Next, use the Base64.encode64 method to encode any string to Base64. It takes a single argument, a string of data to be encoded and returns a new string with the encoded data. This includes a string loaded from a file or a string with binary data. Remember that though it's called a String, Ruby strings can hold any string of bytes.


#!/usr/bin/env ruby
require 'base64'

File.open('ruby.png', 'r') do|image_file|
  puts Base64.encode64(image_file.read)
end

Running this (and replacing 'ruby.png' with a file that you actually have) should give an output similar to the example above, but very long. Hundreds of lines long for large files. This text can safely be stored in databases, sent in emails, etc.

There is a variant of the encode64 method called Base64.strict_encode64 that complies with RFC 4648 (the RFC that formally defines Base16, Base32 and Base64), which stipulates that no newlines may appear in the output. This does more or less the same thing, but the Base64 encoded data is one long line.

And another variant called Base64.urlsafe_encode64. Base64's character set is mostly letters and numbers, but includes a few symbols such as + and /, which have special meanings in URLs. This variant uses other characters in place of these characters. But beware, most Base64 decoders don't know about this, you must either replace them yourself before feeding it to a Base64 decoder server-side, or use the Base64.urlsafe_decode64 method.

Decoding Base64 in Ruby

Decoding Base64 is just as easy in Ruby. Use the methods Base64.decode64, Base64.strict_decode64 and Base64.urlsafe_decode64 just as you used them above to encode that data. These methods take a single argument, a string holding the encoded data, and return a string holding the decoded data.

A Caveat

When encoding and decoding Base64, it's preferable to collect your entire input streams. If you were to, for example, try to decode only a portion of a Base64 encoded string there could be unintended results. Since a single character in a Base64 encoded string only represents 6 bits of the encoded data, if you were to take a random sample of a Base64 encoded file and decode it, you may start decoding from the middle of a byte, or leave part of a byte off at the end. For this reason, only encode and decode complete streams.

Another downside to using this library is that it only works on strings. Behind the scenes it works on pack and unpack. This means you won't want to be encoding very large files as it needs to load the entire file twice, and the output string is 33% larger. So to encode a 1 gigabyte file, you'll need 2.33 gigabytes of memory. If you need to encode large files, most systems (such as Linux or OS X) come with base64 command-line utilities that will be much more efficient than loading all that data as strings.

Encoding Without the Base64 Library

As mentioned above, the Base64 library implements the actual encoding and decoding with the String#pack method. In fact, every single method in this library is exactly one line long. Here is the source of the Base64.encode64 method.


  def encode64(bin)
    [bin].pack("m")
  end

If you need to encode or decode Base64 data and cannot or do not want to load the Base64 library, you can use the pack("m") and unpack("m").first methods.

  1. About.com
  2. Computing
  3. Ruby
  4. Advanced Ruby
  5. Base64 in Ruby

©2014 About.com. All rights reserved.