1. Computing

Getting Your First Page

By

Undoubtedly, most every use for Mechanize starts with a single GET request. This is done from the Mechanize object itself. The Mechanize#get method returns a Page object, which is where most of the action happens. In its more pure form, Mechanize#get takes a single argument: the URL of the page you want to get.


page = agent.get('http://www.google.com/')

The second parameter to Mechanize#get is a hash of parameters. This hash will be turned into the CGI parameters passed in the URL in the format ?a=b&c=d. If you don't intend to pass any CGI parameters but must pass something for the third or fourth parameter to Mechanize#get, use the empty array literal [].


# Will make request to
# http://example.com/?key1=value1&key2=value2
agent.get('http://example.com', {:key1 => 'value1', :key2 => 'value2'})

The third parameter is the referrer. Some web sites will serve you differently depending on which page or site referred you to this page. For instance, many image sharing sites will not serve you an image unless the referrer is that site itself, to prevent people from bandwidth leeching. However, this is a simple HTTP request header, and can easily be modified by the requester.

A referrer comes in the form of a URL. So, if I were to click on a link from http://www.reddit.com/ to http://ruby.about.com/, the request would include the following header.


Referer: http://www.reddit.com/

Before you ask, yes, that is misspelled on purpose. It was spelled incorrectly in the original RFC, and it's been that way ever since. But at any rate, to pass a referrer, pass the URL as a string. If you need to pass something for the fourth parameter to Mechanize#get but don't want to pass a referrer, use the nil value.


agent.get('http://ruby.about.com/',
          [],
          'http://www.reddit.com/')

The final parameter to Mechanize#new is a set of custom headers. Custom request headers are rarely used with Mechanize, as the primary use of them are referrer spoofing and cookies, both of which are handled by Mechanize in a more elegant manner. But at any rate, if you have the need to pass any custom request headers, they can be passed as a hash. But note that Mechanize will not pass on any unknown headers, and header keys must be strings and capitalized correctly.


agent.get('http://ruby.about.com/',
          [],
          'http://www.reddit.com/',
          {'Referer' => 'http://www.reddit.com/'})
  1. About.com
  2. Computing
  3. Ruby
  4. Tutorials
  5. The Mechanize 2.0 Handbook
  6. Getting Your First Page

©2014 About.com. All rights reserved.