1. Computing

The Mechanize 2.0 Handbook

Mechanize is a library for automated interaction with web sites. For all intents and purposes, it acts like a web browser with no user interface. It downloads web pages, can click on links, fill out and submit forms, store cookies, etc. Mechanize is useful for automated crawling, testing and scraping of web sites.

The Mechanize Agent
When you use a web browser, your browser is said to be the "user agent." In other words, it is a program that acts on your behalf. The Mechanize object is your Ruby program's "agent," and it's the first class you'll use when starting any Mechanize program.

The Page Class
The Page class is class you'll be using most often. After any type of request is made, such as a GET or POST request from the Mechanize agent or "clicking" on a link, Mechanize will return a Page object.

The Link Class
The Link class is what you use to navigate from one page to another. The Link class itself is quite small, there just isn't much to a link. It doesn't have many methods, the click method being the most used by far.

Forms in Mechanize
Manipulating and submitting forms is about as easy as you could imagine in Mechanize. As expected, the form objects (such as check boxes, buttons and text fields) have methods like "click" that mirror real-browser input.

Mechanize Error: "Unknown cookie jar file format"
The error "Unknown cookie jar file format" can be raised in one of two places, in Mechanize::CookieJar#save_as and Mechanize::CookieJar#load.

Error: "Mechanize::RobotsDisallowedError: Robots access is disallowed for URL"
The "Mechanize::RobotsDisallowedError: Robots access is disallowed for URL" error is raised by Mechanize when robots.txt is honored and you have tried to fetch a page forbidden by robots.txt.

Mechanize Error: Mechanize::UnsupportedSchemeError
The error Mechanize::UnsupportedSchemeError is raised when Mechanize tries to fetch a page using an unsupported protocol scheme, such as ftp or gopher. Mechanize supports only the file, http and https schemes.

