1. Computing

Update Twitter From an RSS Feed

A Practical Ruby Script

By

This Ruby script for Twitter is what's often referred to as a "mashup." Mashups interface one protocol to another, or one website to another. This mashup takes information from an RSS feed and posts it as updates to Twitter. If you have a website or blog, you can just run this script every time you make a new post and it'll automatically tweet about it.

It will also make use of two cool gems. The first is SimpleRSS, which is a dead-simple way to parse RSS feeds and grab info from them. You could be using REXML or Hpricot or Nokogiri, but SimpleRSS was designed for this specific task. It also uses ShortenURL, an interface to URL shortening services like TinyURL and RubyURL.

Section 1 is the standard Twitter opening. Create a Twitter::Base object with your username and password.

Section 2 uses something called Marshaling to load an object saved to the hard drive. Marshaling will save the entire state of an object into a string, which you can then write to the hard drive or store elsewhere. Later, you can unmarshal the object and it will be make whole again. Think of it like cryogenic freezing for your Ruby objects.

If the rss_times file in the current directory exists, it will unmarshal the object from that. Simply pass the contents of the file to Marshal.load to un-freeze the object. If the file didn't exist, it will create a new object.

The object in question here is a hash. The keys are the URLs of the RSS feeds we're watching. The values are the times we last posted an article from them. This is to remember where we left off, so we can pick up again next time we run the script.

Section 3 uses SimpleRSS to parse an RSS stream from either a URL or a file. Since we required the open-uri library, the open method will now transparently open any HTTP URLs we give it. This is very handy for uses like this.

Section 4 starts iterating over all of the items found in the RSS feed in reverse. This is done in reverse so we can see the oldest entries first. If its publication date (its pubDate) is later than the date in our hash, shorten the URL of the item and print the title and URL to the screen.

Section 5 posts the item to Twitter and updates the hash with the time of the item. It will then marshal the hash back to the hard drive by opening rss_times for writing, then using the Marshal.dump method to freeze the hash. Then, to be nice to Twitter and your followers, sleep for 5 minutes.

 #!/usr/bin/env ruby
 # Monitor an RSS feed, and post its
 # articles to twitter.
 require 'rubygems'
 require 'twitter'
 require 'simple-rss'
 require 'shorturl'
 require 'open-uri'
 
 ### Section 1
 username = 'your_username'
 password = 'your_password'
 twitter = Twitter::Base.new( username, password )
 
 ### Section 2
 # If the rss_times file exists, unmarshal
 # the rss_times hash, else create a new
 # one.
 rss_times = if File.exists?('rss_times')
   Marshal.load( File.read('rss_times') )
 else
   Hash.new( Time.mktime('1970') )
 end
 
 ### Section 3
 # Connect to twitter and fetch RSS
 rss = SimpleRSS.parse( open(ARGV[0]) )
 
 ### Section 4
 rss.items.reverse.each_with_index do|i,idx|
   if i.pubDate > rss_times[ARGV[0]]
     link = ShortURL.shorten(i.link)
     text = "#{i.title} #{link}"
 
     puts text
     puts "=" * 50
 
     ### Section 5
     # Save the time of this update
     # and marshal the hash back to storage
 
     twitter.update text
 
     rss_times[ARGV[0]] = i.pubDate
     File.open( 'rss_times', 'w' ) do|f|
       f.write Marshal.dump(rss_times)
     end
 
     sleep 300
   end
 end
 

©2014 About.com. All rights reserved.