Archive for June, 2008

Keep your site as fresh as your code

Friday, June 27th, 2008

I am actively working on PoolParty, but the website doesn’t necessarily show that the project is in active development.

Luckily, it is hosted at github and they have an API to access their changesets. Now, on PoolParty’s website, I am surfacing the latest changeset’s time, message and author as well as a link to the changeset.

How?

Code (shortened for brevity purposes):

$:.unshift(File.dirname(__FILE__))
require "rubygems"
require "yaml"
require "open-uri"
 
class Grab
  @url = "http://github.com/api/v1/yaml/auser/pool-party/commits/master"
  @refresh_after = 20
 
  def self.latest_commit_object
    o = yaml["commits"][0]
    {
      :message => o["message"].gsub(/\n/, "<br />"),
      :date => Time.parse(o["committed_date"]).timeago,
      :by => o["committer"]["name"],
      :url => o["url"]
    }
  end
end

The github api is slick. As you can see from above, the format for the api url looks like

http://github.com/api/version/format/username/repository/type/object

Where the variables there correspond to:

  • Current version: v1
  • Acceptable formats: json, xml, yaml
  • Acceptable types: commits, commit

Grabbing the latest changesets from the url as YAML format, we can quickly and easily parse the message into a useful hash.

I personally like the “# hours ago” syntax that 2.0 sites are popping up with everywhere, so I just added a timeago method that is very similar to the one included with rails. The source is attached.

Since I don’t want poolpartyrb.com’s load time to be dependent upon how fast it can hit github, I cache the variable so every 20 views it refreshes.

Use in Rails

Save both the files (attached below) to your lib directory.

Add this to your environment.rb

require 'grab'

Then add this helper to your application_helper.rb

  def show_latest_commit
    @latest_commit = Grab.latest_commit_object
 
    <<-EOS
      <div class='note'><strong>Latest commit</strong><br />
      <a href="#{@latest_commit[:url]}">#{@latest_commit[:message]}</a>
      <div class="highlight">by #{@latest_commit[:by]}</div>
      #{@latest_commit[:date]}
    EOS
  end

How to use in sinatra

I use this on poolpartyrb.com, which is a sinatra site. I do basically the same as above, but generally render the code in haml instead

  $:.unshift(File.dirname(__FILE__))
  require "sinatra"
  require 'grab'
 
  get '/' do
    @latest_commit = Grab.latest_commit_object
    haml :home, :layout => :layout
  end

Then in my home.haml

  .note
    %strong
      Latest commit
    %br
    ==<a href="#{@latest_commit[:url]}">
    = @latest_commit[:message]
    </a>
    .highlight
      == by #{@latest_commit[:by]}
    = @latest_commit[:date]

Get the code here

edit: updated location of code

Announcing… PoolParty! An Open Source tool for managing EC2 clusters

Thursday, June 5th, 2008

Is your EC2 server pool starting to look like this?

PoolParty can help you turn it into this:

Why would I need PoolParty?

Amazon’s Elastic Compute Cloud has made it cheap and straightforward to run your own server pool, but there are various issues with volatility, lack of load balancing between instances, no persistent instance storage (currently in private beta). Also there is no mechanism for scaling your instances dynamically (ie, adding or removing instances as your load changes).

That is all taken care of by PoolParty, a recently open sourced Ruby gem and the brainchild of fellow CitrusByte developer Ari Lerner. PoolParty takes care of the following:

  • Automatic scaling based on demand and load
  • Starting/stopping instances
  • Self-healing
  • Provisioning and bootstrapping initial software
  • Setting up S3Fuse
  • Load balancing with HAProxy
  • Built-in monitoring
  • Plug-in architecture for extendability / customization

How?

PoolParty is based on established open source technologies that have been widely used in production. It takes your EC2 server pool, and invites S3Fuse, monit, and members of the High Availability Linux Project . Now you don’t just have a pool. You have a rocking pool party. All you have to do is sign those invites! (In other words: just provide a config file and you’re done).

More Information

You can check it out at the official PoolParty website. Tutorials, walk-throughs and articles are coming soon.

If you are interested in active development, check out the source at github. Active discussion is on the PoolParty Google Group

Here are the slides from the talk Ari gave at RailsConf sessions (also available here):

Read this doc on Scribd: pool party presentation

Ruby TMTOWTDI, Episode 1

Monday, June 2nd, 2008

This is the inaugural post in a series that we’ll be calling Ruby TMTOWTDI (There’s more than one way to do it). Usually the TMTOWTDI acronym is used as a disparaging term towards flexible languages, which offer a myriad ways of solving most given problems. This series, however, aims to use this property of Ruby as an educational tool.

The format is as follows: we’ll present a problem, then everyone will try and write the most idiomatic, concise code that solves the problem in a readable fashion. This is not meant to be a golfing competition, nor an optimization one. However, we will benchmark the solutions for a rough idea on how different styles affect performance. For every installment we’ll have a performance winner for fastest time and a ’style’ winner for the most elegant code.

This gives us at CitrusByte a better understanding of each other’s coding styles, and stimulates discussions about little tricks we may not be aware of. Best of all, it’s fun as heck to be a part of.

You can get involved too! Leave a comment with your solution if you think you can best the ones listed here. We also encourage you to propose a simple problem (less than 20 lines of code to solve) for the next instalment in the series.

Since this is the first time we’re doing it, we’ve picked a very simple problem to solve as a quick warm-up.

Problem: given a string ’str’ of the format

"a10 b20 c25 d40"

convert it to a hash

{'a' => 10, 'b' => 20, 'c' => 25, 'd' => 40 }

Note that the values of this hash are integers, not strings. If the string is

"a10 a20"

the output should be

{'a' => 20}

.

Leaving some spoiler space in case you want to try it on your own. Below are selected solutions from our team.


Spoiler space


Solutions

Michel

Hash[*str.scan(/(.)(\d{2})/).map { |k, v| [k, v.to_i] }.flatten]

Time taken: 9.99s

Tim

Hash[*(str.scan(/(\w)(\d+)/).map {|k, v| [k, v.to_i] }.flatten)]

Time taken: 9.40s

Tim’s comments: the splat operator has low precedence so the outermost set of
parantheses are redundant, but I felt it’s better to have them there to make it
very obvious what’s going on.

Nicolas (winner: speed award)

str.scan(/\s*(\D+)(\d+)/).inject({}) {|h,(k,v)| h[k] = v.to_i; h }

Time taken: 7.19s

Nicolas’s comments: I like inject/reduce/fold, whatever you call it. It’s
concise and usually a perfect fit for constructing enumerable data structures.

Martin (winner: style award)

str.split.inject({}) {|h,i| h[i[/^\D+/]] = i[/\d+$/].to_i; h }

Time taken: 8.75s

Ari

hash = str.split.inject({}) do |set, h|
  set.merge( {h[0].chr => h[1..-1].to_i} )
end

Time taken: 23.02s

Ari’s comments: I knew everyone else would be strutting their one-liners, so I went for as much clarity as possible. I feel this code not only solves the problem, but also makes it very apparent what the desired solution is.

Benchmark code

require 'benchmark'
 
ITERATION = 400
 
def measure(&amp;block)
  ITERATION.times { block.call }
end
 
str = ''
5000.times { str &lt;&lt; ('a'..'z').to_a[rand(26)] + (rand(90)+10).to_s + ' ' }
br = Benchmark.bmbm do |b|
  b.report("Running #{ITERATION} times on string of length #{str.length}") do
    measure {
      #### Code to measure goes here
    }
  end
end

Conclusion

Lots of interesting styles here. We have two ways of creating hashes: with the Hash[ [key =>|, value]* method and with inject().

We’ve also seen various ways of breaking up a string: with scan(), with split(), and with regex arguments to a string’s [] method. While we cannot come to definite conclusions about performance (since the code that is being compared doesn’t differ just in one area), we can see a couple things going on:

  • running Hash[] on an array as opposed to creating it directly via inject hurt Michel and Tim’s performances compared to Martin/Nicolas
  • merge() seems to be significantly more costly than straight addition/replacement of hash keys. If you’re ever injecting into a hash it may be wiser (if your problem allows) to follow the { hash[key] = value; hash } format instead of merging.

We’ve picked Martin as the style winner for the elegance of his code which fit of the string parsing inside the inject block as opposed to the scan-based solutions which did some of it outside in the scan. The h[i[/^\D+/]] = i[/\d+$/].to_i portion was the clearest signal of intent for the contents of the hash.

Meanwhile, Nicolas is the performance winner by a mile, coming in more than 20% faster than the nearest competitor.

Congratulations to Martin (style) and Nicolas (speed) for being the winners of the inaugural CitrusByte Ruby TMTOWTDI. Stay tuned for the next episode!

EDIT

Additional comments by Nicolas:

Anyway, after thinking about it, even if I like inject a lot, there’s one big gain for the Hash[*...] as Michel posted. It
transmits intent better, in a way.

You look at that and the entire code sequence is a Hash construction, plain and simple, even if you ignore the subexpression in the middle, you know the outcome. And if you look one character into the brackets, then you know it’s constructing a hash from some transformation of an existing enumerable. Bah, just something that responds to #to_a.

Instead, using inject, the code flows (at least in my head) nicer as you don’t have to parse subexpressions, but it’s not immediately obvious what’s it about.

Hm, actually, depends on what intent is better to show. I mean, the objective is something like “to transform a string in a certain format into a more traversable structure” or “to create a hash out of a string in a certain format?”

That (not so) subtle difference might make me decide about which way is better, but then the description of the problem is not descriptive enough. Good, that means that the problem lies in the exercise’s designer. I can sleep well :P

(Probably there’s a third approach that to a degree involves a recursive lambda and someone flying over a desktop yelling “stand back! I know hash construction”, but my ninjitsu is not good enough for that yet)