Archive for the ‘Ruby’ Category

For heaven’s sake, please just install hoe!

Thursday, February 21st, 2008

This is why people get annoyed about the silly gem install dependency mess (esp WRT hoe):

$ gem install heckle
Need to update 31 gems from   # fair enough, you're allowed
Install required dependency ruby2ruby? [Yn]  Y    # Yeah, I know you need these
Install required dependency ParseTree? [Yn]  Y     # oh, and this one too
Select which gem to install for your platform (i686-linux)
 1. ParseTree 2.1.1 (ruby)                                   # yeah, just a vanilla ruby here
 2. ParseTree 2.1.1 (i386-mswin32)
 3. ParseTree 2.1.0 (ruby)
 4. ParseTree 2.0.2 (ruby)
 5. Skip this gem
 6. Cancel installation
> 1
Install required dependency RubyInline? [Yn]  Y   # This is also needed, I gather 
Install required dependency hoe? [Yn]  Y             # Ha, hoe again, OK
Install required dependency rubyforge? [Yn] Y     # Don't care, don't understand why both
Install required dependency rake? [Yn]  Y            # There's no rake on this box, really?
Install required dependency hoe? [Yn]  Y             # WTF, yeah, I just said that
Install required dependency hoe? [Yn]  Y             # .. now you're just fucking with me
Install required dependency ZenTest? [Yn] Y        # Huh? I like ZenTest, but there's no reason...
Install required dependency hoe? [Yn]  Y             # **** YOU, hoe!
Successfully installed heckle-1.4.1
Successfully installed ruby2ruby-1.1.8

The fourth time it asked me, it decided to trust me and actually starting installing the gems….

ISBN10 to ISBN13

Thursday, January 24th, 2008

As of the beginning of 2007, ISBN10 is dead. Now we’re in a world that allows “979” prefixes, though the following code doesn’t expect them yet…

Here’s some stuff to turn your 10-digit ISBNs into 13-digit ISBNs, naively assuming “978”, following an the API post doing the same from LibraryThing. There’s another one at for humans.

Code from O’Reilly’s internal stuff:

module Isbn
  def self.dash_isbn(isbn)
    raise"ISBN argument must be string") unless isbn.is_a?(String)
    if isbn.length == 10
      return isbn[/^./] + "-" + isbn[1..3] + "-" + isbn[4..8] + "-" + isbn[/.$/]
    elsif isbn.length == 13  
      return isbn[0..2] + "-" + isbn[3].chr + "-" + isbn[4..6] + "-" + isbn[7..11] + "-" + isbn[/.$/]
      raise"ISBN must be 10 or 13 characters")
  def self.isbn10toisbn13(isbn)
    raise"ISBN argument must be string") unless isbn.is_a?(String)
    raise"ISBN must be of length 10") unless isbn.length == 10
    prefix = "978"
    isbn12 = prefix + isbn[0...-1] 
    return isbn12 + check_digit_13(isbn12).to_s

  def self.check_digit_13(isbn_12)
    # need to subtract remainder from 10 
    # and do exemption for zero LMS 08.29.2006
    raise"ISBN must be of length 12") unless isbn_12.length == 12
    sum = 0
    odds = 0
    evens = 0
    isbn_12.scan(/\d/).each_with_index {|d, i| 
      if (i % 2) == 0
        evens = evens + (d.to_i * 1)
        odds = odds + (d.to_i * 3)
    sum = evens + odds
    digit = sum % 10
      return 0
      return 10 - digit
end # of module Isbns

The last test method relies on my database, which you won’t have. Replace it with something else you trust or drop it.

#!/usr/bin/env ruby

require 'test/unit'
require 'pdb'
require 'isbn'

class IsbnTest < Test::Unit::TestCase
  def setup
    @isbn10 = "059610123X"
    @isbn13 = "978059610123X"
  def test_dash_isbn
    # must be a String
    assert_raise ArgumentError do Isbn.dash_isbn(123) end
    # must be 10 or 13 characters
    assert_raise ArgumentError do Isbn.dash_isbn("123") end
    assert_equal("0-596-10123-X", Isbn.dash_isbn(@isbn10))
    assert_equal("978-0-596-10123-X", Isbn.dash_isbn(@isbn13))
  def test_isbn10toisbn13
    # must be a String
    assert_raise ArgumentError do Isbn.isbn10toisbn13(123) end
    # must be 10 characters
    assert_raise ArgumentError do Isbn.isbn10toisbn13(@isbn13) end
    assert_equal("9780596101237", Isbn.isbn10toisbn13(@isbn10))
  def test_check_digit_13
    # must be 12 characters
    assert_raise ArgumentError do Isbn.check_digit_13(@isbn13) end
    assert_equal(7, Isbn.check_digit_13(@isbn13[0..-2]))
    assert_equal(0, Isbn.check_digit_13("123456789018"))
  def test_isbn13
    ["0596008627", "1565929470", "0596002734", "0596004001", "0596527357", 
     "0596101635", "0596005059", "0596527063", "1565926374", "0596526946", 
     "1565926420", "0596101805", "059652742X", "156592455X", "0596006446", 
     "0596008473", "0596009607", "0596100582", "0596100493", "0596004427", 
     "1565925890", "1565927141", "059651610X"].each {|isbn|
      puts "Testing #{isbn}"
      assert_equal(, Isbn.isbn10toisbn13(isbn), "Bad checksum!")

RubyConf 2007 Second Day Afternoon

Saturday, November 3rd, 2007

Ed Borasky: Profiling and Tuning Ruby 1.8

Slides are available here. Cougar is the name of the project that this is coming from?

Is Ruby 1.8 Slow?

To benchmark: Collect a set of benchmark times, then normalize them, then compute the geometric mean of the ratios.

Alioth is a popular (if controversial) set of benchmarks. Using gcc as the standard, java is about 3 times slower, python is about 10, and ruby is 19 times slower (slower than Python, PHP, and Perl).

“Hemibel thinking”: (half of a an “order of magintude” [factor of 10]). Look for hemibel improvements or difference. Any greater accuracy doesn’t really help. Hemibel ratios: log10(a/b) * 2. Now we can rethink java as ~1, perl ~2, ruby ~2.6 (in terms of hemibels). Now we can say “Ruby is sort of slow.”

But, of course, reducing the speed of a language to a single number is not really that interesting. Instead, we need to find the variation of the benchmarks. Cue Box and Whisker Plots (they show a lot of data in a small amount of space). Now yarv looks really nice, with a really limited variation over the different benchmarks (python and ruby do well, perl and php do poorly). Each language has big (upper) outliers. Ruby’s worst outlier is “spectralnorm 500” (also bad for yarv).

OK, it’s slow, now what?

What to do: Throw hardware? Wait for 1.9/JRuby/IronRuby/Rubinius? Tune for 1.8?

Let’s try tuning 1.8 against the 1.9 slightly-modified benchmark suite. First, check how much you can get of GCC optimizations. There’s something to be gained there, but it’s a pretty small difference (1/2 a hemibel). Next, check out gprof. gprof shows the top three most expensive methods were rb_eval, rb_call0, and rb_call (for slightly more than 50% of the time). Another GNU tools is gcov (for coverage and profiling), and it can provide some
more help.

Phil Hagelberg: Tightening the Feedback Loop

How can you become a more effective programer? First, read The Pragmatic Programmer. Next, test more (duh) and use Test::Unit or RSpec (don’t waste your time doing stuff by hand that the computer could automate). autotest takes away a bit of the discipline of good testing (it tells “you what you need to know when you need to know it whether or not you know you need it or not”). But, autotest still requires a lot of context switching between terminals. Consider a solution like Flymake for emacs (which does in-editor syntax checking).

Testing habits help show the general feedback principles nicely. Many other parts of software development could benefit from the same introspection, measuring, and automation. What else should we measure?

  • Accuracy (syntax highlighting and automatic indentation): modern editors
  • Meta-Accuracy (especially when starting out, are your accuracy checks actually accurate?): heckle, rcov
  • Maintainability (hard to measure, but you can measure complexity): flog
  • Performance: write your own automated test for what you care about

Now that you’re watching various things, track it over time so that you can chart personal/project progress.

Eric Hodel: Maximizing Productivity

(How to find time to contribute to more projects)

Fun is the key to high productivity. It’s easier, of course, to have more fun on stuff you want to work on rather that something that someone else wants you to work on. YAGNI & KISS help break projects into tiny, implementable pieces. Doing a lot of tests up front will help other people write patches for you and will help you not make mistakes when adding new features. [More pimping for heckle and autotest.]

Document late in the game to capture the latest changes. README should provide a quick-start, links to the rest of the documentation, a bug tracker link, and a synopsis and feature overview.

Finally, before releasing anything have a partner review your code.

RubyConf 2007 Second Day Morning

Saturday, November 3rd, 2007

John Lam: State of IronRuby

John Lam

Photo by dwortlehock

Who works on IronRuby? The core is: John Lam, Hiabo Luo, Tomas Matousek, & John Messerly.

Why did John move from Toronto to Seattle to start working at Microsoft on IronRuby? He was in love with RubyCLR and couldn’t turn down the opportunity to work on a “real” implementation.


John wants IronRuby to be a Ruby implementation but also in changing Microsoft’s approach to opensource (“change or die”), especially the “either or approach” to thinking about opensource. However, it’s really hard to change a company that’s doing pretty well. Unfortunately, after doing some announcements about their intentions, the blogosphere decided that IronRuby was going to be the work of the devil. Today, the project is hosted on Rubyforge, has open SVN access, and has already recieved code contributions from the outside community.

Support Rails: it’s “the testbed” (dispite speculation about MS not wanting to threaten ASP.NET): it’s “the testbed” (dispite speculation about MS not wanting to threaten ASP.NET).

Run everywhere .NET runs (Mac, Linux, Windows). [John is doing the presentation on a Mac and does an irb demo (compiled on Windows) with CLR types (with nice integration between the CLR type’s methods and pure Ruby methods) on Mono.]

Under the covers

[John runs the Rubinius spec suite in IronRuby in Windows on his Mac. A lot of things fail (373 failures of 1030 examples). He thanks the Rubinius team for all of their work.]

[John goes on to show a bunch of quick IronRuby demos, including some in-browser demos using IronRuby in Silverlight.] There’ll be more examples up later on his blog.

If you’re interested in making a language on .NET, please come to the Lang.NET Symposium.

Q & A

Release schedule? Unlike “real projects”, IronRuby has a “conference-driven schedule”. They want to do a push to have Rails working in IronRuby by RailsConf 2008.

Can the IronRuby developers look at other implementations source code? They have to be “extremely clean” on the DLR, but “IronRuby does not have the same restrictions” because it doesn’t ship with the OS. Explicitly, they can look at
test and test frameworks.

Charlie Nutter & Tom Enebo: JRuby: Ruby for the JVM

JRuby logo

Everyone in the audience knows already what JRuby is, so they won’t cover that. It’s a Java implementation of the Ruby 1.8 language. It’s available under a few opensource licenses, blah blah blah. They released 1.0.2 a couple of days ago (and 1.1b1 last night during Matz’ questions).

[Tim Bray fairly incomprehensively announces that Sun has done a deal with University of Tokyo to work with ? on ? and give them “a bunch of money”. Another implementation?]


Installation: 1. unpack binary, 2. set PATH. Dependencies: Java 5+. jruby.jar contains the full runtime.

The lexer is a hand-written lexer ported from MRI. The parser is another port of the MIR YACC/Bison parser using Jay (Bison for Java). This parser is the key for the recent boom in IDE support for Ruby (like NetBeans). [Charlie does a little NetBeans demo showing auto-complete and variable renaming/highlighting, unused variable detection.]

The Core classes are all written in Java, and nearly all have a 1:1 correspondence (String is RubyString, Array is RubyArray, etc).

The interpreter is a “simple switch-based AST walker” that recurses for nesting. Code starts out interpreted but command-line scripts compile immediately. “eval‘d code always interpreted (for now).”

JRuby 1.1 brings full bytecode compilation. “The compiler is basically done. This is the only complete 1.8 compiler in existence.” [Charlie shows a fib benchmark. Compiling makes it quite a bit bigger in filesize (oh well). There’s a _huge_ improvement when using java -server.] Typically, you want to use AOT mode, which avoids “JIT warmup time”. It may also use less memory in the future and perhaps start faster?

Charlie and Tom

Photo by dwortlehock


Performance optimizations are progressing in a few directions. The first is the obvious one of compilation. For real numbers, see this blog post. Comilation caches literals, uses Java local variables when possible, and uses “monomorphic inline method cache.” ObjectSpace is a more controversial
optimization. ObjectSpace allows you to iterate all Objects in a system. This isn’t painful in MRI, but is not easy in the fully-concurrent JRuby (“it sucks 2-5 times as much”). Charlie asserts that it is almost never used (except for Test::Unit). It’s turned off by default but can be turned on with a flag. [Some in the audience believe that the reverse should be used.] Custom implementations of Array, String, and Hash (among others) have also boosted performance.

Threading, Extensions, & Integration

JRuby supports only native OS threads but they’re parallel. They also emulate unsafe green thread operations (Thread#kill, Thread#raise, etc).

Ruby native (C) extensions are not supported. Some libraries may be accessible with JNA. If you’re looking for a good binding, consider a Java equivalent to the C. Most of the extensions that Rails uses have already been ported to Java/JRuby.

You can call Ruby from Java, you can call Java from Ruby. A popular integration is building GUIs with Swing. Ruby makes “Swing development fun” and is much less verbose than the Java version. Swing integration also brings a cross-platform GUI solution to Ruby (essentially for the first time). A direct approch:

$ jirb
>> include Java
=> Object
>> import javax.swing.JFrame
>> frame ="Hello")
=> nil
# there is a frame on the screen
>> frame.set_size(500, 500)
=> nil 

Profligacy adds a new layout language to simply GUI creation. MonkeyBars takes a GUI editor approach.


Rails works with JRuby. It uses a JDBC connector to ActiveRecord. It can generate .war files (via Warble). ActiveHibernate is coming soon, and provides a different persistence API. Ruvlets brings Servlets to JRuby.


You can use Test::Unit or RSpec for Java code. “It’s so much less code” that writing tests for Java directly.


  • JRuby is ready
  • JRuby is more than just an implementation
  • JRuby needs your help

Evan Phoenix: Rubinius 1.0

Rubinius logo

Rubinius, like IronRuby and JRuby is aiming to bring “total world domination …. for Ruby!” It’s a Smalltalk-inspired VM:

class Rubinius < Smalltalk
  # form
  include Ruby::Syntax
  # function
  include Ruby::Behavior
  include"crazy cs papers")

Rubinius debuted last year at RubyConf 2006. There’s been “enormous progress” over the course of the year.

A comparison of implementation langagues (slightly misleading):

84,516 lines of C
     0 lines of Ruby

128,786 lines of C
     50 lines of Ruby

 48,282 of C#
      0 lines of Ruby

113,508 lines of Java
  1,000 lines of Ruby

 25,398 lines of C
 13,946 lines of Ruby 

Evan Phoenix

Photo by dwortlehock

Some “Junior High Analysis”:

  • 1.8 & 1.9 is Ruby (core) for C programmers
  • JRuby is Ruby for Java programmers
  • IronRuby is Ruby for C# programmers
  • Rubinius is Ruby for Ruby programmers

Rubinius finally allows Ruby programmers to “eat yummy dogfood.” It’s targeted at fellow Ruby programmers, not folks on the street. It also provides a much tighter feedback loop for language improvements & clarification.

Following on that, they have 57 committers, with 17 of those with 20 or more commits (36 with 100+ lines changed). Commit bits have been “free flowing” and easy to get (if something you submit as a patch you get commit rights).

Props to EngineYard for funding Evan’s work on Rubinius.

Evan is shooting to have 1.0 out by ? (was: RubyConf 2007). Things change. Mistakes were made. They’re evolving.

Q & A

Given the fact that Ola Bini and Avi Bryant should Rubinius replace MRI eventually? The chicken didn’t decide to replace the dinosaur [laughs], the environment did. The ultimate goal (from the start) wasn’t “replace MRI”, it was “total world domination”.

Are you more interested in performance than conformance? Rubinius has some compiler flags to turn certain optimizations on and off.

How small do you think you can get the C? They’ll follow Squeak in that they’ll write a C generator. The final goal is hand-maintained lines of C==0.

Are you worried about incompatibilities? We’re all worried about that. We wouldn’t checking something that intentionally broke compatibility.

What are these specs you’re writing? Do you pass them? They’re as implementation-agnostic as possible and written for MiniRSpec
(intended to sytnax-compatible with RSpec). Compatibility is all centered on emulating MRI. We used to fail ~1100 specs a few weeks ago, today we’re closer to ~500. The improvement is all due to community involvement.

Have you made any enhancements? Yeah, we did some prettier backtraces andcapturing exceptions from C extension segfaults, for example.

Is Rubinius challenged by ObjectSpace, Continuations, &Selector Namespaces? They have a problem with ObjectSpace like JRuby, continuations are “our shiny awesomeness” (thanks Smalltalk’s spaghetti stack) [shows continuation demo], and (no comment on SN).

RubyConf 2007 First Day Afternoon

Friday, November 2nd, 2007

Nathaniel Talbott: Why Camping Matters

“I don’t know about you, but I am totally psyched about this conference!” Nathaniel has spoken at every RubyConf.

 bacon, egg & cheese biscuit, west egg cafe

Photo by

Every talk needs a metaphor, and this talk’s will be the bacon, egg, and cheese biscuit.


The bacon is the connection to the creator, and to chunky bacon. It’s a 4k micro framework created by _why.

A whole Camping app goes in a single file and is a typical MVC.

Camping.goes :Blog

module Blog::Controllers
  class Index < R '/' # route to /
    def get
      "Hello Rubyconf!"

$ camping blog.rb
# blah blah, up on localhost:3301

Hooray, Hello World!. Let’s start doing more MVC:

Camping.goes :Blog

module Blog::Controllers
  class Index < R '/' # route to /
    def get
      render :index
  class Add < R '/add'
    render :add

module Blog::Views
  def index
    # this is markaby
    a "add post", :href => R(Add)
  def add
    form :method => :post do
      fieldset do
        label "Title: "
        input :name => :title; br
        label "Body"
        textarea :name => :body; br

Camping is nice when making a sketch of the app, and eliminates all the unnecessary crap. To that extent, it’s wonderful for rapid prototyping. We’ve already got most of a Blog skeleton. Now, time for M of MVC.

Camping.goes :Blog

module Blog::Controllers
  class Index < R '/' # route to /
    def get
      render :index
  class Add < R '/add'
    def get
      render :add
    def post
      redirect Index

module Blog::Views
  def index
    # this is markaby
    a "add post", :href => R(Add)
  def add
    form :method => :post do
      fieldset do
        label "Title: "
        input :name => :title; br
        label "Body"
        textarea :name => :body; br

# This is ActiveRecord under the covers, that's _not_ part of the 4k
module Blog::Models
  class Post < Base; end 

  # Migrations go inline
  class CreatePost < V 1
    def self.up
      create_table :blog_posts do |t| 
        t.column :title, :string
        t.column :body, :text

def Blog.create;

Cool, we’ve got something working. It’d be nice if we knew what was there…

module Blog::Views
  def index
    @posts.each do |e|
      p e.body

    # this is markaby
    a "add post", :href => R(Add)


class View < R '/view/(\d+)'
  def get(id)
    @post = Post.find(id)
    render :view

[As you can see, this all happened very fast, as we’re only 17 minutes in so far.] Nathaniel adds comments in the next 4 minutes.


So, that’s really most of Camping. To keep the metaphor going, we need to break some eggs (over Rails). Some differences between Rails & Camping:

  • Convention over configuration vs. minimalizism (nothing to configure)
  • Opinionated vs. wackyness (“Camping isn’t so much opinionated as it is… strange.”)
  • Boring vs. different
  • Just like always, but better vs. “More flexible than a wet noodle”
  • Lots and lots of helpers vs. diy
  • Rails vs. Ruby
  • Daunting to hack on vs. hackable (if odd)
  • Encourages conformance vs. encourages experimentation


Actually hacking on stuff is fun. That’s why Camping & any other cool, out-there framework or language is there: “we need to feed our inner hacker!” Keys to feeding your inner hacker:

  • Needs to be regular (daily would really be great)
  • Eat a variety of foods (different is good: Rails Monday, Camping Tuesday)
  • You have to feed on things that you’re passionate about


The biscuit is the Community. Why do people who have been to both RailsConf and RubyConf prefer RubyConf? [A show of hands confirmed this.] RubyConf is still small, and it’s the hobbyist conference. Both of those are not true of RailsConf. We, the Ruby community, need both conferences, with RailsConf bringing the momentum and RubyConf bringing the vitality.


Testing: There’s a testing framework for Camping called Mosquito.

Production: Nathaniel does use it in production, but not for clients.
Intranet apps would be great.

Nathan Sobo: Treetop: Bringing the Elegance of Ruby to Syntactic

Earlier in the day, a significant but small number of folks raised their hands when asked if they had ever written a parser. However, many of us have written ad-hoc parsers on other stuff (in regexes, usually). Why don’t regular programmers use the same tools as language designers for creating parsers? Because the tools for making parsers have a really high barrier to entry, even though regexes and loops make for brittle software. Hopefully, some new tools will lower this barrier to entry. Treetop is an attempt at this.

What is a Context Free (Generative) Grammar? A grammar is a program, a program that generates every possible string in a language. However, there’s a problem with this model of grammars, as sometimes there’s ambiguity (think of if/else with ambigous nesting). Instead, do Parsing Expression Grammars [PEG], which Treetop uses, and work on recognizing language rather than generating it.

PEGs are just a generalization of ur-regexes, but are more powerful because they can do recursion. Here’s something for (((a))):

# Treetop, not Ruby
grammar ParenLanguage
  rule nested_parens
    '(' nested_parens ')' / [a-z]

# use in Ruby like so
load_grammar 'paren_language'
parser =
tree = parser.parse("(((a)))")

The tree above is an OO view of the parse.

Some livecoding

Let’s parse the language of arithmetic.

(5 + 2) * (10 - 5)

First, draw a tree over the thing you want to parse.

Here’s what I captured from what he livecode:

dir = File.dirname(__FILE__)
require "#{dir}/test_header"

load_grammar "#{dir}/aritmetic"

class ArithmeticGrammarTest < Test::Unit::TestCase
  include GrammarTestHelper

  def setup
    @parser =
  def test_numbers_simple
    assert @parser.parse('0').success?
    assert_equal 0, @parser.parse('0').eval
    assert @parser.parse('123').success?

  def test_numbers_use_helpers
    assert 0, parse('0').eval

  def test_variables
    assert_equal 2, ... something ..

    assert_equal 20,  parse('x * 10').eval({'x' => 2})
    assert_equal 4 * 3 * 2, parse('4 * 3 * 2 * 1').eval...

  def test_additive
    assert_equal 5 + 2 * 10 - 5, parse('5 + x * 10 - y').eval({'x' => 2, 'y' => 5})

  def test_parentheses  
    assert_equal (5 + 2) * (10 - 5), parse('(5 + x) * (10 - y)').eval({'x' => 2, 'y' => 5})

# different file
grammar Arithmetic
  rule primary
    '(' space additive space ')' {

  rule space
    ' '*

  rule additive
    operand_1:multiplicative space additive_op space operand_2:additive {
      def eval(env)
        additive_op.apply(operand_1.eval(env), operand_2.eval(env))

  rule additive_op

  rule multiplicative
    operand_1:primary space '*' space operand_2:multiplicative {
      def eval(env)
        operand_1.eval(env) * operand_2.eval(env)

  rule variable
    [a-z]+ {
      def eval(env)
      def name

  rule number
    ([1-9] [0-9]* / '0') {
      def eval(env)

Note that we didn’t lex anywhere above, and that the stuff above is composable. Grammars can be opened up an have other Grammars included (include Arithmetic, then override some part of it!)).

[Just showed up a Turing-complete Lambda Calculus language parser in 132 lines]

Imagine (using each others PEGs):

grammar RubyWithSQLStrings
  include Ruby
  include SQL

  rule expression

  rule ruby_string
    quote sql_expression quote / super

We didn’t cover lookahead, but there’s both negative and positive. Here’s negative (a quote, a bunch of not quotes, followed by quote):

'"' (!'"' .)* '"'

Memoization makes all of this stuff work, although it wasn’t an option in the past.

Ryan Davis: Hurting Code for Fun and Profit

On Ruby Sadism, Asceticism, & Introspection.

Start with a story: Once upon a time, a developer went to a New Place. The New Place had legacy code (any code that you didn’t write yourself). Every piece of legacy code reference 5 other files and everything is a rats nest. The developer is mad. He does what is “right” and kills all of the people responsible.

OR: Developer finds the dependencies, the rats nest, and gets angry. But, this time he pulls out tools and instead of maiming the people, he hurts the code. He shows the code who is boss.

 People will press charges if you hurt them.

Photo by candescence

People will press charges when you hurt them, code won’t.

Why Hurt Code

Hurting code is fun, and may make your code cleaner, more readable, and easier to test. If you make fixing code fun, you’ll do it much more often. An obvious example of sadism is killing a bug by writing a new test.

For some reason, people love complexity. Asceticism is characterized by strict self-discipline. Test-first is an example of asceticism. YAGNI is an example of abstenstion. Resist indulgence! (needless complexity, overly-clever code, code that you don’t need right now, “technical debt”)

“A developer’s obligation is to make sure that the code as written makes the clearest possible statement as to how the solution was understood at the time of writing.” –Ward Cunningham

Introspection-oriented development

How to do it?

  • Ask yourself constantly: How do I do better? How did I overlook that bug? Am I wrong?
  • Improve yourself: read 1 nerd book per month (which is 12x industry average)
  • & other wikis with smart people
  • Get rid of high-flow mailing lists, meaningless blogs in feedreader, bad sites
  • Grow: Learn a language a year, learn your tools much better, examine your habits, study something wierd
  • Push yourself: Write lots; throw away; write more (they weren’t kidding when they said “practice makes perfect”)
  • Push yourself more: Be competetive, challenge the status quo
  • Feel: Have an opinion, have passion (zentest, flog & heckle all came from love) (image_science came from hate)
  • Feedback: Figure out how to get better


Flog can help find code that will be hard to test and understand.

Coverage tools are good at finding gaping holes, but not anything about quality.

Heckle (“the most sadistic tools I’ve written”) mutates your implementation to make sure that your tests are good.

RubyConf 2007 First Day Morning

Friday, November 2nd, 2007

I’m at RubyConf 2007 for the next few days. Here’s a stream-of-consciousness blog of the first morning’s talks. Apparently there will eventually be video of the talks online.

RubyConf shirt

Photo by jremsikjr

David Black kicks it off

This year is bigger than ever, with attendance 15 times greater than the first one in 2001. New tracks have been added & the format has been changed with plenary sessions for the mornings and 3 tracks in the afternoon.

Marcel Molina: What Makes Code Beautiful?

Historical definitions of beauty

Beautiful things according to the audience:

  • My Wife
  • “His Wife”
  • Kids
  • Flowers
  • Expressiveness
  • Simplicity

Marcel Molina

Photo by dwortlehock

“Unlike most of the room, I wasn’t doing awesome BASIC hacks when I was 5 [years old]. I was reading books.” Marcel was interested in language and semantics, especially the differences between very similar sentences & constructions. There are good ways of constructing sentences and bad ways, especially for software.

If you make a really long sentence, with lots of relatively, if interesting, long clauses that don’t really do anything but keep the audience from knowing the important bits (because of huge delay and “suspension”), you suck.

Ruby appealed to Marcel almost immediately on some root level, although he wasn’t entirely aware of why. This is half of why “My Wife” is beautiful but you can’t explain why (“I just feel it” versus “her jawline is the golden ratio”). This makes peoples assertions that Ruby is elegant interesting to try to quantify.

Beauty for Software

Three parts of beauty (from Aquinas):

  • Proportion (you could make your hand 10 times bigger and it’d still be ok, but not if you didn’t preserve the ratio of sizes)
  • Integrity (a crystal hammer might be beautiful, but isn’t much of a hammer)
  • Clarity (“complicated in the perjorative sense”)

Now, a case study in code. The background is a web service that reads in a huge chunk of XML and builds Ruby objects as strings, but it’d be really nice to coerce those strings into appropriate objects:

'true'                     => true
'false'                    => false
'42'                       => 42
'2007-08-01T23:55:35.000Z' => Wed Aug 01 23:55:35 UTC 2007

The basic attempt is just to try a bunch of different coercions, one after the other, in a special try {} block.

So, how beautiful is this CoercibleString? It’s fairly proportionate, but that doesn’t really matter, we’re more interested in the appropriate size measure of proportionality. In this case, it’s not the appropriate size, because he later
refactored it to half the number of lines (from ~20 to 10). Does it have integrity, in that it is well suited to what it does? He uses the Generator library, which uses continuations in 1.8 (but threads in 1.9), and it was crazy slow and had a memory leak, so it doesn’t have integrity. Clarity? “Uh… yeah.” It had to be explained to everyone (unlike the refactored). Anyway, it failed
on all three.

Short Code: WTF

Photo by jnunemaker

Remember, all three parts of our definition are necessary (and no one can excluded). You lose clarity if you go too far on shortness:

expand(join("", (map { /\s+\w/ ? ( $_ ....

Does quality relate to beauty

Many engineers seem to not always care about beauty (and “feelings”), great software and beauty go hand & hand. For example, Kent Beck in Smalltalk: Best Practice Patterns is just an exploration of the best ways to design and write software. He might not use the word “beauty” or think about it in that way, but his ideas on rules to write great software is based on the same principles of beauty oulined above.

Is any of this useful?

This refactored coerce method may not be the most beautiful thing on its own, but compared to assembler, it’s stunning.

class String
  def self.coerce(string)
    case string
    when 'true':          true
    when 'false':         false
    when /^[1-9]+\d*$/:   integer(string)
    when DATETIME_FORMAT: Time.parse(string)

Ruby may not be the most beautiful thing in 20 years, but it certainly is today. If you’re not pleased with the beauty of the case statement above, consider the time before if was implemented in programming languages, then consider how beautiful if was when it was first added.

“Luckily for us, Ruby is optimized for beauty.” “When you’re working on software, try to imagine better modes of expression.” After giving it a try, make sure it doesn’t violate any of the three rules of beauty. Iterate until it passes all three and you’ll hopefully end up with something beautiful.

Hats off to Matz & and the Rubycore team for making such a beautiful language.

Jim Weinrich: Advanced Ruby Class Design

Jim’s history in programming and OO meant that while he’d used dynamic languages, his sense of OO was all from a strict paradigm. Coming from Java and C++ will give you a lot of good concepts, but there are parts of Ruby that are inconceivable in Java.

Jim Weirich

Photo by dwortlehock

Master of Disguise

This is an example from Rake (Rake::FileList).

RUBY_FILES = FileList['lib/**/*.rb']

FileList is like an Array, except that it initializes with a GLOB from the filesystem, has a specialized to_s method, uses lazy evaluation (woot), and has some extra methods (ext (for file extension manipulation), pathmap).

The first pass at this took the similarity to Array and started with that explicitly:

class FileList < Array

Java would suggest that you never inherit from concrete classes, which also is a good rule for Ruby but for totally different reasons. More on that later.

The lazy bits made direct Array access problematic (like index), so each Array-accessing method had to call the resolve method to unlazyify the FileList. This made some operations not work.

Instead of inheriting from Array, you should use to_ary, so that Ruby helps you when messing with other Arrays out in the world. FileList now is just a regular class not inheriting from anything special and Ruby will ask a FileList if it can behave as an Array (using to_ary).

As for all of the methods needing to call resolve, you can DRY this with a list of relevant methods and a class_eval.

Takeaway: Consider to_ary/to_str when you want to mimic a base class, rather than using inheritance.

Doing nothing

Jim built Builder for the pure fun of it. (Thanks, Jim, it’s a pretty nice library!) It uses block structure and method_missing to make writing XML much easier. However, because XML element names may conflict with builtin methods (class is a good example), we have to make sure xml.class("Intro to Ruby") doesn’t blow up.

Wouldn’t it be nice to inherit from Object without inheriting all the stuff from Object? Introducing BlankSlate, which is really easy to write:

class BlankSlate
  instance_methods.each do |name|
    undef_methods name

…but that’s a little too overzealous, because it removes /^__/ methods (__id__ is used by a lot of internal stuff, for example). That can be easily fixed with an unless.

Unfortunately, you’ve still got problems with global methods defined later (in Kernel, say). You can fix this by adding to the method_added hook in Kernel and Object:

alias_method :original_method_Added, :method_added
def method_added(name)
  result = original_method_added(name)
  BlankSlate.hide(name) if self == Kernel

All set? Not quite, we’ve still got a similar bug that bypasses method_added:

module Name
  def name
    "My Name"o

class Object
  include Name


This can be fixed with the append_features hook (look at BlankSlate in Builder).

Parsing without Parsing


User.find(:all, :conditions =< ["name = ?", "jim"])

…which looks a lot like SQL code. Why can’t I just call select? {|user| == "jim"

“Wouldn’t it be nice if there was a way we could use select on ActiveRecord models?” Let’s write it (naive first attempt):

class User

This, of course, is not effecient at all, and large tables will kill you. Databases do really have a purpose, of course, and we should be using their design. Here’s a magical method:

  cond = translate_block_to_sql(&block)
  find(:all, :conditions => cond)

…however, not many people have written Ruby parsers, which is an “interesting language to parse”. You could use ParseTree, which uses ruby to parse Ruby then evicerates the result. Could we just execute the code? (huh?)

Here’s some curious code:

$ irb -rnode1
>> user ="users")
>> result =
>> puts result.to_s
>> result2 = user.age
>> puts result2.to_s

This allows for references to tables that helps build SQL code. Here’s the background (similar for MethodNode):

class TableNode < Node
  def initialize(table_name)
    @table_name = table_name
  def method_missing(sym, *args, &block), sym)
  def to_s

OK, we’ve got field references down, but how do we do stuff like ==?

class Node
  def ==(other)"=", self, other)

…well we just capture the interesting method in the Node class then translate the method to a SQL fragment (in BinaryOpNode)

$ irb -rnode1res1 = (user.age == 50)
>> user ="users")
>> puts res1.to_s
(users.age = 50)
>> res2 = ( == "jim")
>> puts res2.to_s
( = jim) # oops, no quotes

To quote strings, we need to differentiate between LiteralNodes and StringNodes, which just wraps with quotes (and probably does escaping). Getting the right kind of Node could depend on a case statement, but that’s not very OO. Every object should really know how to convert itself…

class Object
  # be careful opening core classes, which is why we have a unique name
  def as_a_sql_node r

class String
  def as_a_sql_node

Now we just need to call it:

  def ==(other)"=", self, other.as_a_sql_node)


…and, as we’d hoped:

>> res2 = ( == "jim")
>> puts res2.to_s
( = 'jim') 


We haven’t handled commutativity [hey, Marcel had this problem too!]. "jim" == will not work, although + gets help from coerce for mathematical operators. A killer problem is that &&/|| aren’t methods (by necessity, because of their shortcircutyness). !/!= also have predefined semantics and aren’t overridable. So, this technique really wouldn’t work for SQL. It’s a “solution looking for a problem.”

What did we learn?

Programming languages shape the way you think, so make sure you’re thinking about problems in a Ruby-ish way. Sometimes, the corners of a language will hold the keys to good, idiomatic design. Don’t be afraid of unusual solutions (some of the time).

More Clever GMail Ads for Programmers

Tuesday, July 10th, 2007

Just like the folks from Jane Street Capital, Swivel really knows how to write good, eye-catching ad copy (for some crazy subset of the population):

Clever Swivel Rails Ad

JRuby + Jetty

Wednesday, June 6th, 2007

I finally figured out how to get JRuby to serve a Jetty servlet today (thanks to Charles). The key was flipping what I’d been trying to do for a while (getting Jetty to run JRuby). Here’s code that implements the AbstractHandler interface pretty trivially:

$ cat jetty_example.jrb 
require 'java'
include_class 'javax.servlet.ServletException'
include_class 'javax.servlet.http.HttpServlet'
include_class 'javax.servlet.http.HttpServletRequest'
include_class 'javax.servlet.http.HttpServletResponse'
include_class 'org.mortbay.jetty.Server'
include_class 'org.mortbay.jetty.servlet.Context'
include_class 'org.mortbay.jetty.servlet.ServletHolder'
include_class 'org.mortbay.jetty.handler.AbstractHandler'
class SimpleHandler < AbstractHandler
  def handle(target, request, response, dispatch) 
    response.getWriter().println("<h1>Goodbye, cruel monoglot world!</h1>")
handler =
server =

To run, add Jetty to your classpath:

$ export CLASSPATH="/path/to/jetty-6.1.3.jar:/.../jetty-util-6.1.3.jar:/.../servlet-api-2.5-6.1.3.jar"

Then it’s just a normal JRuby invocation:

$ jruby jetty_example.jrb

It’s trivial code at this point (and doesn’t handle concurrent requests, maxing out at 6.47r/s across my network), but at least it’s got me started.

[UPDATE: I can get the non-concurrent request handling way down with just a few simple tweaks (mainly running JRuby in SERVER mode) and running ab locally ;-)]

The Code Behind DocBook Elements in the Wild

Tuesday, May 1st, 2007

[UPDATE: Added a link to the categorized CSV file below]

Here’s some of the nitty-gritty behind DocBook Elements in the Wild. We’re trying to get a count of all of the element names in a set of 49 DocBook 4.4 <book>s.

First, go ask the O’Reilly product database for all the books that were sent to the printer in 2006. Because I’m better at XML than Unix text tools, ask for mysql -X. Now we’ve got something like:

<resultset statement="select...">
        <field name="isbn13">9780596101619</field>
        <field name="title">Google Maps Hacks</field>
        <field name="edition">1</field>
        <field name="book_vendor_date">2006-01-05</field>
        <field name="isbn13">9780596008796</field>
        <field name="title">Excel Scientific and Engineering Cookbook</field>
        <field name="edition">1</field>
        <field name="book_vendor_date">2006-01-06</field>
        <field name="isbn13">9780596101732</field>
        <field name="title">Active Directory</field>
        <field name="edition">3</field>
        <field name="book_vendor_date">2006-01-06</field>

Next, fun with XMLStarlet:

$ xml sel -t -m "//field[@name='isbn13']" -v '.' -n books_in_2006.xml                         

Now, pull the content down from our Atom Publishing Protocol repository and make a big document with XIncludes:

#!/usr/bin/env ruby
require 'kurt'
require 'rexml/document'
OUTFILE = "aggregate.xml"
files_downloaded = []
ARGV.each {|atom_id|
  entry = Atom::Entry.get_entry("#{Kurt::PROD_RESOURCES}/#{CGI.escape(atom_id)}")
  filename = atom_id.gsub(/\W/, '') + ".xml", "w") {|f|
    f.print entry.content
  files_downloaded << filename

agg =
agg.root.add_namespace("xi", "")
files_downloaded.each {|file|
  xi = agg.root.add_element("xi:include")
  xi.add_attribute("href", file)
}, "w") {|f|
  agg.write(f, 2)

Resolve all of the XIncludes into one big file:

$ xmllint --xinclude -o aggregate.xml aggregate.xml 

It’s now pretty huge (well, huge in my world):

$ du -h aggregate.xml
102M    aggregate.xml

At this point, we’re ready to do the real counting of the elements (slow REXML solution commented out in favor of a libxml-based solution):

#!/usr/bin/env ruby
require 'rexml/parsers/pullparser'
require 'rubygems'
require 'xml/libxml'
start =
ARGV.each {|filename|      
  counts =
#  parser =
#  while parser.has_next?
#    el = parser.pull
#    if el.start_element? 
#      element_name = el[0]
#      if counts[element_name]
#        counts[element_name] += 1
#      else  
#        counts[element_name] = 1
#      end  
#    end  
#  end
  parser =
  parser.filename = filename
  parser.on_start_element {|element_name, _| 
    if counts[element_name]
      counts[element_name] += 1
      counts[element_name] = 1
  parser.parse + ".count.csv", "w") {|f|
    counts.each {|element_name, count|
      f.puts "\"#{element_name}\",#{count}"

(Hooray for steam parsing, as this 100MB file was cranked through in 27 seconds on a 700MHz box!)

Finally, we’ve got CSV and we can do some graphing. Here’s the full CSV and the categorized CSV. Rather than working on a code-based graphing solution, I just messed with Excel. The result:

DocBook Elements from 49 Books

Here’s my favorite, a drill-down based on a categorization I just made up (click through for the drill-down):

DocBook Elements from 49 Books, Categorized

Books used:

JRuby + JFreeChart = Sparklines

Friday, April 13th, 2007

Inspired by how easy it was to get JFreeChart working and some code from former colleague Andrew Bruno, I thought it’d be nice to write some JRuby to generate Edward Tufte’s Sparklines.

Here’s some simple example code on a semi-random dataset:

# Mostly inspired by
# have JFreeChart in your classpath, obviously, as well as jcommon.jar
require 'java'

module Graph
  class Sparkline
    include_class ''
    include_class 'org.jfree.chart.ChartUtilities'
    include_class 'org.jfree.chart.JFreeChart'
    include_class 'org.jfree.chart.axis.NumberAxis'
    include_class 'org.jfree.chart.plot.XYPlot'
    include_class 'org.jfree.chart.renderer.xy.StandardXYItemRenderer'
    include_class ''
    include_class ''
    include_class 'org.jfree.chart.plot.PlotOrientation'

    def initialize(width=200, height=80, data=[])
      @width = width
      @height = height
      dataset = create_sample_data() if data.empty?
      @chart = create_chart(dataset)

    def render_to_file(filename, format="png")
      javafile =
      ChartUtilities.saveChartAsPNG(javafile, @chart, @width, @height)

    def create_sample_data
       series ="Sparkline")
      data = [20]
      (1..99).each {|x|
        y = (data.last + (rand(x) + 1)) / 2
        data << y
        series.add(x, y)

      dataset =
      return dataset

    def create_chart(dataset)
      x =

      y =

      plot =

      chart =, JFreeChart::DEFAULT_TITLE_FONT, plot, false)
      return chart

  end # class Sparkline  
end # class Graph

sp =
puts "Rendering sparkline"

And the resulting sparkline chart:
An Example Sparkline Chart


UPDATE: Removed some of the useless sample generation code