O’Reilly Release ePubs

July 15th, 2008

As of today, 30 O’Reilly titles are available as Ebook bundles and many will be in the Kindle Store later today:

As promised last month, O’Reilly has released 30 titles as DRM-free downloadable ebook bundles. The bundles include three ebook formats (EPUB, PDF, and Kindle-compatible Mobipocket) for a single price — at or below the book’s cover price.

I’ve spent a reasonable chunk of my year helping make this happen, both on the O’Reilly side and by adding .epub support to the DocBook-XSL stylesheets with Paul Norton of Adobe. Hopefully, our customers will be happy with the new formats.

Never Have I Felt More Famous

July 14th, 2008

Ah, the day my tent showed up in TechCrunch:

My Tent on Tech Crunch

Talk: AtomPub Makes You Cool at Code4Lib2008

February 26th, 2008

I just gave a lightning talk at the fabulous code4lib conference. My boss at Atom Publishing Protocol was cool and how it could teach people about RESTful Web Services and HTTP (again). Here are the slides: http://kfahlgren.com/talks/code4lib2008/atompub_teaches_rest_http.pdf.

For heaven’s sake, please just install hoe!

February 21st, 2008

This is why people get annoyed about the silly gem install dependency mess (esp WRT hoe):

$ gem install heckle
Need to update 31 gems from http://gems.rubyforge.org   # fair enough, you're allowed
Install required dependency ruby2ruby? [Yn]  Y    # Yeah, I know you need these
Install required dependency ParseTree? [Yn]  Y     # oh, and this one too
Select which gem to install for your platform (i686-linux)
 1. ParseTree 2.1.1 (ruby)                                   # yeah, just a vanilla ruby here
 2. ParseTree 2.1.1 (i386-mswin32)
 3. ParseTree 2.1.0 (ruby)
 4. ParseTree 2.0.2 (ruby)
 5. Skip this gem
 6. Cancel installation
> 1
Install required dependency RubyInline? [Yn]  Y   # This is also needed, I gather 
Install required dependency hoe? [Yn]  Y             # Ha, hoe again, OK
Install required dependency rubyforge? [Yn] Y     # Don't care, don't understand why both
Install required dependency rake? [Yn]  Y            # There's no rake on this box, really?
Install required dependency hoe? [Yn]  Y             # WTF, yeah, I just said that
Install required dependency hoe? [Yn]  Y             # .. now you're just fucking with me
Install required dependency ZenTest? [Yn] Y        # Huh? I like ZenTest, but there's no reason...
Install required dependency hoe? [Yn]  Y             # **** YOU, hoe!
Successfully installed heckle-1.4.1
Successfully installed ruby2ruby-1.1.8

The fourth time it asked me, it decided to trust me and actually starting installing the gems….

RescueTime Is Da Bomb!

February 14th, 2008

I’ve been using RescueTime since the fall, after hearing about it from some YCombinator-related person. It’s an absolutely spectacular application, and has really changed the way I understand my work and computer use. They also just released a cool widget:

EDIT: I can’t get WordPress to not screw up the widget markup. THEN: Raw-HTML to the rescue!

XML for Publishers at TOC

February 14th, 2008

Keith Fahlgren at his TOC Tutorial, XML for Publishers

Originally uploaded by duncandavidson

I just got back from New York and the second annual O’Reilly Tools of Change for Publishing (TOC) Conference. It’s become a very impressive conference in just two years and had impressive attendance and speakers this year. There’s good blog coverage from George Walkley and pointers to more from the new TOC blog.

I had the honor of doing a tutorial on the last day and had a great time talking with and teaching an energized, question-happy audience about XML in the publishing industry. If you weren’t able to make it to TOC this year, you can pre-order the DVDs of four of the eight tutorials, including mine, and get 30% off with discount code TOCD3. Here’s the link: XML for Publishers.

Hott! Work-Historical Shirt

February 12th, 2008

Production University Tools Seminar

Originally uploaded by Norm Walsh

Loved this shirt. Blogged longer about it at XML.com History Repeats: Teaching Publishers Markup.

ISBN10 to ISBN13

January 24th, 2008

As of the beginning of 2007, ISBN10 is dead. Now we’re in a world that allows “979” prefixes, though the following code doesn’t expect them yet…

Here’s some stuff to turn your 10-digit ISBNs into 13-digit ISBNs, naively assuming “978”, following an the API post doing the same from LibraryThing. There’s another one at isbn.org for humans.

Code from O’Reilly’s internal stuff:

module Isbn
  def self.dash_isbn(isbn)
    raise ArgumentError.new("ISBN argument must be string") unless isbn.is_a?(String)
    if isbn.length == 10
      return isbn[/^./] + "-" + isbn[1..3] + "-" + isbn[4..8] + "-" + isbn[/.$/]
    elsif isbn.length == 13  
      return isbn[0..2] + "-" + isbn[3].chr + "-" + isbn[4..6] + "-" + isbn[7..11] + "-" + isbn[/.$/]
      raise ArgumentError.new("ISBN must be 10 or 13 characters")
  def self.isbn10toisbn13(isbn)
    raise ArgumentError.new("ISBN argument must be string") unless isbn.is_a?(String)
    raise ArgumentError.new("ISBN must be of length 10") unless isbn.length == 10
    prefix = "978"
    isbn12 = prefix + isbn[0...-1] 
    return isbn12 + check_digit_13(isbn12).to_s

  def self.check_digit_13(isbn_12)
    # http://www.barcodeisland.com/ean13.phtml
    # need to subtract remainder from 10 
    # and do exemption for zero LMS 08.29.2006
    raise ArgumentError.new("ISBN must be of length 12") unless isbn_12.length == 12
    sum = 0
    odds = 0
    evens = 0
    isbn_12.scan(/\d/).each_with_index {|d, i| 
      if (i % 2) == 0
        evens = evens + (d.to_i * 1)
        odds = odds + (d.to_i * 3)
    sum = evens + odds
    digit = sum % 10
    if digit.zero?
      return 0
      return 10 - digit
end # of module Isbns

The last test method relies on my database, which you won’t have. Replace it with something else you trust or drop it.

#!/usr/bin/env ruby

require 'test/unit'
require 'pdb'
require 'isbn'

class IsbnTest < Test::Unit::TestCase
  def setup
    @isbn10 = "059610123X"
    @isbn13 = "978059610123X"
  def test_dash_isbn
    # must be a String
    assert_raise ArgumentError do Isbn.dash_isbn(123) end
    # must be 10 or 13 characters
    assert_raise ArgumentError do Isbn.dash_isbn("123") end
    assert_equal("0-596-10123-X", Isbn.dash_isbn(@isbn10))
    assert_equal("978-0-596-10123-X", Isbn.dash_isbn(@isbn13))
  def test_isbn10toisbn13
    # must be a String
    assert_raise ArgumentError do Isbn.isbn10toisbn13(123) end
    # must be 10 characters
    assert_raise ArgumentError do Isbn.isbn10toisbn13(@isbn13) end
    assert_equal("9780596101237", Isbn.isbn10toisbn13(@isbn10))
  def test_check_digit_13
    # must be 12 characters
    assert_raise ArgumentError do Isbn.check_digit_13(@isbn13) end
    assert_equal(7, Isbn.check_digit_13(@isbn13[0..-2]))
    assert_equal(0, Isbn.check_digit_13("123456789018"))
  def test_isbn13
    ["0596008627", "1565929470", "0596002734", "0596004001", "0596527357", 
     "0596101635", "0596005059", "0596527063", "1565926374", "0596526946", 
     "1565926420", "0596101805", "059652742X", "156592455X", "0596006446", 
     "0596008473", "0596009607", "0596100582", "0596100493", "0596004427", 
     "1565925890", "1565927141", "059651610X"].each {|isbn|
      puts "Testing #{isbn}"
      assert_equal(PDB::ProdDB.new(isbn).isbn13, Isbn.isbn10toisbn13(isbn), "Bad checksum!")

RubyConf 2007 Second Day Afternoon

November 3rd, 2007

Ed Borasky: Profiling and Tuning Ruby 1.8

Slides are available here. Cougar is the name of the project that this is coming from?

Is Ruby 1.8 Slow?

To benchmark: Collect a set of benchmark times, then normalize them, then compute the geometric mean of the ratios.

Alioth is a popular (if controversial) set of benchmarks. Using gcc as the standard, java is about 3 times slower, python is about 10, and ruby is 19 times slower (slower than Python, PHP, and Perl).

“Hemibel thinking”: (half of a an “order of magintude” [factor of 10]). Look for hemibel improvements or difference. Any greater accuracy doesn’t really help. Hemibel ratios: log10(a/b) * 2. Now we can rethink java as ~1, perl ~2, ruby ~2.6 (in terms of hemibels). Now we can say “Ruby is sort of slow.”

But, of course, reducing the speed of a language to a single number is not really that interesting. Instead, we need to find the variation of the benchmarks. Cue Box and Whisker Plots (they show a lot of data in a small amount of space). Now yarv looks really nice, with a really limited variation over the different benchmarks (python and ruby do well, perl and php do poorly). Each language has big (upper) outliers. Ruby’s worst outlier is “spectralnorm 500” (also bad for yarv).

OK, it’s slow, now what?

What to do: Throw hardware? Wait for 1.9/JRuby/IronRuby/Rubinius? Tune for 1.8?

Let’s try tuning 1.8 against the 1.9 slightly-modified benchmark suite. First, check how much you can get of GCC optimizations. There’s something to be gained there, but it’s a pretty small difference (1/2 a hemibel). Next, check out gprof. gprof shows the top three most expensive methods were rb_eval, rb_call0, and rb_call (for slightly more than 50% of the time). Another GNU tools is gcov (for coverage and profiling), and it can provide some
more help.

Phil Hagelberg: Tightening the Feedback Loop

How can you become a more effective programer? First, read The Pragmatic Programmer. Next, test more (duh) and use Test::Unit or RSpec (don’t waste your time doing stuff by hand that the computer could automate). autotest takes away a bit of the discipline of good testing (it tells “you what you need to know when you need to know it whether or not you know you need it or not”). But, autotest still requires a lot of context switching between terminals. Consider a solution like Flymake for emacs (which does in-editor syntax checking).

Testing habits help show the general feedback principles nicely. Many other parts of software development could benefit from the same introspection, measuring, and automation. What else should we measure?

  • Accuracy (syntax highlighting and automatic indentation): modern editors
  • Meta-Accuracy (especially when starting out, are your accuracy checks actually accurate?): heckle, rcov
  • Maintainability (hard to measure, but you can measure complexity): flog
  • Performance: write your own automated test for what you care about

Now that you’re watching various things, track it over time so that you can chart personal/project progress.

Eric Hodel: Maximizing Productivity

(How to find time to contribute to more projects)

Fun is the key to high productivity. It’s easier, of course, to have more fun on stuff you want to work on rather that something that someone else wants you to work on. YAGNI & KISS help break projects into tiny, implementable pieces. Doing a lot of tests up front will help other people write patches for you and will help you not make mistakes when adding new features. [More pimping for heckle and autotest.]

Document late in the game to capture the latest changes. README should provide a quick-start, links to the rest of the documentation, a bug tracker link, and a synopsis and feature overview.

Finally, before releasing anything have a partner review your code.

RubyConf 2007 Second Day Morning

November 3rd, 2007

John Lam: State of IronRuby

John Lam

Photo by dwortlehock

Who works on IronRuby? The core is: John Lam, Hiabo Luo, Tomas Matousek, & John Messerly.

Why did John move from Toronto to Seattle to start working at Microsoft on IronRuby? He was in love with RubyCLR and couldn’t turn down the opportunity to work on a “real” implementation.


John wants IronRuby to be a Ruby implementation but also in changing Microsoft’s approach to opensource (“change or die”), especially the “either or approach” to thinking about opensource. However, it’s really hard to change a company that’s doing pretty well. Unfortunately, after doing some announcements about their intentions, the blogosphere decided that IronRuby was going to be the work of the devil. Today, the project is hosted on Rubyforge, has open SVN access, and has already recieved code contributions from the outside community.

Support Rails: it’s “the testbed” (dispite speculation about MS not wanting to threaten ASP.NET): it’s “the testbed” (dispite speculation about MS not wanting to threaten ASP.NET).

Run everywhere .NET runs (Mac, Linux, Windows). [John is doing the presentation on a Mac and does an irb demo (compiled on Windows) with CLR types (with nice integration between the CLR type’s methods and pure Ruby methods) on Mono.]

Under the covers

[John runs the Rubinius spec suite in IronRuby in Windows on his Mac. A lot of things fail (373 failures of 1030 examples). He thanks the Rubinius team for all of their work.]

[John goes on to show a bunch of quick IronRuby demos, including some in-browser demos using IronRuby in Silverlight.] There’ll be more examples up later on his blog.

If you’re interested in making a language on .NET, please come to the Lang.NET Symposium.

Q & A

Release schedule? Unlike “real projects”, IronRuby has a “conference-driven schedule”. They want to do a push to have Rails working in IronRuby by RailsConf 2008.

Can the IronRuby developers look at other implementations source code? They have to be “extremely clean” on the DLR, but “IronRuby does not have the same restrictions” because it doesn’t ship with the OS. Explicitly, they can look at
test and test frameworks.

Charlie Nutter & Tom Enebo: JRuby: Ruby for the JVM

JRuby logo

Everyone in the audience knows already what JRuby is, so they won’t cover that. It’s a Java implementation of the Ruby 1.8 language. It’s available under a few opensource licenses, blah blah blah. They released 1.0.2 a couple of days ago (and 1.1b1 last night during Matz’ questions).

[Tim Bray fairly incomprehensively announces that Sun has done a deal with University of Tokyo to work with ? on ? and give them “a bunch of money”. Another implementation?]


Installation: 1. unpack binary, 2. set PATH. Dependencies: Java 5+. jruby.jar contains the full runtime.

The lexer is a hand-written lexer ported from MRI. The parser is another port of the MIR YACC/Bison parser using Jay (Bison for Java). This parser is the key for the recent boom in IDE support for Ruby (like NetBeans). [Charlie does a little NetBeans demo showing auto-complete and variable renaming/highlighting, unused variable detection.]

The Core classes are all written in Java, and nearly all have a 1:1 correspondence (String is RubyString, Array is RubyArray, etc).

The interpreter is a “simple switch-based AST walker” that recurses for nesting. Code starts out interpreted but command-line scripts compile immediately. “eval‘d code always interpreted (for now).”

JRuby 1.1 brings full bytecode compilation. “The compiler is basically done. This is the only complete 1.8 compiler in existence.” [Charlie shows a fib benchmark. Compiling makes it quite a bit bigger in filesize (oh well). There’s a _huge_ improvement when using java -server.] Typically, you want to use AOT mode, which avoids “JIT warmup time”. It may also use less memory in the future and perhaps start faster?

Charlie and Tom

Photo by dwortlehock


Performance optimizations are progressing in a few directions. The first is the obvious one of compilation. For real numbers, see this blog post. Comilation caches literals, uses Java local variables when possible, and uses “monomorphic inline method cache.” ObjectSpace is a more controversial
optimization. ObjectSpace allows you to iterate all Objects in a system. This isn’t painful in MRI, but is not easy in the fully-concurrent JRuby (“it sucks 2-5 times as much”). Charlie asserts that it is almost never used (except for Test::Unit). It’s turned off by default but can be turned on with a flag. [Some in the audience believe that the reverse should be used.] Custom implementations of Array, String, and Hash (among others) have also boosted performance.

Threading, Extensions, & Integration

JRuby supports only native OS threads but they’re parallel. They also emulate unsafe green thread operations (Thread#kill, Thread#raise, etc).

Ruby native (C) extensions are not supported. Some libraries may be accessible with JNA. If you’re looking for a good binding, consider a Java equivalent to the C. Most of the extensions that Rails uses have already been ported to Java/JRuby.

You can call Ruby from Java, you can call Java from Ruby. A popular integration is building GUIs with Swing. Ruby makes “Swing development fun” and is much less verbose than the Java version. Swing integration also brings a cross-platform GUI solution to Ruby (essentially for the first time). A direct approch:

$ jirb
>> include Java
=> Object
>> import javax.swing.JFrame
>> frame = JFrame.new("Hello")
>> frame.show
=> nil
# there is a frame on the screen
>> frame.set_size(500, 500)
=> nil 

Profligacy adds a new layout language to simply GUI creation. MonkeyBars takes a GUI editor approach.


Rails works with JRuby. It uses a JDBC connector to ActiveRecord. It can generate .war files (via Warble). ActiveHibernate is coming soon, and provides a different persistence API. Ruvlets brings Servlets to JRuby.


You can use Test::Unit or RSpec for Java code. “It’s so much less code” that writing tests for Java directly.


  • JRuby is ready
  • JRuby is more than just an implementation
  • JRuby needs your help

Evan Phoenix: Rubinius 1.0

Rubinius logo

Rubinius, like IronRuby and JRuby is aiming to bring “total world domination …. for Ruby!” It’s a Smalltalk-inspired VM:

class Rubinius < Smalltalk
  # form
  include Ruby::Syntax
  # function
  include Ruby::Behavior
  include Google.search("crazy cs papers")

Rubinius debuted last year at RubyConf 2006. There’s been “enormous progress” over the course of the year.

A comparison of implementation langagues (slightly misleading):

84,516 lines of C
     0 lines of Ruby

128,786 lines of C
     50 lines of Ruby

 48,282 of C#
      0 lines of Ruby

113,508 lines of Java
  1,000 lines of Ruby

 25,398 lines of C
 13,946 lines of Ruby 

Evan Phoenix

Photo by dwortlehock

Some “Junior High Analysis”:

  • 1.8 & 1.9 is Ruby (core) for C programmers
  • JRuby is Ruby for Java programmers
  • IronRuby is Ruby for C# programmers
  • Rubinius is Ruby for Ruby programmers

Rubinius finally allows Ruby programmers to “eat yummy dogfood.” It’s targeted at fellow Ruby programmers, not folks on the street. It also provides a much tighter feedback loop for language improvements & clarification.

Following on that, they have 57 committers, with 17 of those with 20 or more commits (36 with 100+ lines changed). Commit bits have been “free flowing” and easy to get (if something you submit as a patch you get commit rights).

Props to EngineYard for funding Evan’s work on Rubinius.

Evan is shooting to have 1.0 out by ? (was: RubyConf 2007). Things change. Mistakes were made. They’re evolving.

Q & A

Given the fact that Ola Bini and Avi Bryant should Rubinius replace MRI eventually? The chicken didn’t decide to replace the dinosaur [laughs], the environment did. The ultimate goal (from the start) wasn’t “replace MRI”, it was “total world domination”.

Are you more interested in performance than conformance? Rubinius has some compiler flags to turn certain optimizations on and off.

How small do you think you can get the C? They’ll follow Squeak in that they’ll write a C generator. The final goal is hand-maintained lines of C==0.

Are you worried about incompatibilities? We’re all worried about that. We wouldn’t checking something that intentionally broke compatibility.

What are these specs you’re writing? Do you pass them? They’re as implementation-agnostic as possible and written for MiniRSpec
(intended to sytnax-compatible with RSpec). Compatibility is all centered on emulating MRI. We used to fail ~1100 specs a few weeks ago, today we’re closer to ~500. The improvement is all due to community involvement.

Have you made any enhancements? Yeah, we did some prettier backtraces andcapturing exceptions from C extension segfaults, for example.

Is Rubinius challenged by ObjectSpace, Continuations, &Selector Namespaces? They have a problem with ObjectSpace like JRuby, continuations are “our shiny awesomeness” (thanks Smalltalk’s spaghetti stack) [shows continuation demo], and (no comment on SN).