Rubyssimo: April 2007

Sunday, April 29, 2007

Insano-Pattern: Tuple Madness

Here at this blog I hope to be documenting all kinds coding tricks sure to keep you indispensable to your employers. This one's for the dynamic languages only: tuple madness.

The pattern is this:

Never return anything from a method call that is not wrapped in an array.

Do:

Prefer returning many objects at once to creating a struct-like container for them.
Return objects of extremely heterogenous types, sharing no ancestor class whatsoever.
Use well-known mathematical sequences as type-indicating indices to your array, placing -- for example -- Orders on all Fibonacci numbers and Users on all perfect squares, with perfect squares taking precedence. Point out that this allows rapid < O(n) array traversal.
Once, just once, return [[[]]] and parse it out as the true set-theoretic definition of the number 2.

Don't:

Document your return types in comments.
Return types in a consistent order. Use the handy-dandy shuffle function.
Return a consistent number of objects. Different code paths should sometimes leave four objects, sometimes five.

If someone criticizes your code, tell them that:

Your coding is LISPish, and that there is just nothing as beneficial to developer productivity as LISP and its powerfully productive list abstractions.
You are taking advantage of duck typing and that you work twice as quickly as you did in the days of static programming.
You are practicing defensive programming and making sure that no dreaded null pointer exception will ever be thrown again. You are doing this by enforcing a scrupulously consistent contracts on each method's return type, and that you really deserve a raise for all the debugging costs you are saving the company. (Note that this should not prevent you from returning nil wrapped in an array).
Typing is just not Agile®. Finishing user story cards is all that counts.

Friday, April 27, 2007

Coghead Revisited -- A New Kind of OS?

Teaching the Whole Office to Program

Since my elementary school days of turtle programming and BASIC, there's been talk about the importance of teaching more people -- kids even -- how to program, making programming into a basic skill, so that anyone could use it to improve her productivity anytime she wanted to without hiring an IT consultant. And the focus -- in this great quest -- has often been on making programming languages a lot easier to understand, easier to read. And, as for all Big Ideas, there's even a sizable naysayer- and backlash- community.

Coghead is building tools to do this kind of thing for Web applications -- to liberate the world from programmers (I, as a programmer, am chomping at the bit to automate myself into irrelevance: I mean that completely sincerely) or to make us all into programmers, depending on your point of view. I saw a firm at Web 2.0 last week (apologies, their name escapes me) that was promoting English as a scripting language.

To be honest, I'm really not sure programming languages can get much simpler. I am very, very happy with the Ruby syntax. I think Ruby -- as a syntax -- is a thing of succinct, eloquent genius. So I don't think the next big gain in either developer productivity or the democratization of programming comes from improving on that syntax. I think it's gone about as far as it can go. The returns are diminishing.

So what if we changed tack? What if we tried something other than simplying programming languages? What if we took Coghead's approach toward all of our programming? I know what you're thinking to yourself: groan, not the dumb RAD (Rapid Applications Development) tools that were so big in the 80s and early 90s, those clunky sets of wizards never amounted to much actual productivity-gain. No, RAD wasn't so rad, because it was simply another interface to the familiar, forever-inaccessible machine code. What I'm wondering about rather is this: What if an entire OS were built in this Coghead-style from the ground up?

What if I could create my own custom view in Outlook? In Photoshop? What if I didn't need to go spelunking in a 6,000 page Developer's Guide? What if I didn't need to pay a year's worth of college tuition for an SDK to do that?

That is, what if applications no longer had despotic control over their own frames? What if I could create a view in any application on my desktop? The OS allowed it, made provisions for it, so no application could fail to support it. And what if I could do this visually? And what if I could tell the OS that what I wanted was a list, and then I could tell the OS what I wanted it to be a list of? What if the entire OS took the creation of this kind of abstraction as one of its primary responsibilities?

So why hasn't this happened already?

The Problem: Declarative Programming and the Backwardness of Software Development

Try teaching someone to program for the very first time. What's the first conceptual stumbling block? What I've found is that it takes people a while to come to terms with the idea that they can't just tell the computer what to do, what they want, what their intention is, they actually have to tell the computer how to do it.

I firmly believe that a smart application could talk a person through about 80% of what programmers commonly do, since so much of what programmers do day-to-day is repetitive and near-automatable with a piece of software pitched at the right level of abstraction. Think about it this way: most all of us still program procedurally. But what we are doing is in essence something declarative. We tell our computers how to do things step-by-step, not what we want done.

After all, the history of programming is the history of increasing abstraction. Standard libraries increase abstraction. Third-party libraries increase abstraction. Good application design involves creating your own reusable abstractions. And these abstractions are declarative in nature: we tell our processor that we want it to multiply two floats. We do not tell it how to encode the floats. We simply need to rise to the level declarative abstraction that non-practitioners are comfortable with.

But most programmers still, for the most part, program at the wrong level of generalization. Creating new algorithms is a very small subset of the day-to-day tasks of most of us. You find yourself most often using other people's algorithms, applying design patterns, and following common business processes -- procedurally doing what ought to be done declaratively.

Rails -- considered as a DSL for web applications -- actually goes a long way towards making Web application programming declarative. Even Java's newfangled annotations are moving in that direction. Those old RAD tools back in the day were basically just visual interfaces to declarative programming done cheaply by code generation.

There are of course the renowned plug-in architectures of Firefox and Eclipse. But those bad boys require some serious code-fu, and while they might liberate their respective platforms, they don't do anything to lower the bar on what it takes to be able to create your own software. If anything, they raise that bar. There's a long long road between Java 101 and OSGi development.

What has limited the power of these kinds of tools in the past is this: it's just been too hard to get all your applications to play ball together. There's just not much out there that I could tie my little visually declared list to. Web 2.0 is making interoperability compulsory, but only an operating system could make it a true precondition of software development. Lots of OSes have component models, but applications tend to support components as an afterthought. What if they simply had to develop for them or else they would have no persistent storage?

Application Rights versus User Rights

So I think there are other kinds of orthogonal concerns whose development would really make a proposal like this fly. Consider, for example, the broad issue of application rights versus user rights. Traditionally, applications have had rights to their territorial integrity (their frame). They have rights to their source code, if they want it (pretty hard to disassemble most stuff). And they have rights to keep your data however they want. What if we revoked those rights? What if we revoked the right to the user's data? That is, what if an OS were designed to prevent applications from serializing data unless it were serialized to an open specification (a specification it would have to provide to the OS)? What if we revoked the right to the source code? What if a popular OS took hold in which everything were scripted and nothing could be compiled down? Sure, companies might shy away from it, but if the platform were popular enough, they would simply have no choice but to develop for it. There's much less to fear from openness if everyone must be open, because if you're looking for intellectual property infringement, no one can hide infringement from you.

I'm not saying that all these ideas are feasible, but I hope they are worth thinking about. It's hard to imagine a speedy OS that didn't allow programmers to compile things down to machine code for some optimizations. On the other hand, as processing power continues to grow, we might see these kinds of optimizations less and less often over the next couple of years.

Wednesday, April 25, 2007

HTML 5 and Drag and Drop

Here's a good overview of the newly unveiled HTML 5 specification that's just out.

And I think they're missing something crucial.

Ray Ozzie has done some great work making what amounts to a clipboard hack for the Web. He is quite right: the Web needs a clipboard, particularly as the Web becomes an application platform. Ozzie's work is slick, elegant, and very useful.

But it's also too darned complicated, and that is no knock on Ray Ozzie. It took a lot of ingenuity to get to this point with the limited resources of HTML 4.

Really, drag and drop should be a first-class citizen of HTML. Why not? Forms and buttons are, and dragging and dropping is as important a GUI concept as forms and buttons. Support baked straight into HTML would make it simple on developers: your browser implements it once, and the rest of us down the Web food-chain never have to worry about it any more than we worry about our <b> tags.

How simple can it be? Very simple, I think. All we really need is a <drag> tag and a <drop> tag. Each <drag> tag attribute can specify which mime formats it will export to (i.e. application/pdf, text/plain) and a callback function for fetching the data each format. A <drop> tag event handler can choose an available format when a drop is made and update its appearance.

And the Web will have, not only a clipboard, but also a desktop.

Friday, April 20, 2007

In Ruby, Not All Objects Are Created Equally

Try this:

irb(main):001:0> class << :a
irb(main):002:1> def foo; "bar"; end
irb(main):003:1> end
TypeError: no virtual class for Symbol
      from (irb):1

But.. but.. but.. I thought adding instance-specific methods was a hallmark of Ruby metaprogramming?

I ran into this problem while trying to create "smart" symbols, symbols that I could hang methods off of to, say, change their textual representation when displayed to the user.

What other objects lack virtual classes? Well, String and Fixnum for two. I'm guessing that this is true of any objects that are accessible as literals.

At first, you might think this is disappointing, an imperfection in the otherwise glittering consistency of Ruby. But consider the alternative.

For one, Ruby would have to keep an object in memory representing the number 7 -- a virtual class just for every token of 7. It would have to create such an object when you tried to access its virtual class. If you had def voyages_of_sinbad; 4 + 3; end, that would have to return not just the result of computation, but then look up the possible virtual class. Whether or not a virtual class for 7 had been created, it would have to at least check. So this look up would be a requirement of any numerical computation.

For another, what distinguishes literals is that they can be accessed directly without clients being passed a reference. In essence, literals are the objects we all share. Literals, as true first-class objects, pulverize encapsulation! (And I ask you to imagine for the moment a insano-pattern of using Fixnums like 7.secret_info to pass messages between objects!)

So be thankful that not all objects are created equally in the Ruby world.

It would be an interesting thought experiment to try to imagine a sensible language without literals.

Sunday, April 15, 2007

Reporting on Rails

Another (far more famous) Ara has written a fantastic, much-needed plug-in for Rails called MOle. MOle lets you gather reporting information on your Rails application in real-time. Reporting is really a cross-cutting concern, common to many applications. It's great to see tools becoming available for Rails in this direction. Coincidentally, we've been rolling our own framework at Postful for doing this sort of thing. What did we call it? Snitch. I'm not kidding. And Snitch is the name of the console application for MOle. We were planning to release our Snitch as a plug-in, but now seeing the fantastic work done on MOle, it makes more sense for us to contribute rather than compete.

What are the advantages of MOle over externalizing your reporting with something like, say, Google Analytics? This is only going to be a fragment of the true list of advantages, since I've just begun dipping into the code, but at first glance what is obvious is this:

Gauging application performance by wrapping controllers

Capturing application-specific business data

Monitoring this data in real-time, rather than the typical reporting lag of external analytics tools

Monitoring particular code paths, like exceptions thrown, rather than just raw request headers

By itself, these things make MOle a necessary complement to externalized analytics packages I'm excited to see where development is going to take this project.

Saturday, April 14, 2007

How to Make Your ActionController Go Up In A Bang

Go on, try it. I dare you. I double-dare you:

class UserController < ActionController::Base  
  def send
  end

  def request
  end

  def render
  end
end

These are all natural enough names for controller methods, but your controller will mysteriously vanish, cease to operate in strange, strange ways if you use any of them.

Why? Because you are overriding methods crucial to the internals of the controller. In the first method, Ruby's Object#send. In the next two, ActionController methods.

How to save yourself time: I really hate silent failure or mysterious failure. But you can make the silent failures clear and noisy (oxymoronic, I know!). Try something like this:

module ActionController
  class Base
    def Base.method_added(sub)
      raise "Cannot override action 'send'"     if sub == :send 
      raise "Cannot override action 'request'"  if sub == :request
      raise "Cannot override action 'response'" if sub == :response
      raise "Cannot override action 'render'"   if sub == :render
      #...
    end
  end
end

Thursday, April 12, 2007

Ruby Scoping Shocker

Ruby head-scratching magic:

irb(main):001:0> if false
irb(main):002:1> x = true
irb(main):003:1> end
=> nil
irb(main):004:0> x
=> nil

But now:

irb(main):005:0> y
NameError: undefined local variable or method `y' for main:Object
        from (irb):5
irb(main):006:0>

This here is very unusual, and I'm not sure if it is part of the Ruby specification or just the implementation. From what I gather, the declaration of 'x' is a side effect of the parsing of the conditional block. What other side effects to unexecuted code are there that we should know about?

The oddity does not carry over to the right side of assignment:

irb(main):001:0> if false
irb(main):002:1> x = y
irb(main):003:1> end
=> nil
irb(main):004:0> x
=> nil
irb(main):005:0> y
NameError: undefined local variable or method `y' for main:Object
        from (irb):5
irb(main):006:0>

Why is the specification/implementation question important?

Because if we could rely on this behavior we could write more compact code, writing fewer variable declarations. For example:

if some_condition
  x = true
end
#..
do_something if x

rather than:

x = nil
if some_condition
  x = true
end
#..
do_something if x

if we can count on this in future and all implementations of Ruby.

Wednesday, April 4, 2007

Why I Hate Test Fixtures (And What I Am Prepared To Do About It)

I have a few complaints against YAML test fixtures:

They break. Changing your database schema will often leave existing test fixtures invalid. I am extremely lazy, and it becomes a maintenance burden.
I hate switching between files in my editor. Already, I have to switch between my application code and my test code. Switching to a test fixture for something as small as a sample object gives me three files I have to switch between.
The test fixtures centralize concerns from all different test suites. For example, I have several users in my test fixture who are used I-don't-even-know-where among my test classes. It seems bizarre to think that while my tests should remain independent and modular, my test data should be entirely jumbled together.
I can't remember what is what within my test fixtures. I had users named to Bill, Peggy, Joe, Quentin. Who are these people? I switched to the slightly more manageable valid_user, invalid_user, valid_user_sending_message_to_invalid_user, invalid_user_receiving_message_to_valid_user. Even if I try to reuse the same fixture objects as often as possible, bloat eventually happens.
If objects have dependencies, they get tedious. Let us say your objects have dependencies on other objects. Your order requires a user to be valid. Your user requires an account. If I now have to make dummy fixtures for all these, I've got to switch between five or six files now.

In fairness, I have these positive things to say about externalized test fixtures:

Sometimes they really need to be externalized, as in when your test fixtures are entire documents, like PDFs or spreadsheets.
The YAML fixtures are a vast, vast improvement over using property files, XML, or writing a lot of object instantiation code within your test itself (all of which I used to do back in Java).

So what do I want instead? I want to create my fixtures within my tests in no more than a line or two with valid defaults:

def test_signup
  user =  User.sample
  post :signup, { :user => user.attributes}
  assert_equal 1, User.count
end

I want to create required dependencies automatically:

def test_signup_creates_account   
  user =  User.sample
  post :signup, { :user => user.attributes}
  assert_equal 1, Account.count
end

I want attributes to be overridable:

def test_signup_requires_email
  user =  User.sample(:email => nil)
  post :signup, { :user => user.attributes}
  assert_equal 0, User.count
end

I want even nested attributes to be overridable:

def test_signup_creates_account   
  user =  User.sample(:account => { :balance => 5.0 })
  post :signup, { :user => user.attributes}
  assert_equal 5.0,  Account.find(1).balance
end

And I want to make minimal changes (none if possible) to my model classes to support this stuff.

Please take note that this kind a testing design pattern has cropped up both in integration testing and in RSpec.

ActiveRecord classes can introspect their associations and, with a plug-in, their validation. For 90% of cases, this should be all we need to create the object graph.

Rubyssimo