Getting Pedantic About Ruby Semantics

I just can’t resist the tempation to be an asshole language lawyer.

Yehuda thinks that it’s important to keep a semantic model of Ruby behavior in mind, and not get hung up on implementation details. He’s right. While it’s important to understand the implementation, with a half a dozen or more different implementations of Ruby out there, it’s more and more important that we understand the semantics of Ruby as a language separately from Matz’ Ruby Interpreter (MRI).

The best resource we have for determining the semantics of Ruby-as-a-language is the draft ISO standard. The draft standard is not without it’s problems, not the least of which is that wider Ruby community hasn’t exactly been encouraged to participate in the standards process. But it’s the closest thing we have to a definition of Ruby’s semantics divorced from any specific implementation. I looked up the parts of the standard which pertain to blocks, in order to see if Yehuda’s mental model of Ruby agrees with that of the creator of Ruby.

Here’s the part defining blocks:

11.2.2 Blocks

[...]

Semantics
A block is a sequence of statements or expressions passed to a method invocation. A block can be called either by a yield-expression (see §11.2.4) or by invoking the method call on an instance of the class Proc which is created by an invocation of the method Proc.new to which the block is passed (see §15.2.17.3.3).

And here’s the part which specifies the semantics of &block arguments:

A block parameter: This parameter is represented by block-parameter-name. The parameter is bound to the block passed to the method invocation.

[...]

13.3.3 Method invocation

The way in which a list of arguments is created are described in §11.2.

Given the receiver R, the method name M, and the list of arguments A, take the following steps:

a) If the method is invoked with a block, let B be the block. Otherwise, let B be block-not-given.

[...]

Push B onto [[block]]. [This is a notational convention indicating an attribute of an execution context -Ed].

[...]
iv) If the block-parameter of L occurs, let D be the top of [[block]] .

I) If D is block-not-given, let V be nil.

II) Otherwise, invoke the method new on the class Proc with an empty list of arguments and D as the block. Let V be the resulting value of the method invocation.

III) Let n be the block-parameter-name of block-parameter.

IV) Create a variable binding with name n and value V in Sb .

As I read the standard, Yehuda is incorrect in his interpretation of Ruby block semantics. Blocks are non-object entities which form one of the attributes of an execution context. If the method provides a name for the block in the form of an &block parameter, then and only then is an object created, by explicitly calling Proc.new.

Does it matter that this model differs from Yehuda’s? In at least one respect it does. Yehuda points out that in MRI, the object_id of a given block is always the same even. By my read of the standard, this is an optimization, and not something to rely on as a part of Ruby’s semantics.

But that’s not the end of the story. The stated intent of the Ruby ISO standard is to formally describe the the behavior of MRI 1.8.7. So does it, in this case? Let’s see:

class < < Proc
  alias old_new new
  def new(&block)
    puts "in Proc.new"
    old_new(&block)
  end
end

def foo(&block)
  puts "In foo"
  bar(&block)
end

def bar(&block)
  puts "In bar"
  block.call
end

foo do
  puts "In block"
end

# >> In foo
# >> In bar
# >> In block

It certainly doesn’t look like Proc.new is being called here. Now, whether this is because MRI is not flexible enough to enable an effective redefinition of Proc.new, or whether Proc.new is never called, is unclear without the assistance of a C-level debugger. If the latter, then Yehuda’s mental model is closer to the real behavior of MRI than the draft standard.

And when standards no more reflect reality than any given hacker’s imagination, the question becomes: what semantics do you think the language should have, moving forward? As a statement of what he thinks the semantics ought be, I personally think Yehuda’s mental model is pretty reasonable, and consistent with Ruby’s “everything is an object” philosophy. But stating that it is the canonical semantics of the language upon which Ruby hackers should base their assumptions – that seems a bit presumptuous to me.