Mother: Design Decisions

Syntax

Syntactically, Mother’s goals are:

1. Conciseness: Allow code to be expressed with as little extra verbiage as possible. This should greatly improve programmer productivity, in both writing code and reading it. Code is read much more than written, so Eclipse’s auto completion and auto code generation doesn’t really solve the problem. The goal here is readability rather than true conciseness. The ideal is “executable pseudo code” rather than the cryptic tokens of Perl, APL or Ursala.

2. Java Then Groovy Compatibility: The syntax should be as close to Java’s as possible, to make it easy for people moving from Java who don’t have any experience with dynamic languages. After that, it should be as similar to Groovy’s as possible, which is very similar to Ruby.

3. Flexibility Then Performance: The simplest syntax should provide the most flexibility, even if it’s at the cost of performance, since most code isn’t performance critical, and you should code things & profile them before deciding what to optimize. When we need a special syntax for a faster way of expressing something, it gets second choice after the flexible version. A great example is method dispatch. The “.” operator is simple and familiar from other languages, so it’s used for flexible method dispatch that looks up argument types at runtime, etc. For the fast, inflexible and somewhat error-prone Java dispatch, we use a different operator.

4. Easy To Parse: We intentionally have a simple parser, to make it easy for people to extend the syntax. In practice, this only constrains us a little.

These goals are pretty similar to Groovy’s, so in practice we end up looking a lot like Groovy syntactically.

Syntactic Ambiguity: Declarations vs. Setters

What does x = 23 mean if we haven’t seen x yet? It could declare x to be a local variable, or it could be a setter being called on this, i.e. equivalent to this.setX(23). Python always requires an explicit self (Python’s name for this) for getting or setting of any instance variable, but that’s not concise and not Java. Ruby requires you to write self.x = 23 to invoke the setter. The Flanagan and Matz book The Ruby Programming Language says “This is a not-uncommon mistake for novices who are just learning about setter methods and assignment in Ruby.”

Plus, there’s the age old problem that if you mistype a variable name when assigning to it, you create a new variable, which can cause mysterious problems that are hard to track down. And the fact that, if you want block-scoping of local variables, it’s not clear whether x = 23 is creating a new x to shadow one in another block, or assigning to the one in the outer block. Ruby and Python solve that by having all local variables be method-scoped.

We’re in the same boat as Groovy here, so we do the same thing: introducing a local variable requires a declaration, with either a type name or def.

Syntactic Ambiguity: Adjacency

In Java, having two expressions next to each other, without an operator in between, can only occur in two places:

Method/Field/Variable declaration: Modifiers* TypeName Identifier
Type cast: (TypeName) Expression

In Java there’s no ambiguity, since TypeName can’t have parentheses, and they’re required for the cast. But in Mother, optional parens means a method name can be followed by the expression for its first argument, without an operator in between. In particular, we want to support both of these:

String x
println x

This is the same problem the Groovy designers faced, and we use the same solution: if it starts with a capital letter, it’s a type name and therefore a type declaration. It’s also a type name if it starts with a modifier, or one of the primitive types, or an annotation. Ruby’s classes and modules must start with a capital letter. Python doesn’t have declarations.

We also want to support these:

(Shape) x
(isyummy ? mymethod : yourmethod) x

 

Allowing Statements Inside Expressions & Scoping Of Such Statements

Groovy doesn’t allow them. In fact, it kind of melts down: a = (def foo = 23; foo * foo) sets a to an array of two curried closures.

Python doesn’t allow them, and neither does Javascript. The only languages I know that do are Ruby, Lisp and Haskell. According to some random people on the Ruby lang IRC channel, as well as a quick reading of active_record/base.rb, Ruby “best practice” is to avoid using statements inside expressions.

It turns out, XQuery allows them, and so does SQL, but I don’t know any examples of large programs written in XQuery or SQL. [Can XQuery use new lines as statement terminators?]

And at ITA I remember it making things harder to read in some ways: If the start of some function body is (+ …, then I know that its adding some number of things, but I don’t know what or why. + is so general that it doesn’t really help to understand what the function does. I know I’m combining pieces, but not what the pieces are, so I’m in the same situation as that Psychology experiment where they gave people a set of instructions to memorize, but because you didn’t know what it referred to people did a bad job of memorizing them. But if you added the title “How do to Laundry” it all made sense and they did much better. Also, I don’t think people have a large mental stack, so to remember “we’re preparing to call X by preparing to call Y by preparing to call …” gets confusing fast.

OTOH, I think there are times where it can help readability, if only a little. In Common Lisp, macros often expand to something like that, and if you have a short computation that’s not worth turning into a macro, you could put it inline. For example, if you want to transform the result of some function, and the function value would have to appear multiple times, so you want to stash it in a variable: foo(def x = yum(); (x – x_avg)*(x- x_avg) ). Many people would say you could put that computation on the line before, but I think that’s a style choice.

As another example, my “dp” macro at ITA evaluates its arg, prints it then returns it. It could be handy to do something like that by hand inside an expression, if only for debugging. I’m sure there are non-debugging use cases too, although I can’t think of any right now. Maybe reading through On Lisp would help.

And I wonder if conventions wouldn’t change. Common Ruby practice avoids it, but I’ll bet few Ruby programmers come from a Scheme/Lisp background, most come from other languages that don’t allow it, so they’re not used to it. And conventions can change of the course of decades: look at RAII in C++, or fluent interfaces in Java.

Another problem is that it introduces a small incompatibility with Groovy. In Groovy, if you have an expression with a closer & you haven’t encountered the closer yet, then newlines don’t end any statement. That’s true within parens, and also:

 a = (b
      + c)

 a = [1,2,3,
      4,5,6]

 println (x,
          y,
          z
  )

and we might want to support lists where we can have an optional , after the last entry, so we don’t want to use the , as the signal that the statement continues.

There’s an argument that we shouldn’t accept the first one above, since it boils down to “sometimes you can put the operator on the next line, and sometimes you can’t.” If the user always has to put it on the end of the first line, that’s a simpler rule and they’ll always get an error (or at least strange behavior) when violating it.

An argument against is that you get the worst possible response — wrong behavior with no error message — when you accidentally leave off the operator in:

 (b op
 c)

It parses it as two statements, drops the results of “b” and just returns “c”. Perhaps the worst example of it is when the “op” is a comma for method arguments or list literals. But that’s a problem with Groovy/Ruby code outside of parens, and it doesn’t seem to be cause a lot of problems. Plus, I don’t think its a common enough problem to drive a feature of a language.

People are surprised when you put multiple statements inside an expression, and it can lead to one of those WTF?? style puzzles, of “Can you believe it parses it that way??? That’s crazy!!”

You know what? I don’t think many Ruby people are confused by that, or even aware of it. GinA & the Groovy wiki page on optional semicolons don’t mention the first example above, so its probably safe to say that most programmers aren’t aware of that subtlety. The Ruby book says “You can safely insert a newline without fear of prematurely terminating your statement after an operator or after a period or comma in a method invocation, array literal, or hash literal.” Perhaps that’s the convention we should follow.

The cost to implement them is small: just a few lines of code in the way () is handled. I don’t think they overlap with any existing language feature. So if we put them in and no one uses them, there’s no big deal. And it gives us the option to use them later if we want. So, let’s go with them.

Should parentheses create a new scope? That seems a little counter intuitive, but so does having a “temporary” variable still be in scope after the subexpression. In Python, Ruby and Javascript, all variables have function scope, including those declared in subexpressions. But Groovy doesn’t work that way, it follows Java in having variable scopes, and so does Mother.

The only other examples of languages that allow variable definition inside expressions, are functional languages like Lisp and Haskell. And the scope of the definition is the expression they’re in. So I think the scope of a definition should be the statement block it’s a part of.

So it makes parentheses like the curly braces in Java. In Java I can write:

 a = 23;
 b = 55;
 {
   int x = foo();
   c = x*x;
 }

So in Mother I could do:

 a = 23
 b = 55
 (
    def x = foo()
    c = x*x
 )

I could also do:

 c = ( def x = foo(); x*x )

So I’ve come around to statements inside expressions, and using () for scoping.

A sub-issue: when we encounter a newline, and the expression up to this point is complete, but the next non-whitespace token has a MOS but no SOS, should we continue parsing? Ruby 1.9 does this in the special case where the operator is “.”, see “The Ruby Programming Language 1st Edition”, section 2.1.6.1, p. 33. Maybe we should extend it to all operators? And get rid of the unary + while we’re at it? I think it’s always a no-op, so it’s only useful as a cue to the reader, e.g. if you have a list of number literals where many are negatives, and you want to put a + in front of the others to emphasize they’re not negative. And that would handle ++ and — properly in all these cases:

 x ++  // ++ is checked for a MOS, not a SOS, so interpreted as postfix
 y

 x     // Because ++ has a SOS, we treat the newline as an implicit ;
 ++ y

 x     // This should probably be an error, but I think it's interpreted as x; ++y .
 ++
 y

Method calls might be interpreted as casts though:

 foo
 ( bar )

You might get some error that stuff after the ) is missing in a type cast. But those are the only tokens having both a MOS and a SOS, at least in Java.

Random Notes

– In order to have cleaner code than this:

 int d = (dBin -J dScale) *J maxd /J dScale +J (maxd /J dScale /J 2);
 line.c = d -J line.a *J (width /J 2) -J line.b *J (height /J 2);

we should have some sort of “embedded Java” syntax, like this:

 [JAVA
   int d = (dBin - dScale) * maxd / dScale + (maxd / dScale / 2);
   line.c = d - line.a * (width / 2) - line.b * (height / 2);
 JAVA]

Back to The Mother Programming Language.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s