- But I also knew, and forgot, Hoare’s dictum that premature optimization is the root of all evil in programming.
- — Donald Knuth
Something bizarre happened on the Groovy-dev mailing list the other day. Alex Tkachman made what I thought was a simple suggestion: since some rarely used features of the language make it slow, we should have a keyword or annotation which says to the compiler, “Hey there, compiler, I’m not using those features in this code, so please give me the faster code.” I had expected the others to debate the pros and cons of the idea, but instead, they couldn’t quite understand it.
They kept saying you’d need to check that the features weren’t actually used at runtime. I suggested the checks could work like assertions: have them on while testing, but off when performance mattered. But that didn’t seem to sink in. At one point, they even said that turning off these features would “betray” the core values of the language.
For most proposals they ask for or provide use cases and debate the pros and cons. But not this time. They seemed to think that any concession to performance is an ugly hack, that a sufficiently smart compiler or runtime system could get code just as fast. Even though it wouldn’t have nearly enough information. They even said that providing a way to turn off unused features would reduce their incentives to make Groovy faster. Much the way driving a car means kids these days never learn how to ride a horse. Or something.
These are very smart, very reasonable people. They’re very good finding the pros and cons of any proposal, and have wisely rejected some knucklehead suggestions of mine. And it’s worked spectacularly well, Groovy’s syntax and semantics are very close to my ideal language. So why don’t they want to make even the smallest, simplest concessions to performance?
Bjarne Stroustrup, inventor of C++, recently talked about OOP (Object Oriented Programming) before C++:
My firm impression (at the time) was that all sensible people “knew” that OOP didn’t work in the real world: It was too slow (by more than an order of magnitude), far too difficult for programmers to use, didn’t apply to real-world problems, and couldn’t interact with all the rest of the code needed in a system.
Twenty five years later, I think we’re in the same situation with dynamic languages. At the moment, languages are divided into two very different camps. In one corner, there are those that emphasize performance. They’re statically typed. They’re compiled, and you have to call the compiler explicitly. They try to make the efficient thing easy and the inefficient thing hard. And the language designers think very hard about the performance implications of every feature they add. C/C++, Java, C# and Ocaml are all in this group.
Then there are the dynamic languages, like perl, Python, Groovy and Ruby. They try to provide the maximum flexibility, because hey, they’re only for something quick and dirty, or something that will spawn other processes, right? Ok, ok, you might want to do some real work with them, but the libraries that do serious computation are all written in efficient language, so you’ll spend all your time in compiled code, right?
Well, it doesn’t always work like that. Matlab started as a wrapper for LINPACK, but now compiles to byte code on a virtual machine. And if you’ve ever tried to take a bunch of XML and turn it into SQL, you know Python can be slow, even with lxml to do the parsing. Each tag is a small amount of text, so the inherent amount of work per tag is very small. If you handle each tag explicitly, they Python overhead will dominate.
Which is a shame, because with a little thought, you can get all the functionality you want, and only pay the runtime cost when you’re actually using the feature. Why can’t I use the slow features outside my inner loop, but turn them off inside for a speedup? In other words, why can’t I take my dynamically typed program, profile it to find the bottlenecks, then sprinkle around some type declarations to get a speed up? Why can’t these passages be as fast as the statically typed languages?
There are a couple languages that do this. Common Lisp is one, the only one that approaches the speed of C. The company I work for uses it for a search engine for air fares, and its mighty fast, while also being a lot more productive than, say, C++. ActionScript, the language of Adobe Flash, does it, as do Visual Basic, VBA, and VB.NET. (Thanks to Alex Staubo and Steve Cooper for pointing those out.)
Failing that, how about a fast dynamic language? Features are often developed in slow languages first, then moved to fast ones, as the C++ example shows. And there are three projects I’ve found that could be the fast successors to today’s slow languages:
- Ng (source repository) is a dynamic language for the Java Virtual Machine by John Wilson. As he has said:
I’m implementing this thing back to front. Firstly I’m building a high performance runtime system for a fully dynamic language on the JVM. I’m then building an AST implementation which can be directly executed (like JRuby) or serialised to XML or to bytecodes. Finally I intend to design the language and compile it to my AST.
The goal is to generally be no more than half the speed of Java, but with full dynamic behaviour. Ng has some impressive benchmarks, but its in its very early stages. John Wilson helped create Groovy, so he has a track record of his ideas coming to fruition, but that’s no guarantee.
- Cython source files have Python syntax & semantics, but compile to C/C++ rather than Python byte code. Python can call them and vice versa, so you can use all your favourite Python libraries and other Python code. They’ve also extended Python syntax with static typing, and the ability to create pure C/C++ functions (which can’t be called from Python byte code). It’s designed to more or less reach the efficiency of C when things are typed. And it seems to be getting a lot of interest.
- PyPy plans to eventually incorporate a Just In Time compiler for Python, they way psycho did. But the project is very ambitious and moving very slowly. We’ll see what comes of it and when.
Premature optimization is the root of all evil. Why am I forced to optimize when I’m choosing my language, before writing a single line of code? I wonder whose going to be the next Bjarne Stroustrup, to become famous for inventing the first widely used fast dynamic language?
In the mean time, its a shame Groovy won’t let programmers specify what parts of their program aren’t using advanced features. Groovy is so close…
UPDATE: Mike Pall, the developer of the LuaJIT (among other things), had a great comment on the reddit page.
Have a look at Dylan (http://www.opendylan.org/).
In the medium future when pypy comes out we will hopefully get a much faster python
There are a lot of false assumptions here, like that the people implementing statically-typed languages are doing it for the performance (Haskell can be very, very fast, but that’s not why it’s statically typed); or that people implementing dynamic languages are doing it for the sake of quick-and-dirty solutions (Smalltalk, anyone?). For that matter, it’s not even true that you *have* to be more static in order to win performance gains; there is convincing evidence that a sufficiently clever JIT can make dynamic code as fast or faster than any compiled static language because unlike the compiler, it knows *exactly* which paths need optimization the most.
Practically speaking, though, take a look at Scala. Or Strongtalk (http://www.strongtalk.org/).
But as I understand it, Alex’s suggestion doesn’t turn anything off – it adds a feature, namely optional static typing. Whether that’s a good thing is a separate discussion (Lispers think so), but it’s certainly not a simplification…
Also, you’ve left Lua off your list (in particular LuaJIT) – any particular reason for that?
gwenhwyfaer, you’re rightit’s adding a feature. More or less I’m saying that adding optional static typing, when the motivation/use case is efficiency, is the same as turning off dynamic typing.
Think of it this way. In Python, when you call foo.bar(), it does a hash lookup of the string “bar” in foo’s hash table, doesn’t find it, then does a hash lookup in bar’s class’s hash table, and finds it there. But there’s a common case where bar’s (and bar’s class’s) entries for functions aren’t changed after they’re created. So why not tell the compiler that you don’t intend to change those, and it can just do some pointer indirection instead of a hash table lookup?
That’s what I meant by “turning off dynamic typing.”
“Premature optimization is the root of all evil. Why am I forced to optimize when I’m choosing my language, before writing a single line of code?” Beautifully put! I think this post is excellent and hope that it is widely read.
By the way, have you heard of Shed Skin? It is an experimental Python-to-C++ compiler that convert pure, but implicitly statically typed, Python programs into optimized C++ code. In other words, you can write modules that use Python in a not-so-dynamic fashion, and Shed Skin will compile them to C++ code. Some great speed-ups result, and (unlike Cython) the syntax is still pure Python. It’s still very experimental, but promising:
http://shed-skin.blogspot.com/
> So why not tell the compiler that you don’t intend to change those, and it can just do some pointer indirection instead of a hash table lookup?
Monomorphic and polymorphic inline caches are not something new. It’s just that most popular dynamically-typed languages have VM implementations that are primitive.
You should have a look at Parrot – http://www.parrotblog.org/ (a VM made explicitly for dynamic languages) and Perl 6 (http://dev.perl.org/perl6/). Yes it’s been a long time coming, but recent progress is very encouraging. Perl has always been one of the faster dynamic languages and Perl 6 will have the optional static typing that you said you’re looking for.
It’s still a couple of years off, but I’m pretty optimistic about the performance gains dynamic languages will have if there’s a real VM with built-in JIT, compiled-byte code, etc specifically for dynamic languages.
Take a look at Rich Hickey’s language on the JVM Clojure.
It’s a Lisp-1, fully dynamic, compiling to JVM byte-code using ASM, with a REPL and using reflection internally to smoothly interop with JVM libraries.
However, through type-hinting, the reflection can essentially be turned off yielding code that should be relatively close to Java execution performance.
The Clojure web-site is at clojure.com
But there are better methods for “flattening” method calls when the same method ends up being called every time – methods which outperform anything relying on indirect function calls. (Consider the SOAR project, for example.)
So what he suggested, and what you are supporting, adds complexity – to both the compiler and the language – in order to arrive at a suboptimal solution, when a better dynamic solution to that particular problem could be added at no structural cost.
(Moreover, LuaJIT’s author reports that he’s got the cost of a hash lookup down to the order of a few cycles – less than the cost of a mispredicted branch, which is most indirect branches on modern CPUs. So even the assumption that such lookups represent an unacceptable burden is not necessarily a sound one.)
Why not just use Common Lisp, like your company does?
Where “sufficiently smart compiler” is understood to mean “Our Fairy Godmother will sprinkle Magic Optimized Pixie Dust on our generated code”.
An example from one of those hated static languages, C.
If you pass pointers to variables to a subroutine, you have no guarantee that they don’t point to the same memory locations, especially for array. So this inhibits some optimizations, since if you change *pointer_one, you have no guarantee you aren’t changing *pointer_two as well. Most compilers have a switch to say that the compiler may assume there are no aliasing problems like this, permitting faster code.
It sounds like a good idea to me, similar to the Common Lisp ability to specify the type of variables going into an array. After all, that takes away part of the Lisp character, but it allows much faster code.
Am I making sense? I hope so, anyway this is a rather disappointing posting, in the sense that the search for “purity” leads down some blind alleys.
LAPACK typo.
I have a question regarding enabling the static typing in python. Specifically the foo.bar() example. What you are proposing is some kind of an annotation in the code which tells the VM to not look into the object’s methods and look in the class’s methods (or vice versa) – thereby saving one hash lookup. Is that right ?
If so, how difficult do you think is it to implement (may be a non-community-approved) extension which does this ? I am not completely aware of python’s internals, hence the question. Thank you
gangadhar, I haven’t looked at the Python VM and runtime at all, but it might not be too hard. What might help is that you can always ignore it if you want to. For example, you could do it only for old style classes at first. You should ask on the python dev mailing list, there may be people who are quite interested in this.
How about a fast dynamic language? Try Lua. Used by many game companies for complex 3D engines. Why? Because its damn fast.
Hello Martin, very interesting post! Added to my aggregator 🙂
I’d say that instead of two camps of languages, there are two camps of language users (or designers), those who are interested in performance and those who are not.
I think you should include Io and PHP in your list of “dynamic” languages. PHP because despite being an ugly language, managed to have a huge success (i dont use it anymore but I acknowledge it was a success), and Io because although it is rather young, it is the real upcomer to one day challenge ruby or python. (Meanwhile I hope people tell em that they need to not make the mistake of ruby and fall behind with documentation. Its annoying to scrap together all info from other people’s code and from websites and collect it gradually over time.)
Nico, you’d be wrong. There might be some truth in a suggestion that some programmers were willing to make greater sacrifices of convenience for the sake of performance than others, and some prefer to wring performance out of their favoured platform than to select their platform on the grounds of performance – but in general there is so much prejudice and magical thinking, and so little data, about what actually constitutes “performance” anyway that anyone claiming to have The Answer is merely exposing themselves as a fool.
Kawa Scheme is great. It’s a Scheme implemented in the Java Virtual Machine, so you can call any Java code you want, including, of course, all the Java libraries. And yes, it’s fast – basically as fast as Java in my experience.
Hi thanks for sharing these experiences with us.
I think the desire the author suggests is a very real one for people using dynamic languages for real problems. It tends to offend language designers, and I can relate, because it can be interpreted as criticism of the language design or feature set. It is somewhat of a paradox, however, in that in well-designed languages with carefully considered feature-sets (like Python), it becomes hard to remove a feature and not break existing code. Most language designers aren’t willing to consider changes that can’t be applied to that language’s standard library in a semantically-neutral fashion.
So, a general solution becomes almost impossible to achieve, but specialized solutions (like Pyrex, Psyco, or Shedskin) are indeed possible and can be helpful.
At a previous job, I wrote a Python-to-C++ translator that did what I called “idiomatic translation” to modules that implemented a very static subset of Python, and used Boost and C++ TR1 features like lambdas, regexps, tuples, and hash_maps to implement their Python counterparts. It was a fun acamedic exercise, but with the hefty pre-requisites (boost) and limited applicability/usability (no clear semantics, inscrutable error reports, etc), I never considered releasing it for more general use.
“I wonder whose going to be the next Bjarne Stroustrup, to become famous for inventing the first widely used fast dynamic language?”
John McCarthy. And a long list of people who developed progressively better compilers for the various Lisps.
Objective-C? It’s very dynamic, but compiled and quite fast.
Try Coldfusion. If it is able to host MySpace it has the speed and scalability for anything. If you haven’t used Coldfusion in 8-10 years you will be surprised at the speed.
Does it need to be open source? Well CF competitor New Atlanta will be open sourcing their J2ee implementation in June.
Hints.
Programmer should be able to provide Hints to the runtime system. Hints would be informations that could somehow help increase the runtime efficiency.
C used to have a “register” keyword, whereby one could tell the compiler “please treat this variable well because speed depends on it”. That’s a hint. Nowadays modern compilers do a better job at allocating registers than before and the “register” hint is not useful anymore.
Until compilers/interpreters becomes intelligent enough (up to inferring the programmer’s intend), I think a pragmatic solution would be to help them by providing Hints.
Unless you prefer to provide some Clues?
Why not just use a fast, very dynamic, statically typed language and programming system like Active Oberon?
gwenhwyfaer, it seems that I touched a sensible spot 🙂
Maybe while they/you/we’re at it, dynamic language implementors could add a keyword that allows the programmer to indicate the common scenario of having a variable which does not change its value from it’s initial value.
Such a declaration could be called a “Constant”, or even “const” (to save typing).
And since this is the internet… Yes, I was being sarcastic.
I can add Stalin to list (fast Scheme implementation)
I like the way Boo go – it’s static but looks like dynamic. When all variables are of type “duck” – it becomes dynamic.
We,re not overly worried. As long as their are people like yourself who put getting something fast over getting something right then natural selection will, slowly win-out.
– Paddy.
The Java implementation of Ruby called JRuby, turns off the ObjectSpace ruby feature by default. Simply because it makes for dead-slow performance, to have it on. If you need it, you can turn it on. And that is cool!
That`s why you can use freepascal
http://www.freepascal.org/
http://www.lazarus.freepascal.org/
Write once , compile everywhere
Speed + Flexibility + Human Syntax ..
😛
Java has come a long way in terms of speed. Each new iteration 1.1, 1.2, 1.3 made increases in speed. 1.5 is actually almost as fast as C++, according to one whitepaper I read.
http://www.codesplunk.com/?s=c
Programming Language Questions & Review
Martin, in many ways it really sounds like you’re describing Dylan (see my URL). It was explicitly designed to bridge the gap between dynamic and static that you talk about. Sadly, I can count the number of people actively working on it on one hand.
Hey, it’s the interweb so who cares if I’m commenting almost two years late.
-Carl
10 years later, I stumble upon this post (was googling about the fast(est) dynamically typed language–which you shouldn’t do btw, in case anyone wonders.) Couldn’t help but notice a small typo. Perl is capitalized when referring to the language and lowercase (perl) when referring to the actual interpreter program (due to case-sensitive Unix file systems.) I know—potayto potahto, tomayto tomahto—Just an fyi 👍