dimanche 1 juin 2008
Open compilers and open tools
Par Sébastien Pierre, dimanche 1 juin 2008 à 16:39 :: Langages
The wide adoption of the object-oriented paradigm lead to great side-effects for software : many programs became more accessible, more flexible and more reusable. Why ? because OOP encourages the use of abstraction to "untangle" code and provide more opportunities for changing or extracting parts of the program.
Most OO-designed programs tend to be easy to tear apart, change or update mainly because the OO paradigm encourages this "component-based" approach to software architecture. However, when you take a closer look at the 'tools' used to write these programs namely compilers (or interpreters) and parsers, you'll realize that they are often more like monolithic black boxes than a flexible set of loosely-coupled components.
There are few examples of "addons" or experimental languages that try to 'expose the inner working' of their toolchain to let developers write their own extension (Meta-Lua or OMeta would be good examples), but it definitely hasn't reach the mainstream. Looking back at the C pre-processor or the Lisp macro system, we can see that this need for flexibility existed long ago, and the solution chosen was more changing the outside than changing the inside.
Maybe the growth of the open-source movement shifted our expectations, but I personnaly don't appreciate "black box software" too much, but I almost expect to have ways to put my hands in the "guts" of the system.
With growing interest for the notion of DSL (Domain-Specific Languages), and the culture of flexibility that dynamic languages have encouraged, we can better understand the practical benefits of being able to adapt the toolchain that we use for creating software.
In the domain of programming languages, many debates are focused on the syntax -- mainly because for most languages the syntax is cast in stone, forming a monolithic block with the operational semantics and core library ; but in essence, syntax, semantics and core library are three different things, that could be separated.
We have a good example of this with JavaScript : it implements semantics which are very close to many dynamic (and non-dynamic) languages, and its traits (mostly prototype-based inheritance and closures/anonymous functions) make it so versatile that it's quite easy to mimic more complex semantics (class-based inheritance, multiple-dispatch, run-time type checking, etc).
However, JavaScript syntax relates more to Frankenstein than to Pascal, if I may say. To circomvent this, people have written Ruby to JS or Python to JS translators, which allow to use JavaScript libraries with the syntax and semantics of Python/Ruby.
Anyway, the method is not changing too much : we're merely wrapping stuff instead of replacing or upgrading parts of a system. The task may be easier now with the plethora of parsing libraries and the cool string and list manipulation primitives offered by most dynamic languages, but this still keeps these projects in the domain of "cool hacks".
The OMeta paper shows us that it is possible to open a parser so that users/developers can easily add rules and modify an existing grammar dynamically -- meaning that you can re-program parsers written using OMeta very easily.
So what would it take to open our compilers, and software-engineering tools ? Looking at projects such as LLVM, the DLR you can already see that some things are opening: we're starting to get access to (object-oriented) representations of programs -- not limited to syntactic elements like AST, but to more in-depth representations. Even better, with the DLR or with LLVM you can write your programming "programmatically" by assembling objects that represent (or maybe just 'are') your program.
This takes syntax out of the equation, thus allowing you to think about plugging any syntax you want, providing you implement a parser that can create this object model. This simple realisation that syntax, semantics and library can be separated is, in itself, a paradigm shift that people will experiment as more tools offer them this ability.
We've seen many new programming languages emerge in the last 10 years, and often with only superficial differences -- open compilers and open development tools will very likely help more mutual consolidation and cooperation between language designers. With now 3 major VMs capable of running a wide range of languages (JVM, .NET, Tamarin) and a growing understanding of the domain of programming languages, we can definitely expect a lot of exciting projects to come !