Even if one implements a lexer and a parser, that alone is not enough to design a programming language.
The lexer is a tool that receives text and outputs a list of tokens.
The parser is a tool that receives a list of tokens and exports a tree.
What we need now is a tool that receives a tree and either runs it or exports some format that can be ran.
There are many ways to achieve this. One possible way is to use LLVM. LLVM is a C/C++ library that allows JIT and AOT compiling, as well as executing the result.
LLVM is a relatively low-level API. Although it allows the programmer to abstract away from concepts such as executable file formats, it is basically a portable assembler.
LLVM does not come with garbage collection, object-oriented programming support nor exceptions. Instead, LLVM provides the tools to create and integrate these features(entirely optional).
Contrast this with virtual machines like JVM and .NET.
Both Java and .NET use an intermediate format which comes with the features mentioned above built-in. In fact, not only they are built-in, they are mandatory.
These are high-level virtual machines.
By "high-level", I do not mean "better" nor "worst". Just different.
They are also not entirely incompatible. For instance, Mono has a LLVM backend for .NET JIT compilation.
While LLVM grants much programming freedom, it comes at a cost: additional development complexity. This may or may not be acceptable.
Among the multiple libraries .NET provides is System.Reflection.Emit. Like LLVM, this namespace provides support for creating, executing and exporting .NET assemblies on runtime.
For .NET language development, it is ideal.
System.Reflection.Emit integrates extremely well in the .NET stack. To generate a language standard library, one just creates that assembly in a .NET language(like C#), and then uses normal Reflection APIs combined with Emit to achieve the desired output.
I will not cover System.Reflection.Emit usage here(at least not now), but I will be discussing how it impacts language development.
In first place, like LLVM, there is very little API difference between executing and exporting a dynamically generated assembly.
In second place, MSIL is more programmer-friendly than LLVM intermediate language in many ways. In particular, object-oriented programming support is built-in, so executing a call is just this:
il.Emit(OpCodes.Ldloc, var1);
il.Emit(OpCodes.Ldstr, "xyz");
il.Emit(OpCodes.Callvirt, typeof(Foo).GetMethod("bar"));
il.Emit(OpCodes.Stloc, var2);
Is roughly equivalent to:
var2 = var1.bar("xyz");
Where var1 extends or implements Foo and "bar" is a virtual method.
Note that Foo could be a core API type. System.Reflection.Emit appears to be a first-class .NET citizen. There is also almost no difference between using some API method and using a dynamically generated method, as MethodBuilder extends MethodInfo and can be used in the Emit APIs in any place where MethodInfo could be used.
For PineDL, this is very practical. I just create the core language classes in C#, and then seemlessly integrate them into the PineDL application. This without having to concern myself with details like garbage collection and internal representation of virtual methods and interfaces. It just works, which is a big plus.
PineDL in particular is a hard language to "quickly" design a garbage collector for, since there is greater-than-usual cyclic reference risk, making choices such as simple reference counters and Boehm's conservative GC a bad choice.
Right now, I have a very simple working PineDL prototype, written entirely in C#, which I have not yet committed to the PineDL repositories.
It is also very simple and incomplete, so I will not be releasing binaries nor source code at this point. I want to at least have variable declarations, loops and most useful operations working before releasing something. In other words, I want to have something to show before showing it.
The code is currently divided in a few projects: Pine.Lexer, Pine.Parser, Pine.Core, Pine.CodeGen and Pine.IO. There is also a small testing project.
Pine.IO is part of the PineDL standard API, although it has very few functions right now.
I'm hoping to have something to show before the end of the year.
No comments:
Post a Comment