As you might already know I'm working on my own programming language for a while now. I'm still on early stages and working on choosing the right platform for Serene and trying to spend time on doing enough research and make decision based on facts, scientific papers and collected data from experiments rather than rushing into things and end up with a mess. I believe those languages that take their time and move slowly but with great research, plan and design are more successful in the long term (Thanks to Pouya for pointing it out). Take Clojure as an example. They are taking their time, experimenting and validating their hypothesis. As a result, Clojure is a well designed, stable and highly backward compatible language with amazing and productive pace. However, some other languages like Python are extremely popular and consequently has more contributors. Dealing with all those contributors caused Python to move faster than it should and they ended up with some bad choices and horrible designs that fixing them requires an humongous effort. Gradually, it becomes harder and harder to fix those and move away from them. GIL is a good example, instead of fixing the issue and removing the GIL, they are introducing (at the of writing this article they added some basic support to latest python release but far from what they want) something else to fix the original problem but it might become a pain point itself. In order to avoid these kind of problem as much as possible I'm trying to take my time and do as many as experiments as I need.
As I mentioned earlier I think GraalVM and Truffle is the right answer for Serene. But to verify my initial Idea I decided to run an experiment. The experiment is all about implementing a Lisp in two environments. A pure Java implementation vs a Truffle implementation.
I spent several days and implementing the pure java version. The repository of the simple version is available in the repo This is a dummy version, but good enough lisp that I didn't paid too much attention to the details and just created a very simple lisp with the following specification.
Note: In this post whereever I use the name Serene for the implementation, I'm referring to the simple version.
Data structures
Since I tried to avoid unnecessary work, I didn't do much and implemented
just one collection type which is the most important and essential data
structure in any Lisp, the mighty List. While my final goal is to have functional
data structures, this List is not a functional one and is a simple linked
list. You can find the implementation under serene.simple.ListNode
.
For the number types I just added support for Long
and Double
via `serene.simple.SNumber` class which acts as a dispatcher between two inner
classes.
For Strings, boolean types and nil
, I just used the equivalent Java data
structures directly.
Reader/Parser
Instead of using a parser generator or a sophisticated parser, I just created
a simple read ahead of position based parser that reads two chars and calls the
appropriate method to create the corresponding Node
. the serene.simple.Node
is an abstract class that has just one important method, eval
. The whole
purpose of the reader is to parse the code and create an AST like data structure
which each node extends the Node
class (I should've create the interface for
it but too lazy to change it now). The eval
method of `ListNode` is a bit
special. It calls the eval
method on all the elements on the list
and then calls the first element as a function and pass the rest of the elements
as the arguments to that function. First rule of lisp :))
The eval
method of ListNode
contains lots more details regarding to java
interop as well which I leave it out of this blog post.
Scope
Scopes are simply a mapping between symbol names and values. Serene consists of two
different scopes, both implemented in serene.simple.IScope
and extend
serene.simple.AScope
abstract class that contains the logic for symbol
lookup and insertion. These two classes are serene.simple.Scope
which
is the general scope and it has a parent/child type of relationship with
other instances of the same class or serene.simple.RootScope
that is
the top level scope. Beside that, RootScope
is pre-populated with all
the built-in functions and types.
Special forms
Serene's special forms are pretty limited. All of them all classes which extend
serene.simple.SpecialForm
abstract class and inherit from Node
indirectly. The
difference between special form evaluation and function evaluation is that in case
of special forms, Serene does not evaluate the arguments and leaves the evaluation
to the special form itself. Here is the list of Serene's special forms:
def
: Creates a binding between a symbol name and the given value:
(def name "serene")
fn
: Creates an anonymous function:
(def inc (fn (x) (+ 1 x)))
quote
: Prevents the evaluation of the given argument and return it as it is:
(quote (1 2 3 4)) ;; => (1 2 3 4)
if
: Evaluates the body based on the return value of the given predicate.
(if (= x 1) (...) ;; if x is 1 (...)) ;; if x is not 1
let
: Sets a local scope and runs its body using that scope.
(let ((x 1) (y 2)) (println x y))
do
: Simply groups several expressions together.
(do (println ...) (if ....))
cond
: Gets several predicates and only evaluates the body corresponding
to the first truthy predicate.
(cond ((= x 1) (body1...) ((= x 2) (body2...)) (true (else...))))
Builtin Function
All the build in function are created by extending the serene.simple.builtin.AFn
abstract class and follow the same Node
convention. Here is a list of the most
important built in functions:
(println ....)
: Prints all the arguments on the stdout.
(quit)
: Quits the program.
(conj coll x...)
: Returns a new list by adding the given arguments.
(count coll)
: Returns the number of elements in the given COLL.
(reverse coll)
: Returns a new list which is the reverse of COLL.
(list 1 2 3..)
: Creates a list from the given arguments.
(first coll)
: Returns the first element of the given COLL.
(rest coll)
: Returns all the elements beside the first element of the given COLL.
(doc fn)
: Returns the documentation for the given symbol if any.
(reduce f coll initial)
: Reduces the COLL by applying F to its elements with the
INITIAL as the default value. F takes two arguments 1) the accumulation 2) the element.
(new Class arg1 arg2...)
: Create a new instance of the given CLASS by passing the given
arguments to its constructor.
Example program
Here is an example program in Serene simple version (benchmarks/fib.srns
in the repo):
;; We have a reduce function but just in case... (def reduce1 (fn (f xs initial-value) (cond ((first xs) (reduce f (rest xs) (f initial-value (first xs)))) (true initial-value)))) ;; A simple map function implementation in serene (def map (fn (f xs) (reduce (fn (acc x) (cons acc (f x))) xs (list)))) (def range-list (fn (x y init) (if (< y x) (do (conj (range-list x (+ y 1) init) y)) init))) (def range (fn (x) (range-list x 0 (list)))) (def fib (fn (n) (def fib-iter (fn (x y z) (if (= x 0) z (fib-iter (- x 1) z (+ y z))))) (fib-iter n 0 1))) (def benchmark-fn (fn (x) (let ((start (now))) (println (fib x)) (- (now) start)))) (def run-benchmark (fn (times) (map (fn (x) (println "Benchmark: " x) (println "Took: " (benchmark-fn 500))) (range times)))) (run-benchmark 20)
What is missing ?
Since Serene (simple) is an experimental language and I'll abandon it eventually. I didn't want to fall into the rabbit hole and just tried to get to the point as soon as possible. So I sacrificed lots of details. Here is a list of the most important missing details:
- A namespace functionality. Because creating and compiling dynamic classes is a massive task and needs tons of work which doesn't make sense for a toy project.
- Unified function interface.
- Requiring different namespaces.
- A sophisticated parser. My Reader implementation is really cheap that suits a toy project. It might worth investigating on different solutions including using a parser generator or ahead of time read implementation.
- Primitive functions in Serene. I used lots of primitive functions from java rather than implementing them inSerene itself, mostly because of two reasons. Lack of macros and namespaces.
- Decent [functional] data structures. The only data structure I implemented is list.
- Quality code. The general quality of this implementation is not great, I sacrificed quality for time.
Conclusion
I'm not going to improve the simple version anymore at this stage. I'm going to run some benchmarks and measure different aspects of the current implementation and then I'll move to the Truffle version and continue experimenting.
Please let me know if you have any comments or questions on this topic. As always I'm available throw social media and email.