Thursday, December 13, 2012

Taming the beast that is GADT

For the past few weeks I have been trying to write a little embedded DSL. The goal is to be able to easily generate bindings between OCaml and C, and maybe Objective-C, too.

So far, I have written a simplified representation of C [1] using normal algebraic datatypes. This will allow me to have a loosely-typed intermediate representation, which I can then pretty print. The pretty printer I have implemented using the Format module from stdlib. It already supports types, expressions and statements, but no global definitions and declarations, yet.
To print OCaml code, I will use Format again for a first draft, and switch to CamlP4, should the need arise. I briefly considered using CamlP4 quotations for the whole project, but that would necessitate quotations and parsers for each language to bind. Unfortunately, my parser-fu is not up-to scratch, and the sparse documentation for CamlP4 is not helping either. [2] Maybe it is time to dust of the dragon books?

The EDSL that I am currently working on uses a feature just recently introduced with OCaml 4.0: generalized algebraic datatypes, or GADT for short. This will allow me to use OCaml's type-inference and -checking for the FFI description language. So basically, I'm writing a strongly-typed code generator in OCaml. When run, this generator will in turn create the necessary OCaml and C code, constituting a particular OCaml foreign function interface.

One thing I've come to notice when working with GADTs is that it helps to have as much type annotations as possible in the beginning. Only later on, when everything is working, can some annotations be removed. Otherwise, there will be a lot of strange type errors. Which for a mere software engineer like me are sometimes hard to understand. With type annotations, the error messages are usually much clearer.
I've found that recursive functions help a lot, as they are probably easiest to annotate and allow me to structure the code a little better. A concrete example is the following bind function:

let rec bind : type s r. (r,s) fn -> s -> r = fun fs f -> 
  match fs with
  | FVoid _ -> f
  | FLambda (t,n,fs') -> let x = XId (t,n) in bind fs' (f x)

With mutually recursive functions and additional type informations, the above example would look like this:

let rec bind :
  type s r. (r,s) fn -> s -> r =
      fun fs f -> 
        match fs with
        | FVoid _ -> bind0 f
        | FLambda (t,n,fs') -> bind1 t n fs' f
and bind0 :
  type a. a x -> a x =
      fun f -> f
and bind1 :
  type a r s. a t -> string -> (r x, s) fn -> (a x -> s) -> r x = 
      fun t n fs f ->
        let x = XId (t,n) in bind fs (f x)

With the following types, interested readers should be able to toy around with the above functions in OCaml (Version 4.0 is required):

(* Type of a value *)
type _ t =
  | TVoid       : unit t
  | TInt        : int t
  | TFun        : ('r,'a) fn                    -> ('r,'a) fn t 
(* Values / expressions *)
and _ x =
  | XInt        : int                           -> int x
  | XId         : 'a t * id                     -> 'a x
  | XAdd        : int x * int x                 -> int x
  | XApp0       : ('r x,'r x) fn x              -> 'r x 
  | XApp1       : ('r,'a x -> 'b) fn x * 'a x   -> ('r,'b) fn x
(* Function signature *)
and (_,_) fn =
  | FVoid       : 'r t                          -> ('r x,'r x) fn
  | FLambda     : 'a t * id * ('r x,'b) fn      -> ('r x,'a x -> 'b) fn
(* Identifier *)
and id = string


Footnotes & References:

  1. I am aware of CIL, but it seems to heavyweight for my purposes, and does not support Objective-C. And I view this whole exercise as a learning experience.
  2. There is now at least a wiki for CamlP4. And Andrei Formiga has a nice collection of links on the subject.

Wednesday, April 21, 2010

Caml-inspect: Version 0.2

I've pushed a new version of the Caml-library to github, and I must say, I'm really starting to like this whole open-source thing. It's quite exciting. Amongst the changes are an updated README (usage instructions), documentation using ocamldoc (make htdoc), and packaging using the -pack option of the OCaml compiler. And easier installation with make using the new install/uninstall targets.

The next step will be to host the project on http://forge.ocamlcore.org/, so it has more exposure to the intended target audience. I've already registered the project as caml-inspect. I am currently awaiting approval from the site admins.

Friday, April 16, 2010

Bugfix for S-expression dumping


I've pushed a new version of the code. The code to dump s-expressions did not handle references correctly, but it should now.

Maybe next time, I should prove the correctness using Coq. But then I would not be writing new code, would I?

Ah, what a dilemma between theoretical exactness and practical application. I wonder if Haskell will bridge that gap elegantly. We never really had the time to get aquainted.

Thursday, April 15, 2010

I'm on Github now


It is done. I now have an account on Github and my first open source repository has just been pushed. Feel free to take it for a spin. And don't forget to tell me what you think of it.

Anyway, after downloading and unpacking the library, just follow the installation instructions in the README file. If you have findlib installed, using the library is as simple as typing

> #use "topfind";;
> #require "inspect";;

into your OCaml prompt. I suggest you open the Inspect module as well.

> open Inspect;;

For starters, both the Dot and the Sexpr library provide a test_data function to generate some interesting data to dump.

> Sexpr.dump (Sexpr.test_data ());;
> Dot.dump (Dot.test_data ());;

It is naturally also possible to let the dump functions inspect themselves:

> Sexpr.dump Sexpr.dump;;
> Dot.dump Dot.dump;;

If you are on a Mac, the Inspect.Dot.dump_osx function should be of interest. It writes the DOT output to a temporary file, uses Graphviz to generate the graph, and displays the results using the open command.

> Dot.dump_osx Dot.dump_osx;;

It goes without saying that you should have Graphviz installed for this last part to work.

Have fun!

Wednesday, April 14, 2010

Object-dump of a BSP-tree



The main motive behind the Inspect library was to be able to see the actual datastructure of a previous project of mine.

Seems like some things could be optimized...

Tuesday, April 13, 2010

A little something...


Here is a little preview of what I am working on at the moment. It is a little library for the beautiful (1) OCaml programming language that allows one to inspect any OCaml value in the REPL by dumping pretty printed S-expressions or producing output for Graphviz.

Something along the lines of

val dump : ?context:dump_context -> 'a -> unit
val dot : ?context:dot_context -> 'a -> unit

There will be more.

Oh, and I am trying out the OCamlMakefile as a substitute for OMake, and I must say I really like it so far. Although OMake has served me well in the past.

-

1: Too be fair, she has some warts, especially with regard to her looks (the quirky syntax). Plus, it can be hard to understand the things she throws at you when you do something wrong (the error messages).
After spending some quality time with OCaml, however, I never ever want to go back to the beast that is C++. That was such an abusive relationship. C on the other hand...

Alright, here goes nothing.

Yeah, well...
The time has come for me to expose myself and my interests to the whole world.
Just sitting in the proverbial basement and hacking the nights away, absorbing knowledge (well, mostly just surfing the internet ;-) and never sharing any output with the world does not cut it for me anymore, I guess.

I will still enjoy spending the occasional sleepless night slouched in front of my warmly glowing monitor, with the cursor eagerly blinking while Emacs patiently waits for my input. However, as of this day, I will also try to share some of my ideas and thoughts with the rest of the world (or at least the miniscule part of it that might be interested in what I have to say).

That's all for now. More to follow.