Project scope and outline plan

Posted on October 16, 2017

Goals

GHOUL (Grand Higher-Order Untyped Language) is a simple functional programming language designed for an exercise in language implementation for CS316 by Bob Atkey and Conor McBride. As the name suggests, this language does not have a type system. The aims of my project are:

  • to learn about the various kinds of type systems and their theoretical underpinnings along with the practical consequences of using particular sorts of type systems;

  • to design a type system for GHOUL or a similar simple programming language in order to produce “GHOTL”;

  • to implement the type system and produce a working type-checking interpreter for GHOTL, potentially based on my GHOUL implementation from last year;

  • to improve on the programmer experience provided by GHOUL, in particular by providing better diagnostics if any stage of program validation (parsing, scope checking, type checking) fails.

In particular, I hope to learn about the clearly very diverse ways in which type systems can make programs easier to reason about and provide greater assurance of correctness, while still offering an application-oriented developer experience.

Blog

Available for viewing at https://lugn.sphalerite.org/ghotl.

Sources will be at https://gitlab.cis.strath.ac.uk/ylb14182/ghotl-blog.

Preliminary achievable objectives

  • Increased knowledge on type systems, documented in the report;

  • Complete design of GHOTL, based on decisions with the goal of an application-oriented type system in mind;

  • Working implementation of GHOTL.

Preliminary survey of related work

Theory

Based largely on skimming of Types and Programming Languages by Benjamin C. Pierce.

  • The simply typed lambda calculus, as described by Alonzo Church, allows checking consistency of function applications but is not expressive enough to allow expressing recursion in itself; it should however serve as a decent introduction to how type systems work, and be a good system to implement initially. Introducing additional language primitives allows making the simply typed lambda calculus more powerful, but does not really extend the type system, so it is not really interesting from a type system perspective;

  • Recursive types would be useful to allow user-defined recursive data structures within the type system rather than only language primitives;

  • Dependent types could allow much more extensive checking of correctness for programs;

  • Type inference of some description would be good to have. A C++11/C#-style system, which simply allows inferring the type of a name binding by determining the type of the expression assigned to (and fails if it cannot be disambiguated) it should not be complex to implement. Haskell-like (Hindley-Milner-based) inference which is informed by the use of the name in addition to its initialisation is more complex and apparently imposes some additional constraints on the type system, and could be a more interesting extension.

Practice

  • C is an imperative systems programming language with a fairly simple type system that does not provide much type safety, allowing many programs with undefined behaviour.

  • C++ extends C with a much more expressive type system aimed as far as I can tell at supporting multiple programming paradigms, particularly object-oriented styles, along with the use of template metaprogramming. Many safety features are implemented on top of rather than within its type system, using RAII to allocate resources when an object is created and free them when it goes out of scope. It is complex and still allows writing code which exposes a number of C’s weaknesses, and as such may be an example to avoid;

  • Compromises on type safety would be necessary in order to allow foreign function calls for an application-oriented language, and clean abstractions that make this unsafety easier to manage will be one thing to investigate;

  • Rust is oriented at systems programming, aiming for the same general domain as C but with a much more powerful type system. It ensures memory safety by tracking lifetimes within its type system and supports unsafe operations such as foreign function calls through the use of a dedicated unsafe keyword. This is exemplary of its general focus on defaulting to safe options, along with defaulting to immutable values;

  • An Agda-style dependent type system may be unsuitable as proofs tend to depend heavily on the definitions of their subjects, leading to tight coupling between them. This may not lend itself well to building interoperating components. I’m doubtful about this though, since the proofs simply provide properties of the functions one may then use to prove further properties. This may in fact lead to much less problematic refactoring and application architecture.

Some of these assumptions may prove to be wrong during the course of the project, and are only guesses based on what I already know.

Methodology

Literature review

Read about various type systems. Types and Programming Languages appears to be a good source for references.

Design and implementation

Informed by literature review; design and implementation will not be distinct phases, in order to allow a feedback loop where issues discovered during implementation or testing can inform the further development of the design.

Evaluation

My intention is not to develop a language which can functionally replace any existing popular language, but an understanding of how various design and implementation choices for a type system affect the use of the resulting language. As such, I do not intend to develop a fully-fledged application, but would like to develop some small programs to test its features. If I pursue the foreign-function call feature, for instance, I may look at implementing a few of the standard Unix utilities to test interoperability with the standard C library.

Initial plan

October – December: Review literature, centered around Types and Programming Languages and any other recommendations from Bob. Write a blog post summarising findings.

December – 12th January: Prepare project poster.

January – February: Design and implementation, with continuous testing to ease the workload during the feature phase. Document progress on blog to provide transparency and material for the final project report.

1st March: Target date to decide on final feature set.

March: Continue testing/fixing bugs, focus less on developing new features. Put blog material together for report and fill any gaps.

26th March: Report submission.

I intend to use a Kanban board to track my progress in a granular fashion, and will include the state of the board in my blog posts. I will not be following any Agile methodology very closely, as they are not very well suited to the individual nature of this project.

Marking scheme

This is an experimentation-based project with a significant software development aspect.

Supervisor comments

This project consists of a large amount of exploration and background reading before implementation, and the proposed plan reflects this. Linus has already made a start in mapping out the space of possibilities, both from a theoretical and practical perspective, and acknowledges that his plans will likely have to be revised as he discovers more. An emphasis on making GHOUL a more practical language for application development (rather than just as a vehicle for type system experimentation) will help guide the project — though it might be an idea to have a specific class of programs that ought to be easy to write in mind. The timetable looks reasonable, with a large amount of space for writing up. I would expect the timetable to become more detailed in mid January as more is known about what is planned for implementation.