Skip to main content

Rapid Syntax Prototyping Tool

··

I want to develop alternative syntax for Lisp. My idea is to write several possible syntaxes and see if I like them or not.

Step 1 #

I imagine it like this:

  • write some BNF-like syntax description
  • generate a parser that will return S-expressions
  • apply transformations similar to nanopass to get correct Lisp S-expressions
    • For example, transform (a + b) to (+ a b)

Parser + transformations give me a “reader”, which I can put in my implementation of Lisp and try to run some examples.

To make it a faster process I want to:

  • reuse “transformers” - which should be trivial, because they are pure functions
  • reuse patterns in BNF similar to predefined patterns in Rosie

Step 2 #

I wonder if it is possible to generate pretty printer based on grammar and transformations. If it would be possible I can get a set of Lisp programs and generate programs in new syntax to see if it looks good.

Can we inverse transformers? Related: Bidirectionalization Transformation Based onAutomatic Derivation of View Complement Functions, Combining Syntactic and Semantic Bidirectionalization, relational programming.

Step 3 #

Given grammar and some additional annotations, it should be trivial to generate syntax highligter (inspired by Rosie).

Implementation #

I need to chose parser generator. PEG parser seems like the best candidate for my purpose (Rosie uses it).

Potential issues with PEG:

Other ideas:

Tutorials for PEG parsers #

Grammars corpus #

Another approach to finding “ideal” grammar is to take a look at existing grammars and see what good and bad parts they have - to learn from their experience:

Zipf’s law, in probability, assertion that the frequencies f of certain events are inversely proportional to their rank r. The law was originally proposed by American linguist George Kingsley Zipf (1902–50) for the frequency of usage of different words in the English language; this frequency is given approximately by f(r) ≅ 0.1/r. Thus, the most common word (rank 1) in English, which is the, occurs about one-tenth of the time in a typical text; the next most common word (rank 2), which is of, occurs about one-twentieth of the time; and so forth. Another way of looking at this is that a rank r word occurs 1/r times as often as the most frequent word, so the rank 2 word occurs half as often as the rank 1 word, the rank 3 word one-third as often, the rank 4 word one-fourth as often, and so forth. Beyond about rank 1,000, the law completely breaks down.

britannica

Is it possible to construct some common corpus of syntactic constructs (assume they follow Zipf’s law) and validate new syntax against those cases?

Other “parsers” for inspiration #

Racket #

Language Workbench #

Read more: instaparsejs, Parsing with derivatives