The Chirp Intermediate Language and Code Generation

Li Xu
Department of Computer Science
University of Massachusetts Lowell
Lowell, MA 01854

October 2008

1  Introduction

Chirp is a simple C-like imperative language targeting embedded microcontrollers used in personal robot devices. The language itself is described in its language specification [1]. This document describes the Chirp Intermediate Language (ChirpIL) and compiler code generation from ChirpIL to target robot hardware.

The design of the Chirp compiler follows the typical compiler structure: the front-end lexer, parser and semantics checker will parse the input program, create an abstract syntax tree (AST) tree representation, and check semantics of program function and statements for type and semantics errors; the middle-end will convert program AST tree into the low-level ChirpIL which is close to the command set of the robot controller, the middle-end will also consist of multiple optimizing passes over ChirpIL to improve code quality and reduce memory size; finally, the back-end translates ChirpIL into instructions in microcontroller command set and generate the executable file to run on the robot device. The document will describe ChirpIL: the intermediate instructions, the program control flow graph formed from basic blocks of ChirpIL instructions, and the code generation process to generate target microcontroller code from ChirpIL. Although ChirpIL enables many opportunities for both code and data optimization, however, its optimization is not in the scope of this document and will be addressed separately in other Chirp compiler design documents.

2  Chirp Intermediate Language

  The Chirp compiler will convert the AST from the front-end to the 3-address style intermediate representation specified by the following ChirpIL grammar.


il_prog( il_func )*
il_func( il_block )*
il_block(label : )? ( il_inst )*
labelSTRING
il_instaload | astore | binop | branch | call | jump | mov | retval | return
aloadrand = aload rand rand
astoreastore rand rand rand
binoprand = rand op rand
opadd | sub | mul | div | mod
branch(bz | bnz ) rand label |
 (beq | bne | blt | ble | bgt | BGE ) rand rand label
call( rand = )? call FUNC_NAME ( rand )*
jumpjmp label
movrand = rand
retvalretval rand
returnreturn
randINT | STRING | VAR_NAME
Figure 1: ChirpIL grammar

2.1  ChirpIL Instructions

As listed in Figure 1, program in ChirpIL form consists of sequence of basic blocks. Block with labels are target of ChirpIL branch instructions. Basic blocks are sequence of ChirpIL instructions with control flow entering through the first instruction and exiting at the last instruction, there is no control transfer into and out of the basic blocks in the middle. The basic block can include the following ChirpIL instructions:

The ChirpIL instructions operate on three kinds of operands:

2.2  Control Flow Graph

Chirp functions are represented as control flow graph (CFG) in ChirpIL form. The CFG for each function consists of basic blocks and each block has CFG edges indicating successor blocks. Furthermore, each function CFG has a unique entry block node and exit block node to represent function entry and return points. The exit block contains a single return instruction to return control flow. The retval instruction should be followed by a jmp instruction to the exit block.

2.3  IRGen: Translation from AST to ChirpIL

The AST representation created by parser is converted into ChirpIL CFG by the IRGen intermediate representation generator. IRGen will translate Chirp statement and expression subtrees with sequence of ChirpIL instructions and generate basic blocks and CFG representation.

To translate expressions, IRGen will create temporary variables to store intermediate results. Arithmetic operations can be translated directly into ChirpIL binary instructions, while logic and comparison expressions are translated through equivalent blocks and branches. Chirp assignments are translated into move and array load and store instructions; control flow statements are translated into basic blocks, branches and jumps.

The following example shows Chirp source program and its ChirpIL IR.


        int a=1001;
        int b;
        int c;
        int i;
        int A[10];
        
        int main()
        {
          a = 3;
          c = 35;
          b = 5+3*a-b/2%100;
        
          for i (0:9) {
             A[i] = i+1;
          }
        
          return A[0];
        }
Figure 2: Example Chirp program


    #Function: main
    main:   a = 3
            c = 35
            t0 = 3 mul a
            t1 = 5 add t0
            t2 = b div 2
            t3 = t2 mod 100
            t4 = t1 sub t3
            b = t4
            i = 0
    BB0:    bgt i 9 BB2
    BB1:    t5 = i add 1
            t6 = i
            t6 = t6 mul 2
            t6 = A add t6
            astore A (t6) t5
            i = i add 1
            jmp BB0
    BB2:    t7 = 0
            t7 = t7 mul 2
            t7 = A add t7
            t8 = aload A (t7)
            retval t8
            jmp BB3
    BB3:    return
Figure 3: ChirpIL IR for the example

3  Code Generation

The back-end code generator will translate ChirpIL to the target microcontroller’s command set. In general, the ChirpIL instructions are close to instructions available on the target devices, as most provide similar instructions on arithmetic and control flow operations. The code generator should manage the following:

We use Parallax Scribbler robot as concrete target. The code generator in general needs to do the following: 1) map variable names to STAMP controller RAM variables, and arrays as ROM variables; 2) as there is no stack on STAMP microcontroller, the code generator should do name mangling for formal and local variables and allocate in the global storage; 3) translate ChirpIL to PBASIC; 4) implement System.Scribbler functions as PBASIC I/O commands (see [3] for details); 5) generate start-up code to initialize variables and jump to the main function entry block.

The generated PBASIC code for the example in Section 2 is shown in Figure 4.


        'global
        a VAR word
        b VAR word
        c VAR word
        i VAR word
        A DATA word 0(10)
        main_ret VAR word
        'func main:local
        main_l_t0 VAR word
        main_l_t1 VAR word
        main_l_t2 VAR word
        main_l_t3 VAR word
        main_l_t4 VAR word
        main_l_t5 VAR word
        main_l_t6 VAR word
        main_l_t7 VAR word
        main_l_t8 VAR word
        
        start_main:
              a = 1001
              GOSUB main
        end
        
        'func: main
        main:
              a = 3
              c = 35
              main_l_t0 = 3 * a
              main_l_t1 = 5 + main_l_t0
              main_l_t2 = b / 2
              main_l_t3 = main_l_t2 // 100
              main_l_t4 = main_l_t1 - main_l_t3
              b = main_l_t4
              i = 0
        BB0:
              if (i > 9) then goto BB2
        BB1:
              main_l_t5 = i + 1
              main_l_t6 = i
              main_l_t6 = main_l_t6 * 2
              main_l_t6 = A + main_l_t6
              WRITE main_l_t6, word main_l_t5
              i = i + 1
              goto BB0
        BB2:
              main_l_t7 = 0
              main_l_t7 = main_l_t7 * 2
              main_l_t7 = A + main_l_t7
              READ main_l_t7, word main_l_t8
              main_ret = main_l_t8
              goto BB3
        BB3:
              RETURN
Figure 4: PBASIC code generated from ChirpIL IR

4  Acknowledgments

This work has been supported by NSF grant DUE-0737054.

References

[1]
The Chirp Language Specification, Version 2. http://www.cs.uml.edu/∼xu/cs406/chirp_spec_v2.html.
[2]
Parallax, Inc. BASIC Stamp Reference Manual, version 2.1.
[3]
Parallax, Inc. Hack’s Hints for the Scribbler Robot.

This document was translated from LATEX by HEVEA.