Final Project (Due: Tuesday, February 4, 2014, at 12:00pm)

Table of Contents

1 Overview

This is the final project in the compiler construction course. Your tasks for the project consists of doing the following:

  • Writing a code generator to a CISC-like assembly language.
  • Integrating the various states in the compiler pipeline, into a complete compiler. As part of this, you will define a procedure compile_scheme_file(source, target), that takes two strings, the name of a Scheme source file, and the name of a target assembly file, read each expression in the Scheme source file, tag-parse it, perform the semantic analysis over it, generate assembly code for it, and write that code into the target assembly file.
  • Writing the run-time support: library of built-in Scheme procedures that will be available with your compiler.

2 Writing the code generator

Define the instance method code_gen(self) in AbstractSchemeExpr, so that when invoked from a Scheme expression e, returns a string of assembly language instructions in the CISC-like architecture arch posted to the website.

The object-oriented way to implement code_gen(self) would be to add an implementation in most of the classes that inherit from AbstractSchemeExpr. If you use the visitor pattern then you may want to implement a CodeGenVisitor class that implements the code generation functionality per class.

You are free to generate any valid arch code, but you are advised to follow the outline of the code generator presented in class.

2.1 The target language

The target language, i.e., the language that your code generator will output, is supposed to model a generic CISC-like assembly language for a general register architecture. What this means is that you have a relatively large number of general purpose registers that can all be used for arithmetic, logical and memory operations. The instruction set provides you with operations for computing sums, differences, products, quotients, remainders, bitwise Boolean operations, simple conditionals, branches and subroutines:

  • Registers. The cisc.h file, which defines our microarchitectural, specifies 16 general-purpose registers. Sixteen registers is a reasonable number; We have certainly used less than a half of these, and you may add as many more as you like. , We also have the stack pointer SP, the frame pointer FP. You may define the stack size to be whatever you like, but Mega(64) seems to be a reasonable starting point.
  • Arithmetic Operations.

You may carry out arithmetic operations between registers and registers or between registers and constants. No nesting of operations is permitted, so, for example, you can have MOV(R0, R3) or ADD(R4, R3), but you may not have something like MOV(R0, ADD(R3, R5)), etc.

  • Branches, Subroutines, Labels.

You can define any number of labels, and branch to any label. A label in C is defined as a name followed by a colon (e.g., L34:). As a rule of thumb, any name that would qualify as a variable or procedure name could also be used as a label. The commands JUMP, JUMP_condition, CALL, CALLA, RETURN, as well as other commands documented in the manual are available for jumping to labels, calling & returning from subroutines. Consult the documentation & code examples for more details.

As mentioned in class, unlike the situation in assembly language, labels in C are symbolic entities and have no addresses. The gcc compiler introduces a non-standard extension, in which the address of a label can be found via &&LabelName, and is a datum of type void *. Correspondingly, jumping to an address, as opposed to a label, is accomplished by means of another non-standard extension: goto *addr.

Our microarchitecture makes extensive use of these extensions to the C standard. This means that this part of the project cannot be done in a compiler other than gcc. This compiler is available on the departmental Unix and Linux machines, comes as the default C compiler on Linux & OSX, and can be downloaded and used under Windows. For information on how to get gcc for Windows, please consult the course syllabus.

  • Conditionals.

Consult the documentation of the cisc microarchitecture, as well as the sample library, for information on all the conditions that can be tested, and how to use them.

  • Stack Operations.

Consult the documentation of the cisc microarchitecture, as well as the sample library, for information on the structure of the stack, the frame, the FPARG and STARG macros, and how to use them.

3 Integration

In this part of the project, you will implement the global procedure compile_scheme_file(source, target), that takes two strings, the name of a Scheme source file, and the name of a target assembly file, read each expression in the Scheme source file, tag-parse it, perform the semantic analysis over it, generate assembly code for it, and write that code into the target assembly file. After evaluation, the value should be printed to stdout unless it is #<void>, in which case, do not print it. If the value of an expression contains #<void>, then and only then is the void object printed within the larger resulting value.

You should be able to invoke it from the Python prompt as follows:

>>> compile_scheme_file('test1.scm', 'test1.asm') 
>>> compile_scheme_file('test2.scm', 'test2.asm') 
>>> compile_scheme_file('test3.scm', 'test3.asm')

You should provide a makefile that can be used to compile the target assembly language file to a standalone executable. You should do this by writing a makefile rule that will describe how to convert files with the .asm suffix to files with no suffix. Please check the online documentation of the make utility for an explanation as to how to do this. Basically, you will need to teach make to recognize the new suffix:

.SUFFIXES: .asm

Then you will need to define the rule:

%:      %.asm
	...

Where ... should be replaced by appropriate invocations of gcc.

4 Run-time support

Certain elementary procedures need to be available for the users of your compiler. These "built-in" procedures need to be hand-written, in assembly language. These include:

apply, < (variadic), = (variadic), > (variadic), + (variadic), / (variadic), * (variadic), - (variadic), boolean?, car, cdr, char->integer, char?, cons, eq?, integer?, integer->char, list, make-string, make-vector, map, null?, number?, pair?, procedure?, remainder, string-length, string-ref, string->symbol, string?, symbol?, symbol->string, vector, vector-length, vector-ref, vector?, zero?.

You will also need to make the procedure Yag available to your users, so that the expansion of letrec works correctly. You may either re-implement it in assembly, or simply catenate the following source code with (and before!) any user code:

(define Yag
  (lambda fs
    (let ((ms (map
		(lambda (fi)
		  (lambda ms
		    (apply fi (map (lambda (mi)
				     (lambda args
				       (apply (apply mi ms) args))) ms))))
		fs)))
      (apply (car ms) ms))))

5 How to submit

  • You should submit a zip file project.zip that creates a directory project containing:
  • scheme-compiler
    • pc.py
    • sexprs.py
    • reader.py
    • tag_parser.py
    • compiler.py
    • arch
    • makefile
    • Any additional files needed to compile and run Scheme code
  • A file readme.txt that contains the names, IDs, and email addresses of all group members (you may either work alone, or with a single partner), as well as the following statement:

I(We), <fill in your name(s)>, declare the following to be true:

  • We worked on this assignment alone. We did not consult with others about this assignment, other than the teaching staff for this course.
  • We did not copy code either from other students or from the internet.
  • We did not make available this code to other students taking this course.
  • We are aware of the University, faculty (as specified in the ‏שנתון‎), departmental, and course policies (as specified in the syllabus) on academic dishonesty: We realize that if we are found to have committed academic dishonesty that we shall be sent before the disciplinary committee (‏ועדת משמעת‎), fail the course, and possibly incur other penalties as decided by the disciplinary committee.

We will not grade any work submitted that does not include the above statement. Please do not forget to include it, with your names spelled out. If you do not do this, for any reason, then at the very least, you will incur the 5% per diem late penalty (as specified in the course syllabus).

5.1 Last minute instructions

  • This is it, folks: This is your last programming assignment in the course. Please make sure everything necessary to compile and run Scheme code is included. Please be very pedantic about it: Create a new directory, unpack your zip file there, and make sure you can run everything from that directory. You must be very sure that everything works.
  • PLEASE do make sure that your work runs on the departmental Linux machines. It will save you much grief later on!
  • PLEASE do go over the submission instructions carefully, and make sure you did not leave out anything.

Author: Mayer Goldberg

Created: 2014-01-23 Thu 11:47

Validate XHTML 1.0