diff options
author | Nathan Ringo <nathan@remexre.com> | 2023-10-05 02:27:31 -0500 |
---|---|---|
committer | Nathan Ringo <nathan@remexre.com> | 2023-10-05 02:27:31 -0500 |
commit | c33dc81d26eb5b77faf0108502ec7588a13ebad6 (patch) | |
tree | 097f63838ed76849fdae525018f554fe4777a4b5 /Compiler in a Day/00-overview.page | |
parent | e49b67af7125cc7c5a5ec6f321057bee1036ba05 (diff) |
Adds Compiler in a Day series. This is a direct import and will need cleanup.
Diffstat (limited to 'Compiler in a Day/00-overview.page')
-rw-r--r-- | Compiler in a Day/00-overview.page | 40 |
1 files changed, 40 insertions, 0 deletions
diff --git a/Compiler in a Day/00-overview.page b/Compiler in a Day/00-overview.page new file mode 100644 index 0000000..778c5c1 --- /dev/null +++ b/Compiler in a Day/00-overview.page @@ -0,0 +1,40 @@ ++++ +title = "Overview — Compiler In A Day" +next = "01-ciadlang.md" ++++ + +# Compiler In A Day + +This is intended to be a walkthrough of a complete compiler for a simple language that can be read and understood in a single day. + +In order to achieve this, we're going to be cutting a lot of corners, mostly around code generation. +The assembly we'll be producing will run correctly, but it will be very inefficient. + +Our compiler will accept a file written in our programming language and output x86_64 assembly, which can be assembled and linked by [GNU Binutils], intended to be run on Linux. + +It should also run on the [Windows Subsystem for Linux] or on FreeBSD with its [Linux ABI support]. + +We'll also have a small runtime, written in C, and using [the Boehm-Demers-Weiser garbage collector]. + +The source code we'll show for the compiler is in Ruby, but nothing Ruby-specific will be used. +In fact, a previous version of this compiler was written in C11. + +Our compiler will have four parts. +They are, in the order they get run: + +- [Lexing][Lexical analysis]: the process of breaking up the strings of source code into lexical units known as "tokens." This simplifies parsing. +- [Parsing]: the process of building a tree representing the program from the tokens. +- Frame layout: the process of assigning slots in each function's [stack frame] to its local variables. +- Code generation: the process of generating actual assembly code from the program. + +TODO: pictures! + +Before we can start looking at these steps, however, we should look at the language we'll be compiling. + +[GNU Binutils]: https://www.gnu.org/software/binutils/ +[Lexical analysis]: https://en.wikipedia.org/wiki/Lexical_analysis +[Linux ABI support]: https://man.freebsd.org/cgi/man.cgi?query=linux&sektion=4&format=html +[Parsing]: https://en.wikipedia.org/wiki/Parsing +[stack frame]: https://en.wikipedia.org/wiki/Call_stack#Structure +[the Boehm-Demers-Weiser garbage collector]: https://en.wikipedia.org/wiki/Boehm_garbage_collector +[Windows Subsystem for Linux]: https://learn.microsoft.com/en-us/windows/wsl/ |