diff options
author | Nathan Ringo <nathan@remexre.com> | 2023-10-05 02:27:31 -0500 |
---|---|---|
committer | Nathan Ringo <nathan@remexre.com> | 2023-10-05 02:27:31 -0500 |
commit | c33dc81d26eb5b77faf0108502ec7588a13ebad6 (patch) | |
tree | 097f63838ed76849fdae525018f554fe4777a4b5 /Compiler in a Day/03-frame-layout.page | |
parent | e49b67af7125cc7c5a5ec6f321057bee1036ba05 (diff) |
Adds Compiler in a Day series. This is a direct import and will need cleanup.
Diffstat (limited to 'Compiler in a Day/03-frame-layout.page')
-rw-r--r-- | Compiler in a Day/03-frame-layout.page | 169 |
1 files changed, 169 insertions, 0 deletions
diff --git a/Compiler in a Day/03-frame-layout.page b/Compiler in a Day/03-frame-layout.page new file mode 100644 index 0000000..2f53308 --- /dev/null +++ b/Compiler in a Day/03-frame-layout.page @@ -0,0 +1,169 @@ ++++ +title = "Frame Layout — Compiler In A Day" +prev = "02-ast.md" +next = "04-abi.md" ++++ + +# Frame Layout + +When programs execute, their local variables' values are typically stored into a data structure called the *call stack*, or commonly just *the stack*. + +This is a stack data structure whose entries are called *stack frames*. +These contain all the information that a function might need to save while calling another one. + +For example, consider the following program. + +```ciad +fun f(x) { + var y; + var z; + + y = x + 3; + z = g(x - 1); + return y - z; +} + +fun g() { + if x == 0 { + return 42; + } else { + return f(x - 1); + } +} + +fun main(args) { + return f(2); +} +``` + +When `f` calls `g`, we need to store the value of `y` somewhere, so that it can be used again after `g` returns. +We also need to store what function `f` should return to once it finishes, and we store that on the stack as well. + +In general, most languages structure their stacks as something like: + +``` +╭───────────────────────────────╮ <-- "bottom of the stack" +│ address main should return to │ (highest address +├───────────────────────────────┤ on most current +│ saved frame pointer │ CPU architectures) +├───────────────────────────────┤ +│ │ +│ saved local variables of main │ +│ │ +┝━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┥ +│ address f should return to │ +├───────────────────────────────┤ +│ saved frame pointer │ +├───────────────────────────────┤ +│ │ +│ saved local variables of f │ +│ │ +┝━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┥ +│ address g should return to │ +├───────────────────────────────┤ +│ saved frame pointer │ frame pointer +├───────────────────────────────┤ <--------- when running g +│ │ +│ local variables of g │ stack pointer +│ │ when running g, aka +╰───────────────────────────────╯ <----- "top of the stack" +``` + +The stack then "grows down" as functions are called. +When a function returns, it removes its frame from the stack, shrinking it. + +The stack is kept track of with the *stack pointer*, which points to the top of the stack, i.e. the address at the _end_ of the current stack frame. + +In some programs, there's also a *frame pointer*. +This points to the address at the start of local variables of the current stack frame. + +With a smarter compiler than we're building in this series, the frame pointer isn't necessary. +However, it can still be useful for a debugger and some kinds of runtime introspection. + +In our compiler, we'll need to have a pass that computes how much storage the local variables of each function will take up in the stack frame. + +Once we know this, we can compute the address of any local variable as an offset from the frame pointer. + +Alright, let's get started with the code! + +The first thing we'll need to do is to extend `Arg` and `DeclVar` with the ability to store a slot index. +This the offset of the variable from the frame pointer. + +```ruby +class Arg + # ... + + attr_accessor :slot +end + +class DeclVar + # ... + + attr_accessor :slot +end +``` + +Then, we can add the method to perform the pass to the `Program` class. + +Computing frame layouts doesn't require any sort of information that's shared between functions, so we can just tell each function declaration to compute it independently. + +```ruby +class Program + # ... + + def compute_frame_layouts + @decls.each do |decl| + case decl + in DeclFun + decl.compute_frame_layout + in DeclVar + # Ignore it; variables don't have any kind of frame + # to be laid out. + end + end + end +end +``` + +Once we're processing the function, we start with storing a counter for the number of local variables we have in the declaration. + +We define a helper function (`assign_slot_to`) to be called on every `Arg` and `DeclVar` in the function as well. + +```ruby +class DeclFun + # ... + + def compute_frame_layout + @locals_count = 0 + def assign_slot_to(var) + var.slot = @locals_count + @locals_count += 1 + end + + @args.each { |arg| assign_slot_to arg } + + def traverse_stmt(stmt) + case stmt + in DeclVar + assign_slot_to stmt + + in StmtIf + stmt.then_stmts.each { |stmt| traverse_stmt stmt } + stmt.else_stmts.each { |stmt| traverse_stmt stmt } + in StmtWhile + stmt.body_stmts.each { |stmt| traverse_stmt stmt } + + else + # Do nothing; nothing else needs to get a slot nor + # contains anything that needs to get a slot. + end + end + + @body.each { |stmt| traverse_stmt stmt } + end +end +``` + +Once this has finished running, every `DeclStmt` will have an `@locals_count`, and every `Arg` and non-global `DeclVar` will have an `@slot`. + +Before we can start generating code, though, we should look at some details of exactly what code we'll be generating. |