The Compilation Pipeline
Source text is just a string. A compiler pipeline is a sequence of transformations that turns that string into something structured and meaningful. Each stage takes the output of the previous one, narrowing the representation from raw text to a typed result.
This model applies to every Alpaca program — not just calculators, but any language you define with the library.
The Four Stages
Most compilers share the same four-stage structure:
- Source text — the raw input string, e.g.,
"3 + 4 * 2" - Lexical analysis — groups characters into tokens:
NUMBER(3.0),PLUS,NUMBER(4.0),TIMES,NUMBER(2.0) - Syntactic analysis — arranges tokens into a parse tree (concrete syntax tree) that encodes grammatical structure
- Semantic analysis / evaluation — extracts meaning from the tree, producing a typed result (in a calculator:
Double)
Some compilers add a fifth stage — code generation — that emits machine code or bytecode. Alpaca stops at stage 4: its pipeline produces a typed Scala value, not machine code, but you can implement it by yourself.
Alpaca's Pipeline
With Alpaca, running the full pipeline takes two calls:
// Full pipeline: source text → typed result
val (_, lexemes) = BrainLexer.tokenize("++[>+<-].")
// lexemes: List[Lexeme] — inc, inc, jumpForward, next, inc, prev, dec, jumpBack, print
val (_, ast) = BrainParser.parse(lexemes)
// ast: BrainAST | Null — the parsed abstract syntax tree
BrainLexer.tokenize handles stages 1–2: source string to List[Lexeme]. BrainParser.parse handles stage 3: lexemes to an AST. You then evaluate or interpret that AST as stage 4, producing the final result.
Both BrainLexer and BrainParser are generated by Alpaca's macros at compile time.
What Comes Next
The rest of the Compiler Theory Tutorial builds on this mental model:
- Next: Tokens & Lexemes — what the lexer produces: token classes, token instances, and how they are represented in Alpaca
- The Lexer: Regex to Finite Automata — how regular expressions define token classes and how Alpaca compiles them
For the full API, see the reference pages:
