HomeDocumentationChatIssues GitHub

Build Architecture

Compilation Flow

Most bundlers and compilers are similar in how they approach compiling an application. At a high level, it looks something like this:

            ┌────────┐                                
            ▼        │                                
        ╔═══════════════╗    ╔═══════════════╗        
        ║ │1│ Discover  ║───▶║ │2│ Transform ║──┐     
        ╚═══════════════╝    ╚═══════════════╝  │     
┌───────────────────────────────────────────────┘     
│  ╔═══════════════════════════╗   ╔═════════════════╗
└─▶║ │3│ Combine and Construct ║──▶║ │4│ Emit Output ║
   ╚═══════════════════════════╝   ╚═════════════════╝
  1. Having been provided an entry point, the dependency graph is traversed. Discovery is a recursive process, continuing until all dependencies and sub-dependencies are discovered and parsed. We'll call each node in the dependency graph a "module".

  2. Input modules are (optionally) transformed in some way.

  3. Input modules are combined in some way. We'll call each combination a "bundle".

  4. The bundles are emitted. As part of this process, another transformation may occur from an intermediate format to the destination format (i.e. byte-code to assembly code).

Roughly, the compiler finds some things, transforms them, combines them, and emits the combinations.

Interlock's Compilation

Interlock's compilation process follows the above pattern, with one caveat: module transformation and the recursive discovery process are intertwined. This allows for module-specific transformation to affect the dependencies that are discovered.

And so, a truer visualization of Interlock's compilation process might look like this:

                 ┌────────────────────┐               
                 ▼                    │               
        ╔═══════════════╗    ╔═══════════════╗        
        ║ │1│ Discover  ║───▶║ │2│ Transform ║──┐     
        ╚═══════════════╝    ╚═══════════════╝  │     
┌───────────────────────────────────────────────┘     
│  ╔═══════════════════════════╗   ╔═════════════════╗
└─▶║ │3│ Combine and Construct ║──▶║ │4│ Emit Output ║
   ╚═══════════════════════════╝   ╚═════════════════╝

Compilation Primitives

We've covered the basic steps of compilation, but we haven't mentioned the data that flows between these steps. Let's look at that next.

Foundational data-structures

Really, there are only two foundational data-structures that flow through the Interlock compilation process: modules and bundles. Below, we'll enumerate the properties of these data structures at various stages.

Module seeds

A module seed is the earliest version of a module, containing the minimum information necessary for the module to be read from disk and to proceed through compilation. It has the following properties:

Compiled modules

A module goes through several steps, each one adding one or more properties to the module object as it proceeds. A module will never be considered compiled until all of its dependencies are compiled. A compiled module will have the following properties (in order of completion):

Bundle seeds

Bundle seeds are the earliest version of a bundle, containing the minimum information necessary to begin the bundle compilation process. A bundle seed has the following properties:

Bundles

Bundles are a combination of one or more modules, and all meta-data needed to generate source. In addition to the bundle-seed properties defined above, bundles have the following properties:

Raw bundles

Raw bundles can be thought of as the "files" that are emitted. In addition to the bundle properties defined above, raw bundles have the following property:

In fact, only raw and dest are required to write bundles to disk, so you may see plugins generate a simple object of shape { raw, dest } late in the compilation process so that they're written to disk.

Variations

It should be noted that the above descriptions of the bundle and module objects disregards the changes that a plugin might introduce. There are no constraints on what can be attached to these objects as they flow through compliation, and some plugin authors may find it useful to attach metadata to the module for use in later steps.

Flow of Data

The following is a conceptual visualization of how these data-structures relate to each other and to the compilation flow:

            │                     
            │ │1│ Entry modules   
            ▼                     
      ╔══════════╗                
   ┌─▶║ Discover ║                
   │  ╚══════════╝                
   │        │                     
   │        │ │2│ Module seeds    
   │        ▼                     
   │  ╔═══════════╗               
   └──║ Transform ║               
      ╚═══════════╝               
            │                     
            │ │3│ Compiled modules
            ▼                     
╔═══════════════════════╗         
║ Combine and Construct ║         
╚═══════════════════════╝         
            │                     
            │ │4│ Bundles         
            ▼                     
     ╔═════════════╗              
     ║ Emit Output ║              
     ╚═════════════╝              
            │                     
            │ │5│ Raw bundles     
            ▼                     
  1. Entry modules are defined by the user in their compilation options. Before entering the Discover stage, they are transformed into entry module seeds.

  2. Module seeds contain the minimum information necessary for the source file to be loaded from disk, parsed, and transformed.

  3. During transformation, module seeds are incrementally fleshed-out until all module properties are populated. We'll call these "compiled modules".

  4. Compiled modules are then combined into one or more bundles. These bundles will correlate directly with the files that are emitted at the end of the compilation process.

  5. Bundles undergo a final, two-step transformation. Given the AST of the modules in each bundle, a Bundle AST is derived. From that, a bundle's code (and optionally a source-map) is generated.

Raw bundles can then be handed off to a function that writes them to disk, or to an HTTP server to server to clients, to anything else.

Pluggable Functions

So far, we've looked at compilation in the abstract. Fortunately, this abstract process corresponds pretty closely with the implementation.

The building blocks for this entire process are specialized pluggable functions.

Pluggable functions are like pure functions, taking inputs, delegating certain computations to other functions (often pluggables), and returning an output. However, there are a few fundamental differences.

In essence, compilation is comprised of a hierarchy of function calls. These function calls behave as extension points, where their output and behavior can be overridden or transformed, either synchronously or asynchronously.

Additionally, each pluggable function (and plugin points that override or transform these pluggable functions) has a speciate execution context (this) that provides access to the compilation options (this.opts).

To dig deeper:

Intermediate representation of code

During compilation, Interlock tries to pair a piece of code with its most useful abstract structure.

This eases the process of manipulating and pulling information of the code during the build process. It also provides (soft) guarantees that the code that's emitted relates to the input source in the way that's expected.

In the case of JavaScript, this means that Babel AST is used to represent code. This is true from the point of parsing until bundle code and sourcemaps are generated. The CSS plugin treats CSS similarly, relying on PostCSS's AST format.

In general, strings and string concatenation should be avoided for module/bundle properties in plugin authoring and in Interlock itself. That is, unless they are the obvious choice. For example, a bundle's code must ultimately be a string or a buffer in order to be written to a file. Because Babel's generator outputs a string, we transform a bundle's AST to string just before the end of compilation.

Babel

Interlock relies on Babel for code parsing, transformation, and generation. That makes Babel the most significant dependency upon which Interlock relies.

Babel itself is split into several packages, which are mostly maintained here. Each Babel package has its own README and test suite. If you need to understand more about how Interlock consumes Babel packages, the READMEs and tests are a good place to start. There are also several active channels in Babel's Slack community;