Proposal: Modify comptime semantics to be runtime-consistent (WIP)

Zig's comptime feature is a marvelously versatile tool with a strong, intuitive appeal to users for its simplicity and its utility for meta-programming, optimization, and zero-cost abstractions. Nonetheless, there are some serious corner cases of its behavior that cause confusion.

In order to bring some clarity to what the correct semantics of comptime should be in some of these cases, I propose a simple rule that I call the Run-time Equivalence Principle, which is intended to uniquely determine the correct behavior of comptime Zig, at least when it overlaps with run-time semantics:

Run-time Equivalence Principle (REP): Adding/deleting comptime from a program should not change its semantics (provided that it compiles)

This is clearly a very strong principle. In fact, it's so strong that you might worry about whether such a solution even exists. After all, maybe there's something inherent in comptime behavior that forces it to diverge from run-time behavior?

To resolve that, I'll try to demonstrate that this principle gives us a reasonable comptime semantics, and furthermore that it coincides with the notion of soundness when Zig's comptime is viewed as an abstract interpretation. In this sense, comptime can be viewed as a very specific kind of abstract interpretation of Zig programs, with automatic lowering of data into the program text when the outputs of this interpretation are "observed".

Of course, Zig's comptime also includes a variety of super-powers which have no run-time equivalents. The REP makes no comment on these, provided that their syntax prevents compilation when removing comptime.

As we'll see later, it's also worth considering a weakened version of the same principle:

Weak Run-time Equivalence Principle (Weak REP): Adding/deleting comptime from a program should not change its semantics (provided that it compiles and contains no undefined behavior)

The Basic Idea

There are two pieces to comptime var's mechanics, intuitively:

  1. Propagation: A comptime var's state may vary line-to-line within a program, just like a normal variable.
  2. Observation: We say that a comptime var is observed whenever it interacts (as a read-only object) with run-time code. This effectively "freezes" the current value of the variable, lowering it into program text.

Let's do a quick example, annotating as we go along. First, one where everything is fine:

comptime var x: usize = 0; // x: 0
x = 1;  // x: 1
x += 2; // x: 3
var y: usize = 3 * x;    // x is observed here, with a value of 3!
// var y: usize = 3 * 3; // <-- This is the equivalent code that x is 
                         //     lowered into via observation
x += 3; // x: 6
try stdout.print("y: {} x: {}\n", .{y, x}); // observation of x (= 6)
// try stdout.print("y: {} x: {}\n", .{y, 6}); // <-- equivalent code after observation

Notice that there's no ambiguity about how to interpret the comptime state of x in this example. Intuitively, it corresponds exactly to what you'd expect a normal run-time variable to do in its place. Its value is copied into the program text at each observation, but the end behavior of the program is the same as the version with no comptime at all.

However, we start to see divergent and unintuitive behavior when run-time ambiguities are introduced:

comptime var x: usize = 0; // x: 0
if (b) { // b is run-time-known
    x = 2; // x: 2
} else {
    x = 1; // x: 1
}
// x: 1 -> Not very logical, but this is what Zig does today
try stdout.print("x: {}\n", .{x});  // observation of x (= 1)
                                    // not run-time consistent, but no error :(

As annotated above, Zig compiles this example and outputs x: 1. If I change x to be a normal var though, it may actually output x: 2 at run-time. Or it may output x: 1! The problem is that we don't know.

I've changed my program behavior, just by adding comptime to a variable. The Runtime-Equivalence Principle (REP) says that this isn't supposed to happen!

Background: Abstract Interpretation

Abstract interpretation is a static analysis technique used to analyze a program off-line, without running it on "real" inputs. Instead, it involves propagating sets of possible values (called abstract domains) through a program, and conservatively approximating the sets of possible states of the program at each step.

In general, these sets can be pretty exotic. They might track null vs. non-null pointers, for example. Or, they could be upper and lower bounds on a varibale, or any number of other things.

Thankfully, the relevant mechanics for Zig's comptime are pretty easy to intuit. We'll be propagating discrete sets of values through a program, like so:

var x: usize = 0; // x ∈ {0}
if (b) { // b is an unknown runtime value
    x = 1;        // x ∈ {1}
}
// x ∈ {0, 1} -> no unique value at runtime!
try stdout.print("x: {}\n", .{x});

A sound abstract interpretation is one that doesn't lie about what set x may be in at run-time. That is, the reported set of possible values really do include all the possible run-time values of x, and it may also report others that are actually not possible to reach at run-time.

This example shows that a sound abstract interpretation of fixed inputs often yields a single fixed output at many points in the program, but there are points where ambiguity is introduced and forcing the interpretation to select a single value is unsound.

If we want comptime behavior to describe an equivalent run-time program correctly, the value of a comptime variable should become undefined, to preserve soundness. Here's a corrected version of our comptime example from the section above:

comptime var x: usize = 0; // x ∈ {0}
if (b) { // b is an unknown runtime value
    x = 2; // x ∈ {2}
} else {
    x = 1; // x ∈ {1}
}
// x ∈ {2, 1} -> no unique value, x becomes undefined
try stdout.print("x: {}\n", .{x});  // observation of undefined x
                                    // compile-time error!

x becoming undefined is a result of losing precision in the abstract analysis.

This gives some flavor for the problems related to propagation of comptime state today.

As we'll see, there are also challenges with the observation of data via comptime references, which may present even bigger challenges for compliance with the REP.

This is section is just a quick nod to abstract interpretation for the technically-oriented reader. Xavier Rival's text on Static Analysis is a good introductory book for a proper primer. Also, a more familiar term for many folks might be partial evaluation, which implies the same interpretation of Zig's comptime semantics, with essentially the same notion of correctness.

The Good: Where the Principle Already Applies

1. Pure Functions with value-type parameters

fn squareme(comptime x: u32) u32 {
    return x * x;
}

pub fn main() !void {
    comptime x : u8 = 2;
    println("The value is: ", .{f(x)});
}

As you can see, abstract interpretation agrees with comptime evaluation today that functions with known inputs have known outputs.

The Run-time Equivalence Principle is perfectly happy with us here.

The REP also tells us that function memoization must be unobservable in general. Zig currently enforces this by considering any closed over state (including local state) as a separate parameter when performing memoization.

2. Sequential statements

It's good to know that many compile-time function evaluations are consistent with our principle.

What about comptime var? For a simple linear program, the REP is satisfied and comptime var corresponds exactly to the uniquely determined variable state at that point in the program. Here's an example which is fine, for instance:

    comptime var x: u8 = 0;
    var i: usize = 2 * x;
    x = 1;
    try stdout.print("i: {} x: {}!\n", .{i, x});

This prints i: 0 x: 1!, exactly as if comptime were not there to begin with

3. inline Loops and Conditionals

When using inline, loops are effectively linearized into a sequence of statements, so based on the last example, we can expect that that inline loops also agree with our principle. Indeed, they do:

    comptime var x: u8 = 0;
    var i: usize = 0;
    inline while (x < 3) {
        x += 1;
        i += x;
        try stdout.print("({},{}) /", .{i, x});
    }

This outputs the expected (1,1) / (3,2) / (6,3) /, and removing inline and comptime does not change the output.

By the same reasoning, inline if also behaves correctly.

The Bad, but Easily-Fixed: Divergent comptime Behavior

Great, so the above examples showed cases where I can delete comptime and inline without changing the semantics of my program. These are super easy to reason about!

However, it can't all be that good, or I probably wouldn't have written any of this up. So without further delay, let's get into the hairy parts of the problem...

1. Non-inline Conditionals

What should this example print?

var b: bool = true;
comptime var x: u32 = undefined;
if (b) {
    x = 1;
} else {
    x = 2;
}
try stdout.print("x: {}", .{x});

Currently, Zig prints x: 2, but the Run-time Consistency Principle says that this is not a correct interpretation of this program.

Let's see why: If we erased comptime and treated x like a normal var, what would its value be? Well, it has no unique value for the call to print()! In fact, x has no uniquely-determined value anywhere after the conditional. Its possible states are {1,2}, not just {2} or {1}.

As we saw in the introduction, this example should be rejected by the compiler, since there is no correct way to lower x to a compile-time value at the point it is observed.

The correct restriction is:

Comptime state becomes undefined after a conditional if the comptime state in each branch does not match (and control flow didn't leave a branch via return or break)

2. Non-inline Loops (with variant comptime state)

    comptime var x: u8 = 1;
    var i: usize = 0;
    while (i < 3) {
        i += x; // `x` as a normal variable would have no unique value here!
        x = 10;
    }
    try stdout.print("({},{})\n", .{i, x});

The output is (3,10), but if we delete comptime, the output changes to (11,10)! Sure enough, this example violates the REP.

The approximate restrictions are as follows:

  • If the loop has a runtime-known condition, then any non-constant writes to x cause it to become undefined. Furthermore, if its value at the end of the loop does not match its value at the beginning, x becomes undefined everywhere after the loop and inside the loop up to the first write
  • If the loop has a comptime-known condition, comptime state can freely vary across the loop, taking the equivalent run-time value after the loop. Within the loop, comptime state is undefined wherever it was not constant during each iteration

The first of these is a restriction of current compiler behavior, the second is an optional relaxation. Since this example has a runtime-known condition and the comptime state varies across the loop, it should be rejected by the compiler.

3. Functions with local static variables

Zig allows us to have static locals by closing over a local struct definition:

fn foo(comptime unused: u8) usize { 
    const state = struct { 
        var x: usize = 0; 
    };
    state.x += 1; 
    return state.x;     
}

pub fn main() !void {
    const stdout = std.io.getStdOut().writer();

    try stdout.print("x: {}, ", .{foo(1)});
    try stdout.print("x: {}, ", .{foo(1)});
    try stdout.print("x: {}\n", .{foo(2)});
}

This outputs x: 1, x: 2, x: 1, but if we delete the comptime it outputs x: 1, x: 2, x: 3.

The handling of the state update is actually correct here, and memoization does not get in our way. What seems to be wrong is that the functions should refer to the same underlying type instance, if we want to respect the REP.

Pulling the struct out of the function makes it clear that its definition doesn't actually depend on unused and gives us REP consistency:

const LocalStateStruct = struct { 
    var x: usize = 0; 
};
fn foo(comptime unused: u8) usize { 
    const state = LocalStateStruct;
    state.x += 1; 
    return state.x;     
}

pub fn main() !void {
    const stdout = std.io.getStdOut().writer();

    try stdout.print("x: {}, ", .{foo(1)});
    try stdout.print("x: {}, ", .{foo(1)});
    try stdout.print("x: {}\n", .{foo(2)});
}

This suggests that the problem here is the implicit creation of a new type just by virtue of being inside a comptime-specialized function.

If the struct { ... } actually depended on unused, then this wouldn't be a REP violation because it unused would have no runtime equivalent (you wouldn't be allowed to delete comptime). Accordingly, a more accurate designation for this problematic case is "functions with a struct definition which is independent from a comptime parameter".

One solution that appears to resolve this comes from a separate proposal: (Minor) Proposal: Support inference of type constructor parameters

Other alternatives are:

  1. Reject this kind of function, and require a type used as local state to be declared at top-level, undesirably exposing the type/state to other functions
  2. Ignore the inconsistency, and just educate users if there are any surprises

The Ugly: comptime Reference Semantics

1. Functions with reference-type parameters

Now that we've gotten through the light stuff, let's get to the really hard part: references at comptime. In fact, in my limited experience, comptime reference semantics are one of the most confusing parts of the language.

To see some of what I mean, peek at this example:

fn set(comptime x: *u32, comptime val: u32) void {
    x.* = val;
}

pub fn main() !void {
    const stdout = std.io.getStdOut().writer();

    comptime var x: u32 = 1;
    try stdout.print("x: {}\n", .{x});
    set(&x, 2);
    try stdout.print("x: {}\n", .{x});
}

This actually prints "x: 1" twice! This is a clear violation of the REP.

  1. Pointer comparison of comptime-observed regions.

Since comptime data structures are lowered to static data during observation, comptime emulation of mutable reference types is necessarily incomplete. In particular, it's not possible to simultaneously allow correct observation of the pointer itself, as well as its referent.

Here's an example:


comptime var x: *T = allocate(T);
x.field1 = 15;
var y1: *T = x;       // Observation by run-time code of pointer

x.field1 = 20;
var y2: *T = x;       // Observation by run-time code of pointer

// Using the run-time pointer's referent:
assert(y1.field1 == y2.field2);

// Comparing the pointers themselves:
assert(y1 == y2);

We can see here that if we want to support the REP, there's an incompatibility between supporting dereferencing and pointer comparison operations simultaneously.

In fact, it's reasonable to say that current Zig is REP-compliant for pointer comparisons, but not for de-referencing. Since it's relatively rare to encounter comptime-lowered pointers that are not eventually de-referenced, this says that Zig programs will commonly violate the REP.

To try to increase compliance, we could modify Zig to correctly lower for runtime-equivalence w.r.t. to dereference, which would support more programs. Nonetheless, there would clearly still be exceptions, which seem very important to control if we're going to claim compliance with the REP.

In order to enforce that invariant, we end up back in the Rust situation where we need yet another pointer type...

What's the solution? Generalized comptime-erased coeffects and capabilities!

This also lets us protect against the use of integer-encoded pointers across comptime and run-time boundaries. Such pointers can only be used either for pointer arithmetic or for de-reference.

Sketching a solution

The primary challenge of handling comptime references has to do with the observation of comptime data structures containing pointers. The REP says that all of the run-time observable parts of a comptime data structure must be frozen and copied exactly into the program text.

This process must be repeated every time the comptime data structure is observed. To save memory, we don't need to re-copy portions of the data structure that haven't changed, but the algorithmic details are complicated. A basic solution would require the following:

  1. Sound Reachability Analysis at comptime: Pointer invariants must be enforced for this. In particular, the bounds (and possibly layout) of memory behind a comptime pointer must be well-defined, so that run-time code does not access un-copied regions of comptime memory.
  2. Memory-Efficient Lowering: Every time an observation occurs, a GC-like algorithm determines reachable objects and freezes these into program code (being careful not to re-freeze already frozen objects)

// TODO: Insert notes to a gist here (and fill in GC algorithm)

In order to accommodate cases where correctly-specified memory bounds/type layout behind a pointer are not achievable, we can split our run-time pointer invariants into two classes:

  • tight pointers: *T is obviously sized, so out-of-bounds access is considered illegal by specification
  • loose pointers: (e.g. [*]T) is an unsized pointer, so the compiler must assume that it may refer to up to the full region it was borrowed from. It is legal to access this pointer out-of-bounds, up to the bounds of the pointer it was derived from

The general trade-off is that loose pointers copy larger comptime regions than their tight counterparts, so they can be more permissive about allowing out-of-bounds access at run-time.

Finally in keeping with the current comptime behavior, any pointer addresses are not guaranteed to be preserved by comptime observation, so e.g. manually saving pointer addresses in integers and then observing these for later de-reference will not work.

The GC algorithm would look roughly like this...

What if we don't want to solve this?

  1. If pointer invariants are the blocking issue, we can restrict which data structures may be observed: For instance, observation could be restricted to tight pointers
  2. We could ignore the inconsistency with the REP and provide detailed, clear documentation of comptime reference semantics to explain its differences vs. run-time code
Education

Regarding more complex pointer semantics that might need to be introduced: If we do end up needing "tight"/"loose" pointers, we should make sure these terms never make it into the documentation. Instead, the text would explain that "some pointer types are compatible with limited out-of-bound data access, including to comptime data." It would then elaborate on the specific bounds implied for each pointer type, and how these interact with allocations.

Practical Dangers: Code Bloat

Code bloat is referenced in several comptime-related Zig issues, and for good reason. This is usually related to the monomorphization of functions, which for our purposes is just another observation that happens when comptime and run-time data mix.

This proposal increases the danger of code bloat by enabling the implicit freezing of an entire graph of comptime objects into user code. In fact, this may happen arbitrarily many times if it is repeatedly modified and observed! Clearly, this is an important issue.

Rather than change the semantics of comptime to avoid these consequences indirectly, it's worth exploring if there's a better way to control the observation of large comptime entities (data structures and functions alike). Just to spit-ball: Perhaps there should be a specific built-in to flag that a large or repeated observation is intended? Maybe a conservative upper limit could be placed on the default number of comptime observations that a variable/function allows, or even the size of the code/data that it generates, with a built-in to alter/bypass this limit as needed.

Any solutions here would cross-apply to help with unexpected code bloat corner cases in Zig today, including for example, the "implicit" parameters used to prevent memoization of certain closed-over comptime state.

Conclusion

Advantages

The goal of this proposal is to make comptime more intuitive and consistent. Hopefully, that should make it easier to teach.

The Runtime-Equivalence Principle says that comptime is a modifier which:

  1. Automatically verifies that the comptime-tagged computation can be completely determined at compile-time
  2. Requests the compiler to compute it and lower any observed state/results into the program text along the way
  3. Gives you extra "super-powers", including creating a type, returning a reference to a stack-allocated variable, or using type-dynamic variables for your computations

The Strong version of the REP provides the additional guarantee that it is always safe to erase or add comptime/inline tags to your program.

These principles seem easier to reason about than treating a comptime var as a unique object with its own semantics separate from run-time var. In my opinion, the principle also captures an important essence of what it means for comptime Zig to be the "same language" as run-time Zig: their interaction should be sound with each of their semantics in isolation.

Whether or not the Runtime-Equivalence Principle is accepted, divergences from run-time behavior, like the examples studied above, can create important foot-guns and barriers to learning, so keeping the principle in mind may prove helpful.

Challenges

The biggest barriers to full conformance with the REP would appear to be unintended code bloat, maintaining pointer invariants for reachability analysis, and implementation complexity/performance in the compiler. These are definitely serious issues that need careful consideration by folks familiar with the compiler internals and detailed Zig specification (certainly, not me!)

Of course, this is all just a sketch of an idea so far, so I'm very eager to hear if anyone can spot any fatal flaws or important details

Aside: What does this have to say about existing principles of comptime operation?

In general, the REP only restricts comptime behavior when a runtime-equivalent behavior exists with the same syntax.

As a consequence, many existing restrictions to run-time or comptime behavior are compatible with (but not required) by the REP. This is also true in general for the comptime super-powers.

Thus, the REP helps to provide clear guarantees about one major portion of comptime behavior, while leaving open the discussions on appropriate super-powers and restrictions to comptime and run-time.

It also helps clarify the impact of those discussions:

  • If the Strong REP is satisfied, then the discussion of super-powers & restrictions primarily impacts the size of Zig's feature set and the compatibility of code between comptime and run-time
  • If only the Weak REP is satisfied, then safety is also impacted.

These discussions are important. Safety always is, of course, but also especially because Zig prides itself on being a quick-to-learn language with very few semantics.

Nonetheless, these questions are largely orthogonal to the Runtime-Equivalence Principle.

Assorted Examples

Here are a few snippets collected from some of the existing Issues:

1. Compatible, and Required: comptime Emulates the Target

For good reason, Andrew has made a strong point that side-effects from the host need to be kept out of comptime for coherent cross-compilation.

The Run-time Equivalence Principle requires precisely the same, and perhaps helps to explain further why this is so important. "Emulating the target platform" is required for abstract interpretation of the target executable.

2. Compatible, but not Required: No comptime var Globals + Inobservable execution order

The Run-time Equivalence Principle does not require these, but they are perfectly compatible with it, since they are restrictions of comptime powers.

At the same time, it's worth considering whether inobservable execution order is still worth holding onto, since the REP implies a weaker, but similar condition that the execution order may be observable only if it is uniquely determined by comptime-known information. In particular, it would be a permitted pattern to perform writes early in the program and then have these read later, where execution order becomes ill-determined. However, this can easily be accomplished

// TODO: Finish this one

Of course, if this does impact incremental compilation as mentioned in #7396, the team may still choose to keep this restriction for performance reasons, or even just to prevent "non-local" comptime effects which may be surprising to the user.

3. Compatible, but not Required: No Closure Over comptime var

The restrictions around comptime var closures are also compatible with the principle, since these are restrictions of comptime behavior. It does seem that the REP would resolve at least some of the nasty side effects, in this case.

4. Compatible, but not Required: fields of anytype may dynamically change their type over time

This is a comptime super-power with no run-time equivalent. As such, the REP passes. My personal opinion is that this appears more natural with the interpretation of the REP, to the extent that comptime is only static when it is observed.

Note: This is referenced in #5804

5. Compatible, but not Required: comptime fields should behave like comptime var (and vice-versa)

Technically, since comptime fields and comptime var agree for the powers they share, the REP does not require this. However, the spirit of the REP certainly suggests that this should be true. At run-time, variables and fields behave with essentially the same semantics. It seems intuitive to say that the same should hold for comptime members and fields.

i.e. comptime var should be allowed to be type-dynamic, just like comptime fields. Conversely, comptime fields should be allowed to vary over time, just like comptime var

6. Compatible (with Weak REP): Returning pointers to comptime variables

fn initArray() []u8 {
    var array: [10]u8 = undefined;
    for (array) |*elem, i| {
        elem.* = @intCast(u8, i * 10);
    }
    return &array;
}
var array = initArray();

As explained by SpexGuy #5718

Comptime variables are not stored on a stack, so returning a pointer to a local var at compile time is fine. This has been proposed in the past as a method to construct a comptime allocator.

This is a violation of the Strong Run-time Equivalence Principle. In particular, the behavior between comptime and run-time diverges in this case only for a program with Undefined Behavior. Since this is specified behavior for comptime and undefined behavior for run-time, this is allowed by the Weak Run-time Equivalence Principle.

In order to recover the Strong REP, we'd need a borrowing/ownership model that's able to statically reject this kind of program at run-time, or we'd have to ban this super-power (along with all implicit GC) from comptime.

Of course, having comptime functionality suddenly become unchecked Undefined Behavior when comptime is removed is certainly a foot-gun. This is what's given up when moving to the Weak version of the REP.