For all those who are interested, here's a quick summary of what I've been up to the last couple of months, with special attention to a couple of juicy theoretical and practical tidbits gleaned along the way.
What's the Direction?
Although my officially stated purpose to a lot of folks has been to get involved with scientific computing, I have been distracted by a deep theoretical "itch" in the PL design space.
As such, I've been taking some time to learn a little bit of a lot of things: A dash of category theory, static analysis, functional programming, effects/co-effects systems, etc. At times it is very exciting and really deep and interesting.
At other times, I feel like I should be slowing down and choosing "a hill to specialize on," but then the nagging feeling that there's a rock un-turned pulls me back to exploration again.
I think this process will be spinning down in the coming weeks and months, as my understanding of the field (and my personal options) consolidates. In any case, it's been great to explore the academic "theme park" and dig in enough of the innards to see how I might feel properly committing to it in grad school, for example.
How it started
Inspired by some of the innovation in the Julia community, I left Apple with a lot of curiosity about why/how it achieved its expressiveness and composability. Being an embedded engineer, I was also terribly curious to understand how a JIT seemed to be contributing to some of the most exciting areas of software development, including WebAssembly's Interface Types, Zig's comptime
, and Julia itself.
The key advantage has to do with two ideas which I expect will become recurring themes: specialization (to specific inputs, types, or usage patterns) and alternative interpretation (to imbue existing code with new ABIs, or new behavior).
How it's going
After a few months of agonizingly slow, but oh-so-exciting learning and exploration, I think some of the foray in the PL (programming language) design space may be paying off.
If you're curious in taking a look at the details, or seeing what goes into a technical proposal of this kind:
- (Minor) Proposal: Support runtime-erased
type
parameters - (Minor) Proposal:
parent
Pointer Attribute for safer polymorphism in Zig - (Major) Proposal: Run-time Equivalence for Zig's
comptime
Only one of these proposals was originally planned, to be honest. However, these are important groundwork before we can officially take a stab in the dark at The Big Problems
Getting to the Essence
I've been looking often to my Big Three programming language influences (Rust, Julia and Zig) to guide me to useful theory and idioms in the PL design space. To see how these pieces fit together, I have tried to boil down these languages to the simplest description of their fundamentals.
So far, I believe that I've identified:
- The essence of safety in Rust is compile-time memory management
- The essence of expressivity in Julia is dispatch which is open by default
- The essence of
comptime
in Zig is abstract interpretation
If you want more details, the third is one is covered in the Run-time Equivalence proposal for Zig above. The other two will be coming soon™.
The Science of "Alternative Interpretation"
The more traditional term for this is probably "Abstract Interpretation", but I prefer "alternative" because it makes it clear that these analyses don't have to be any more "abstract" than the underlying algorithm they are re-interpreting
Chris Rackauckus has a great blog post on this here
The concept of interpreting a program "alternatively" may seem very vague, but it has some very concrete applications:
- If you replace all
float
types in a program with(float, float)
and swap out the primitive math operations (+
,-
,*
,/
,sin
,cos
, etc.) for their primal + derivative versions, you just just (forward) auto-differentiated a program! - If you write an algorithm on the CPU but have an automatic way to translate it to the GPU, you just re-interpeted a program to specialize it for heterogeneous hardware!
- If you fix certain values at compile-time and partially evaluate that portion of the program, you've just created a "specializer" to compile down to an optimized version of your algorithm. This can be used, e.g., to make a really fast and concise FFT library
The big connection that made things "click" for me is that these interpretations are only alternative at all because we've implicitly fixed a set of "canonical" semantics at the bottom of our technological stack. Alternative interpretation is really about making it possible to swap out these semantics. That might mean:
- Translating the hardware primitives
- Targeting architectures with very different underlying semantics (e.g. CPU/GPU)
- Cross-compiling to different architectures or operating systems
- Automatically translating an algorithm to run in parallel or use SIMD
- Translating the software primitives
- Auto-differentation: a computation for
f(x)
is translated to one that computes(f(x), f'(x))
- Symbolic interpretation: a computation of
f(x)
is made to instead return a symbolic expression equivalent tof(x)
- Linear Algebra: By changing the ring (scalar operations) underlying a linear algebra library, we can use it do computation on graphs
async
: A computation is wrapped in a state machine so that it can execute independent and not block the thread- Lazy/Memoized: A function that computes a result directly can be translated to one that does no computation until it is used, or which automatically memoizes its results
- Auto-differentation: a computation for
- Changing the software modules
- I add a new module to my project and want it to work automatically with my custom data types. This requires re-interpreting that module
The unified view of program behavior asserts that the behavior of your CPU, your programming language, and the libraries built on top of it are fundamentally the same. Each layer designs a set of supported operations, defines their behavior, and exposes these to the layer above.
In theory, any of these can be swapped out or adapted to accomodate a desired change in the program semantics. In practice, software has spent a lot of time figuring out how to do this correctly in its own layers. As Moore's Law runs out, hardware is also necessarily diversifying its approach.
Of course, in order to be able to accomplish any of this, it's clear that our approach must be fundamentally compositional. In that vein, here's a few brilliant papers exploring alternative interpretation on a highly-compositional basis:
- Conal Elliot's The Simple Essence of Automatic Differentiation
- Melliès and Zeilberger's Functors are Type Refinement Systems
- Carette, Kiselyov, and Shan's Finally Tagless, Partially Evaluated
That's it for now!
Next time, I'll have a chance to dig into the The Big Problems. For now, the literature review and experimentation continues, focusing primarily on Dependent Types and Effect/Co-effect Systems.
Thanks to everyone for all of your support, ideas, and community!