Today marks the end of my first quarter on github sponsors. Thanks to everyone who made this possible by giving money to some rando on the internet.
Here is what I did so far (some already public, some sponsors-only):
- An opinionated map of incremental and streaming systems. Mapping out the landscape of streaming and incremental systems, trying to create common terminology to piece together work from different communities that mostly don't talk to each other.
- Internal consistency in streaming systems (code here). I want to take seriously this idea of turning the database inside out. So I'm figuring out what consistency guarantees are needed to build entire app backends inside a streaming system, testing existing systems and categorizing failure modes. This is where the bulk of my time went this quarter - each system I added took a lot of studying and coding and debugging, even though only 3 made it into the final article. I also learned a lot that will be useful for later projects.
- Thoughts on benchmarking (WIP). Starting to plan how to benchmark streaming systems properly. Mostly looking at existing benchmarking efforts (few, mostly limited and/or flawed) and trying to figure out good representative workloads.
- Why isn't differential dataflow more popular? I think the ideas in differential dataflow are really compelling, but they've mostly been ignored. So I asked people for feedback on what it's lacking and what needs to be added. I got ~20 emails and ~100 forum comments which are summarized at the end.
- Wrote 3/N chapters for a short book about focus, a text editor built from scratch using only sdl and freetype. The goal is to keep it simple enough that you can read the code and the accompanying book in a few days and understand the entire thing well enough to fork and alter it. Much work remaining, but I already used it for all my writing and coding this year.
- How Materialize and other databases optimize SQL subqueries. Explaining how other databases optimize subqueries, how the approach I built at materialize fills in the gaps and what problems are still open.
- MMIO in zig. A short case study of a cute api for dealing with memory-mapped IO registers in zig.
- How safe is zig? Looking at which aspects of memory safety zig already handles and comparing those to root cause breakdowns from various large C++ projects. Not paticularly conclusive, and any discussion about root causes was swept away by the language flamewars caused by unwisely putting an easily skimmed summary above the fold.
- Wrote 8 editions of my sponsors newsletter. Contains random thoughts, work in progress, short discussions of interesting papers/articles/projects, and other miscellanea that can't justify a whole blog post. I'm not sure how much people care about this, but I already did all the thinking and reading anyway so I might as well write it down.
Ideas for Q2:
- Imp. How rich can you make a language while still being able to compile it to a structured streaming/incremental system or to a database query planner? The answer so far is pretty rich! But I have two different versions of the compiler which each support different subsets of the language. And the accompanying posts and interactive examples need to be updated to all refer to the same version of the language. It would only take a month or two to get this to the point where people can try it out.
- More focus. Many more chapters to write. I have some plans for incremental syntax highlighting and also want to write up some benchmarks I did of existing highlighters. Also hundreds of smaller features in my todo list.
- 'Differential dataflow for mere mortals'. Write a version of differential dataflow that is optimized for readability instead of performance and flexibility. Make an interactive animation of the execution and use it to explain how the damn thing works.
- Staged query compilers in zig. I previously did this in julia and got pretty good results. The main obstacles were limits on what kind of data could be lifted to compile time and struggling to force stack allocation for working state. Zig solves both of these and also makes it easy to compile the result to wasm, which I foresee being useful in the future.
- Explaining some open problems I've run into eg why query compilation is hard in a streaming setting, how to handle errors in streaming/incremental systems.
Right now I'm making ~1900 CAD/month, not yet minimum wage but halfway towards being able to comfortably pay all my bills. Here is where the money goes.
If you want to see more like this and you can afford to pay for it, here's the link: