Like, on a scale from c to rust?
For various common safety issues, we can look at protections that are present in software as it is typically shipped (ie excluding tools like AddressSanitizer that are not recommended for production use):
|issue||c||zig (release-safe)||rust (release)|
|out-of-bounds heap read/write||none||runtime||runtime|
|null pointer dereference||none||runtime⁰||runtime⁰|
|type confusion||none||runtime, partial¹||runtime²|
|use after free||none||none⁴||compile time|
|double free||none||none⁴||compile time|
|invalid stack read/write||none||none||compile time|
|uninitialized memory||none||none||compile time|
|data race||none||none||compile time|
- optional types
- tagged unions, doesn't protect against holding a pointer to value while changing tag
- tagged unions
- not by default, but available via compiler setting or by linting against unchecked arithmetic
- this is contentious, see counterarguments here and here
There are two clear groups here:
- Spatial memory safety. Mostly runtime mitigations. Nearly identical in both zig and rust. These are easy to implement and probably sufficiently non-controversial that any new systems language will have similar features.
- Temporal memory safety and data race safety. Mostly compile time mitigations. Unique to rust. These are novel, non-trivial to implement and add a significant amount of complexity to the language.
So we can say that zigs spatial memory safety is roughly comparable to rust, and its temporal memory safety and data race safety are roughly comparable to c.
Zig does has some weak improvements over c with regards to temporal memory safety:
- The standard library includes a set of allocators which don't reuse allocations, preventing use-after-free, and which catch double-free. I'm not clear yet on how high the runtime and memory overhead are though, which will dictate when it is practical to use these in production.
- In c unitialized variables are often used when they can't easily be initialized by a single expression. In zig it's possible to use a labeled block that returns the initial value, or to use an optional type and initialize it to null.
- The pervasive allocator api makes it easier to use arena allocation or garbage-collected pools which simplify lifetime management.
errdefersimplifies resource cleanup inside complicated control flow, reducing the possibility of mistakes.
- Support for generics reduces the changes of casting mistakes.
Zig also has a number of tools to help detect violations of temporal memory safety during testing. These are useful, but experience with c indicates that they won't be sufficient to eliminate vulnerabilities.
I tried looking at some public breakdowns of security issues from various projects written in c and c++ (mostly sourced from Alex Gaynors handy summary) to get a sense of the relative frequencies:
- Android: ~75% spatial vs ~15% temporal (just eyeballing the pie-chart)
- Windows: Some of the categories don't map neatly to spatial vs temporal. If we assume that 'stack corruption' is always temporal but 'heap corruption' could go either way then we have 23-36% spatial vs 28-41% temporal for 2018. If we narrow down to exploited issues then it's 0% spatial vs 75% temporal.
- Curl: 45% spatial vs 7% temporal (the pie-chart breakdown is only for the 52% of security issues related to memory safety)
- 0day in the wild: Insufficient detail on most, but if we look just at those explicitly marked as 'use after free' then we have 5/25 in 2020, 5/21 in 2019, 6/13 in 2018 etc so possibly >50% temporal?
This isn't a very clear picture. The percentages vary wildly between projects. The categories are sufficiently vague that I could be classifying them all wrong. Looking only at fixed issues tells us nothing about how easy they are to exploit, but looking at existing exploits limits us to a very small dataset.
It certainly seems like just fixing spatial memory safety (going from c to zig) is a non-trivial improvement. But I'd like to better understand why actual exploits appear here to rely more often on violating temporal memory safety.
Rust bears additional complexity and friction to buy temporal memory safety and data race safety. But sometimes we might be able to buy those more cheaply eg:
- Systems that can approach temporal memory safety by:
- Systems that can approach data race safety by:
Sometimes we might also just choose the bear the cost. For systems with low risk profiles (eg internal software that is never exposed to hostile input) we might decide that debugging the occasional use-after-free is preferable to adding development friction.
There are certainly systems though where none of the above are options. For example, the web spec pretty much mandates that browsers must have complicated ownership models, use pervasive sharing between threads and be constantly exposed to hostile inputs. In such cases it's hard to make an argument for zig.