loader
Generating audio...

arxiv

Paper 2503.03698

AEGIS: Towards Formalized and Practical Memory-Safe Execution of C programs via MSWASM

Authors: Shahram Esmaeilsabzali, Arayi Khalatyan, Zhijun Mo, Sruthi Venkatanarayanan, Shengjie Xu

Published: 2025-03-05

Abstract:

Programs written in unsafe languages such as C are prone to memory safety errors, which can lead to program compromises and serious real-world security consequences. Recently, Memory-Safe WebAssembly (MSWASM) is introduced as a general-purpose intermediate bytecode with built-in memory safety semantics. Programs written in C can be compiled into MSWASM to get complete memory safety protection. In this paper, we present our extensions on MSWASM, which improve its semantics and practicality. First, we formalize MSWASM semantics in Coq/Iris, extending it with inter-module interaction, showing that MSWASM provides fine-grained isolation guarantees analogous to WASM's coarse-grained isolation via linear memory. Second, we present Aegis, a system to adopt the memory safety of MSWASM for C programs in an interoperable way. Aegis pipeline generates Checked C source code from MSWASM modules to enforce spatial memory safety. Checked C is a recent binary-compatible extension of C which can provide guaranteed spatial safety. Our design allows Aegis to protect C programs that depend on legacy C libraries with no extra dependency and with low overhead. Aegis pipeline incurs 67% runtime overhead and near-zero memory overhead on PolyBenchC programs compared to native.

Paper Content:
Page 1: AEGIS: Towards Formalized and Practical Memory-Safe Execution of C programs via MSWASM Shahram Esmaeilsabzali1, Arayi Khalatyan2⋆, Zhijun Mo1, Sruthi Venkatanarayanan1, and Shengjie Xu1 1Huawei Canada, Waterloo Research Centre shahram.esmaeilsabzali@huawei.com zhijun.mo@huawei.com sruthi.venkatanarayanan@huawei.com shengjie.xu@huawei.com 2University of Waterloo akhalaty@uwaterloo.ca Abstract. Programs written in unsafe languages such as C are prone to memory safety errors, which can lead to program compromises and seri- ousreal-worldsecurityconsequences.Recently,Memory-SafeWebAssem- bly (MSWASM) is introduced as a general-purpose intermediate byte- code with built-in memory safety semantics. Programs written in C can be compiled into MSWASM to get complete memory safety protection. In this paper, we present our extensions on MSWASM, which improve its semantics and practicality. First, we formalize MSWASM seman- tics in Coq/Iris, extending it with inter-module interaction, showing that MSWASM provides fine-grained isolation guarantees analogous to WASM’s coarse-grained isolation via linear memory. Second, we present Aegis, a system to adopt the memory safety of MSWASM for C programs in an interoperable way. Aegis pipeline generates Checked C source code from MSWASM modules to enforce spatial memory safety. Checked C is a recent binary-compatible extension of C which can provide guar- anteed spatial safety. Our design allows Aegis to protect C programs that depend on legacy C libraries with no extra dependency and with low overhead. Aegis pipeline incurs 67% runtime overhead and near-zero memory overhead on PolyBenchC programs compared to native. 1 Introduction Memory safety, which ensures that a pointer is valid and its dereference accesses a valid memory location, is of paramount importance in systems security. There is a huge amount of C code in modern software infrastructures. However, C semantics does not protect these codes from memory safety violations like buffer ⋆This author was a co-op at the Waterloo Research Centre at the time this research was carried out.arXiv:2503.03698v1 [cs.PL] 5 Mar 2025 Page 2: 2 Authors Suppressed Due to Excessive Length overflow or use-after-free. Therefore, a single out-of-bound pointer dereference can be, and has been, the root cause of critical security vulnerabilities. For example, the Heartbleed vulnerability [6] in 2014 would allow a remote attacker to dump server process memory stealthily, which can leak private keys and user data. There is a plethora of approaches to this problem [28], including compiler instrumentation [27,23,20], fuzzing [16,35], or rewriting C code into a memory safe languages, such as Rust [7,34]. However, there is no silver bullet to the problem yet because there is a trade-off between the performance, practicality, and security guarantees of different approaches. Recently, Memory-Safe WebAssembly(MSWASM) has introduced a new di- rection in memory safety enforcement for C programs [17]. MSWASM is based on WASM, an intermediate, portable bytecode format that C/C++ programs can be compiled into. WASM programs can run efficiently in secure sandboxes (e.g., browsers). However, WASM does not prevent memory safety violations within the program [17]. By extending raw pointers into MSWASM handles, which are unforgeable, integrity-protected fat pointers, MSWASM could provide complete memory safety for C. In addition, as an intermediate bytecode format, MSWASM allows the execution engine or source-to-source translator that im- plements it to decide the exact data layout of handles and the security policy to enforce for the same bytecode. This feature offers users the freedom to choose appropriate security-performance tradeoffs depending on a deployment scenario. While the original MSWASM paper established the foundation for mem- ory safety enforcement [17], there are still missing pieces for practical adoption. First, there is no mechanical formalization of the semantics of MSWASM and mechanized proofs of its memory safety properties, both of which are neces- sary for ensuring practical safety and security. The memory safety guarantees of MSWASM, especially spatial safety , which ensures lack of buffer overflow and underflow, has been proven informally, in the context of single-module semantics only.Inparticular,itdoesnotdealwithconsequencesofimportingandexporting segments of memory via handles belonging to different MSWASM modules. As a result, it is not clear how to formulate the isolation guarantees for real-world MSWASM modules with external dependencies. Second, the MSWASM proto- type cannot generate code that depends on legacy C libraries, because compiling a C program into MSWASM requires alldependent code to be compiled with the same toolchain; it does not allow for partial compilation of a C library or file. This significantly limits the usefulness of MSWASM for existing systems. In this paper, we present Aegis, which extends MSWASM both formally and practically in the following two ways: Formalism. We present a full formalization of MSWASM semantics in Co- q/Iris [10], and prove local handle isolation for MSWASM, a fine-grained notion of isolation comparable to the coarse-grained WASM isolation [25], proven re- cently.WeextendtherecentWASMsemanticformalizationandtheorems[25,31] toincludeallfeaturesofMSWASM,specifically,theinter-moduleinteraction.We also provide a proof of our compilation from MSWASM to Checked C, thereby guaranteeing the spatial safety of our generated Checked C code. Page 3: Title Suppressed Due to Excessive Length 3 Aegis Pipeline. We present our toolchain for translating an MSWASM module to efficient, spatially-safe Checked C code. Checked C [26] is a binary-compatible extension of C that introduces a minimal type system to annotate pointers with bound information (referred to as checkedpointers). The Checked C com- piler either reasons that a checked pointer dereference is safe or adds dynamic bound checking, which can potentially be removed by compiler optimizations. The compilation pipeline involves (1) compiling the original C programs into MSWASM, (2) running a “MSWASM to Checked C” (M2C) code generator to emit Checked C sources, and (3) compiling and linking the generated Checked C sources with legacy C code to produce the executables. Incorporating MSWASM as the intermediate bytecode format simplifies formalization of spatial memory safety over the entire compilation process, especially when the final executable contains code from multiple parties. The M2C code generator performs type- preserving compilation [18], where well-typed MSWASM code is translated into well-typed Checked C code. To bridge the difference in pointer representation between MSWASM-protected code and legacy C code when they call functions in each other, the code generator emits wrapper functions that convert argu- ments and return values for function calls. In addition, we implemented several features missing from MSWASM prototype in Aegis, including support for global variables and sliceinstructions for spatial safety on stack. Lastly, we prove our translation algorithm produces spatially-safe Checked C code. We have benchmarked Aegis pipeline against PolyBenchC and compared the overhead against Rust backend of MSWASM. Our runtime overhead against native C code is 67% on average, with memory overhead being negligible. The Rust implementation of MSWASM has runtime overhead of 96% with a massive memory usage overhead compared to Aegis. Our contribution in this paper is twofold: – Semantic formalization of MSWASM in Coq/Iris: We mechanize MSWASM semantics in Coq/Iris. We state a theorem that characterizes the isolation guarantees and pointer integrity of MSWASM. We also prove our translation from MSWASM to Checked C produces spatially-safe code. – Spatial memory safety for C programs with MSWASM: We present a toolchain to protect C programs via MSWASM and Checked C. We make our tools, source code, and formalizations all available.3 The remainder of the paper is organized as follows. Section 2 provides back- ground on MSWASM and Checked C. Section 3 presents our formalization of MSWASM. Section 4 discusses the Aegis pipeline, including the overall design and the implementation based on the LLVM and WABT fork for MSWASM. Section 5 contains the evaluation for our implementation. Section 6 discusses related work in memory safety and relevant formal proofs. Finally, Section 7 concludes the paper. 3https://doi.org/10.5281/zenodo.14289850 Page 4: 4 Authors Suppressed Due to Excessive Length 2 Background 2.1 WASM and MSWASM WebAssembly(WASM) [9] is a binary format for a stack-based virtual machine with load/store architecture. It is designed to be compact, portable, secure, and efficient for web applications and beyond. It provides a coarse-grained memory isolation model where all application data are allocated in a contiguous linear memory and all memory accesses are checked against the bounds of this linear memory, hence ensuring memory isolation. In addition, it utilizes type checking to ensure safety of the WASM stack. It also provides lexical structure at the assembly level to prevent control-flow related attacks. While WASM prevents a buggy application from tampering memory of other applications using the linear memory sandbox, it does not stop one data struc- ture to overflow another in the same insecure application. Recently, memory safeWASM (MSWASM) was introduced, which is an extension of WASM that provides this missing fine-grained memory safety protection [17]. Fig.1: MSWASM Syntax and Semantic of WASM with MSWASM Extension highlighted [17]. We add h.null, shown in pink, to the formal syntax to facilitate our proofs. Types τ:=i32| i64 | f32 | f64 | handle Value v:=...|⟨nbase, noffset, nbound, bvalid, nid⟩ Instr. i:=const v| ... |segload τ|segstore τ| slice | segalloc | h.add | segfree | h.null Byte b∈ {0..28−1} Lin. Mem H:=b∗SegID nid:=i32 Store Σ:=(H,T,A) Segments T:=i32,→(b, t)Tags t:=H|D Allocator A:=nid,→(i32, i32) Syntax and Semantic. MSWASM extends the WASM syntax with six opera- tions and a new type handle, as shown in Figure 1. A handle is a new type that is essentially a fat pointer. A handle consists of four integers (a lower bound, an upper bound, a current offset, and an allocation ID), as well as, one validity byte, which denotes whether the handle is valid and active or not. Besides the normal pointer operations (e.g., pointer arithmetic) over handle, MSWASM also defines a sliceoperation to shrink the bounds of a handle. The handle structure together with its operations ensure complete spatial (lack of overflow, underflow, etc.) and temporal (lack of use after free, etc.) memory safety. When a viola- tion happens, MSWASM semantics captures it and terminates. For ease of our formalization, we add “h.null” syntax, to represent a null handle. MSWASM extends the WASM semantic with the notion of segment Tand segment allocator A. A segment is a piece of memory disjoint from the linear memory in WASM. Handles can only point to segments, and can also be serial- ized/deserialized in the segments. Semantically, every byte of every segment has amemory tag that determines if it is data or handle, disallowing forging a new handle from data bytes. Page 5: Title Suppressed Due to Excessive Length 5 2.2 Checked C Checked C [26,29] is a dialect of C that adds spatial memory safety protection to C. A compiled Checked C program is binary-compatible with other binaries on the same platform. Checked C enhances C with a type system to declare a bound for a pointer, hence creating a checkedpointer. All checked pointers will be bound-checked for spatial safety at dereference time. For each bound check, the compiler will first try to prove that the check never fails and can be removed in binary; if not, it adds runtime checks that could be hoisted or optimized away by compiler optimizations. Therefore, we choose Checked C as the backend of Aegis for its C compatibility and low-overhead spatial safety enforcement. Pointer T* pcan be defined as a checked pointer pointing to an array as: 1array_ptr<T> p: count(n) Where pis a checked pointer of type Twhose lower bound is pand whose upper bound is p + sizeof(T) * n .ncan be variable, structmembers, or an expression. Besides count,byte_count syntax can be used to specify the bounds size in bytes. Besides pointers to arrays, Checked C uses ptr<T>for pointers to a single object (i.e., address-of type pointers) that cannot have arithmetic, as well as, nt_array_ptr<T> for pointers to null-terminated arrays. Checked C allows to gradually port C code to spatially-safe, typed Checked C code. A piece of code can be marked as “checked” or “unchecked” at different levels of granularity. Checked C’s Blame Theorem guarantees that if a piece of code is checked, then that piece of code cannot be “blamed” for any spatial safety violation [26]. When there is a mix of checked and unchecked code, it is not possible, in general, to guarantee spatial safety. Our tool, by virtue of MSWASM’s semantic being spatially-safe, automatically generates checked code from the original C code, except for variadic functions that are currently not supported in Checked C and outgoing calls to those external C functions that wecannotanalyzeorinstrument.Thus,ourtoolgeneratesspatially-safeChecked C code from MSWASM code, subject to its limitations. 3 Formal Semantics of MSWASM in Coq Semantics of MSWASM is proven memory safe [17], but not mechanically using a formal tool. We have developed a mechanical formal semantics of MSWASM together with formal proofs about its isolation guarantees. Our formalization is based on WasmCert-Coq [31] and Iris-Wasm [25]. In the rest of this section, we briefly explain our results, focusing on module systems of MSWASM and how its semantics correspond to our main result on MSWASM isolation guarantee. We decorate Coq formalized definitions, theorems, or examples with . We provide our full formalization. Page 6: 6 Authors Suppressed Due to Excessive Length 3.1 Extension for MSWASM on Module Interaction We first briefly recall the WASM module design in Figure 2. A module has four components: functions f∗, linear memory mem∗, (function) tables tab∗, and global variables glob∗. A WASM module will be instantiated to an instance as its runtime representation with these four components. The global variables are statically typed, like WASM stacks. Each component can be imported, ex- ported, or initialized during (MS)WASM module instantiation . Once the module instantiates, the starter function will be automatically invoked if there is one. Fig.2: (MS)WASM Module Syntax, adapted from [25] (glob. ty.) tg :=mut τ|immut τ(import type) imptdes :=func in | ... | globitg (import) impt :=import imptdes (export type) exptdes :=func en | ... | globen (export) expt :=export exptdes (globals) glob :=global tg i (modules) m:=module f∗glob∗tab∗mem∗impt∗expt∗ The MSWASM original paper [17] considers the scenario where only one module is active at a time, and thus the imports and exports are excluded from their formalization and implementation. We present our formalism that includes multiple modules. Compared to WASM, MSWASM allows a module to import and export a handle, but it is important to notice that a handle cannot be stored in the linear memory, and thus the only way to import and export a handle is to make it a global variable. Our isolation proof for MSWASM is defined incrementally over the structure of MSWASM syntax, using the method of logical relation [5]. 3.2 Type Safety in MSWASM This is the first step in our incremental proof. We extend the Iris-Wasm’s unary logical relation with MSWASM features and derive a strengthened type safety compared to [17,31]: arbitrary syntactically well-typed partialprograms can ex- ecute successfully after proper linking. Then we can derive the type safety for complete programs. Theorem 1(Type Safety for Closed Expression) A syntactically well- typed closed expression can never reach a stuck state during its execution, where stuckness represents any error including spatial safety. Thanks to the design of WASM, this type safety already covers the safety of the MSWASM stack. Please note that trapis a well-defined behavior in WASM, that is not considered “stuck”. Next, we lift type safety to the module level, based on the module level formalism in Iris-Wasm. Page 7: Title Suppressed Due to Excessive Length 7 Theorem 2(Type Safety for Closed Module) A syntactically well-typed closed module (with a starter function) can never reach a stuck state during its execution. 3.3 Memory Safety in MSWASM [17] provides a memory safety proof based on a monitor machine , covering the spatial and temporal safety of the MSWASM. However, this safety property can- not capture the isolation andhandle integrity properties relevant for MSWASM: it is possible for a spatial and temporal safe language to allow forging handles. We illustrate how our mechanization can help fill this gap and strengthen the formal guarantee of MSWASM. First, we need some terminology. Definition 1(Reachable , and Maximal Reachable ) We say a segment Sis reachable from an active handle h, denoted as a relation reachable (h,S) when –we can access Sfrom handle h –ifhandle hcanaccessanotherhandle h′andreachable (h′,S′),then reachable (h, S′) –reachable (h,S) and reachable (h,S′) implies reachable (h,S∪S′) Wedefinethemaximalreachable Reachable (h,S)asthelargest Ss.t.reachable (h, S) holds. Theorem 3(Local Handle Isolation) –Given arbitrary syntactically well typed function f, –given a set of well-typed arguments vsforf, –if we can separate the whole memory M=L˙∪Einto two disjoint parts, where we call Llocal memory (ornon-exported memory ) and Eexported memory, –ifReachable (vs,E) then during and after the function call f(vs), the local memory Lis unchanged. Inabovetheorem, exported memory isdefinedasthe(maximum)reachablemem- ory from the function inputs vs; and local memory is just the rest of the memory. Our theorem can explicitly show the invariance of the local memory during the execution of f. In other words, this property means “ the execution of fwill never write the non-exported memory ”. Another way to look at Theorem 3 is to consider it as a safety property [12], because its violation can be detected by monitoring modifications in local mem- ory in a finite trace. More importantly, this safety is robustagainst arbitrary (even malicious) linking. We demonstrate this using a concrete example (the theorem holds for all cases). Example 1 Given an arbitrary syntactically well-typed function ⊢g: []→ [](an arbitrary function representing an “attacker”), and a parameter n, we have the following code: Page 8: 8 Authors Suppressed Due to Excessive Length 1(const 4); (segalloc); (local.tee 0); ;;Allocation;; 2(const n); (segstore i32); ;;Store n;; 3(call g); ;;Hand over to attacker ;; 4(local.get 0); (segload i32); ;;Retrieve the stored value;; 5(const n); (sub i32); (eqz i32); ;;Check Modification;; 6(if [const 1] [const 0]) ;;Return 0 if modified, otherwise 1;; Basically, we allocate local memory to store n, and we check if this local memory has been modified after the execution of the attacker, g. Using our mechanical formalization and proof, we can formally show that this piece of code will either terminate with 1 or trap (during the execution of g). That means an arbitrary attacker, g, cannot modify the local memory of the caller. This mechanized example further illustrates the robustness and isolation against arbitrary (well-typed) attackers. A corollary of Theorem 3 is handle in- tegrity: it is not possible for an attacker gto forge the local handle in the caller, or else it could have consequently modified the caller’s memory. We also accom- pany our mechanization with several other examples, including those involving modules, to illustrate the usage of our logical relation model. Finally, we can lift Theorem 3 to the module level, by considering the imports and exports as inputs and outputs. In this theorem, we will ignore the import of the table and linear memory, and focus on the impact of the segments. Thus, we need to extend the definition of reachable for function closures. The details can be found in the mechanized proof. Theorem 4(Local Handle Isolation for Module) –Given an arbitrary syntactically well typed module Frequiring no imports on linear memory and tables, –given a set of well-typed arguments vsas imported global variables –and a set of closures csas imported functions, –if we can separate the whole memory M=L˙∪Einto two disjoint parts, where we call Llocal memory (ornon-exported memory ) and Eexported memory, –ifReachable (vs,E1) and Reachable (cs,E2) and E=E1∪E2 then during and after the module instantiation F(including the execution of its starter function) using imported vsandcs, the local memory Lis unchanged. 4 Aegis Pipeline Section 3 highlights the protection guarantees of MSWASM. In this section, we present the design and implementation of our Aegis pipeline that preserves the formalism’s protection guarantees when it comes to spatial safety. Aegis pipeline implements the spatial memory protection for the input program by Page 9: Title Suppressed Due to Excessive Length 9 generating code in Checked C and resuming the build process using the modified Checked C compiler, which can compile both C and Checked C code. Compared to other memory safety schemes using runtime checking, Aegis has a special focus on preserving memory safety formal guarantees. Compared to existing C to Checked C solutions that convert raw pointers to checked pointers in Checked C, Aegis pipeline converts them into fat pointers instead, losing the source code compatibility and efficiency in exchange for complete automation and coverage. We make our entire implementation available. Source codeMSWASM binary CheckedC sourceExecutable binaryC2M MSWASM binaryM2CCheckedC compiler MSWASM binary… … Extra source files and binaries Fig.3: Overview of AEGIS Pipeline Figure 3 gives an overview of the pipeline. First, the user would use a C-to- MSWASM (C2M) compiler to produce the MSWASM binary. The original C2M compiler [17] does not emit correct code for C programs using global variables and other unsupported features; they should be compiled with our modified C2M. The output MSWASM binaries are the counterparts of object files, and if there are multiple source files, all MSWASM binaries are linked together4. Next, the user runs our MSWASM-to-Checked-C code generator (M2C) that emits Checked C sources and headers from MSWASM files. After that, the user can use the Checked C compiler to compile them as regular (Checked) C programs and resume normal build procedure. Each of these tools can be used separately; for example, M2C can be applied independently to a valid MSWASM module. If there are additional source code and binaries, the user can add them when invoking the Checked C compiler. This pipeline allows C programs to benefit from the MSWASM as a se- cure bytecode format without losing C interoperability. The user can guarantee that all code in the input MSWASM files are spatially protected, even if some MSWASM files are produced by third party vendors. This property is especially useful when there are multiple code owners/vendors. Assume Alice develops one component of a system and delivers it to Bob, who in turn collects binaries from multiple parties, integrates them together, then ships the executable to customers. Previously, Alice ships native binaries to Bob, and Bob cannot ver- ify the security of the binary unless there is some non-trivial binary analysis in 4Currently, M2C requires the input MSWASM files to remain relocatable (i.e., pre- serving the “linking” section) if they depend on symbols from native code (e.g., libc). Page 10: 10 Authors Suppressed Due to Excessive Length place. With the Aegis pipeline, Alice can ship MSWASM binaries to Bob, and Bob can compile this binary into native code with confidence that the delivered code has spatial memory safety protection. Besides converting the entire program, Aegis also supports protecting part of the program (e.g., a vulnerable library) by selectively compiling the selected files through Aegis pipeline and writing wrappers for functions defined in them. Writing wrappers that export Aegis-protected functions to native C code still requires some programmer effort at the time of writing. In summary, our implementation includes: –Modifying C2M to support various missing features, including C global variables and fine-grained spatial safety for stack via “slice” instruction in MSWASM; and to fix bugs; –Creating M2C code generator, which is based on the wasm2ctool in the WABT from MSWASM fork but with the code generation section rewritten; and lastly –Modifying Checked C compiler, mainly, to support customizable error han- dling upon spatial safety violation. Limitations Currently, function pointers in native C and MSWASM has different representations and thus cannot be used across the domains. Aegis does not support programs passing pointers/handles between native C and MSWASM through shared memory in the general case, and requires programmer effort to write wrappers. However, Aegis automatically converts pointers passed through function call/returns in best efforts. 4.1 Aegis Pipeline In Action Next, we use a descriptive example of how Aegis works. We have also developed a pen-and-paper formalization for proving the correctness of our M2C; a technical report for this proof is included in our provided artifacts. Figure 4 shows a simple example of how Aegis pipeline converts C code to MSWASM and then to Checked C. The green, blue, and red boxes demon- strate how different parts of original C code are transformed into MSWASM, and Checked C. As a stack machine, MSWASM instructions consume values on the stack and push results back to the stack. We annotate the sub-expression the code is evaluating for the blue box ( &array[index] ) and code after it (stor- ing value 1 to the address), for readability. MSWASM maps function arguments to local variables, and local.get 0 /1in MSWASM pushes destptr and size in C, correspondingly. In the green box, MSWASM converts the malloc() to new_segment that pops a size and pushes the resulting handle into the stack. The three instructions before new_segment is computing sizeleft shifted by 2 (which issizeof(int)*size ). The handle_store after them writes the arraypointer to destptr. The code inside blue boxes computes the address of array[index] . local.get 3 pushes arrayto the stack. Instructions from local.get 2 toi32.shl compute index*sizeof(int) as the offset for the pointer arithmetic. handle.add Page 11: Title Suppressed Due to Excessive Length 11 00008d func[0] <foo>: local[0] type=handle local.get 0 local.get 1 i32.const 2 i32.shl new_segment local.tee 3 handle.store 5 0 local.get 3 local.get 2 i32.const 2 i32.shl handle.add i32.const 1 i32.store 2 0 endint* tmp = &array[index ] *tmp = 1void w2c_foo(Handle w2c_p0, u32 w2c_p1, u32 w2c_p2) { … FUNC_PROLOGUE; u32 w2c_i1, w2c_i2; Handle w2c_i0 = (Handle){…}, w2c_i1_0 = (Handle){…}; w2c_i0 = w2c_p0; w2c_i1 = w2c_p1; w2c_i2 = 2u; w2c_i1 <<= (w2c_i2 & 31); w2c_i1_0 = (Handle){.data = calloc <uint8_t>(w2c_i1, sizeof (uint8_t)), . data_size = w2c_i1, .offset= 0}; w2c_l3 = w2c_i1_0; handle_store (w2c_i0, w2c_i1_0, 0); w2c_i0 = w2c_l3; w2c_i1 = w2c_p2; w2c_i2 = 2u; w2c_i1 <<= (w2c_i2 & 31); w2c_i0.offset += (int32_t)(w2c_i1); w2c_i1 = 1u; i32_store(w2c_i0, w2c_i1, 0); FUNC_EPILOGUE;}voidfoo(int** destptr, intsize, intindex) { int* array = ( int*)malloc( sizeof(int)*size); array[index] = 1; *destptr = array; } Fig.4: Code example (C to MSWASM to Checked C) adds them together. Finally, the code after the blue boxes dereference the re- sulting pointer and stores a constant value 1to it. M2C translates MSWASM functions similar to macro expansion: each in- struction maps to a Checked C code sequence implementing its semantics, and M2C uses a big switch statement inside a loop that pattern-matches instructions to convert each function. We also rewrote code sequences to replace the use of WASM linear memory with MSWASM’s segmented memory, as linear memory is represented as segment itself in MSWASM semantics. 1typedef struct Handle{ // defined in runtime library 2 long data_size; // n_bound, max size 3 long offset; // n_offset, current offset from base address 4 _Array_ptr<uint8_t> data: count(data_size); // n_base, base address 5} Handle; // T* ptr, *ptr --> Handle h, *((ptr<T>)(h.data + h.offset) 6 7static inline void handle_store(Handle h, Handle value, int offset) { 8 dynamic_check(sizeof(Handle) <= h.data_size); 9 dynamic_check((unsigned long)(h.offset + offset) <= 10 (unsigned long)(h.data_size - sizeof(Handle))); 11 *((ptr<Handle>)(h.data + h.offset + offset)) = value; 12} Listing 1: Handle definition and handle_store() Implementation Page 12: 12 Authors Suppressed Due to Excessive Length Listing 1 shows how M2C implements MSWASM handle and memory ac- cesses. We implement Handleas a struct with a checked pointer and bounds information. We implement all memory access related instructions using handles, for example, dereferencing the handle hto type Tbecomes *((ptr<T>)(h.data + h.offset)) . There are various load and store operations in MSWASM, which areimplementedsimilarly.Forexample, handle_store(h, v, offset) inListing1 stores the value of handle vinto where handle hpoints. Because (MS)WASM memory access instructions take an additional offsetvalue, the code adds this offsetas well when dereferencing a handle. Because all generated Checked C code is in the checked scope, the Checked C compiler will either try to prove that the access is statically safe, or it will insert runtime bounds checking to guaran- tee spatial safety. The Checked C dynamic_check() calls express necessary and sufficient conditions for spatial safety. If Checked C compiler can prove that a dynamic_check() condition always holds, then it is removed in runtime, otherwise the compiler will insert runtime checks to enforce it. When the MSWASM program contains calls to functions defined outside, M2C assumes the callees are native C functions and generates wrapper func- tions to handle calling convention differences between MSWASM and native C. The wrapper functions convert C pointers and MSWASM handles for pointer arguments and return values, and recovers struct’s passed by value in generated Checked C source. Because C pointers do not contain bounds information, the current implementation creates handles with infinite bounds, bypassing bounds checks on them. Our implementation allows users to manually prepare the wrap- pers so that tighter bounds can be specified using human knowledge. The wrap- per functions are defined with weak linkage so that if another MSWASM module has a strong definition, it takes priority during linking. 4.2 Additional Features Besides the M2C code generator, we modified C2M and Checked C compiler to support several language features and use cases while observing the MSWASM semantics in the pipeline. Global variable handling While (MS)WASM has “globals”, they cannot be used for C global variables because they are addressed by indices and it is not possible to take their address. Therefore, in the (MS)WASM implementation, global variables and constant data (e.g., constant string literals) are compiled to ordinary data symbols in a single data segment reserved for them. However, their initializers are emitted as WASM data segments, which, in the MSWASM formalism, should be mapped to WASM linear memory. In addition, accesses to global variables are not spatial-safety protected because they do not use handles. Inourimplementation,wechangedthecodegenerationforglobalvariablesin theC2Mcompilersothat:(1)abackend-specificIRpasscreatesanewMSWASM global handle for each global variable declaration and emits initialization func- tions to initialize the handle; and (2) the Instruction selection (ISel) lowers all Page 13: Title Suppressed Due to Excessive Length 13 global variable references as loading and dereferencing the global handle. The initialization functions will be executed once during the module initialization according to WASM semantics. In addition, if M2C finds such a global handle declaration in the input MSWASM binary, it will emit C global variable decla- rations or definitions in the output Checked C code, and initializes the handles with the address of the global variables. This ensures that the C source code in Aegis pipeline can reference global variables defined outside and vice versa. Bounds narrowing MSWASM includes the slice instruction, with the usual semantics of returning a sub-handle from a given handle. However, the instruc- tion is not implemented in the original MSWASM toolchain. In addition, uti- lizing the slice instruction for spatial protection requires computing the change to lower and upper bounds, which requires a sequence of instructions to im- plement. This would result in code bloat and it is inefficient in practice. In- stead, we adopt CHERI’s CSetBounds instruction semantics and implemented handle.setbounds instruction that takes (1) the handle for bounds narrowing, and (2) the new length that the handle is allowed to access. This instruction can be shown to be equivalent to the semantics of slice in MSWASM. This in- struction allows us to protect access to stack objects: when the address of an stack-allocated object is taken, we emit a handle.setbounds instruction so that the result pointer carries bounds to cover only the referenced object instead of the entire stack. This new feature provides spatial safety for stack. Integer-to-pointer cast C programs written for embedded systems sometimes need to read or write at specific hard-coded addresses for I/O or other low-level system operations. These accesses becomes integer-to-pointer cast followed by pointer de-references in the LLVM IR. To support these programs while main- taining the formalism of MSWASM, the MSWASM module imports a special handle _physical_memory5that points to a segment representing the underly- ing address space, and integer-to-pointer casts are lowered as pointer arithmetic on this handle. We modified M2C and Checked C compiler as well to support recovery of such operations on access to _physical_memory . 4.3 Case Studies Wehaveappliedourtoolchaintosecurereal-lifevulnerabilities.Wehaveusedour tool in both: (i) “whole-program” mode, where all source codes are transformed into Checked C code; and (ii) “partial-program” mode where only a selected part of a system (e.g., a file) is transformed. We have also created a cross-compilation toolchain that enabled us to port our toolchain to an embedded domain. We now briefly discuss how we used Aegis to treat two well-known critical spatial safety vulnerabilities: (i) Heartbleed vulnerability in OpenSSL (CVE- 5The mechanism is not limited to bare-metal programs; it can be used in virtual address space Page 14: 14 Authors Suppressed Due to Excessive Length 2014-0160); and (ii) git log vulnerability in Git (CVE-2022-41903). Both vul- nerabilities happen on the heap, due to missing bound checking in memcpycalls. The former is a buffer overflow vulnerability. The latter is a buffer underflow vulnerability, itself the result of integer overflow.6In both cases, we use Aegis selectively over the vulnerable parts of the code. The rest of the code commu- nicates with the transformed code via wrappers that the tool provides which we customize. The main effort went into parameters of the vulnerable function that included pointers to structs that needs to be serialized/deserialized to the corresponding handle generated in Aegis Checked C. While Aegis automatically generates wrappers for interfacing, it is necessary to manually specify bounds for raw C pointers which are converted to handles in Aegis code. We also make minor modifications to include Aegis transformed code in the build systems of OpenSSL and Git. In both cases, our fix effectively secured the code against the known vulnerabilities, with little effort. The mechanism by which Aegis secures the faulty memcpycalls is straight- forward: A pointer in the original C code is eventually transformed into an MSWASM handle, and subsequently a Checked C pointer, which guarantees to capture any access beyond its static low and high bounds. In Aegis, we have two options for implementation of secure memcpy: –The libc implementation of memcpycan be compiled via C2M and M2C to produce an equivalent version of memcpythat accepts handles: memcpyprototype via Handles 1 Handle w2c_memcpy(Handle w2c_p0, Handle w2c_p1, u32 w2c_p2) –The call to memcpyis considered an external call to a libc function by Aegis, allowing us to provide a Checked C interface of memcpythat accepts checked pointers instead of raw pointers: memcpyprototype via Checked C pointers 1itype_for_any(T) void *memcpy( 2 void * restrict dest : itype(restrict array_ptr<T>) byte_count(n), 3 const void * restrict src : itype(restrict array_ptr<const T>) byte_count(n),size_t n) : itype(array_ptr<T>) byte_count(n); where the above prototype in Checked C asserts that the source and des- tination buffers can be checked pointer whose sizes are bound by the last parameter of function, n. The call goes through WASI interface.7 Both approaches work in Aegis, with the second option being faster but would require the M2C compiler to provide the proper checked pointers for Checked C version of memcpy. 6https://www.x41-dsec.de/static/reports/X41-OSTIF-Gitlab-Git-Security-Audit- 20230117-public.pdf 7https://github.com/WebAssembly/WASI Page 15: Title Suppressed Due to Excessive Length 15 5 Evaluation We evaluate our Aegis pipeline to measure its impact on the performance and binary size of the protected programs. We use PolyBenchC8for evaluation, which is commonly used for evaluating WASM compilers. This benchmark is also used by the original MSWASM paper when evaluating their Rust backend for MSWASM [17], which itself is based on rWasm [4]. We run all programs on a Ubuntu 22.04.2 system with an Intel i9-9900K CPU. We compare the protected versions of programs generated by the following three tools against native: 1. Aegis pipeline: We use C2M (Clang version 11.0.0 with -O3), M2C and Checked C (Clang version 12.0.0 with -O3). 2.rWasm bb: We use the Baggy Bounds [1] implementation of rWasm, which converts MSWASM to Rust [17]. This technique stores memory as a single vector of bytes, and an allocator allocates segments which allow for faster bound checking. Handles are 64-bit values storing an offset in memory and thelogofthesegmentsize(roundeduptonearestpoweroftwoatallocation). 3.rWasm seg: We use the segments implementation of rWasm [17]. This stores segment memory as a vector of segments, where each segment is a vector of bytes, thus automatically enforcing spatial safety, thanks to built-in spatial safety in safe Rust. However, this implementation also provides temporal safety - we manually turn off this feature to accurately measure overhead for only spatial safety. 5.1 Performance Impact We compare the runtime and memory usage of PolyBenchC programs protected by the three tools above. To evaluate the runtime increase, we use PolyBenchC’s builtin timer, which outputs execution time using gettimeofday function. There- fore, the runtime numbers exclude startup and shutdown phase of the programs. This decision may make the reported runtime overhead smaller than the end- to-end measurement, but it facilitates more fair comparison among the tools we evaluate. To measure the memory usage, we retrieve the peak virtual ( VmPeak) and physical ( VmHWM) memory usage from Linux procfile system. Figure 5 shows the runtime and memory usage of each versions. Figure 5a shows the normalized runtime usage. The line for 100% repre- sents the native execution, and bars above 100% represent the overhead portion of the results. When compared to native, Aegis has 67% geomean overhead, rWasm bbhas 97% overhead, and rWasm seghas 117% overhead. Aegis has the least overhead, thanks to Checked C compiler optimizations and our efficient implementation of MSWASM runtime. rWasm bbhas better performance com- pared to rwasm segsince rWasm bbuses the Baggy Bounds technique which has a reduced cost for bounds checking at the cost of higher memory consumption. Figure 5b shows the virtual and physical memory usages. Aegis incurs near- zero memory overhead in all programs because it requires no changes on object 8https://github.com/MatthiasJReisinger/PolyBenchC-4.2.1 Page 16: 16 Authors Suppressed Due to Excessive Length 0%100%200%300%400%500%600%700%800%900%1000%Aegis rWasm_bb rWasm_seg (a) Normalized runtime 5.27E+05 050000100000150000200000250000300000V_Native V_Aegis V_rWasm_bb V_rWasm_seg P_Native P_Aegis P_rWasm_bb P_rWasm_seg (b) Peak memory usage (KB), stacking Virtual (dark) with Physical (light) Fig.5: Runtime and memory usage. allocations and no extra dependency. rWasm segincurs 5% geomean overhead comparedtonativeexecutionbecauseoftheRustimplementation. rWasm bbhas the highest memory overhead, ranging from 87% to 272% compared to native for virtual memory usage and 109% to 6430% for physical memory. This is because it allocates objects in power-of-two sized bins. 5.2 Executable Size Because PolyBenchC programs have similar code sizes and all native executa- bles are between 17KB and 21KB, we discuss average executable sizes only. Ex- ecutables produced by the Aegis pipeline is 29KB on average, 70% larger than the native versions. In contrast, the average executable sizes for rWasm bband rWasm segare 35,375KB and 35,650KB respectively. This massive code bloat is from the libraries Rust statically linked into the executables. 6 Related Work 6.1 Pointer-based Spatial Safety Protection Fat pointer schemes, such as Aegis, store a pointer and its bounds next to each other to achieve locality, and thus efficiency. CHERI [32] architecture introduces Page 17: Title Suppressed Due to Excessive Length 17 rWasmbbrWasmseg Aegis100101 Fig.6: Normalized slowdown against native (log scale); cf. Figure 13 in [17]. dedicated capability registers and instructions to accelerate fat pointer oper- ations. MSWASM is designed with CHERI backend in mind. Besides storing bounds inline with pointers for efficiency, schemes can also store them out-of- band for better compatibility with uninstrumented binaries [23,20]. Fat pointer schemes can enforce fine-grained spatial safety at the cost of compatibility. To overcome the incompatibility of memory layout by fat pointers, early works store the pointer metadata in disjoint memory locations (i.e., shadow memory) [21,19], incurring high performance overhead. More recent work on tagged pointers squeeze metadata into the unused bits of pointers. This enables embedding metadata without changing pointer size or paying pointer metadata lookup cost. FRAMER [22] and In-Fat Pointer [33] use the pointer tags to locate in-memoryobjectboundsthatarelargerinsize.Taggedpointerschemesachieves better compatibility at the cost of weaker security guarantees. Further work related to memory safety using dynamic checking can be found in [27,30]. 6.2 Transforming C code to Safer Languages Withtheadventofmemory-safeprogramminglanguages,researchersalsostarted exploring source-to-source transforms that port unsafe C programs to safer lan- guages. However, because of the semantic gap between C and safe languages, it is challenging for these analyses to be sound and complete on potentially unsafe operations. Therefore, these transforms are not fully automatic or they don’t have full coverage yet. 3C [15] introduces progressive protection to C code by in- ferring Checked C type annotations that place safety checks. CheckedCBox [13] applies program partitioning to Checked C programs to protect checked code from unchecked code. There are also works exploring converting C to Rust pro- grams using static analysis [7,34]. Page 18: 18 Authors Suppressed Due to Excessive Length 6.3 Formal Model on Heap Safety 1-Safety on heap. SoftBounds [21] and Checked C [14] provide extensions on the formal semantic of a fragment of C in Coq and prove their (syntactic) type safety. SoftBounds focuses on the absence of over-bound, while [14] aims at The Blame Theorem to characterize Core Checked C as a gradual type system. RichWasm [24] extends single-module WASM semantic with a substructural capability-based type system and proves its (syntactic) type safety. With the advanced type system design, spatial and temporal safety can be eliminated in RichWasm at compile time, at the cost of expressing potentially unsafe code. Non-interference on heap. The closest work is [2], which argues the defi- nition of memory safety should include aspects of privacy and isolation, and consequently a form of non-interference. Another stream of works on non-interference is based on Information Flow Type System [8,3,11].Theseworksaimatamoregeneralsettingofnon-interference— not restricted to heap—but require modifications to syntax and typing rules to additionally annotate security levels. Compared to non-interference, our notion of robust1-safety is weaker, as our final theorem can only claim no modification (no-write) happens in the designated area, but we cannot prove the absence of information leakage (no- read). However, our advantage is the simplicity of the proof structure, as it is a relatively lightweight extension to the existent unary logical relation in [25] and thus leaves most existent proof unchanged. 7 Conclusion We present the first mechanized formalization of MSWASM and extend it to supportsecureinter-moduleinteraction.Basedonourformalism,wealsopresent Aegis, which provides a practical solution to apply MSWASM on C programs, by translating MSWASM into spatially-safe Checked C. By virtue of generating codeinadialectofC,Aegisallowsforaprogrammatic,binary-compatiblelevelof interoperability between memory-safe Checked C and C code. Our formalization and implementation provide the foundation for adopting MSWASM in realistic settings where memory safety, interoperability, and performance are paramount. References 1. Akritidis, P., Costa, M., Castro, M., Hand, S.: Baggy bounds checking: An effi- cient and backwards-compatible defense against out-of-bounds errors. In: USENIX Security Symposium. vol. 10, p. 96 (2009) 2. de Amorim, A.A., Hriţcu, C., Pierce, B.C.: The meaning of memory safety. In: 7th International Conference on Principles of Security and Trust (POST). pp. 79–105 (2018) Page 19: Title Suppressed Due to Excessive Length 19 3. Barthe, G., Pichardie, D., Rezk, T.: A certified lightweight non-interference java bytecode verifier. In: Programming Languages and Systems: 16th European Sym- posium on Programming, ESOP 2007, Held as Part of the Joint European Confer- ences on Theory and Practics of Software, ETAPS 2007, Braga, Portugal, March 24-April 1, 2007. Proceedings 16. pp. 125–140. Springer (2007) 4. Bosamiya, J., Lim, W.S., Parno, B.: {Provably-Safe }multilingual software sand- boxing using {WebAssembly }. In: 31st USENIX Security Symposium (USENIX Security 22). pp. 1975–1992 (2022) 5. Dreyer, D.: The type soundness theorem that you really want to prove (and now you can). Milner Award Lecture, POPL (2018) 6. Durumeric, Z., Li, F., Kasten, J., Amann, J., Beekman, J., Payer, M., Weaver, N., Adrian, D., Paxson, V., Bailey, M., Halderman, J.A.: The matter of heart- bleed. In: Proceedings of the 2014 Conference on Internet Measurement Confer- ence. p. 475–488. IMC ’14, Association for Computing Machinery, New York, NY, USA (2014). https://doi.org/10.1145/2663716.2663755, https://doi.org/10.1145/ 2663716.2663755 7. Emre, M., Boyland, P., Parekh, A., Schroeder, R., Dewey, K., Hardekopf, B.: Alias- ing limits on translating c to safe rust. Proc. ACM Program. Lang. 7(OOPSLA1) (apr 2023). https://doi.org/10.1145/3586046, https://doi.org/10.1145/3586046 8. Frumin, D., Krebbers, R., Birkedal, L.: Compositional non-interference for fine- grained concurrent programs. In: 2021 IEEE Symposium on Security and Privacy (SP). pp. 1416–1433. IEEE (2021) 9. Haas, A., Rossberg, A., Schuff, D.L., Titzer, B.L., Holman, M., Gohman, D., Wag- ner, L., Zakai, A., Bastien, J.: Bringing the web up to speed with webassembly. In: Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation. pp. 185–200 (2017) 10. Jung, R., Krebbers, R., Birkedal, L., Dreyer, D.: Higher-order ghost state. In: Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming. pp. 256–269 (2016) 11. Kammüller, F.: Formalizing non-interference for a simple bytecode language in coq. Formal Aspects of Computing 20, 259–275 (2008) 12. Lamport, L.: Specifying systems: the tla+ language and tools for hardware and software engineers (2002) 13. Li, L., Bhattar, A., Chang, L., Zhu, M., Machiry, A.: Checkedcbox: Type directed program partitioning with checked c for incremental spatial memory safety (2023) 14. Li, L., Liu, Y., Postol, D., Lampropoulos, L., Van Horn, D., Hicks, M.: A formal model of checked c. In: 2022 IEEE 35th Computer Security Foundations Sympo- sium (CSF). pp. 49–63. IEEE (2022) 15. Machiry, A., Kastner, J., McCutchen, M., Eline, A., Headley, K., Hicks, M.: C to checked c by 3c. Proceedings of the ACM on Programming Languages 6(OOPSLA1), 1–29 (2022) 16. Mallissery, S., Wu, Y.S.: Demystify the fuzzing methods: A comprehensive survey. ACM Comput. Surv. 56(3) (oct 2023). https://doi.org/10.1145/3623375, https: //doi.org/10.1145/3623375 17. Michael,A.E.,Gollamudi,A.,Bosamiya,J.,Johnson,E.,Denlinger,A.,Disselkoen, C., Watt, C., Parno, B., Patrignani, M., Vassena, M., et al.: Mswasm: Soundly enforcing memory-safe execution of unsafe code. Proceedings of the ACM on Pro- gramming Languages 7(POPL), 425–454 (2023) 18. Morrisett, G., Walker, D., Crary, K., Glew, N.: From system f to typed assembly language. ACM Transactions on Programming Languages and Systems (TOPLAS) 21(3), 527–568 (1999) Page 20: 20 Authors Suppressed Due to Excessive Length 19. Nagarakatte, S., Martin, M.M.K., Zdancewic, S.: Watchdoglite: Hardware- accelerated compiler-based pointer checking. In: Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization. p. 175–184. CGO ’14, Association for Computing Machinery, New York, NY, USA (2014). https://doi.org/10.1145/2581122.2544147, https://doi.org/10.1145/ 2581122.2544147 20. Nagarakatte, S., Zhao, J., Martin, M.M., Zdancewic, S.: Softbound: highly compatible and complete spatial memory safety for c. SIGPLAN Not. 44(6), 245–258 (jun 2009). https://doi.org/10.1145/1543135.1542504, https://doi.org/10. 1145/1543135.1542504 21. Nagarakatte, S., Zhao, J., Martin, M.M., Zdancewic, S.: Softbound: highly com- patible and complete spatial memory safety for c. In: Proceedings of the 30th ACM SIGPLAN Conference on Programming Language Design and Implementa- tion. p. 245–258. PLDI ’09, Association for Computing Machinery, New York, NY, USA (2009). https://doi.org/10.1145/1542476.1542504, https://doi.org/10.1145/ 1542476.1542504 22. Nam, M.J.,Akritidis, P., Greaves, D.J.:Framer: atagged-pointer capabilitysystem with memory safety applications. In: Proceedings of the 35th Annual Computer Security Applications Conference. p. 612–626. ACSAC ’19, Association for Com- puting Machinery, New York, NY, USA (2019). https://doi.org/10.1145/3359789. 3359799, https://doi.org/10.1145/3359789.3359799 23. Oleksenko, O., Kuvaiskii, D., Bhatotia, P., Felber, P., Fetzer, C.: Intel mpx ex- plained: A cross-layer analysis of the intel mpx system stack. Proc. ACM Meas. Anal. Comput. Syst. 2(2) (jun 2018). https://doi.org/10.1145/3224423, https: //doi.org/10.1145/3224423 24. Paraskevopoulou, Z., Fitzgibbons, M., Thalakottur, M., Mushtak, N., Mazur, J.S., Ahmed, A.: Richwasm: Bringing safe, fine-grained, shared-memory interoperability down to webassembly. arXiv preprint arXiv:2401.08287 (2024) 25. Rao, X., Georges, A.L., Legoupil, M., Watt, C., Pichon-Pharabod, J., Gardner, P., Birkedal,L.:Iris-wasm:Robustandmodularverificationofwebassemblyprograms. Proceedings of the ACM on Programming Languages 7(PLDI), 1096–1120 (2023) 26. Ruef, A., Elliott, A.S., Sweet, I., Hicks, M., Tarditi, D.: Checked c for safety, gradually. Draft. Nov 17(2017) 27. Song, D., Lettner, J., Rajasekaran, P., Na, Y., Volckaert, S., Larsen, P., Franz, M.: Sok: Sanitizing for security. In: 2019 IEEE Symposium on Security and Privacy (SP). pp. 1275–1295 (2019). https://doi.org/10.1109/SP.2019.00010 28. Szekeres, L., Payer, M., Wei, T., Song, D.: Sok: Eternal war in memory. In: 2013 IEEE Symposium on Security and Privacy. pp. 48–62 (2013). https://doi.org/10. 1109/SP.2013.13 29. Tarditi, D.: Checked c specification. Checked C (2023), https://github.com/ checkedc/checkedc/releases 30. Vintila, E., Zieris, P., Horsch, J.: Evaluating the Effectiveness of Memory Safety Sanitizers . In: 2025 IEEE Symposium on Security and Privacy (SP). pp. 88–88. IEEE Computer Society, Los Alamitos, CA, USA (May 2025). https://doi.org/10. 1109/SP61157.2025.00088, https://doi.ieeecomputersociety.org/10.1109/SP61157. 2025.00088 31. Watt, C., Rao, X., Pichon-Pharabod, J., Bodin, M., Gardner, P.: Two mechanisa- tions of webassembly 1.0. In: Formal Methods: 24th International Symposium, FM 2021, Virtual Event, November 20–26, 2021, Proceedings 24. pp. 61–79. Springer (2021) Page 21: Title Suppressed Due to Excessive Length 21 32. Woodruff, J., Joannou, A., Xia, H., Fox, A., Norton, R.M., Chisnall, D., Davis, B., Gudka, K., Filardo, N.W., Markettos, A.T., Roe, M., Neumann, P.G., Watson, R.N.M., Moore, S.W.: Cheri concentrate: Practical compressed capabilities. IEEE Transactions on Computers 68(10), 1455–1469 (2019). https://doi.org/10.1109/ TC.2019.2914037 33. Xu, S., Huang, W., Lie, D.: In-fat pointer: hardware-assisted tagged-pointer spatial memory safety defense with subobject granularity protection. In: Proceedings of the 26th ACM International Conference on Architectural Support for Program- ming Languages and Operating Systems. p. 224–240. ASPLOS ’21, Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/ 3445814.3446761, https://doi.org/10.1145/3445814.3446761 34. Zhang, H., David, C., Yu, Y., Wang, M.: Ownership guided c to rust translation. In: Enea, C., Lal, A. (eds.) Computer Aided Verification. pp. 459–482. Springer Nature Switzerland, Cham (2023) 35. Zhu, X., Wen, S., Camtepe, S., Xiang, Y.: Fuzzing: A survey for roadmap. ACM Comput. Surv. 54(11s) (sep 2022). https://doi.org/10.1145/3512345, https://doi. org/10.1145/3512345

---