Open AGI Codes | Your Codes Reflect!

Updates

Generating audio...

arxiv

Paper 2503.04512

Modular Reasoning about Error Bounds for Concurrent Probabilistic Programs

Authors: Kwing Hei Li, Alejandro Aguirre, Simon Oddershede Gregersen, Philipp G. Haselwarter, Joseph Tassarotti, Lars Birkedal

Published: 2025-03-06

Abstract:

We present Coneris, the first higher-order concurrent separation logic for reasoning about error probability bounds of higher-order concurrent probabilistic programs with higher-order state. To support modular reasoning about concurrent (non-probabilistic) program modules, state-of-the-art program logics internalize the classic notion of linearizability within the logic through the concept of logical atomicity. Coneris extends this idea to probabilistic concurrent program modules. Thus Coneris supports modular reasoning about probabilistic concurrent modules by capturing a novel notion of randomized logical atomicity within the logic. To do so, Coneris utilizes presampling tapes and a novel probabilistic update modality to describe how state is changed probabilistically at linearization points. We demonstrate this approach by means of smaller synthetic examples and larger case studies. All of the presented results, including the meta-theory, have been mechanized in the Rocq proof assistant and the Iris separation logic framework

Paper Content:

Page 1: Modular Reasoning about Error Bounds for Concurrent Probabilistic Programs KWING HEI LI, Aarhus University, Denmark ALEJANDRO AGUIRRE, Aarhus University, Denmark SIMON ODDERSHEDE GREGERSEN, New York University, USA PHILIPP G. HASELWARTER, Aarhus University, Denmark JOSEPH TASSAROTTI, New York University, USA LARS BIRKEDAL, Aarhus University, Denmark We present Coneris, the first higher-order concurrent separation logic for reasoning about error probability bounds of higher-order concurrent probabilistic programs with higher-order state. To support modular reasoning about concurrent (non-probabilistic) program modules, state-of-the-art program logics internalize the classic notion of linearizability within the logic through the concept of logical atomicity . Coneris extends this idea to probabilistic concurrent program modules. Thus Coneris supports modular reasoning about probabilistic concurrent modules by capturing a novel notion of randomized logical atomicity within the logic. To do so, Coneris utilizes presampling tapes and a novel probabilistic update modality to describe how state is changed probabilistically at linearization points. We demonstrate this approach by means of smaller synthetic examples and larger case studies. All of the presented results, including the meta-theory, have been mechanized in the Rocq proof assistant and the Iris separation logic framework. 1 Introduction Probabilistic data structures, such as approximate counters, skip lists, or Bloom filters are widely used in concurrent programming. These data structures can improve time and space efficiency compared to their deterministic counterparts. However, some probabilistic data structures may return wrong results with a small probability. Analyzing and ensuring this probability of error is sufficiently small is essential for using these data structures. But this analysis is challenging because probabilistic programs often have unintuitive behaviors, which are only made more complicated when probabilistic behaviors are combined with concurrency. Because randomness and concurrency both introduce non-determinism, any analysis must take into account the large range of possible outcomes that can arise from this non-determinism. For just concurrency alone without randomness, a number of verification techniques have been developed that abstract away from concurrent non-determinism. For example, Concurrent Separation Logic (CSL) [O’Hearn 2007] allows for threads in a concurrent program to be verified in alocal way, without having to consider the effects of interference from other threads at every step of execution. A key feature of modern concurrent separation logics is support for proving that data structures are logically atomic [da Rocha Pinto et al .2014; Jacobs and Piessens 2011; Jung et al .2015; Svendsen et al .2013]. Logical atomicity allows clients to reason as if a concurrent data structure had a logical state that is updated atomically at a single point in time during each operation. This internalizes the idea of linearizability , a standard notion of correctness for concurrent data structures, and enables modular and compositional proofs. It is natural to consider whether similar techniques can be applied to reason about randomized concurrent data structures. Prior work has explored extending concurrent separation logic to Authors’ Contact Information: Kwing Hei Li, Aarhus University, Denmark, hei.li@cs.au.dk; Alejandro Aguirre, Aarhus University, Denmark, alejandro@cs.au.dk; Simon Oddershede Gregersen, New York University, USA, s.gregersen@nyu.edu; Philipp G. Haselwarter, Aarhus University, Denmark, pgh@cs.au.dk; Joseph Tassarotti, New York University, USA, jt4767@ cs.nyu.edu; Lars Birkedal, Aarhus University, Denmark, birkedal@cs.au.dk.arXiv:2503.04512v1 [cs.LO] 6 Mar 2025 Page 2: 2 Kwing Hei Li, Alejandro Aguirre, S. O. Gregersen, Philipp G. Haselwarter, Joseph Tassarotti, and Lars Birkedal the randomized setting [Fesefeldt et al .2022; Tassarotti and Harper 2019; Zilberstein et al .2024]. However, none of these prior logics support compositional reasoning about data structures because they lack support for reasoning about logical atomicity. Indeed, even developing a suitable notion of what logical atomicity would look like in the presence of randomization is challenging. Whereas prior efforts developing non-randomized logical atomicitiy could draw inspiration and intuition from the notion of linearizability, there is no widely-accepted analogue of linearizability that is suitable for randomized data structures. Indeed, prior work has shown ways in which standard notions of linearizability do not suffice when clients of a data structure can make randomized choices [Golab et al. 2011]. In this paper, we develop an appropriate notion of randomized logical atomicity in the context of Coneris, a new concurrent separation logic. Coneris is a probabilistic higher-order concurrent separation logic for reasoning about error bounds of ConcRandML programs, a ML-like language with support for discrete random sampling, unstructured concurrency, higher-order functions, and higher-order dynamically-allocated local state. On the surface level, Coneris retains all the standard rules of higher-order separation logic for concurrent programs, along with modern features like expressive impredicative invariants and custom ghost resources. From the probabilistic side, Coneris inherits two kinds of separation logic resources from previous logics for reasoning about probabilities, specifically presampling tapes first introduced in Clutch [Gregersen et al .2024], and error credits first introduced in Eris [Aguirre et al . 2024]. The former are used to reason about random choices that a program will take in the future, while the latter are used to track an upper bound on the probability of some error occurring during execution. Using its higher-order features, Coneris encodes randomized logical atomicity by adapting the earlier HOCAP [Svendsen et al .2013] approach to logical atomicity. To do so, we introduce a novel probabilistic update modality , written as|⇝⇝𝑃. Intuitively, the modality is used to describe a probabilistic logical update to a piece of ghost state of the program. In particular, we can use it to reason as if the randomness that the program uses is atomically drawn at one single point in time, which is crucial in writing and proving modular specifications. We demonstrate the flexibility of our approach by verifying a selection of data structures. For example, we verify the correctness of a concurrent hash function. The hash module is subsequently used in an efficient concurrent implementation of a Bloom filter, and we derive strict error bound for a client program that uses the Bloom filter. These examples utilize rich language features such as higher-order functions and local state, and display non-trivial interactions between concurrency and probability. As far as we are aware, even verifying the simpler examples are out-of-scope for previous techniques, let alone reasoning about them modularly . Contributions. In summary, we make the following contributions: •The first concurrent and probabilistic higher-order separation logic for reasoning about error bounds of programs written in ConcRandML , a probabilistic concurrent higher-order programming language with higher-order references. •A novel probabilistic update modality that is used to capture a probabilistic notion of logical atomicity. •An extension to the probabilistic setting of the HOCAP approach, that allows us to write and prove modular specifications of randomized concurrent data structures. •A selection of case studies showcasing our approach to modular verification of higher-order concurrent probabilistic data structures. Page 3: Modular Reasoning about Error Bounds for Concurrent Probabilistic Programs 3 •Full mechanization of all results in the Rocq proof assistant [Bertot and Castéran 2013], using the Iris separation logic framework [Jung et al .2018] and the Coquelicot real analysis library [Boldo et al. 2013]. Outline. In §2, we demonstrate how Coneris is used to reason about programs that feature both concurrency and probability, and highlight some of the key challenges that arise from this combination. We then present the syntax and semantics of our language ConcRandML in §3. In §4, we present a collection of program logic rules and we show how to apply them to reason about a concurrent randomized counter module in §5. Subsequently in §6, we showcase Coneris on a range of concurrent probabilistic data structure examples. Finally, we discuss related work and conclude with ideas for future work in §8 and §9, respectively. 2 Motivation and Technical Challenges We first recall the features of the Eris logic [Aguirre et al .2024] for reasoning about error bounds of sequential programs, then we discuss how Coneris extends these ideas to the concurrent setting, illustrating some of the challenges that arise when reasoning about error bounds in the presence of concurrency. Sequential Error Reasoning with Error Credits. Eris is a separation logic that introduces a new assertion called error credits written E(𝜀), where 0≤𝜀≤1is a real number. This assertion represents a budget upper bounding the probability that a specification can fail to hold. In particular, an Eris Hoare triple of the form {𝑃∗E(𝜀)}𝑒{𝑥.𝑄}implies that if we execute 𝑒in a state satisfying 𝑃, then the probability that 𝑒crashes or returns a value 𝑥that violates 𝑄is at most𝜀. To illustrate how the E(𝜀)assertion is used, consider the following program as a simple example: twoAdd≜let𝑙=ref0in𝑙←(!𝑙+rand 3);𝑙←(!𝑙+rand 3); !𝑙 Here rand𝑁is a probabilistic construct that samples a value uniformly from {0,...,𝑁}. In particular, rand 3returns a random number from the set {0,..., 3}with probability 1/4each. The twoAdd program first allocates a reference 𝑙initialized to 0, then adds the result of a probabilistic sampling rand 3to the value in 𝑙twice, and concludes by reading the number in 𝑙. Suppose we want to prove an upper bound on the probability that the program returns 0. This happens only if both calls to rand 3return 0, which occurs with probability at most 1/4·1/4=1/16. We can capture this as a specification in Eris by proving {E(1/16)}twoAdd{𝑥.𝑥>0}. Eris provides 3 key rules for working with error credits:1 err-split E(𝜀1+𝜀2) E(𝜀1)∗E(𝜀2)= = = = = = = = = = = = = = = = = = = = =ht-rand-exp E𝔘𝑁[F]≤𝜀 {E(𝜀)}rand𝑁{𝑛.E(F(𝑛))∗𝑛∈{0..𝑁}}err-1 E(1) False The first rule says that error credits can be split and joined together. The second rule says that when the program makes a randomized choice, we can re-distribute error credits along different branches of the randomized outcome, so long as the expected value oraverage amount of error credit does not increase. Finally, the third rule says that an error credit of 1implies False ,i.e., we can deduce anything, which intuitively follows from the idea that an upper bound of probability 1 is trivial. Figure 1 shows a proof outline using these rules to derive the Hoare triple stated above fortwoAdd . The key steps in this proof are to apply the ht-rand-exp rule twice to reason about the two rand 3statements in the proof. The first time the rule is applied, we start with E(1/16), and instantiateFin the rule to be the function F(𝑛)≜1 4·[𝑛=0]where[𝑃]evaluates to 1if𝑃(𝑎) is true and to 0otherwise. That is, we end up with E(1/4)in the case where rand 3returned 0, and 1We write inference rules with a double horizontal line to mean the rule can be applied in either direction. Page 4: 4 Kwing Hei Li, Alejandro Aguirre, S. O. Gregersen, Philipp G. Haselwarter, Joseph Tassarotti, and Lars Birkedal {E1 16} let𝑙=ref0in {𝑙↦→0∗E1 16} 𝑙←(!𝑙+rand 3); (apply ht-rand-exp usingF(𝑥)≜1 4·[𝑥=0]) {∃𝑛.𝑙↦→𝑛∗(𝑛=0∧E1 4)∨𝑛≠0} 𝑙←(!𝑙+rand 3); (apply ht-rand-exp usingF′(𝑥)≜[𝑛=0∧𝑥=0]) {∃𝑚.𝑙↦→𝑚∗((𝑚=0∧E(1))∨𝑚≠0)} {∃𝑚.𝑙↦→𝑚∗𝑚≠0} (discharge case 𝑚=0using err-1 from E(1)) !𝑙 {𝑥.𝑥>0} Fig. 1. Proof outline for the Hoare triple {E(1/16)}twoAdd{𝑥.𝑥>0}. E(0)otherwise. For the latter cases, the remainder of the proof is trivial, since the value in 𝑙is already greater than 0. For the former case, on the next call to rand 3, we again apply ht-rand-exp , settingF′(𝑚)≜[𝑛=0∧𝑚=0]. Thus, when rand 3again returns 0, we end up with E(1). In this case,𝑙still contains 0, but we can use err-1 to finish the proof. For the other cases, 𝑙will be greater than 0, so the postcondition follows directly. Concurrent Error Bounds in Coneris. Coneris generalizes Eris’s error credit reasoning to the concurrent setting. To illustrate some of the key ideas and challenges in doing so, consider the following concurrent variation of the twoAdd example: conTwoAdd≜let𝑙=ref0in(faa𝑙(rand 3)|||faa𝑙(rand 3)); !𝑙 Here|||represents parallel composition of two threads, and faa𝑙𝑛is a fetch-and-add command that atomically adds 𝑛to the value stored in 𝑙and returns the value prior to the addition. Intuitively, no matter which order the faacommands execute in the two threads, the probability that the final !𝑙at the end of the program returns 0is again bounded above by 1/16. Coneris allows us to show this by proving a Hoare triple of a similar form as the one we saw fortwoAdd . More precisely in, Coneris, the intuitive meaning of a Hoare triple {𝑃∗E(𝜀)}𝑒{𝑄}is that, “ for all possible schedulings of thread steps , if𝑃holds, then the probability that 𝑒reaches an error state or returns a value that violates 𝑄is at most𝜀”. All of Eris’s proof rules for error credits also hold in Coneris. To reason about parallel execution, Coneris additionally has the familiar parallel composition rule from concurrent separation logic: {𝑃1}𝑒1{𝑣1.𝑄1𝑣1} {𝑃2}𝑒2{𝑣2.𝑄2𝑣2} {𝑃1∗𝑃2}𝑒1|||𝑒2{(𝑣1,𝑣2).𝑄1𝑣1∗𝑄2𝑣2}ht-par-comp To apply this rule we have to divide up the precondition into two separate parts, 𝑃1and𝑃2, and show that they suffice as preconditions for the two threads 𝑒1and𝑒2, respectively. We already saw with err-split that we can split error credits, so for the conTwoAdd example we might try to split the initial error budget of E(1/16)into two, giving each thread E(1/32). However, we would soon run into two issues: (1)As in the sequential case, we want to apply ht-rand-exp to try to distribute all of the credits to the cases where the rand 3commands return 0. However, if each thread has E(1/32)and applies ht-rand-exp to reason about the rand 3it executes, then after applying the rule, it Page 5: Modular Reasoning about Error Bounds for Concurrent Probabilistic Programs 5 can have at most E(1/8)for the case where the rand 3returns 0. Combining two E(1/8)in the post-condition of the parallel composition rule, we would end up with a total of E(2/8) for the case where both threads add 0to the counter. But this is not enough to apply err-1 , and so we would not be able to prove the intended postcondition. (2)Both threads need to modify the shared location 𝑙, so they both need to have “ownership” of the points-to assertion for 𝑙. However,𝑙↦→0⊬𝑙↦→0∗𝑙↦→0, so we cannot pass ownership of this assertion in the precondition of both threads when applying ht-par-comp . The second problem has a well-known solution in CSL, namely invariant assertions . Fortunately, as we will see, it turns out that invariants also provide a solution to the first problem. Coneris provides Iris-style invariant assertions of the form 𝐼, which states that an assertion 𝐼is an invariant of program execution. These assertions support the following rules: ht-inv-alloc 𝐼∗𝑃 𝑒{𝑄} {𝐼∗𝑃}𝑒{𝑄}inv-dup 𝐼 𝐼∗𝐼ht-inv-open 𝑒atomic{𝐼∗𝑃}𝑒{𝐼∗𝑄} 𝐼∗𝑃 𝑒 𝐼∗𝑄 The first rule ht-inv-alloc says that we can allocate an invariant assertion 𝐼if we know that 𝐼 holds in the precondition. Invariant assertions are duplicable , meaning that we can produce multiple copies using inv-dup , so that when applying ht-par-comp , each thread can have a copy of 𝐼in its precondition. Finally, threads can access the assertion 𝐼inside of the invariant using ht-inv-open , which requires that the invariant is restablished after the atomic expression 𝑒finishes executing. An expression 𝑒isatomic if it steps to a value in a single execution step. By putting the 𝑙↦→− assertion inside of an invariant 𝐼, we can thus allow both threads to access 𝑙by using ht-inv-open during the FAA step. A standard technique in the CSL literature is to use ghost state to encode a kind of protocol in the invariant assertion that tracks how 𝑙can evolve through the shared access by the two threads [Jung et al .2015]. We omit the exact details of how this ghost state encoding works, but at a high level, this invariant assertion would have a format like: 𝐼≜(𝑙↦→0∗...)| {z } no thread has added∨(∃𝑣.𝑙↦→𝑣∗...)| {z } one thread has added∨(∃𝑣.𝑙↦→𝑣∗...)| {z } both threads have added where threads use ghost state to track which case of this disjunction they are in. Our key observation is that if we now also include error credits in the invariant, then we can additionally resolve the first issue mentioned above related to not having a sufficient number of error credits. For example, by setting the invariant to a form like 𝐼≜(𝑙↦→0∗...)| {z } no thread has added∨(∃𝑣.𝑙↦→𝑣∗...)| {z } one thread has added∨∃𝑣.𝑙↦→𝑣∗𝑣>0∗... | {z } both threads have added ∗ E(1/16)∗... | {z } no thread has sampled∨E(1/4)∗... | {z } one thread has sampled 0∨E(1)∗... | {z } both threads sampled 0 We initially have that the invariant owns the whole error credit E(1/16). The first thread to sample will use this E(1/16)with ht-rand-exp , ending up with E(1/4)in the case it samples 0, which it uses to reestablish the invariant. The second thread to sample can then use this E(1/4)with ht-rand-exp to get E(1)in the case that it also samples 0, at which point we can use err-1 to exclude this case, just as we did in the original sequential example. Of course, additional ghost state is needed to track this more complex protocol, but modern CSLs like Iris already provide sophisticated tools for encoding these kinds of protocols. Page 6: 6 Kwing Hei Li, Alejandro Aguirre, S. O. Gregersen, Philipp G. Haselwarter, Joseph Tassarotti, and Lars Birkedal createCntr≜𝜆_.ref0 readCntr≜𝜆𝑙.!𝑙 incrCntr≜𝜆𝑙.faa𝑙(rand 3)conTwoAdd≜let𝑙=createCntr()in (incrCntr𝑙|||incrCntr𝑙); readCntr𝑙 Fig. 2. Refactored conTwoAdd code. Although this example seems simple, to the best of our knowledge, Coneris is the first unary program logic for randomized concurrent programs that can prove this bound of 1/16forconTwoAdd . Prior concurrent separation logics, even those that are specific to first-order languages [Fesefeldt et al.2022; Zilberstein et al .2024] lack logical facilities necessary for expressing this non-trivial protocol on the shared state. Modularity and Randomized Logical Atomicity. While placing error credits in a shared invariant solved the problems discussed in the previous part, the solution we have shown so far is not modular. To see this issue, imagine that we refactored the code of conTwoAdd as in Figure 2. Here we introduce intermediate functions that encapsulate the operations performed on the location 𝑙. Ideally, we ought to be able to derive separate specifications for these operations, and then use only their specifications to prove the same Hoare triple we had before for conTwoAdd . For example, we might introduce a predicate of the form counter𝑙𝑛≜𝑙↦→𝑛capturing the value of the counter. Then, a specification for incrCntr might look like E𝔘3[F]≤𝜀 {counter𝑙𝑥∗E(𝜀)}incrCntr𝑙{∃𝑛.counter𝑙(𝑥+𝑛)∗E(F(𝑛))∗𝑛∈{0..3}} which expresses the effects of adding values to the counter and also allows for averaging error credits across the different outcomes, similarly to ht-rand-exp . However, this specification for incrCntr is not sufficient for verifying conTwoAdd . As we saw previously for that example, we must put the points-to assertion for 𝑙(corresponding to counter ) and the E(1/16)in an invariant. But we cannot use ht-inv-open to open this invariant when using the above specification for incrCntr , because incrCntr𝑙is not an atomic expression. Fornon-probabilistic concurrent programs, a standard solution to this problem is to derive a specification that captures that incrCntr behaves as if it were atomic when incrementing the value in the counter. Although several different techniques have been proposed for encoding what it means for a function to be logically atomic, in Coneris we adapt the HOCAP [Svendsen et al .2013] approach. At its core, the idea behind HOCAP is to observe that what makes a physically atomic expression special, in terms of the rules of the logic, is that we can open an invariant around a physically atomic step. Thus, to capture that an operation behaves as if it is atomic, we need a specification style that allows for opening an invariant at the logical point where an operation takes effect. In Iris, this is captured through the update modality , written|⇛𝑄which holds when 𝑄 can be derived by opening invariants. By deriving a specification for incrCntr in which the counter assertion occurs under such an update modality in the pre-condition, we can enable incrCntr to open invariants to get this counter assertion at the moment when it performs the faa. To extend this idea to the probabilistic setting, Coneris introduces a new modality, called the probabilistic update modality |⇝⇝, which additionally allows for error credits to be updated in an expectation preserving way, in the style of ht-rand-exp . Intuitively,|⇝⇝𝑃holds if we can make an instantaneous probabilistic update of our resources such that the outcome satisfies 𝑃. Compared to the standard update modality |⇛, the probabilistic update modality can additionally redistribute errors credits through a logical operation called tape presampling that allows clients to reason Page 7: Modular Reasoning about Error Bounds for Concurrent Probabilistic Programs 7 about future probabilistic choices. By using this new modality in HOCAP-style specifications, we are able to capture randomized logical atomicity, enabling modular reasoning about concurrent probabilistic libraries. Because this technique requires more advanced rules of Coneris, we postpone demonstrating it to §5.1, and first present more of the formal details of Coneris and the semantics of ConcRandML, the concurrent programming language used to express our examples. 3 Preliminaries In §3.1, we first recall various definitions from probability theory. We then present the syntax of ConcRandML in §3.2 and its operational semantics in §3.3. 3.1 Probability Theory To account for possibly non-terminating behavior of programs, we define our operational semantics using probability sub-distributions. A discrete subdistribution (henceforth simply distribution ) on a countable set 𝐴is a function 𝜇:𝐴→[0,1]such thatÍ 𝑎∈𝐴𝜇(𝑎)≤1. The collection of distributions on𝐴is denoted byD(𝐴). The null distribution 0:D(𝐴)is the constant function 𝜆𝑥.0. We let {𝑁..𝑀}denote the set{N∈N|𝑁≤𝑛≤𝑀}and for𝑁≥0we let 𝔘𝑁:D(N)denote the (uniform) distribution that returns 1/(𝑁+1)for every𝑛∈{0..𝑁}and0otherwise. The expected value of𝑋:𝐴→[0,1]with respect to 𝜇is defined as E𝜇[𝑋]≜Í 𝑎∈𝐴𝜇(𝑎)·𝑋(𝑎). The mass of𝜇is E𝜇[𝜆𝑥.1]. Given a predicate 𝑃on𝐴, the Iverson bracket[𝑃]evaluates to 1if𝑃(𝑎)is true and to 0 otherwise, and the probability of 𝑃w.r.t.𝜇isPr𝜇[𝑃]≜E𝜇[[𝑃]]. Distributions form a monad; we write𝜇≫=𝑓forbind(𝑓,𝜇), which is defined as follows. ret:𝐴→D(𝐴) bind:(𝐴→D(𝐵))×D(𝐴)→D(𝐵) ret(𝑎)(𝑎′)≜[𝑎=𝑎′] bind(𝑓,𝜇)(𝑏)≜∑︁ 𝑎∈𝐴𝜇(𝑎)·𝑓(𝑎)(𝑏) 3.2 The ConcRandML Language Our examples are written in the ConcRandML language, which is an ML-style programming language extended with probabilistic sampling and fork-based concurrency. The syntax of the language is defined by the grammar below: 𝑣,𝑤∈Val::=𝑧∈Z|𝑏∈B|()|ℓ∈Loc|rec f x =𝑒|(𝑣,𝑤)|inl𝑣|inr𝑣| 𝑒∈Expr ::=𝑣|x|rec f x =𝑒|𝑒1𝑒2|𝑒1+𝑒2|𝑒1−𝑒2|...|if𝑒then𝑒1else𝑒2|(𝑒1,𝑒2)|fst𝑒|... ref𝑒1|!𝑒|𝑒1←𝑒2|𝑒1[𝑒2]|rand𝑒|fork𝑒|faa𝑒1𝑒2|rand𝑒1 𝜎∈State ::=Locfin−⇀Val 𝜌∈Cfg::=List(Expr)×State The syntax is mostly standard, for example, the expressions ref𝑒,!𝑒, and𝑒1←𝑒2allocate, load from, and store into a reference, respectively. An array 𝑒1can be accessed at offset 𝑒2(for load or store) via𝑒1[𝑒2]andrand𝑁samples from the uniform distribution on {0,...,𝑁}. Concurrency is supported via fork𝑒, which executes 𝑒in a new thread, and atomic operations which provide synchronization between threads. For example, the atomic fetch-and-add faa𝑒1𝑒2 instruction adds the integer 𝑒2to the value 𝑣stored at location 𝑒1and returns 𝑣.2 A program configuration 𝜌∈List(Expr)×State is given by a pair containing the list of currently executing threads and the heap modeled as a finite map from locations to values. A configuration 𝜌 isfinal if the first expression in the thread list is a value. 2ConcRandML also supports other atomic instructions such as atomic exchange and compare-and-swap, which we omit here for brevity. Page 8: 8 Kwing Hei Li, Alejandro Aguirre, S. O. Gregersen, Philipp G. Haselwarter, Joseph Tassarotti, and Lars Birkedal 3.3 Operational Semantics The operational semantics of ConcRandML programs is given in stages: expressions take a single execution step, which gets lifted to thread pools by schedulers, and finally these steps are chained together to obtain full program execution. Expressions. Thestep function takes an expression (representing the currently active thread) and the current state and produces a distribution over the new expression, new state, and a (possibly empty) list of newly spawned threads. ConcRandML has a standard call-by-value semantics where steps can occur under evaluation contexts. Deterministic language constructs like if-then-else (1) orfork𝑒(2)step deterministically by using the return of the distribution monad. The rand𝑁 instruction (3) uniformly associates probability 1/(𝑁+1)to any integer 𝑛between 0 and 𝑁. step :(Expr,State)→D( Expr,State,List(Expr)) step(if true then𝑒1else𝑒2,𝜎)=ret(𝑒1,𝜎,[]) (1) step(fork𝑒,𝜎)=ret((),𝜎,[𝑒]) (2) step(rand𝑁,𝜎)=𝜆(𝑛,𝜎,[]).1 𝑁+1if𝑛∈{0,...,𝑁}and 0otherwise (3) Thread Pools and Schedulers. The operational semantics of a configuration 𝜌=(®𝑒,𝜎)is then given simply by indicating which thread amongst ®𝑒should step, i.e., by specifying an index 𝑖∈[0,|®𝑒|−1]and applying the step function to (𝑒𝑖,𝜎): tpStep(®𝑒,𝜎)(𝑖)≜  0 if(®𝑒,𝜎)is final, ret(®𝑒,𝜎) if𝑒𝑖is a value, step(𝑒𝑖,𝜎)≫=𝜆(𝑒′ 𝑖,𝜎′,®𝑒′).ret(®𝑒[𝑖↦→𝑒′ 𝑖]++®𝑒′,𝜎′)otherwise. If𝜌is final, it does not step. If 𝑒𝑖is a value, we take a stutter step. Otherwise, we update the 𝑖-th thread with the stepped expression 𝑒′ 𝑖and append the newly spawned threads ®𝑒′to the thread pool. Ascheduler decides which thread in a configuration to step next. Formally, a (probabilistic, stateful) scheduler is given by a transition function 𝜁:(SchedSt×Cfg)→D( SchedSt×N), which takes in an internal state Ξ∈SchedSt and a configuration 𝜌, and returns a distribution on its new internal state and the index of the thread in 𝜌to step next. Given a configuration 𝜌, a scheduler 𝜁, and a scheduler state Ξ, we can now define the single scheduler-step reduction function schStep𝜁(Ξ,𝜌)∈D( SchedSt×Cfg)as follows: schStep𝜁(Ξ,𝜌)≜𝜁(Ξ,𝜌)≫=𝜆(Ξ′,𝑖).tpStep(𝜌,𝑖)≫=𝜆𝜌′.ret(Ξ′,𝜌′) Intuitively, schStep𝜁(Ξ,𝜌)first evaluates 𝜁(Ξ,𝜌)to get a new state Ξ′and index𝑖, steps the𝑖-th thread to obtain the configuration 𝜌′, and returns the new scheduler state and configuration. The notion of scheduler we consider is quite strong. Firstly, the scheduler is probabilistic and can update its internal state and choose the next thread to step probabilistically instead of deterministi- cally. Secondly, the update decision of a scheduler can depend not only on its internal state, but also on the entire view of the thread pool and the memory state. These two design choices provide more power to the scheduler and enable us to reason about the error bounds of algorithms under a larger and richer class of schedulers than, say, deterministic schedulers. Page 9: Modular Reasoning about Error Bounds for Concurrent Probabilistic Programs 9 Program Execution. We next define 𝑛-step program execution with respect to a scheduler 𝜁as the following recursive function exec𝜁,𝑛:(SchedSt×Cfg)→D( Val). exec𝜁,𝑛(Ξ,𝜌)≜  ret𝑣 if𝜌is final and𝜌=(𝑣·®𝑒,𝜎)for some𝑣∈Val, 0 if𝑛=0and𝜌is not final, schStep𝜁(Ξ,𝜌)≫=exec𝜁,𝑛−1otherwise. One can read exec𝜁,𝑛(Ξ,𝜌)(𝑣)as the probability of returning 𝑣in the first thread after at most 𝑛steps of𝜌under the scheduler 𝜁initialized with the scheduler state Ξ. Finally, full program execution is defined as the limit of exec𝜁,𝑛, which exists by monotonicity and continuity: exec𝜁(Ξ,𝜌)≜lim𝑛→∞exec𝜁,𝑛(Ξ,𝜌) We simply write exec𝜁𝑒if the result is the same for all initial program and scheduler states. Traditionally, a program 𝜌issafeif it never gets stuck during execution, i.e., any partial program execution starting from 𝜌is either a value or it can make progress. To define the appropriate notion of safety for the probabilistic setting of ConcRandML (see Theorem 4.2), we need the following auxiliary definition of partial program execution pexec𝜁,𝑛:(SchedSt×Cfg)→D( SchedSt×Cfg). pexec𝜁,𝑛(Ξ,𝜌)=( ret(Ξ,𝜌) if𝜌is final or𝑛=0, schStep𝜁(Ξ,𝜌)≫=pexec𝜁,𝑛−1otherwise. We can view pexec as a relaxation of exec which keeps probability mass on configurations that are not final, whereas the latter only considers final configurations. 4 Logic In this section, we dive into the rules of Coneris. We start with a glance of the syntax of the logic and its adequacy theorem. Then, we explore the general program logic rules before discussing presampling tapes and the probabilistic update modality. 4.1 Introduction to Coneris The Coneris logic is built on top of the Iris base logic [Jung et al .2018] and inherits all of the basic propositions and their associated proof rules. This includes the later modality ⊲, the persistence modality and the points-to connective ℓ↦→𝑣that asserts exclusive ownership of the location ℓ storing value 𝑣. A selection of Coneris propositions are shown below. 𝑃,𝑄∈iProp ::=True|False|𝑃∧𝑄|𝑃∨𝑄|𝑃⇒𝑄|∀𝑥.𝑃|∃𝑥.𝑃|𝑃∗𝑄|𝑃∗𝑄|⊲𝑃|𝑃| 𝑃𝜄|𝑔𝛾| |⇛E1E2𝑃|ℓ↦→𝑣|{𝑃}𝑒{𝑄}E|E(𝜀)|𝜅↩→(𝑁,®𝑛)| |⇝⇝E1E2𝑃|... Coneris is a separation logic and propositions denote sets of resources. 𝑃∗𝑄holds for resources that can be decomposed into two disjoint pieces satisfying 𝑃and𝑄. The separating implication 𝑃∗𝑄is the right adjoint of ∗, in the sense that 𝑃∗(𝑃∗𝑄)⊢𝑄. While omitted in §2, note that invariant assertions 𝑃𝜄are annotated with an identifying name 𝜄which is used to prevent the prover from opening the same invariant twice (which is unsound). For bookkeeping purposes, Hoare triples are annotated with the set of invariant names that the specification relies on; we omit thismask annotation when considering the set of all invariant names ⊤. As mentioned in §2, we internalize error bounds using the error credit assertion E(𝜀). The presampling tape assertion𝜅↩→(𝑁,®𝑛)is a probabilistic connective that we adapt from Clutch [Gregersen et al .2024] which plays a key role in deriving certain modular specifications. The probabilistic update modality |⇝⇝E1E2𝑃is a novelty of the Coneris logic. We further discuss these three connectives and their role in the following section. Page 10: 10 Kwing Hei Li, Alejandro Aguirre, S. O. Gregersen, Philipp G. Haselwarter, Joseph Tassarotti, and Lars Birkedal The meaning of the Coneris Hoare triple is captured by the adequacy theorem shown below. Theorem 4.1 (Adeqacy). If{E(𝜀)}𝑒{𝜙}, then for all schedulers 𝜁,Prexec 𝜁𝑒[¬𝜙]≤𝜀. The theorem says that by proving a Hoare triple for the expression 𝑒, assuming initial ownership ofE(𝜀)error credits, then for all schedulers 𝜁, the probability of the program 𝑒returning a value notsatisfying the proposition 𝜙is smaller than or equal to 𝜀. In addition, we have another safety theorem that provides an upper bound on the probability of the expression getting stuck. Theorem 4.2 (Safety). If{E(𝜀)}𝑒{True}, then for all schedulers 𝜁with mass 1and integers 𝑛, the mass of pexec𝜁,𝑛(Ξ,([𝑒],𝜎))is greater or equal to 1−𝜀. Intuitively, this theorem states that proving {E(𝜀)}𝑒{True}in Coneris implies that the probability of𝑒getting stuck is at most 𝜀for all schedulers3. 4.2 Rules of Coneris Program-Logic Rules. Coneris satisfies all the usual structural and computational rules present in Iris-based separation logics. For example, Coneris satisfies the bind rule ( ht-bind ), the frame rule ( ht-frame ), and the usual computational rules of for interacting with the heap ( e.g.,ht-load ). ht-bind {𝑃}𝑒{𝑣.𝑄}∀𝑣.{𝑄}𝐾[𝑣]{𝑅} {𝑃}𝐾[𝑒]{𝑅}ht-frame {𝑃}𝑒{𝑄} {𝑃∗𝑅}𝑒{𝑄∗𝑅}ht-load {𝑙↦→𝑣}!𝑙{𝑤.𝑤=𝑣∗𝑙↦→𝑣} Invariants can be allocated by giving up ownership of the corresponding resources ( ht-inv- alloc ). If we own an invariant, we can temporarily, for one atomic step, get access to its contents (ht-inv-open ). The later modality ⊲is important for soundness but can otherwise be ignored [Jung et al. 2018]. ht-inv-alloc 𝑃𝜄∗𝑄 𝑒{𝑅}E {𝑃∗𝑄}𝑒{𝑅}ht-inv-open 𝑒atomic{⊲𝐼∗𝑃}𝑒{⊲𝐼∗𝑄}E 𝐼𝜄∗𝑃 𝑒{𝑄}E⊎{𝜄} The Update Modality. The update modality |⇛E1E2is the primary primitive for manipulating ghost resources and interacting with invariants in the Iris base logic. As we alluded to earlier in §2, a key idea behind the HOCAP approach to modular specification is to use this modality as a way to assert that a proposition could be proven by opening invariants. The update modality |⇛E1E2is annotated with two sets of invariants. We write |⇛Ewhen E1=E2=Eand|⇛whenE=⊤, the set of all names. Intuitively, the assertion |⇛E1E2𝑃denotes a resource that, together with the resources from the invariants in E1, can be updated and split into two disjoint pieces: one satisfying 𝑃and one satisfying the invariants in E2. That is, we can use the update modality to specify resource updates and invariant access ( inv-open ) as an assertion in the logic rather than just as a primitive rule of the program logic. The update modality can be eliminated ( ht-fupd-elim ) at any suitable time during program verification. inv-open 𝑃𝜄⊲𝑃∗|⇛E1E2(⊲𝑃∗𝑄) |⇛E1⊎{𝜄} E 2⊎{𝜄}𝑃ht-fupd-elim 𝑒atomic{𝑃∗𝑄}𝑒 |⇛E2E1𝑅 E2|⇛E1E2𝑃∗𝑄 𝑒{𝑅}E1 A key idea behind the approach we apply in §5 is to parameterize program specifications by a proposition of the shape 𝑃∗|⇛E1E2𝑄, a so-called view shift , that is eliminated at the linearization 3The condition that 𝜁must have mass 1 (for all scheduler states) rules out the pathological situation where a configuration is “stuck” because 𝜁has probability less than 1 to pick anythread to step next. The assumption is only used in Theorem 4.2. Page 11: Modular Reasoning about Error Bounds for Concurrent Probabilistic Programs 11 point of the module operation. By providing a view shift as an argument, the client can specify how they wish for their logical state (their “view”) to evolve when the operation physically takes place. Presampling Tapes. Reminiscent of how prophecy variables [Abadi and Lamport 1988, 1991; Jung et al .2019] allow us to talk about the future, presampling tapes give us the means to talk about the outcome of sampling statement in the future. Presampling tapes were introduced in Clutch [Gregersen et al .2024] to address an alignment issue in refinement proofs, but as we later see in §5 they also play a crucial role in modularizing (unary) proofs about concurrent probabilistic programs through probabilistic view shifts. Intuitively, presampling tapes allow us in the logic to presample the outcome of future sampling statements. Formally, they appear both operationally and in the logic. In the programming language, presampling tapes appear as two new ghost code constructs, tape𝑒andrand𝑒1𝑒2, that are used to allocate a new presampling tape and sample from a tape, respectively. 𝑣∈Val::=...|𝜅∈Label 𝑒∈Expr ::=...|tape𝑒|rand𝑒1𝑒2𝜎∈State≜(Locfin−⇀Val)×( Labelfin−⇀Tape) 𝑡∈Tape≜{(𝑁,®𝑛)|𝑁∈N∧®𝑛∈N∗ ≤𝑁} In the operational semantics, allocation of a fresh presampling tape (4)viatape𝑁determin- istically associates a fresh label 𝜅to the empty tape 𝜖. A labelled rand𝜅 𝑁 with an empty tape samples uniformly (5),i.e., it behaves like an unlabelled rand𝑁. If, on the other hand, the tape 𝜅is non-empty, rand𝑁 𝜅deterministically pops the first value 𝑛from the tape (6). Note that no step in the operational semantics writes contents to a presampling tape. In fact, tapes and label annotations do not in any way alter the behavior of the program and can be entirely erased [Gregersen et al . 2024]. However, as we later see, the probabilistic update modality allows us to reason as if a presampling step could asynchronously pre-populate a tape with a random sample at any point in time. step(tape𝑁,𝜎)=ret(𝜅,𝜎[𝜅:=(𝑁,𝜖)],[]) (where𝜅is fresh w.r.t. 𝜎) (4) step(rand𝜅 𝑁,𝜎)=𝜆(𝑛,𝜎,[]).1 𝑁+1if𝜎[𝜅]=(𝑁,𝜖)∧𝑛∈{0,...,𝑁}and 0otherwise (5) step(rand𝜅 𝑁,𝜎[𝜅:=𝑛·®𝑛])=ret(𝑛,𝜎[𝜅:=®𝑛],[]) (6) Now, the logical assertion 𝜅↩→(𝑁,®𝑛)denotes ownership of the presampling tape 𝜅with bound 𝑁and contents®𝑛, analogously to how the points-to connective for the heap denotes ownership of a location and its contents. The two rules ht-alloc-tape andht-rand-tape reflects the operational behavior of equations (4) and (6) in the logic. ht-alloc-tape {True}tape𝑁{𝜅.𝜅↩→(𝑁,𝜖)}ht-rand-tape {𝜅↩→(𝑁,𝑛·®𝑛)}rand𝜅 𝑁{𝑥.𝑥=𝑛∗𝜅↩→(𝑁,®𝑛)} Probabilistic Update Modality. Previous work [Aguirre et al .2024; Gregersen et al .2024; Haselwarter et al .2025] introduce presampling tapes for different purposes but, common for all instantiations, presampling is only supported as a rule in the program logic. Similar to how ht- inv-open only allows us to reason about invariants using the program logic, presampling is only supported as a primitive program-logic rule. This is notsufficient for the modular specifications we set out to prove. Intuitively, we need a way to specify updates to presampling tapes as an assertion , just like the update modality enables us to specify invariant access and resource updates as an assertion. To this end, we introduce the probabilistic update modality |⇝⇝E1E2𝑃. This modality satisfies all the same rules as the update modality; for example, it can be used to open invariants (hence the Page 12: 12 Kwing Hei Li, Alejandro Aguirre, S. O. Gregersen, Philipp G. Haselwarter, Joseph Tassarotti, and Lars Birkedal invariant masksE1andE2) and update resources, and it is monadic ( pupd-ret andpupd-bind ) and can be derived from the update modality ( pupd-fupd ). pupd-ret 𝑃 |⇝⇝E𝑃pupd-bind |⇝⇝E1E2𝑃 𝑃∗|⇝⇝E2E3𝑄 |⇝⇝E1E3𝑄pupd-fupd |⇛E1E2𝑃 |⇝⇝E1E2𝑃 pupd-presample-exp E𝔘𝑁[F]≤𝜀 E(𝜀)𝜅↩→(𝑁,®𝑛) |⇝⇝E(∃𝑛.𝜅↩→(𝑁,®𝑛·𝑛)∗E(F(𝑛)))pupd-err |⇝⇝E(∃𝜀.0<𝜀∗E(𝜀)) The key novelty of the probabilistic modality is its ability to populate presampling tapes as shown in pupd-presample-exp . The rule says that if we own a presample tape we can populate the tape with a freshly sampled value 𝑛. Similar to ht-rand-exp , it also allows re-distributing error credits along different branches of the randomized outcome, so long as the expected value of the error credit does not increase. We showcase the probabilistic update modality on an example in §5.4. As a somewhat orthogonal property, the probabilistic update modality also internalizes the notion of continuity of probabilities within the logic. Specifically, it permits synthesizing some arbitrarily small error credit out of thin air as seen in pupd-err . This principle enables induction by error amplification [Aguirre et al. 2024] as we showcase in §5.4. 5 Modular Specifications of Concurrent Randomized Modules In this section, we first provide an overview of how HOCAP-style specifications capture logically atomicity of concurrent data structures. Next, we explain how we extend the approach to capture randomized logical atomicity and present a modular specification for the randomized counter module. We also describe how the specification is strong enough to verify clients that use the randomized counter module concurrently. Subsequently, we present three different implementations of the concurrent randomized counter module and discuss how to show that they satisfy the specification. Later in §6 we discuss how we verify a series of larger case studies. 5.1 Modular Specifications of Concurrent Randomized Modules: Overview Before considering the randomized setting, we showcase our specification style on a non-randomized example: a concurrent counter module with functions for creating a counter, (deterministically) incrementing by one, and reading. As alluded to in §4.2, the high level idea is to parameterize specifications by a view shift that captures how the logical state of the counter evolves at the linearization point. Our specification of the non-randomized counter module is shown in Figure 3. {True}createCntr(){𝑐.∃𝜄.counter𝜄𝑐∗cfrag 1 0} ∀E,𝜄,𝑐,𝑄. counter𝜄𝑐∗(∀𝑧.cauth𝑧∗|⇛Ecauth(𝑧+1)∗𝑄𝑧) incrCntr𝑐{𝑧.𝑄𝑧}E⊎{𝜄} ∀E,𝜄,𝑐,𝑄. counter𝜄𝑐∗(∀𝑧.cauth𝑧∗|⇛Ecauth𝑧∗𝑄𝑧) readCntr𝑐{𝑧.𝑄𝑧}E⊎{𝜄} Fig. 3. Specification for a (non-randomized) concurrent counter module. When creating a counter, one obtains ownership of two resources: counter𝜄𝑐andcfrag 1 0. The counter𝜄𝑐resource captures that 𝑐is a counter with an associated invariant name 𝜄. Intuitively, Page 13: Modular Reasoning about Error Bounds for Concurrent Probabilistic Programs 13 this invariant contains the internal state of the counter but the details are unknown to clients. The predicate is persistent, i.e.,counter𝜄𝑐⊣⊢counter𝜄𝑐∗counter𝜄𝑐and can hence be freely shared. The predicates cauth andcfrag provide authoritative andfragmental views of the counter. In- tuitively, cauth provides the counter module’s view of the counter and cfrag denotes the client’s view. A fragmental view cfrag𝑞 𝑛denotes a𝑞-fractional view that the counter is at least the value𝑛. The cfrag𝑞 𝑛resource can be split and combined, i.e.,cfrag(𝑞1+𝑞2)(𝑛1+𝑛2) ⊣⊢ cfrag(𝑞1,𝑛1)∗cfrag(𝑞2,𝑛2)and thus shared. The fragmental view is guaranteed to be consis- tent with the authoritative view, i.e.,cauth𝑛∗cfrag𝑞𝑚⊢𝑚≤𝑛andcauth𝑛∗cfrag 1𝑚⊢𝑚=𝑛, and updated accordingly, i.e.,cauth𝑛∗cfrag𝑞𝑚⊢|⇛Ecauth(𝑛+𝑝)∗cfrag𝑞(𝑚+𝑝). The specification for the increment and read functions are parameterized by a view shift that gives (temporary) access to the module’s view. This is one of the key ideas of HOCAP-style specifications. From the client’s perspective, the view shift is a proof obligation. For the increment function, proving this view shift requires having ownership of a fragmental view (to update the resources), but the fragmental view can be provided by opening an invariant using the update modality. The client-chosen predicate 𝑄lets the client derive information as part of the view shift. For example, they can pick 𝑄𝑧≜cfrag𝑞(𝑛+1)∧𝑧=(𝑛+1). Probabilistic Concurrent modules with Error Redistribution. Now, consider the random- ized concurrent counter module from §2 where the increment function increments the counter by a value chosen uniformly at random from 0 to 3. For the client to be able to redistribute error credits as part of the random sampling, we parameterize the specification of the increment function by another view shift as shown in Figure 4. ∀E,𝜄,𝑐,𝑄. counter𝜄𝑐∗ |⇛E ∅∃𝜀,F.E(𝜀)∗(E𝔘3[F]≤𝜀)∗∀𝑥∈{0..3}.E(F(𝑥))∗|⇛∅ E(∀𝑧.cauth𝑧∗|⇛Ecauth(𝑧+𝑥)∗𝑄𝜀F𝑥𝑧) incrCntr𝑐 {𝑧.∃𝜀,F,𝑥.𝑄𝜀F𝑥𝑧}E⊎{𝜄} Fig. 4. Specification of incrCntr for a concurrent randomized counter module Notice that in the precondition the client now has to prove a view shift which is split into two parts. We begin by looking at the second part (the line at the bottom). This is analogous to the deterministic case, except that now the abstract state cauth𝑧gets incremented by some uniformly sampled𝑥∈{0..3}. This operation is randomized, so we also let the client update their error credits along this distribution, which is the first part of the view shift. After opening all invariants in E, the client chooses some 𝜀and an error distribution function F, gives up E(𝜀), gets back E(F(𝑥)), re-establishes all invariants in E, and goes on to prove the second part. Notice that the specification allows the client to retrieve error credits from an invariant. Intuitively, these two parts of the view shift capture two separately logically atomic actions of the increment operation. The first being the random operation where we re-distribute errors, and the second being the actual increment, where we increase the counter by the sampled value. If all these preconditions are satisfied, then at the end of incrCntr , we return some value 𝑧which satisfies 𝑄𝜀F𝑥𝑧for some𝜀,F, and𝑥. The specification for creating and reading the counter are unchanged as no randomization is involved. Probabilistic Concurrent Modules with Error Redistribution and Presampling. One limitation of the previous specification is that the sampling operation is fixed to take place within the function call incrCntr . As a result, the only point at which randomness can be generated for the Page 14: 14 Kwing Hei Li, Alejandro Aguirre, S. O. Gregersen, Philipp G. Haselwarter, Joseph Tassarotti, and Lars Birkedal module, and errors can be distributed, is at the invocation of the increment operation. However, it is sometimes useful to reason about the probabilistic part of the operation asynchronously. In Clutch [Gregersen et al .2024] presampling tapes are used to generate randomness asyn- chronously and facilitate refinement proofs. In a concurrent setting, there is also an asynchronous component arising from the order in which randomized operations are physically resolved, and we propose the use of presampling and tapes to resolve them in advance and independently from this order. In the previous specification, the view shift consisted of a probabilistic part ( i.e., spending and distribution of error credits) and a deterministic part (updating the abstract state). Presampling allows us to decouple these two parts and reason about them separately, resulting in a more expressive HOCAP-style specification. We demonstrate that by exposing tapes and presampling operations in the module specifications, clients can perform presampling for an abstract randomized operation. This is an indispensable technique for verifying certain concurrent modules, and we show an example in §5.2. The new and final specification of the probabilistic counter module, which includes not only error redistribution, but also a (ghost) method for creating an abstract presampling tape and a (ghost) operation for sampling on a tape, see Figure 5. This specification has a new predicate ctape ∀E,𝜀,F,®𝑛,𝜅.(E(𝜀)∗(E𝔘3[F]≤𝜀)∗ctape𝜅®𝑛∗|⇝⇝E∃𝑛∈{0..3}.E(F(𝑛))∗ctape𝜅(®𝑛·[𝑛])) ∀𝜄,𝑐.{counter𝜄𝑐}createCtape(){𝜅.ctape𝜅𝜖} ∀E,𝜄,𝑐,𝑛,®𝑛,𝑄.counter𝜄𝑐∗ctape𝜅(𝑛·®𝑛)∗ (∀𝑧.cauth𝑧∗|⇛Ecauth(𝑧+𝑛)∗𝑄𝑧) incrCntr𝑐𝜅 𝑧.ctape𝜅®𝑛∗𝑄𝑧 E⊎{𝜄} Fig. 5. Specification for a randomized counter module with presampling tapes. that stores the presampled randomness for the random counter. Note that ctape is an abstract predicate which might be realized in multiple ways besides using primitive tape predicates, which allows us to hide the details of how different implementations of the counter module physically generate randomness. By exposing the abstract presampling tape explicitly, we aim to capture more of the proof principles for a concrete randomized operation. (In §5.2, we demonstrate that this specification which exposes abstract tapes is in fact more general than the previous one) Compared to the previous randomized specification, reasoning about randomness of the incre- ment operation is now extracted into a separate condition that utilizes the probabilistic update modality (|⇝⇝𝑃), which says that we can presample onto the ctape and distribute errors in a expectation-preserving manner. With this change, clients can allocate their own local tapes via createCtape and reason about randomness locally. The incrCntr function takes a non-empty ctape predicate as argument, and acts in a (logically) deterministic manner, by reading and consuming the first element 𝑛of the tape, and incrementing the abstract state of the counter by 𝑛. Now that we have shown an expressive specification (Figure 5), in the following sections, we show how this specification suffices to verify clients. We also show that three different implementations of the probabilistic random counter module all meet this specification. These three implementations exhibit different numbers of sampling operations, but yet they all meet the same abstract module specification. In other words, the randomization of the increment operation acts “logically atomic” as expressed by a single probabilistic update, even if in reality, it is not. From the perspective of a client, random sampling within the increment operation appears to behave as if it is simply a single rand 3. We refer to this as randomized logical atomicity . Page 15: Modular Reasoning about Error Bounds for Concurrent Probabilistic Programs 15 5.2 Verifying Clients of Randomized Counter Module We now describe how the specification with error re-distribution and presampling tapes shown in Figure 5 can be used to verify concurrent clients. We also show how the HOCAP-style specification that exposes abstract tapes is more general than the one that does not. Revisiting conTwoAdd .We begin with the conTwoAdd client example introduced in § 2. Since the new specification of the randomized concurrent counter utilizes tapes, we annotate the conTwoAdd client to use the abstract tapes exposed in the specification: conTwoAdd≜let𝑐=createCntr()in let𝜅=createCtape()in incrCntr𝑐𝜅|||let𝜅=createCtape()in incrCntr𝑐𝜅 ; readCntr𝑐 Recall that we expect the return value to be 0with a probability 1/16. We state this through the following Coneris Hoare triple: {E(1/16)}conTwoAdd{𝑣.𝑣>0}. We present here a high level of the proof, and defer the details to Appendix A. Most of this proof is similar to the one sketched in §2 where we allocate an invariant that encodes a protocol that tracks both the available amount of error credits and the ghost state of both threads and describes how they can evolve. In the case where both threads sampled 0, we are able to obtain E(1)from the invariant at the end and derive a contradiction with err-1 . The difference between this proof and that from §2 are twofold. Firstly, the randomness is generated asynchronously using the presampling rule and the abstract tapes. The probabilistic update modality allows us to open the invariant, obtain error credits from it, presample onto our abstract tapes, redistribute the error credits, and close the invariant again, all in an atomic manner. Secondly, to apply incrCntr andreadCntr , we need to prove the view shifts in the precondition of their corresponding Hoare triple specifications. An important detail of this proof is that we do not need to place any ctape predicate within the invariant. This is because each thread uses a separate, local tape (which means that tapes need not be shared in an invariant). This kind of “local tape” principle lets each thread “own” its own randomness and this helps simplify the proof since we need not worry how the state of ctape is changed by other external concurrent threads. Advantage of Exposing Abstract Tapes. Recall that in §5.1, we presented a simpler speci- fication of the randomized counter module in Figure 4 that does not expose presampling tapes as abstract predicates. To see why that specification is not as general as that in Figure 5 and that it is useful to expose the presampling tapes in the specification, consider the following twoIncr program and its specification in Figure 6. The sequential program twoIncr first creates a new ran- domized counter and allocates a tape for the counter. It then performs two incrCntr andreadCntr pair operations successively, to read the exact values 𝑣1and𝑣2incremented into the counter. At the end it returns 4∗𝑣1+𝑣2. Because both 𝑣1and𝑣2are sampled uniformly from {0,..., 3}, the return value is uniformly distributed between {0,..., 15}. This is captured by the Hoare triple in Figure 6, where we are able to split some amount of error credits across the 16possibilities in an expectation-preserving way. Note that the view shift of the Hoare triple captures the fact that this re-distribution of error credits happens in a logically-atomic manner. Proving the specification of twoIncr with the more general specification (Figure 5) is relatively straightforward. After applying the specification for creating the counter and the tape, we perform two consecutive presamples onto ctape with the presampling specification of the counter module. These two presampling operations are combined into one atomic operation with the pupd-bind Page 16: 16 Kwing Hei Li, Alejandro Aguirre, S. O. Gregersen, Philipp G. Haselwarter, Joseph Tassarotti, and Lars Birkedal twoIncr _≜let𝑐=createCntr()in let𝜅=createCtape()in incrCntr𝑐𝜅; let𝑣1=readCntr𝑐in incrCntr𝑐𝜅; let𝑣2=readCntr𝑐−𝑣1in 4∗𝑣1+𝑣2∀𝜄𝑄E.  ( |⇛E ∅∃𝜀F. E(𝜀)∗(E𝔘15[F]≤𝜀)∗ (∀𝑥.E(F(𝑥))∗|⇛∅ E𝑄𝜀F𝑥))   twoIncr(){𝑧.∃𝜀F.𝑄𝜀F𝑧}E⊎{𝜄} Fig. 6. Code and Specification of twoIncr rule, allowing us to use the view shift provided in the precondition to split the error credits for the 16possibilities. The rest of the proof follows directly by applying the specification for incrementing the counter with the tape and reading from it twice. However, the specification without the presampling tapes exposed (Figure 4) is not strong enough to prove this Hoare triple. This is because this specification restricts the error redistribution to only occur within the incrCntr call, and we are unable to combine the two separate error redistribution operations in each incrCntr call into one atomic action. On the other hand, the more general specification allows us to “pull” the randomized operation out of the incrCntr call, and lets us perform the randomized operation in advance via the presampling operation of the abstract tapes. 5.3 Three Implementations of the Randomized Counter Module Recall from §5.1 that the randomized counter module of the specification from Figure 5 must provide four methods: createCntr for creating the counter, createCtape for creating a tape, incrCntr for incrementing the counter with a random value chosen uniformly from the set {0,..., 3}(sampled from the tape), and readCntr for reading the value of the counter. To emphasize the point about modularity, we consider three implementations that we show meet the same specification, which we refer to as 𝐼1,𝐼2, and𝐼3, respectively. They only differ in the way they implement the createCtape andincrCntr method – the implementations of createCntr andreadCntr are the same in the three implementations and look as follows: createCntr≜𝜆_.ref0 readCntr≜𝜆𝑙.!𝑙 Internally, the counter is represented by a pointer to a number and the read method simply dereferences the pointer. The three implementations of the create tape and increment methods are shown in Figure 7. In 𝐼1, the increment method simply increments the counter value stored at createCtape1≜𝜆().tape 3 createCtape2≜𝜆().tape 1 createCtape3≜𝜆().tape 4incrCntr 1≜𝜆𝑙𝜅. faa𝑙(rand𝜅3) incrCntr 2≜𝜆𝑙𝜅. let𝜅=tape 1in faa𝑙(rand𝜅1∗2+rand𝜅1) incrCntr 3≜rec𝑓 𝑙𝜅=let𝑥=rand𝜅4in if𝑥<4then faa𝑙𝑥else𝑓 𝑙𝜅 Fig. 7. Implementation of the counter module. the location by a rand 3chosen value between 0and3using a fetch-and-add atomic instruction Page 17: Modular Reasoning about Error Bounds for Concurrent Probabilistic Programs 17 (faa). The function hence creates a tape with bound 3. In𝐼2, the increment method is implemented using two coin-flips (i.e., calls to rand 1), and in𝐼3, we use a recursive rejection sampler that, in order to simulate rand 3, repeatedly samples from rand 4until it gets a value within {0,...3}. The createCtape function for both implementations creates a tape with bound 1and4, respectively. For𝐼2and𝐼3in particular, it is interesting that even though the implementations do not sample randomness atomically (e.g., 𝐼3can possibly execute any number of rand 4operations), they still meet the specification where the presampling of a single value onto the abstract tape is described by a single probabilistic update modality (we show this in the next subsection). In other words, we capture randomized logically atomicity of the module, in the sense that externally, there appears to be a single randomized transition within the incrCntr function. 5.4 Proving that 𝐼1,𝐼2, and𝐼3satisfy the HOCAP-style Specification with Error Redistribution and Presampling Tapes We now show how the three randomized counter implementations meet the specification with error redistribution and presampling tapes. We start by giving concrete definitions for the three abstract predicates. For the three implementations, it turns out that the counter predicate counter , and the cauth andcfrag predicates are defined identically; the persistent counter predicate counter𝜄𝑐is defined as∃𝑙,𝑛.𝑐=𝑙∗𝑙↦→𝑛∗cauth𝑛𝜄, and the cauth andcfrag predicates are defined with a standard authoritative-fractional resource algebra. We show the exact definition of ctape for each of the three implementations below. ctape1𝜅®𝑛≜𝜅↩→(3,®𝑛) ctape2𝜅®𝑛≜𝜅↩→(1,expand®𝑛)∗Forall(𝜆𝑥.𝑥 <4)®𝑛 ctape3𝜅®𝑛≜∃®𝑚.filter(𝜆𝑥.𝑥 <4)®𝑚=®𝑛∗𝜅↩→(4,®𝑚) Forctape1, since the first implementation uses a rand 3to sample from 0to3directly, we define ctape1with the presampling tape 𝜅↩→(3,®𝑛). For ctape2, since we are sampling from 0to3via tworand 1s, the predicate is defined by expanding the tape elements into its binary representation. The function expand takes in a list of numbers and rewrites them into binary representation while keeping the list “flattened”. For example expand([2; 3; 1; 0])returns[1; 0; 1; 1; 0; 1; 0; 0]. Finally, for the third rejection sampler-like implementation, if we are logically storing ®𝑛with our ctape predicate, the concrete tape stores some list ®𝑚such that®𝑛is equal to®𝑚with all 4s removed from it. It suffices to show that the functions createCntr ,createCtape ,incrCntr , and readCntr satisfy the specification and that ctape satisfies its presampling probabilistic update specification, i.e., we can logically append a new element into the ctape while redistributing errors. The specification of the functions are not too complicated. As an example, consider the incrCntr specification for 𝐼3. counter𝜄𝑐∗ctape𝜅(𝑛·®𝑛)∗ (∀𝑧.cauth𝑧∗|⇛Ecauth(𝑧+𝑛)∗𝑄𝑧) (rec𝑓 𝑙𝜅=let𝑥=rand𝜅4in if𝑥<4then faa𝑙𝑥else𝑓 𝑙𝜅)𝑐𝜅 {𝑧.ctape𝜅®𝑛∗𝑄𝑧}E⊎{𝜄} After unfolding the definition of the abstract predicates for 𝐼3, we repeatedly loop through the recursive function until we reach a value 𝑛in the tape that is smaller than 4by structural induction on the tape or Löb induction. During the atomic faaoperation, we open the invariant with ht-inv- open and eliminate the view shift in the precondition. The specification of the other functions can be proven similarly. Page 18: 18 Kwing Hei Li, Alejandro Aguirre, S. O. Gregersen, Philipp G. Haselwarter, Joseph Tassarotti, and Lars Birkedal We now focus on showing that for each of the ctape definitions, they satisfy the presampling specification. For ctape1, we see after unfolding its definition, the statement of the presampling specification is the same as that of pupd-presample-exp and hence holds directly. For ctape2, it suffices to prove the following probabilistic update: ∀®𝑛.(E𝔘3[F]≤𝜀)∗𝜅↩→(1,®𝑛)∗E(𝜀)∗ |⇝⇝E∃𝑣1𝑣2.E(F(𝑣1∗2+𝑣2))∗𝜅↩→(1,®𝑛·[𝑣1,𝑣2])(7) This probabilistic update is valid because we can do two presamples consecutively via pupd-bind . In detail, we first apply pupd-presample-exp to presample the first bit, choosing the first error splitting functionF𝑎≜𝜆𝑏.if𝑏=1thenF(2)+F( 3)elseF(0)+F( 1). We then do a case split to check which bit we really presampled. If it is a 0, we apply pupd-presample-exp again, choosing the error splitting function to be F𝑏≜𝜆𝑏.if𝑏=1thenF(1)elseF(0). Otherwise, we instantiate F𝑏≜𝜆𝑏.if𝑏=1thenF(3)elseF(2). Forctape3we want to show that we can repeatedly presample enough values onto the tape such that the last element is smaller than 4and all values beforehand are 4, while distributing the error credit according to the final value, which can be shown by the following lemma: ∀®𝑛.(E𝔘3[F]≤𝜀)∗(∃®𝑚.filter(𝜆𝑥.𝑥 <4)®𝑚=®𝑛∗𝜅↩→(4,®𝑚))∗E(𝜀)∗ |⇝⇝E∃𝑛.0≤𝑛<4∗E(F(𝑛))∗(∃®𝑚.filter(𝜆𝑥.𝑥 <4)®𝑚=(®𝑛·[𝑛])∗𝜅↩→(4,®𝑚))(8) We prove this probabilistic update through induction by error amplification . We first apply pupd-bind andpupd-err to obtain some positive error credit E(𝜀′)to get the following: (E𝔘3[F]≤𝜀)∗𝜀′>0∗E(𝜀′)∗(∃®𝑚.filter(𝜆𝑥.𝑥 <4)®𝑚=®𝑛∗𝜅↩→(4,®𝑚))∗E(𝜀)∗ |⇝⇝E∃𝑛.0≤𝑛<4∗E(F(𝑛))∗(∃®𝑚.filter(𝜆𝑥.𝑥 <4)®𝑚=(®𝑛·[𝑛])∗𝜅↩→(4,®𝑚)) Now we apply the induction by error amplification rule below (see Eris [Aguirre et al .2024] for more details): ind-err-amp 𝜀1>0𝑘>1 E(𝜀1)∀𝜀2.(E(𝑘·𝜀2)∗𝑃)∗E(𝜀2)∗𝑃 𝑃 Morally this states that in order to prove 𝑃we can assume it holds guarded by an amount of credits amplified by a factor 𝑘strictly greater than 1. We choose the amplification factor 𝑘≜5. It suffices to show (with the induction hypothesis highlighted): (E𝔘3[F]≤𝜀)∗𝜀′>0∗E(5·𝜀′)∗(∃®𝑚.filter(𝜆𝑥.𝑥 <4)®𝑚=®𝑛∗𝜅↩→(4,®𝑚))∗...∗ E(𝜀′)∗(∃®𝑚.filter(𝜆𝑥.𝑥 <4)®𝑚=®𝑛∗𝜅↩→(4,®𝑚))∗E(𝜀)∗... We can now combine E(𝜀)∗E(𝜀′)with err-split and apply pupd-presample-exp with E(𝜀+𝜀′)as the initial error budget. We choose the distribution function to be 𝜆𝑥.if𝑥<4thenF(𝑥)else𝜀+5·𝜀′. After a single presampling step, we do a case split on whether the presampled value is 4or not. If it is, then we establish the conclusion with the induction hypothesis since we successfully amplified the error credit E(𝜀′)by a factor of 5. Otherwise, we presampled an “accepted” value, and we can directly establish the goal via pupd-ret . 6 Case Studies In this section, we present several other case studies that we have verified using Coneris. Page 19: Modular Reasoning about Error Bounds for Concurrent Probabilistic Programs 19 6.1 Thread-Safe Hash Functions Hash functions are often assumed to behave uniformly [Bellare and Rogaway 1993]. That is, a hash function ℎfrom a set of keys 𝐾to a set of values 𝑉behaves as if, for each key 𝑘, the hash ℎ(𝑘)is randomly sampled from a uniform distribution over 𝑉independently of all other keys. This assumption can be modeled using an idealized hash function that uses a mutable map, which serves as a cache of hashes computed so far [Mittelbach and Fischlin 2021]. If the key has already been hashed, we return the value stored in the map, otherwise we sample a fresh value uniformly, store it in the cache, and return it. In the concurrent setting, however, this does not suffice: if two threads concurrently attempt to hash the same key 𝑘, they may end up with different hash values. If one thread gets preempted by the scheduler right after sampling, a second thread could overtake and sample a different value before the first thread stores its value to the cache. We implement a thread-safe idealized hash function using a lock. To hash a value, one first acquires the lock, then samples the key and stores it to the cache, before releasing the lock again. While the implementation is uninteresting, its specification is not. In particular, we give a specifica- tion that offers exclusive ownership of each key 𝑘and the ability to presample the hash ℎ(𝑘). As we later see in §6.2, this ability can greatly simplify the probabilistic analysis of concurrent data structures that use hashing. ThehashInit function initializes a new hash function and satisfies the specification below. {True}hashInit()n ℎ.∃𝛾.hashFun𝛾ℎ∗∗𝑘∈𝐾hashKey𝛾𝑘−o Here,𝛾is an abstract ghost name identifier used for tracking ghost state. The abstract predicate hashFun𝛾ℎis duplicable, i.e.,hashFun𝛾ℎ⊣⊢hashFun𝛾ℎ∗hashFun𝛾ℎ, while hashKey𝛾𝑘− represents that key 𝑘has not yet been hashed, and is exclusive, i.e.,hashKey𝛾𝑘−∗hashKey𝛾𝑘−⊢ False . When invoking a hash function on a key with an undecided value, a fresh value 𝑣∈𝑉is sampled and hashKey𝛾𝑘𝑣is returned. The predicate hashKey𝛾𝑘𝑣is duplicable and each subsequent invocation is guaranteed to return 𝑣. n hashFun𝛾ℎ∗hashKey𝛾𝑘−o ℎ𝑘n 𝑣.∃𝑣∈𝑉.hashKey𝛾𝑘𝑣o n hashFun𝛾ℎ∗hashKey𝛾𝑘𝑣o ℎ𝑘{𝑤.𝑤=𝑣} However, hash values can also be presampled and error credits redistributed across the possible outcomes of the presampling using the probabilistic update below. hashFun𝛾𝑓∗hashKey𝛾𝑘−∗ E(𝜀)∗ |⇝⇝⊤∃𝑣∈𝑉.hashKey𝛾𝑘𝑣∗(𝑣∈𝑋∗E(𝜀1))∨(𝑣∉𝑋∗E(𝜀0)) Here𝑋⊆𝑉is some set of hash values and 𝜀1,𝜀0∈[0,1]such that𝜀1·|𝑋|+𝜀0·(|𝑉|−|𝑋|)≤𝜀·|𝑉|. For example, by picking 𝜀1≜1,𝜀0≜0, and𝜀≜|𝑋|/|𝑉|one can spend 𝜀error credits to avoid the outcomes in 𝑋when determining the hash ℎ(𝑘). We show the specification by allocating a fresh presampling tape for each key in K. A similar idea is used in previous work [Gregersen et al .2024] to show refinement of lazy and eager hash functions. In our specification, intuitively, hashKey𝛾𝑘−denotes exclusive ownership of 𝑘’s presampling tape which is transferred to an invariant after presampling. This invariant captures that, for all keys 𝑘, either𝑛has been presampled onto 𝑘’s tape or𝑛has been stored at entry 𝑘in the hash function’s cache. Page 20: 20 Kwing Hei Li, Alejandro Aguirre, S. O. Gregersen, Philipp G. Haselwarter, Joseph Tassarotti, and Lars Birkedal bfInit()≜ let hfs =List.init𝑘(λ_.hash _init())in let arr =Array.init𝑆false in (hfs,arr)bfAdd bfl𝑥≜ let(hfs,arr)=bfl in List.Iter(λℎ.let𝑖=ℎ𝑥in arr[𝑖]← true)hfs bfLookup bfl 𝑦≜ let(hfs,arr)=bfl in let res =ref true in List.Iter(λℎ.let𝑖=ℎ𝑦in res←!res&&arr[𝑖])hfs; !resbfMain xs y≜ let bfl=bfInit()in (rec𝑓zs= match zs with |[]⇒() |z::zs′⇒(bfAdd bfl z)|||( f zs′) end)ks; bfLookup bfl k Fig. 8. Implementation of Concurrent Bloom Filter 6.2 Bloom Filter Bloom filters are approximate data structures to represent sets, with operations for inserting elements and querying for membership. In their most basic, sequential presentation, a Bloom filter consists of an array of bits of a fixed size 𝑆, initially set to 0, and a list of hash functions (ℎ1,...,ℎ𝑘) of some fixed length 𝑘. When inserting an element 𝑥, we compute(ℎ1(𝑥)mod𝑆,...,ℎ𝑘(𝑥)mod𝑆) and set those indices to 1. When checking if an element 𝑦is in the set, we also compute (ℎ1(𝑦) mod𝑆,...,ℎ𝑘(𝑦)mod𝑆), and look up those indices in the array. If they are all set to 1, we answer positively, otherwise we answer negatively. Thus, when checking the membership of an element that is not in the set there exists a small probability of observing a false positive if there are hash collisions with previously inserted elements. Computing this probability is challenging and requires involved combinatorial reasoning, in fact Bloom’s original analysis [Bloom 1970] gave the wrong bound. An efficient concurrent implementation of a Bloom filter allows parallel insertions, since con- current writes to the same entry in the array would both set the entry to 1. In this case study, we implement a concurrent Bloom filter and prove a bound on the probability of observing a false positive result on a membership query. We use the concurrent hash module presented in §6.1 to implement this concurrent Bloom filter example (see Figure 8). First consider 𝑁sequential insertions 𝑥1,...,𝑥𝑁followed by checking membership for some 𝑦∉{𝑥1,...,𝑥𝑁}. From a mathematical perspective, the probability of false positive corresponds to the following experiment: first sample a batch of 𝑘·𝑁integers uniformly at random in {0,...,𝑆−1}. Now sample a second batch of 𝑁integers in the same manner. What is the probability that they areallin the first batch? The exact bound was first calculated by Bose et al .[2008], and in later work, Gopinathan and Sergey [2020] mechanized the proof. Now suppose that the insertions 𝑥1,...,𝑥𝑁happen in𝑁parallel threads. Intuitively, concurrent implementations of Bloom filters should have the same probability of false positive, since parallel queries to hash functions are independent. Using our logic, we can make this intuition concrete, and prove that the bound in the concurrent setting indeed corresponds to the sequential one. Page 21: Modular Reasoning about Error Bounds for Concurrent Probabilistic Programs 21 Our modular approach allows us to simplify the mathematical reasoning within the proof of the specification and defer all complex combinatiorial reasoning to the meta-level. The proof crucially relies on both the stateful representation of error probabilities ( i.e., error credits) as well as the notion of randomized logical atomicity, which allows us to presample all randomness in advance. The key observation is that the probability of false positive follows a simple recurrence. Let Efp(𝑙,𝑏)be the probability of observing a false positive for a single membership query after setting 𝑙uniformly selected indices to 1in an array that already contains 𝑏bits set to 1. Efp(0,𝑏)≜(𝑏 𝑆)𝑘 Efp(𝑙+1,𝑏)≜𝑏 𝑆·Efp(𝑙,𝑏)+𝑆−𝑏 𝑆·Efp(𝑙,𝑏+1) Our analysis can therefore assume that every time we hash, we start with EEfp(𝑙+1,𝑏)for some 𝑙, where𝑏is the number of distinct hash outputs that have been observed so far, and then obtain either EEfp(𝑙,𝑏)orEEfp(𝑙,𝑏+1)depending on whether or not the output of the hash is a new one or not. This means that the decision on how to distribute credits can be done locally everytime we hash a new element. With this in mind, we can prove the following spec: NoDup(xs)∗y∉xs∗EEfp(k·|xs|,0) bfMain𝑥𝑠𝑦{𝑣.𝑣=false} (9) i.e., the probability of false positive is at most Efp(𝑘·|𝑥𝑠|,0), which corresponds to the theoretical bound given by Bose et al .[2008] for the sequential setting4. In order to simplify reasoning about concurrent hashing, we presample the hash outcomes for every key in 𝑥𝑠in advance, using the hash specification in §6.1. It is at this point that most reasoning about probabilities takes place, and that we do the distribution of error credits. After this phase, we have EEfp(0,𝐵)for some𝐵, as well as predicates of the form hashKey𝛾𝑘𝑖𝑣𝑖for every key and every hash, and we know that the set of presampled hash outcomes has cardinality 𝐵. Then we execute all insertions, with an invariant that ensures that the array never has more than 𝐵elements set to 1. Finally, we can do a lookup, and use our error credits EEfp(0,𝐵)to ensure that at least one of the indices we look up is set to 0, which guarantees that the query returns false . To the best of our knowledge, we are the first to prove a tight bound on the probability of false positives for a concurrent Bloom filter (9). For more details on the analysis, we refer the reader to our Rocq development. 6.3 Other Case Studies Other case studies demonstrating the flexibility of our approach in verifying concurrent randomized data structures can be found in Appendix C. We first define a Rand module that captures the operation of sampling from a uniform distribution atomically and we provide three implementations that satisfy it (similar to the implementations in the randomized counter module from §5.3). We then use this Rand module to implement a lazy random sampler that features a specification that utilizes the probabilistic update modality in its view shift and we provide an example to show why this feature is important. We also implement a concurrent collision-free hash data structure and show that it meets an amortized error specification; the error required for each presampling operation is fixed as it is amortized across a fixed number of insertions. 4Note that their bound is given as a closed mathematical expression and we have not mechanized that it corresponds to our recursive definition. Page 22: 22 Kwing Hei Li, Alejandro Aguirre, S. O. Gregersen, Philipp G. Haselwarter, Joseph Tassarotti, and Lars Birkedal 7 Semantic Model and Soundness Coneris is implemented on top of the Iris [Jung et al .2018] base logic, which in isolation, is simply a higher-order separation logic not tied to any specific programming language. In this section, we define the semantic model of Coneris and explain how to prove the soundness of the program logic. 7.1 Model Weakest Precondition. The Coneris Hoare triple is defined in terms of a weakest precondition predicate as follows: {𝑃}𝑒{𝑄}≜(𝑃∗wp𝑒{𝑄}) Expressing Hoare triples in term of a weakest precondition is standard for defining program logics [Jung et al .2018], especially for other similar Iris non-probabilistic logics. The definition of the weakest precondition is however novel, which we detail below. Note that the weakest precondition is defined as a guarded fixed point: the recursive occurrences of the weakest precondition appear under the later modality ⊲on the last line. wp𝑒1{Φ}≜∀𝜎1,𝜀1.𝑆(𝜎1,𝜀1)∗|⇛⊤ ∅ sstep𝜎1𝜀1{𝜎2,𝜀2. 𝑒1∈Val∗ |⇛∅ ⊤𝑆(𝜎2,𝜀2)∗Φ(𝑒1)∨ 𝑒1∉Val∗pstep(𝑒1,𝜎2)𝜀2{𝑒2,𝜎3,𝑙,𝜀 3. ⊲sstep𝜎3𝜀3{𝜎4,𝜀4.|⇛∅ ⊤𝑆(𝜎4,𝜀4)∗wp𝑒2{Φ}∗∗𝑒′∈𝑙wp𝑒′{True}}}} One can intuitively understand wp𝑒{Φ}as a proposition that describes that 𝑒issafe, meaning it does not get stuck, and that for every possible return value 𝑣, the postcondition Φ(𝑣)holds. We now explain the definition of the weakest precondition in detail. At the beginning, we assume the ownership of a state interpretation 𝑆(𝜎1,𝜀1)for some state 𝜎1and error value 𝜀1. This state interpretation 𝑆:State→R≥0→iProp gives meaning to the ownership of references ℓ↦→𝑣, tapes 𝜅↩→(𝑁,®𝑛), and error credits E(𝜀). The resource algebra used to instantiate the state interpretation is standard, and we refer the readers to Aguirre et al. [2024] for more details. After that, we perform a view shift through a fancy update modality |⇛⊤ ∅ , which intuitively means we open all invariants temporarily and that we have access to the resources of all invari- ants we defined previously. Following the view shift, we need to prove a state step precondition sstep𝜎1𝜀1{...}. The exact definition of the state step precondition is explained later. For now, we can think of the state step precondition as the modality that allows instantaneous probability- preserving updates supported by the probabilistic update modality |⇝⇝𝑃. Given state 𝜎1and error budget𝜀1, we can perform any number of probability-preserving updates to step to resulting state 𝜎2and leftover error budget 𝜀2, which must satisfy the rest of the continuation. The next part of the weakest precondition depends on a case split on the expression 𝑒1. In the first case, where 𝑒1is a value, we do a view shift |⇛∅ ⊤ where we re-establish allinvariants, return the updated state interpretation and show that the return value 𝑒1satisfies the postcondition Φ. Otherwise, if 𝑒1is not a value, we have to prove a program step precondition pstep(𝑒1,𝜎2)𝜀2{...}. We later explain the specifics of this precondition modality, but for now, one can loosely understand the connective as somewhat similar to the state step precondition, where we take an actual step on the configuration (𝑒1,𝜎2), instead of performing a probabilistic update on 𝜎2. After the single step to resulting expression 𝑒2, state𝜎3, forking the list of expressions 𝑙with leftover error budget 𝜀3, we prove another state step precondition, which we can ignore here.5Finally, we re-establish all invariants after the view shift |⇛∅ ⊤ , return the state interpretation, show that wp𝑒2{Φ}holds, andwp𝑒′{True}holds for all 𝑒′in the forked list 𝑙. 5This extra state step precondition is only used to validate certain invariant opening properties not discussed in this paper. Page 23: Modular Reasoning about Error Bounds for Concurrent Probabilistic Programs 23 state-step-err-1 1≤𝜀 sstep𝜎𝜀{Φ}state-step-ret Φ(𝜎,𝜀) sstep𝜎𝜀{Φ}state-step-continuous ∀𝜀′.𝜀<𝜀′∗sstep𝜎𝜀′{Φ} sstep𝜎𝜀{Φ} state-step-exp E𝜇[F]≤𝜀 schErasable(𝜇,𝜎 1) ∀𝜎2.0<𝜇(𝜎2)∗sstep𝜎2(F(𝜎2)){Φ} sstep𝜎1𝜀{Φ} Fig. 9. Inductive Definition of the State Step Precondition sstep 𝜎𝜀{Φ}. State and Program Step Preconditions. The state step precondition is defined inductively by four inference rules presented in Figure 9. Firstly, if the error budget 𝜀is larger or equal to 1, the precondition holds trivially as all sub-distributions have mass smaller or equal to 1(state-step- err-1 ). If the predicate Φholds for the current state and error budget, the precondition also holds (state-step-ret ). The third rule state-step-continuous states that the precondition holds if for error budget 𝜀′larger than the 𝜀(in particular, 𝜀′can be arbitrarily close to 𝜀), then the precondition does in fact hold for 𝜀as well. This is the main rule that allows us to create error credit from thin air (pupd-err ), letting us exploit the fact that the real numbers are complete at the level of Coneris. Lastly, state-step-exp is the main interesting rule, which relies on the following auxiliary definition. It relies on the following auxiliary definition. Definition 7.1. A distribution on states 𝜇is ascheduler erasable state update of𝜎∈State , written asschErasable(𝜇,𝜎), if for all schedulers 𝜁, scheduler states Ξ, thread pools®𝑒, and any number of execution steps 𝑛, we have (𝜇≫=(𝜆𝜎′.pexec𝜁,𝑛(Ξ,(®𝑒,𝜎′)))).tp=(pexec𝜁,𝑛(Ξ,(®𝑒,𝜎))).tp where we write−.tpfor the function that projects out the thread pool component from a distribution on configurations. A distribution 𝜇is thus a scheduler erasable state update of 𝜎if the probability of executing ®𝑒 from state𝜎to any particular list of threads is the same if we first update the state with respect to𝜇and then execute®𝑒. Recall that the operational semantics requires schedulers to be invariant under changes to presampling tapes; such changes thus constitute scheduler erasable state updates. The state-step-exp rule then states that if we can find a function F:State→[ 0,1]and a distribution 𝜇:D(State)such that (1) the expectation of Fwith respect to 𝜇is at most𝜀, (2)𝜇 is scheduler erasable with respect to 𝜎1, and (3) for all 𝜎2, the continuation sstep𝜎2(F(𝜎2)){Φ} holds, then sstep𝜎1𝜀{Φ}holds. This is the rule that allows us to do presampling on tapes, since the presampling action is a scheduler erasable operation. The program step precondition is defined by a single inference rule prog-step-exp . It is similar to that of state-step-exp , except that we take exactly one step of the configuration (𝑒1,𝜎1). In detail, pstep(𝑒1,𝜎1)𝜀{Φ}holds if the configuration (𝑒1,𝜎1)is reducible and there exists some functionF:Expr×State×List(Expr)→[ 0,1]whose expectation with respect to step(𝑒1,𝜎1)is smaller or equal to 𝜀, and for all(𝑒2,𝜎2,𝑙), the continuation Φ(𝑒2,𝜎2,𝑙,F(𝑒2,𝜎2,𝑙))holds. prog-step-exp red(𝑒1,𝜎1)Estep(𝑒1,𝜎1)[F]≤𝜀 ∀(𝑒2,𝜎2,𝑙).0<step(𝑒1,𝜎1)(𝑒2,𝜎2,𝑙)∗Φ(𝑒2,𝜎2,𝑙,F(𝑒2,𝜎2,𝑙)) pstep(𝑒1,𝜎1)𝜀{Φ} Page 24: 24 Kwing Hei Li, Alejandro Aguirre, S. O. Gregersen, Philipp G. Haselwarter, Joseph Tassarotti, and Lars Birkedal Probabilistic Update Modality. Recall that the probabilistic update modality |⇝⇝E1E2𝑃depends on two masksE1andE2. These extra mask parameters are used to track the opening of invariants and prevent us from opening the same invariant twice (which is unsound). One can understand |⇝⇝E1E2𝑃as that we can perform a probability-preserving update to get the resources 𝑃with the possibility of accessing resources of invariants in the mask E1and reestablishing resources of invariants in the mask E2in the end. |⇝⇝E1E2𝑃≜∀𝜎1,𝜀1.𝑆(𝜎1,𝜀1)∗|⇛E1∅sstep𝜎1𝜀1{𝜎2,𝜀2.|⇛∅ E 2𝑆(𝜎2,𝜀2)∗𝑃} The implementation of the probabilistic update modality somewhat resembles a simplified version of the weakest precondition, where we only perform a single state step. Specifically, |⇝⇝E1E2𝑃 holds if after assuming some state interpretation 𝑆(𝜎1,𝜀1), we can open all invariants in E1through the view shift|⇛E1∅, and prove a state step precondition with the input parameters 𝜎1and𝜀1. Given resulting state 𝜎2and error budget 𝜀2after the state step precondition, we re-establish all invariants in the mask E2with the view shift |⇛∅ E 2and give back the state interpretation and prove the resource 𝑃. 7.2 Soundness The soundness of Coneris comes in two flavours, the correctness adequacy theorem Theorem 4.1 and the safety theorem Theorem 4.2. We now briefly describe the overall structure proof of the correctness adequacy theorem; the proof of the safety theorem is similar and is omitted. We first prove an intermediate lemma: Lemma 7.2 (Adeqacy Simplified). IfE(𝜀)⊢wp𝑒{𝜙}∗∗𝑒′∈®𝑒′wp𝑒′{True}, then for all sched- ulers𝜁, states𝜎, and natural numbers 𝑛,Prexec 𝜁,𝑛(𝑒·®𝑒′,𝜎)[¬𝜙]≤𝜀. This lemma is proven by induction on 𝑛and structural induction on the state step precondition fixed point. For each step, we unfold the definition of exec to determine which thread the scheduler chooses to step next. We unfold the definition of the corresponding weakest precondition proposi- tion (the one that matches the thread chosen to step), and show that the sstep andpstep modalities satisfies monadic composition, allowing us to compose the errors. Finally by taking ®𝑒′to be the empty list of threads, and taking the limit of 𝑛, we are able to show Prexec 𝜁𝑒[¬𝜙]≤𝜀, which is the goal of the adequacy theorem. 8 Related Work Approximate Reasoning. There are various approaches for tracking error probabilities in probabilistic programs. The approximate Hoare logic [Barthe et al .2016b] uses a grading on Hoare triples to approximate error probabilities. Expectation-based logics such as that of Batz et al . [2019]; Morgan et al .[1996] are defined with a weakest-precondition-style quantitative predicate transformer that computes the expected value of a program’s postcondition, which can be used to derive approximate correctness bounds. Compared to our work, these logics are usually restricted to sequential, first-order imperative programs. Our method of using error credits to track error bounds is first used in Eris [Aguirre et al .2024] to prove error bounds of sequential higher-order probabilistic programs. Various other logics also considered reasoning about approximate correctness in the relational setting. The logic apRHL [Barthe et al .2016a, 2013] relates the probability distribution of two programs through approximate probabilistic couplings, which can then be used to prove differential privacy. Also inspired by Eris, error credits are used in the relational logic Approxis [Haselwarter et al. 2025] to prove approximate equivalences of higher-order programs. Page 25: Modular Reasoning about Error Bounds for Concurrent Probabilistic Programs 25 Concurrent Probabilistic Program Logics. One of the first program logics developed for concurrent probabilistic programs is the probabilistic rely-guarantee calculus [McIver et al .2016] (that extends the rely-guarantee logic [Jones 1983]) that verifies the quantitative correctness of a probabilistic concurrent programs without local state. Later, Concurrent Quantitative Separation Logic [Fesefeldt et al .2022] extends Quantitative Separation Logic [Batz et al .2019] to reason about the lower bounds of probability to realize the postcondition of concurrent, heap-manipulating, randomized imperative programs. Compared to our work, it cannot establish strict error bounds that arise between the interleaving of threads (see the conTwoAdd example in §2) and cannot reason about programs in a (procedure-)modular way. The relational logic Polaris [Tassarotti and Harper 2019] establishes refinements between con- current probabilistic programs and a monadic representation via probabilistic couplings inspired by the pRHL logic [Barthe et al .2017a,b, 2012]. The simpler monadic model can then be studied to derive properties of the original programs, such as bounds on its expected value. However, Polaris does not demonstrate how to compose refinements of different data structures and did not develop an approach for reasoning about logical atomicity. Lohse and Garg [2024] developed ExpIris, a variant of Iris that supports establishing bounds on the expected cost of concurrent higher-order programs with mutable state. In ExpIris, an upper bound budget on the number of steps a program can take is written as an additional parameter of weakest preconditions, called a potential . On randomized steps, this potential can be updated in an expectation-preserving way, similar to ht-rand-exp . However, because potentials are a parameter of the weakest precondition, instead of a separation logic resource like error credits, there is no way to share them in an invariant, as we saw was necessary for obtaining tight analyzes in §2. Recently, Probabilistic Concurrent Outcome Logic [Zilberstein et al .2024] extends Demonic Outcome Logic [Zilberstein et al .2025] to reason about the distributions of outcomes from con- current probabilistic programs. Although this logic is able to prove other probabilistic properties beyond the scope of Coneris such as independence and conditioning, the programs considered are restricted to those without dynamically allocated state or higher-order functions, and the logic does not support defining ghost state. Internalization of Linearizability. There is a long line of research on internalizing lineariz- ability as a reasoning principle within concurrent program logic specifications. Jacobs and Piessens [2011] first extended the resource-invariants-based method from Owicki and Gries [Owicki and Gries 1976] allowing users to parameterize the specification of concurrent functions with ghost code. Later, Svendsen et al .[2013] further extended their idea and proposed a new style of speci- fication using higher-order concurrent abstract predicates (HOCAP) (which was built on top of CAP [Dinsdale-Young et al .2010]). At around the same time, da Rocha Pinto et al .[2014] introduced a different logic called TaDa, which proposed the use of atomic triples to capture logical atomicity of programs. There has also been much research in encoding logically atomic specifications within the Iris separation logic [Jung et al .2019, 2015]. In our work of Coneris, we take inspiration from these logics, especially HOCAP, to capture randomized logical atomicity within probabilistic concurrent programs. 9 Conclusion We presented Coneris, the first concurrent and probabilistic higher-order separation logic for error bound reasoning. Coneris captures randomized logical atomicity through the novel probabilistic update modality, enabling modular verification of concurrent programs that is out-of-scope for previous techniques. We demonstrated the flexibility of Coneris by verifying various examples Page 26: 26 Kwing Hei Li, Alejandro Aguirre, S. O. Gregersen, Philipp G. Haselwarter, Joseph Tassarotti, and Lars Birkedal modularly, most of which involve local state and intricate reasoning over randomness that arise from concurrency. There are various directions for extending Coneris. Firstly, we would like to extend Coneris to enable verifying strict error bounds of concurrent probabilistic programs under restricted schedulers, such as those that cannot view the configuration of the program. It is also interesting to explore whether ideas from Approxis [Haselwarter et al .2025] can be used to extend Coneris into the relational setting to establish approximate bounds between concurrent probabilistic programs. Lastly, we would like to consider integrating cost credits from Tachis [Haselwarter et al .2024] into Coneris to reason about both the expected work and span time costs of concurrent probabilistic programs. Acknowledgments The first author would like to thank Amin Timany for enlightening discussions regarding HOCAP- style specifications. This work was supported in part by the National Science Foundation, grant no. 2338317, the Carlsberg Foundation, grant no. CF23-0791, a Villum Investigator grant, no. 25804, Center for Basic Research in Program Verification (CPV), from the VILLUM Foundation, and the European Union (ERC, CHORDS, 101096090). Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or the European Research Council. Neither the European Union nor the granting authority can be held responsible for them. References M. Abadi and L. Lamport. 1988. The existence of refinement mappings. In [1988] Proceedings. Third Annual Symposium on Logic in Computer Science . 165–175. https://doi.org/10.1109/LICS.1988.5115 Martín Abadi and Leslie Lamport. 1991. The existence of refinement mappings. Theoretical Computer Science 82, 2 (1991), 253–284. https://doi.org/10.1016/0304-3975(91)90224-P Alejandro Aguirre, Philipp G. Haselwarter, Markus de Medeiros, Kwing Hei Li, Simon Oddershede Gregersen, Joseph Tas- sarotti, and Lars Birkedal. 2024. Error Credits: Resourceful Reasoning about Error Bounds for Higher-Order Probabilistic Programs. Proc. ACM Program. Lang. 8, ICFP, Article 246 (Aug. 2024), 33 pages. https://doi.org/10.1145/3674635 Gilles Barthe, Thomas Espitau, Benjamin Grégoire, Justin Hsu, and Pierre-Yves Strub. 2017a. Proving uniformity and independence by self-composition and coupling. In LPAR-21. 21st International Conference on Logic for Programming, Artificial Intelligence and Reasoning (EPiC Series in Computing, Vol. 46) , Thomas Eiter and David Sands (Eds.). EasyChair, 385–403. https://doi.org/10.29007/vz48 Gilles Barthe, Noémie Fong, Marco Gaboardi, Benjamin Grégoire, Justin Hsu, and Pierre-Yves Strub. 2016a. Advanced Probabilistic Couplings for Differential Privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (Vienna, Austria) (CCS ’16) . Association for Computing Machinery, New York, NY, USA, 55–67. https://doi.org/10.1145/2976749.2978391 Gilles Barthe, Marco Gaboardi, Benjamin Grégoire, Justin Hsu, and Pierre-Yves Strub. 2016b. A Program Logic for Union Bounds. In 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016) (Leibniz International Proceedings in Informatics (LIPIcs), Vol. 55) , Ioannis Chatzigiannakis, Michael Mitzenmacher, Yuval Rabani, and Davide Sangiorgi (Eds.). Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Dagstuhl, Germany, 107:1–107:15. https://doi.org/ 10.4230/LIPIcs.ICALP.2016.107 Gilles Barthe, Benjamin Grégoire, Justin Hsu, and Pierre-Yves Strub. 2017b. Coupling proofs are probabilistic product programs. In Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages (Paris, France) (POPL ’17) . Association for Computing Machinery, New York, NY, USA, 161–174. https://doi.org/10.1145/3009837.3009896 Gilles Barthe, Benjamin Grégoire, and Santiago Zanella Béguelin. 2012. Probabilistic Relational Hoare Logics for Computer- Aided Security Proofs. In Mathematics of Program Construction , Jeremy Gibbons and Pablo Nogueira (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 1–6. Gilles Barthe, Boris Köpf, Federico Olmedo, and Santiago Zanella-Béguelin. 2013. Probabilistic Relational Reasoning for Differential Privacy. ACM Trans. Program. Lang. Syst. 35, 3, Article 9 (Nov. 2013), 49 pages. https://doi.org/10.1145/2492061 Kevin Batz, Benjamin Lucien Kaminski, Joost-Pieter Katoen, Christoph Matheja, and Thomas Noll. 2019. Quantitative separation logic: a logic for reasoning about probabilistic pointer programs. Proc. ACM Program. Lang. 3, POPL, Article 34 (Jan. 2019), 29 pages. https://doi.org/10.1145/3290347 Page 27: Modular Reasoning about Error Bounds for Concurrent Probabilistic Programs 27 Mihir Bellare and Phillip Rogaway. 1993. Random oracles are practical: a paradigm for designing efficient protocols. In Proceedings of the 1st ACM Conference on Computer and Communications Security (Fairfax, Virginia, USA) (CCS ’93) . Association for Computing Machinery, New York, NY, USA, 62–73. https://doi.org/10.1145/168588.168596 Yves Bertot and Pierre Castéran. 2013. Interactive theorem proving and program development: Coq’Art: the calculus of inductive constructions . Springer Science & Business Media. Burton H. Bloom. 1970. Space/Time Trade-offs in Hash Coding with Allowable Errors. Commun. ACM 13, 7 (1970), 422–426. https://doi.org/10.1145/362686.362692 Sylvie Boldo, Catherine Lelay, and Guillaume Melquiond. 2013. Coquelicot: A User-Friendly Library of Real Analysis for Coq. (Sept. 2013). https://inria.hal.science/hal-00860648 working paper or preprint. Prosenjit Bose, Hua Guo, Evangelos Kranakis, Anil Maheshwari, Pat Morin, Jason Morrison, Michiel H. M. Smid, and Yihui Tang. 2008. On the false-positive rate of Bloom filters. Inf. Process. Lett. 108, 4 (2008), 210–213. https://doi.org/10.1016/J. IPL.2008.05.018 Pedro da Rocha Pinto, Thomas Dinsdale-Young, and Philippa Gardner. 2014. TaDA: A Logic for Time and Data Abstraction. InECOOP 2014 – Object-Oriented Programming , Richard Jones (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 207–231. Thomas Dinsdale-Young, Mike Dodds, Philippa Gardner, Matthew J. Parkinson, and Viktor Vafeiadis. 2010. Concurrent Abstract Predicates. In ECOOP 2010 – Object-Oriented Programming , Theo D’Hondt (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 504–528. Ira Fesefeldt, Joost-Pieter Katoen, and Thomas Noll. 2022. Towards Concurrent Quantitative Separation Logic. In 33rd International Conference on Concurrency Theory (CONCUR 2022) (Leibniz International Proceedings in Informatics (LIPIcs), Vol. 243) , Bartek Klin, Sławomir Lasota, and Anca Muscholl (Eds.). Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Dagstuhl, Germany, 25:1–25:24. https://doi.org/10.4230/LIPIcs.CONCUR.2022.25 Wojciech M. Golab, Lisa Higham, and Philipp Woelfel. 2011. Linearizable implementations do not suffice for randomized distributed computation. In Proceedings of the 43rd ACM Symposium on Theory of Computing, STOC 2011, San Jose, CA, USA, 6-8 June 2011 , Lance Fortnow and Salil P. Vadhan (Eds.). ACM, 373–382. https://doi.org/10.1145/1993636.1993687 Kiran Gopinathan and Ilya Sergey. 2020. Certifying Certainty and Uncertainty in Approximate Membership Query Structures. In Computer Aided Verification - 32nd International Conference, CAV 2020, Los Angeles, CA, USA, July 21-24, 2020, Proceedings, Part II (Lecture Notes in Computer Science, Vol. 12225) , Shuvendu K. Lahiri and Chao Wang (Eds.). Springer, 279–303. https://doi.org/10.1007/978-3-030-53291-8_16 Simon Oddershede Gregersen, Alejandro Aguirre, Philipp G. Haselwarter, Joseph Tassarotti, and Lars Birkedal. 2024. Asynchronous Probabilistic Couplings in Higher-Order Separation Logic. Proc. ACM Program. Lang. 8, POPL, Article 26 (Jan. 2024), 32 pages. https://doi.org/10.1145/3632868 Philipp G. Haselwarter, Kwing Hei Li, Alejandro Aguirre, Simon Oddershede Gregersen, Joseph Tassarotti, and Lars Birkedal. 2025. Approximate Relational Reasoning for Higher-Order Probabilistic Programs. Proc. ACM Program. Lang. 9, POPL, Article 41 (Jan. 2025), 31 pages. https://doi.org/10.1145/3704877 Philipp G. Haselwarter, Kwing Hei Li, Markus de Medeiros, Simon Oddershede Gregersen, Alejandro Aguirre, Joseph Tassarotti, and Lars Birkedal. 2024. Tachis: Higher-Order Separation Logic with Credits for Expected Costs. Proc. ACM Program. Lang. 8, OOPSLA2, Article 313 (Oct. 2024), 30 pages. https://doi.org/10.1145/3689753 Bart Jacobs and Frank Piessens. 2011. Expressive modular fine-grained concurrency specification. SIGPLAN Not. 46, 1 (Jan. 2011), 271–282. https://doi.org/10.1145/1925844.1926417 Cliff Jones. 1983. Specification and Design of (Parallel) Programs. Proceedings Of Ifip Congress ’ 83, 321–332. Ralf Jung, Robbert Krebbers, Jacques-Henri Jourdan, Ales Bizjak, Lars Birkedal, and Derek Dreyer. 2018. Iris from the ground up: A modular foundation for higher-order concurrent separation logic. J. Funct. Program. 28 (2018), e20. https://doi.org/10.1017/S0956796818000151 Ralf Jung, Rodolphe Lepigre, Gaurav Parthasarathy, Marianna Rapoport, Amin Timany, Derek Dreyer, and Bart Jacobs. 2019. The future is ours: prophecy variables in separation logic. Proc. ACM Program. Lang. 4, POPL, Article 45 (Dec. 2019), 32 pages. https://doi.org/10.1145/3371113 Ralf Jung, David Swasey, Filip Sieczkowski, Kasper Svendsen, Aaron Turon, Lars Birkedal, and Derek Dreyer. 2015. Iris: Monoids and Invariants as an Orthogonal Basis for Concurrent Reasoning. In Proceedings of the 42nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2015, Mumbai, India, January 15-17, 2015 . 637–650. https://doi.org/10.1145/2676726.2676980 Janine Lohse and Deepak Garg. 2024. An Iris for Expected Cost Analysis. arXiv:2406.00884 [cs.PL] Annabelle McIver, Tahiry Rabehaja, and Georg Struth. 2016. Probabilistic rely-guarantee calculus. Theoretical Computer Science 655 (2016), 120–134. https://doi.org/10.1016/j.tcs.2016.01.016 Quantitative Aspects of Programming Languages and Systems (2013-14). Arno Mittelbach and Marc Fischlin. 2021. The Theory of Hash Functions and Random Oracles - An Approach to Modern Cryptography . Springer. https://doi.org/10.1007/978-3-030-63287-8 Page 28: 28 Kwing Hei Li, Alejandro Aguirre, S. O. Gregersen, Philipp G. Haselwarter, Joseph Tassarotti, and Lars Birkedal Carroll Morgan, Annabelle McIver, and Karen Seidel. 1996. Probabilistic predicate transformers. ACM Trans. Program. Lang. Syst. 18, 3 (May 1996), 325–353. https://doi.org/10.1145/229542.229547 Susan Owicki and David Gries. 1976. Verifying properties of parallel programs: an axiomatic approach. Commun. ACM 19, 5 (May 1976), 279–285. https://doi.org/10.1145/360051.360224 Peter W. O’Hearn. 2007. Resources, concurrency, and local reasoning. Theoretical Computer Science 375, 1 (2007), 271–307. https://doi.org/10.1016/j.tcs.2006.12.035 Festschrift for John C. Reynolds’s 70th birthday. Kasper Svendsen, Lars Birkedal, and Matthew Parkinson. 2013. Modular Reasoning about Separation of Concurrent Data Structures. In Programming Languages and Systems , Matthias Felleisen and Philippa Gardner (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 169–188. Joseph Tassarotti and Robert Harper. 2019. A separation logic for concurrent randomized programs. Proc. ACM Program. Lang. 3, POPL, Article 64 (Jan. 2019), 30 pages. https://doi.org/10.1145/3290377 Noam Zilberstein, Dexter Kozen, Alexandra Silva, and Joseph Tassarotti. 2025. A Demonic Outcome Logic for Randomized Nondeterminism. Proc. ACM Program. Lang. 9, POPL, Article 19 (Jan. 2025), 30 pages. https://doi.org/10.1145/3704855 Noam Zilberstein, Alexandra Silva, and Joseph Tassarotti. 2024. Probabilistic Concurrent Reasoning in Outcome Logic: Independence, Conditioning, and Invariants. arXiv:2411.11662 [cs.LO] https://arxiv.org/abs/2411.11662 Page 29: Modular Reasoning about Error Bounds for Concurrent Probabilistic Programs 29 A Modular Proof of conTwoAdd In this section, we show in more detail how to prove conTwoAdd with the HOCAP-style specifica- tions of the randomized counter module (see Figure 5). Before we proceed, we present a selection of side conditions of the abstract predicates in Figure 10 which we previously omitted in Figure 5. The first side condition expresses that the counter representation predicate is persistent, which means that it is duplicable so that clients can share it among several threads. We then have a series of side conditions regarding the cauth andcfrag abstract predicates, which are used to keep track of the abstract state of the counter. The first condition states that cfrag abstract predicates can be combined by adding their arguments together. The next condition states that if we hold both the cauth andcfrag resource and the fraction of the cfrag is exactly 1, the value from both predicates agree. The last side condition describes how we can update the abstract state of a counter: if we have a cauth and a cfrag predicate with the same ghost name, we can update the predicates by incrementing the values of both by a constant 𝑥. 𝐶𝜄𝛾𝑐∗𝐶𝜄𝛾𝑐 cfrag𝛾 𝑓 𝑧∗cfrag𝛾 𝑓′𝑧′∗cfrag𝛾(𝑓+𝑓′)(𝑧+𝑧′) cauth𝛾𝑧∗cfrag𝛾1𝑧′∗𝑧′=𝑧 cauth𝛾𝑧∗cfrag𝛾 𝑓 𝑧′∗|⇛cauth𝛾(𝑧+𝑥)∗cfrag𝛾 𝑓(𝑧′+𝑥) Fig. 10. Selection of Side Conditions on Abstract Predicates Recall that since the new specification of the randomized concurrent counter utilizes tapes, the conTwoAdd client is annotated to use the abstract tapes. conTwoAdd≜let𝑐=createCntr()in let𝜅=createCtape()in incrCntr𝑐𝜅|||let𝜅=createCtape()in incrCntr𝑐𝜅; readCntr𝑐 We now prove that the return value is 0with a probability of 1/16, with the Coneris Hoare triple: {E(1/16)}conTwoAdd{𝑣.𝑣>0}. We first consider the invariant used to track the change in shared state during the parallel composition. We use two states 𝑆0and𝑆1(𝑛)(of some inductive type 𝑇) to track the state of the threads, with 𝑆0representing the state where the thread has not sampled a value yet and 𝑆1(𝑛) representing it sampled 𝑛. Note that we do not need an additional state to track whether the sampled value has been added into the counter, because that can be tracked by the resource cfrag . We use the invariant 𝐼shown below to capture the shared state of the two threads. Notice that the invariant 𝐼makes use of the exclusive-authoritative ghost resource algebra, which consists of the authoritative part•𝑥and the fragment part◦𝑥. We omit the definition and properties of this resource and we Page 30: 30 Kwing Hei Li, Alejandro Aguirre, S. O. Gregersen, Philipp G. Haselwarter, Joseph Tassarotti, and Lars Birkedal refer readers to Jung et al. [2018] for more information. sampled𝑠≜match𝑠with𝑆0⇒None|𝑆1(𝑛)⇒ Some𝑛end onePositive 𝑠1𝑠2≜∃𝑛.𝑛>0∧(sampled𝑠1=Some𝑛∨sampled𝑠2=Some𝑛) 𝐼(𝛾1,𝛾2)≜∃(𝑠1𝑠2:𝑇).•𝑠1𝛾1∗•𝑠2𝛾2∗ if onePositive 𝑠1𝑠2then E(0) else E 4(bool_to_nat(sampled𝑠1=Some 0)+bool_to_nat(sampled𝑛2=Some 0)−2) We now show how to prove the specification of conTwoAdd using the invariant previously defined. After stepping through the code up until the parallel composition component, and allocating the necessary resources and invariant, we arrive at the following proof obligation: ( 𝐶𝜄𝛾𝑐∗cfrag𝛾1 0∗𝐼(𝛾1,𝛾2)𝜄′ ∗ ◦𝑆0𝛾1∗◦𝑆0𝛾2) let...|||let...; readCntr𝑐{𝑣.𝑣>0} We can apply the side condition of cfrag to split it between the two threads and apply the rule for parallel composition which leaves us with the following three obligations: n 𝐶𝜄𝛾𝑐∗𝐼(𝛾1,𝛾2)𝜄′ ∗cfrag𝛾0.5 0∗◦𝑆0𝛾1o let... ∃𝑛.cfrag𝛾0.5𝑛∗◦𝑆1(𝑛)𝛾1 (10) n 𝐶𝜄𝛾𝑐∗𝐼(𝛾1,𝛾2)𝜄′ ∗cfrag𝛾0.5 0∗◦𝑆0𝛾2o let... ∃𝑛.cfrag𝛾0.5𝑛∗◦𝑆2(𝑛)𝛾1 (11) ( 𝐶𝜄𝛾𝑐∗cfrag𝛾1(𝑛1+𝑛2)∗𝐼(𝛾1,𝛾2)𝜄′ ∗ ◦𝑆1(𝑛1)𝛾1∗◦𝑆1(𝑛2)𝛾2) readCntr𝑐{𝑣.0<𝑣} (12) Let us first focus on Equation (10). We first apply the specification for createCtape to create an empty tape resource and we arrive at the following obligation. ( 𝐶𝜄𝛾𝑐∗𝐼(𝛾1,𝛾2)𝜄′ ∗cfrag𝛾0.5 0∗ ◦𝑆0𝛾1∗ctape𝜅𝜖) incrCntr𝑐𝜅 ∃𝑛.cfrag𝛾0.5𝑛∗◦𝑆1(𝑛)𝛾1 Now that we have a ctape predicate on our hands, we can presample a value onto it so that it can be used for the incrCntr method later. Specifically, we perform the following probabilistic update: 𝐼(𝛾1,𝛾2)𝜄′ ∗◦𝑆0𝛾1∗ctape𝜅𝜖∗|⇝⇝⊤∃𝑛.◦𝑆1(𝑛)𝛾1∗ctape𝜅[𝑛] This probabilistic update proposition is proven by first applying the probabilistic update modality version of inv-open where we access the resources within the invariant, and subsequently updating the authoritative resource pairs from the state 𝑆0to𝑆1(𝑛)to track the value 𝑛presampled onto the tape. After this probabilistic update, we are left with the following obligation: ( 𝐶𝜄𝛾𝑐∗𝐼(𝛾1,𝛾2)𝜄′ ∗cfrag𝛾0.5 0∗ ◦𝑆1(𝑛)𝛾1∗ctape𝜅[𝑛]) incrCntr𝑐𝜅 ∃𝑛.cfrag𝛾0.5𝑛∗◦𝑆1(𝑛)𝛾1 The rest of the proof then follows almost directly by applying the new specification for incrCntr and choosing 𝑄𝑧≜cfrag𝛾0.5𝑛. The second obligation (Equation (11)), representing the behavior of the second thread, is proven in an almost identical fashion. Let us now focus on the last obligation (Equation (12)). To prove that the return value is positive, we apply the specification of readCntr , choosing𝑄𝑣≜𝑣>0, leaving us with the following view Page 31: Modular Reasoning about Error Bounds for Concurrent Probabilistic Programs 31 shift obligation for the precondition: 𝐼(𝛾1,𝛾2)𝜄′ ∗cfrag𝛾1(𝑛1+𝑛2)∗◦𝑆1(𝑛1)𝛾1∗◦𝑆1(𝑛2)𝛾2∗ (∀𝑧.cauth𝛾𝑧∗|⇛⊤\𝜄cauth𝛾𝑧∗𝑧>0) We do a case split on the values of 𝑛1and𝑛2. If they are both 0, we can open the invariant 𝐼(𝛾1,𝛾2) to access a E(1)error credit to derive a contradiction with err-1 . Otherwise, using the rules for cauth andcfrag , we can show that the values in the cauth andcfrag predicates coincide, i.e. they are both𝑛1+𝑛2and must be positive, which completes the proof. B HOCAP-style Specification with Error Redistribution In §5.1, we presented a HOCAP-style specification that does not expose presampling tapes as an abstract predicate (see Figure 4). Although in §5.2 we showed that it is less general than the specification with ctape shown in Figure 5, in this section, we briefly explain how to use the specification to prove clients of the module and how to show various implementations meet the specification. B.1 Implementation We first show three possible implementations of the module that mirror those shown in §5.3. For𝐼1, we do not need to allocate any tapes and we sample from rand 3directly. However, note incrCntr 1≜𝜆𝑙.faa𝑙(rand 3) incrCntr 2≜𝜆𝑙.let𝜅=tape 1in faa𝑙(rand𝜅1∗2+rand𝜅1) incrCntr 3≜𝜆𝑙.let𝜅=tape 4in (rec𝑓 𝜅= let𝑥=rand𝜅4in if𝑥<4then faa𝑙𝑥else𝑓 𝜅)𝜅 Fig. 11. Three Implementations of Increment that for𝐼2and𝐼3, we have to create a tape internally and sample from it. This is because the randomization within the two implementations occurs over various steps even if it acts “logically atomic”. By adding extra ghost code that utilizes tapes, we are able to reason about the randomness asynchronously, which we demonstrate in later subsections. B.2 Verification of Client of HOCAP-style Specification with Error Redistribution We now verify the conTwoAdd example in §5.2 with the specification from Figure 4. Since tapes are not exposed in this specification, conTwoAdd is written without explicit allocation of ctape . conTwoAdd≜let𝑐=createCntr()in (incrCntr𝑐|||incrCntr𝑐); readCntr𝑐 As before, we want to verify that the final read value is positive, with error probability 1/16, which we show with the following Coneris Hoare triple. {E(1/16)}conTwoAdd{𝑣.𝑣>0} Page 32: 32 Kwing Hei Li, Alejandro Aguirre, S. O. Gregersen, Philipp G. Haselwarter, Joseph Tassarotti, and Lars Birkedal In fact the proof works almost identically to that presented in §5.2. For example, the states and invariants used to track the shared state of the two parallel threads are identical to the ones used before. We begin by stepping through the code up until the parallel composition component, and after allocating the necessary resources and invariant, we arrive at the following proof obligation: n 𝐶𝜄𝛾𝑐∗cfrag𝛾1 0∗𝐼(𝛾1,𝛾2)𝜄′ ∗◦𝑆0𝛾1∗◦𝑆0𝛾2o incrCntr𝑐|||incrCntr𝑐; readCntr𝑐 {𝑣.𝑣>0} We can apply the side condition of cfrag to split it between the two threads and apply the rule for parallel composition, leaving us with the following three obligations: n 𝐶𝜄𝛾𝑐∗𝐼(𝛾1,𝛾2)𝜄′ ∗cfrag𝛾0.5 0∗◦𝑆0𝛾1o incrCntr𝑐 ∃𝑛.cfrag𝛾0.5𝑛∗◦𝑆1(𝑛)𝛾1 (13) n 𝐶𝜄𝛾𝑐∗𝐼(𝛾1,𝛾2)𝜄′ ∗cfrag𝛾0.5 0∗◦𝑆0𝛾2o incrCntr𝑐 ∃𝑛.cfrag𝛾0.5𝑛∗◦𝑆2(𝑛)𝛾1 (14) n 𝐶𝜄𝛾𝑐∗cfrag𝛾1(𝑛1+𝑛2)∗𝐼(𝛾1,𝛾2)𝜄′ ∗◦𝑆1(𝑛1)𝛾1∗◦𝑆1(𝑛2)𝛾2o readCntr𝑐{𝑣.0<𝑣}(15) We focus only on the first obligation; the second obligation follows similarly and the third obligation is similar to the proof of Equation (12) in §5.2. From Equation (13), we apply the specification of incrCntr directly, choosing 𝑄 𝜀F𝑛𝑧≜cfrag𝛾0.5𝑛∗◦𝑆1(𝑛)𝛾1. It then suffices to prove the following view shift for the precondition of the specification: 𝐼(𝛾1,𝛾2)𝜄′ ∗cfrag𝛾0.5 0∗◦𝑆0𝛾1∗ |⇛E ∅∃𝜀F.E(𝜀)∗(E𝔘3[F]≤𝜀)∗ (∀𝑥.0≤𝑥<4∗E(F(𝑥))∗|⇛∅ E (∀𝑧.cauth𝛾𝑧∗|⇛E\𝜄cauth𝛾(𝑧+𝑥)∗cfrag𝛾0.5𝑥∗◦𝑆1(𝑥)𝛾1)) We first open our invariant 𝐼(𝛾1,𝛾2)while stripping away the |⇛E ∅ mask, which allows us to access the error credit stored in the invariant. After choosing the right Fbased on a case analysis on the state of the right thread (which we omit for brevity), we update the authoritative resource pairs from•𝑆0𝛾1∗◦𝑆0𝛾1to•𝑆1(𝑥)𝛾1∗◦𝑆1(𝑥)𝛾1and re-establish the invariant 𝐼while removing the |⇛∅ E mask, leaving us with the following state: 𝐼(𝛾1,𝛾2)𝜄′ ∗cfrag𝛾0.5 0∗◦𝑆1(𝑥)𝛾1∗ cauth𝛾𝑧∗|⇛E\𝜄cauth𝛾(𝑧+𝑥)∗cfrag𝛾0.5𝑥∗◦𝑆1(𝑥)𝛾1 After incrementing both the cauth andcfrag components by exactly 𝑥through the|⇛E\𝜄mask (which is possible by the side conditions of cauth andcfrag ), we can then directly establish the final goal. B.3 Proving that 𝐼1,𝐼2, and𝐼3Satisfy the HOCAP-style Specification with Error Redistribution We now briefly describe how each of the three randomized counter implementations meets the specification with error redistribution in Figure 4. The concrete definitions for the abstract predicates are actually identical to those used in the proof of §5.2. For example, the counter predicate is still defined as ∃(𝑙:Loc)(𝑛:nat).𝑐=𝑙∗𝑙↦→𝑛∗cauth𝛾𝑛𝜄 Page 33: Modular Reasoning about Error Bounds for Concurrent Probabilistic Programs 33 It then suffices to show that the functions createCntr,incrCntr , and readCntr satisfy the HOCAP- style specification. We focus on the incrCntr specification since it is the most complicated; the other two functions can be verified in a similar, if not easier, fashion. For𝐼1, after symbolically stepping through the program, we are left with the following obligation:   𝐶𝜄𝛾𝑐∗( |⇛E ∅∃𝜀F.E(𝜀)∗(E𝔘3[F]≤𝜀)∗ (∀𝑥.0≤𝑥<4∗E(F(𝑥))∗|⇛∅ E (∀𝑧.cauth𝛾𝑧∗|⇛E\𝜄cauth𝛾(𝑧+𝑥)∗𝑄𝜀F𝑥𝑧)))   faa𝑐(rand 3){𝑧.∃𝜀F𝑥.𝑄𝜀F𝑥𝑧}E We first open the |⇛E ∅ mask around the atomic rand 3operation. After opening the first view shift, we are given some E(𝜀)and someFsuch that the expected sum of Fis smaller than 𝜀. We then apply ht-rand-exp to distribute the errors across the various results and close the |⇛∅ E mask, leaving us with the following obligation: 𝐶𝜄𝛾𝑐∗(∀𝑧.cauth𝛾𝑧∗|⇛E\𝜄cauth𝛾(𝑧+𝑥)∗𝑄𝜀F𝑥𝑧) faa𝑐𝑥{𝑧.∃𝜀F𝑥.𝑄𝜀F𝑥𝑧}E Since the faaoperation is atomic, we can open the invariant 𝐶around the expression, resulting in this goal: 𝑙↦→𝑛∗cauth𝛾𝑛∗(∀𝑧. ...) faa𝑙𝑥{𝑧.∃(𝑛:nat).𝑙↦→𝑛∗cauth𝛾𝑛∗∃𝜀F𝑥.𝑄𝜀F𝑥𝑧}E\𝜄 The rest of the proof follows nicely from the proof rule for the faaoperation, completing the proof. For𝐼2, we similarly step through the program, where we additionally allocate a tape 𝜅, and thus we arrive at the following goal:   𝐶𝜄𝛾𝑐∗𝜅↩→(1,𝜖)∗( |⇛E ∅∃𝜀F. E(𝜀)∗(E𝔘3[F]≤𝜀)∗ (∀𝑥.0≤𝑥<4∗E(F(𝑥))∗|⇛∅ E (∀𝑧.cauth𝛾𝑧∗|⇛E\𝜄cauth𝛾(𝑧+𝑥)∗𝑄𝜀F𝑥𝑧)))   faa𝑙(rand𝜅1∗2+rand𝜅1){𝑧.∃𝜀F𝑥.𝑄𝜀F𝑥𝑧}E From here, we directly open the |⇛E ∅ and access the E(𝜀)error credit. Then, unlike what we did for𝐼1, here we perform a probabilistic update where we presample two values 𝑣1,𝑣2onto the tape 𝜅, and we distribute E(𝑣1∗2+𝑣2)for each branch, i.e., we update the resources via the following lemma (in this instance, ®𝑛is instantiated to be the empty tape list 𝜖). This follows from Equation (7) which we proved previously. After closing the|⇛∅ E mask, we are left with the following obligation: 𝐶𝜄𝛾𝑐∗𝜅↩→(1,[𝑣1,𝑣2])∗ (∀𝑧.cauth𝛾𝑧∗|⇛E\𝜄cauth𝛾(𝑧+𝑣1∗2+𝑣2)∗𝑄𝜀F(𝑣1∗2+𝑣2)𝑧) faa𝑙(rand𝜅1∗2+rand𝜅1){𝑧.∃𝜀F𝑥.𝑄𝜀F𝑥𝑧}E Page 34: 34 Kwing Hei Li, Alejandro Aguirre, S. O. Gregersen, Philipp G. Haselwarter, Joseph Tassarotti, and Lars Birkedal © «∀𝜀F𝜄E𝜅𝛾®𝑛. (E𝔘𝑁[F]≤𝜀)∗ 𝜄∈E∗ isRand𝜄𝛾∗ randTape𝜅®𝑛𝛾∗ E(𝜀)∗ |⇝⇝E∃𝑛.E(F(𝑛))∗randTape𝜅(®𝑛·[𝑛])𝛾ª®®®®®®®®® ¬ (a) Presampling specification∀𝜄𝛾𝜅𝑛®𝑛E. 𝜄∈E∗ isRand𝜄𝛾∗ctape𝜅(𝑛·®𝑛) randf𝜅{𝑧.𝑧=𝑛∗randTape𝜅®𝑛}E (b)randf specification Fig. 12. Selection of Specification of Rand Module We can then read the values of the tape directly for both samples: 𝐶𝜄𝛾𝑐∗𝜅↩→(1,𝜖)∗ (∀𝑧.cauth𝛾𝑧∗|⇛E\𝜄cauth𝛾(𝑧+𝑣1∗2+𝑣2)∗𝑄𝜀F(𝑣1∗2+𝑣2)𝑧) faa𝑙(𝑣1∗2+𝑣2){𝑧.∃𝜀F𝑥.𝑄𝜀F𝑥𝑧}E From here, the fetch-and-atomic-add step is similar to that for the 𝐼1implementation. The proof of 𝐼3is very similar to that of 𝐼2, except for the presampling step after allocating the tape resource. In particular we want to show that we can repeatedly presample enough values into the tape such that the last element is smaller than 4and all values beforehand are 4, while distributing the error credit according to the final value. Here we use Equation (8) proved previously to do so. After performing the probabilistic update on the tape (such that it contains an “accepted” value at the end), we can then step through the rest of the program, looping repeatedly until we reach the final “accepted” value and establish the postcondition. C Other Case Studies C.1 Rand Module For the random counter module introduced previously in §5.1, we identified three distinct imple- mentations (§5.3) that sample randomness from a uniform distribution for the incrCntr operation, e.g. we can directly call a single rand (𝐼1), chain various rand s together ( 𝐼2), or use a rejection sampler method where we repeatedly sample until we obtain a desirable value ( 𝐼3). We now define a general interface, which we refer to as the Rand module , that captures what it means to sample from a uniform distribution atomically, and show that several implementations satisfy it. In later case studies, we use this interface to verify larger programs, to highlight the usability of this module and to demonstrate modular reasoning. The Rand module is parameterized by a natural number 𝑁, which is the range of the uniform distribution (we are sampling uniformly from {0,...,𝑁}). The interface exposes two functions, randAllocate andrandf , which creates a tape and samples from it, respectively. It also describes various abstract predicates, their side conditions, and specifications of the functions, most notably the specification that allows clients to presample into the abstract tape randTape , and reading from it with randf , which we present in Figure 12. The first condition states that when given the isRand invariant, a randTape abstract predicate, and some error credits, we can append a value at the end of the tape and split the errors in an Page 35: Modular Reasoning about Error Bounds for Concurrent Probabilistic Programs 35 expectation-preserving way, similar to the presampling specification presented in the random counter module. The second condition states that given the isRand invariant and a non-empty tape, we can run randf on the tape to deterministically pop the tape and return its first element. By choosing concrete definitions for the abstract predicates of this module, one can show that various implementations of random samplers satisfy this Rand module specification; e.g. we proved that a rejection sampler meets the specification of the Rand module (the proof is similar to showing that implementation 𝐼3of the randomized counter module meets its specification). C.2 Lazy Random Sampler In this subsection, we consider the implementation of a concurrent lazy one-shot random sampler that samples lazily. This sampler is lazy in the sense that we only perform the sampling the first time the thunk is invoked and we store the result in a reference that is read from whenever we call the thunk again. A selection of the code of the lazy random sampler is presented in Figure 13. The function lazyRandInit creates a tuple containing a lock and a reference that points to None . When we call lazyRandf with the tuple and a tape label, we acquire the lock and load the value of what the location is pointing to. If it is Some𝑥, we directly return 𝑥. Otherwise, we sample 𝑥from the tape with the function randf from the Rand module defined in Appendix C.1 and store it into the location. We additionally take in an extra argument 𝑡𝑖𝑑inlazyRandf and store it into the location together with the sampled value to represent the first thread id that succeeds in acquiring the lock and performing the actual randomized operation. We release the lock at the end right before we return from the function. lazyRandInit≜λ_.letℓ=ref None in let𝑙𝑜=newLock()in (𝑙𝑜,ℓ)lazyRandf≜λ(𝑙𝑜,ℓ)𝜅𝑡𝑖𝑑. acquire𝑙𝑜; let𝑣=match !ℓwith |Some𝑥⇒𝑥 |None⇒ let𝑥=(randf𝜅,𝑡𝑖𝑑)in ℓ←Some𝑥;𝑥 end in release𝑙𝑜;𝑣 Fig. 13. Code of the Lazy Random Sampler To motivate what the specification of this lazy random sampler module should look like, we first consider a client program lazyRace that uses this module. In this example, we set the parameter of the internal Rand module to be 1, sorandf samples uniformly between 0and1. (The function lazyAllocTape in this example creates a tape for this lazy random sampler, and we omit the code for brevity.) lazyRace≜let𝑟=lazyRandInit()in (lazyRandf𝑟(lazyAllocTape())0)|||( lazyRandf𝑟(lazyAllocTape())1) In the lazyRace program, we create a lazy random sampler and fork two threads. Each thread attempts to sample from it but they pass a different 𝑡𝑖𝑑as the parameter parameter. It should be the Page 36: 36 Kwing Hei Li, Alejandro Aguirre, S. O. Gregersen, Philipp G. Haselwarter, Joseph Tassarotti, and Lars Birkedal case that both threads return the same tuple value 𝑥=(𝑥1,𝑥2); intuitively, this is because regardless of how the threads are scheduled, the thread that is executed last must read the value stored by the the thread that is executed first. Consider the following specification of lazyRace where for both return values of the threads, we have 𝑥1=𝑥2with error probability 1/2. {E(1/2)}lazyRace{𝑣.∃𝑛.𝑣=((𝑛,𝑛),(𝑛,𝑛))} This is true morally because whichever threads gets scheduled first to perform the sampling, we can useE(1/2)to avoid sampling a value from randf that is different from the 𝑡𝑖𝑑passed, ensuring that the sampled value is identical to the 𝑡𝑖𝑑. However, there is some subtlety in the proof of this Hoare triple, in particular, we cannot perform any presampling in advance of the actual lazyRandf . If we directly attempt to presample a value to each tape on both threads (avoiding the corresponding 𝑡𝑖𝑑), we need to pay up to E(3/4)error credits because we are doing two presampling calls, where ideally, we should only need to do one. One might try to rewrite lazyRace such that both threads share the same tape, but this does not solve the presampling problem directly. In particular, before either threads call lazyRandf , we do not know what value to sample into the tape. Whatever value we presample onto the tape, the scheduler can deliberately choose to schedule the threads in a way such that the 𝑡𝑖𝑑of the winning thread does not match the value presampled. In other words, we want to delay the operation of presampling and perform it not before the lazyRandf call, but during it. Given this observation, the specification of the lazy random sampler module is written in a way that allows presampling to be performed within the lazyRandf call. Here we show the specification for presampling into the tapes of the lazy random sampler module and lazyRandf in Figure 14. The © «∀𝜀F𝑃𝜄E𝜅𝛾𝑙𝑟. (E𝔘𝑁[F]≤𝜀)∗ 𝜄∈E∗ isLazyRand 𝑙𝑟 𝑃𝜄𝛾∗ lazyTape𝜅None𝛾∗ E(𝜀)∗ |⇝⇝E∃𝑛.E(F(𝑛))∗ lazyTape𝜅(Some𝑛)𝛾ª®®®®®®®®®®® ¬ (a) Presampling Specification∀𝜄𝛾𝑙𝑟 𝑃𝜅𝑡𝑖𝑑𝑄 1𝑄2.   isLazyRand 𝑙𝑟 𝑃𝜄𝛾∗ (∀𝑛.𝑃𝑛∗lazyAuth𝑛𝛾∗ |⇝⇝⊤match𝑛with |Some(𝑥,𝑦)⇒𝑃𝑛∗lazyAuth𝑛𝛾∗𝑄1𝑥𝑦 |None⇒∃𝑛′.lazyTape𝜅(Some𝑛′)𝛾∗ (lazyTape𝜅None𝛾∗ |⇛⊤𝑃(𝑛′,𝑡𝑖𝑑)∗lazyAuth(𝑛′,𝑡𝑖𝑑)𝛾∗ 𝑄2𝑛′𝑡𝑖𝑑) end)   lazyRandf𝑙𝑟𝜅𝑡𝑖𝑑{(𝑥,𝑦).𝑄1𝑥𝑦∨𝑄2𝑥𝑦}E (b)lazyRandf Specification Fig. 14. Selection of Specification of the Lazy Random Sampler Module presampling specification for the lazy random sampler is not too different from the other previous examples; the main difference is that the abstract tapes for the module lazyTape stores an option type instead of a list. Since for each tape, only the first value could ever be relevant in that it is chosen to be the value stored in the reference, there is no reason to presample more than one value into a single tape. Now, let us focus on the more complicated specification for the lazyRandf function. Firstly, notice that the lazy random sampler predicate isLazyRand takes in an additional predicate 𝑃as an Page 37: Modular Reasoning about Error Bounds for Concurrent Probabilistic Programs 37 argument. Intuitively, 𝑃is the invariant protected by the lock, and if the reference maps to the value𝑛, it is the case that 𝑃𝑛holds whenever we access the lock and release it. The precondition of thelazyRandf function requires two resources. The first being the abstract predicate isLazyRand and the second being a view shift. The view shift encodes how the state of the module changes throughout the call. The view shift starts by assuming that we have 𝑃𝑛andlazyAuth𝑛𝛾for some 𝑛, which represents the operation of acquiring the lock and gaining access to the authoritative state of the lazy random sampler. We then perform a case split on 𝑛. If it is Some(𝑥,𝑦), then this means that the lazy random sampler has already committed to a value, so we return directly by releasing the lock and establishing some postcondition 𝑄1𝑥𝑦. Otherwise, if it is None , we reach the branch where we have to do a randomized sampling. Here we are allowed to perform some probabilistic update operation to provide a non-empty lazyTape (since the view shift is implemented with a |⇝⇝⊤), and it suffices to prove that after reading that value 𝑛′in the tape, we establish the authoritative part of the lazy random sampler with the reference storing (𝑛′,𝑡𝑖𝑑)and some postcondition 𝑄2𝑛′𝑡𝑖𝑑. If all preconditions hold, then the return value of the function is some pair (𝑥,𝑦), where either 𝑄1𝑥𝑦 or𝑄2𝑥𝑦holds. The key ingenuity of the specification of lazyRandf is that the view shift is described by the |⇝⇝modality, allowing us to perform presampling on abstract tapes within the function call in addition to outside it. In particular, we can choose to perform a presampling action on a tape or not depending on whether the sampler is storing a None (it has not been invoked before) or not. This flexibility allows us to prove the specification of the lazyRace , by only performing a single presampling within the first invocation of the function call lazyRandf . C.3 Concurrent Amortized Collision-free Hash When proving the correctness of randomized data structures, it is useful to assume that a hash function is collision-free, in that different input keys for the function return different hash values. In reality, collisions might occur but with very low probabilities. Here, we first consider a concurrent collision-free hash, one that can be shared among many threads, and each thread pays error credits to avoid collisions for every presampling action. Because we want to be able to use the hash in a concurrent context, the specification of the hash is written in a HOCAP-style with presampling tapes exposed to allow modular reasoning. We implement a concurrent model of the idealized hash function under the uniform hash assumption [Bellare and Rogaway 1993]. The assumption states that the hash function hashf mapping sets of keys 𝐾 to hash values 𝑉is a random oracle, in that for each key 𝑘∈𝐾, the hash value ℎ(𝐾)is sampled uniformly from 𝑉independently of all other keys. We implement this model as a tuple containing a lock and a mutable map 𝑙𝑚, choosing𝐾and𝑉to be{0,...,𝑁}. The main hash function hashf is shown below. The lock is acquired and released around the body of the hash function to ensure that at most one thread is changing the state of the mutable map. In the critical section, if the key 𝑘 has been hashed before, we directly return 𝑙𝑚(𝑘). Otherwise, we sample a fresh value uniformly from𝑉with the randf function defined in Appendix C.1, read a value from the tape 𝜅, store it in Page 38: 38 Kwing Hei Li, Alejandro Aguirre, S. O. Gregersen, Philipp G. Haselwarter, Joseph Tassarotti, and Lars Birkedal 𝑙𝑚(𝑘), and return it at the end. hashf(𝑙𝑜,𝑙𝑚)𝑘𝜅≜acquire𝑙𝑜; let𝑣=match get𝑙𝑚𝑘 with |Some(𝑏)⇒𝑏 |None⇒let𝑏=randf𝜅in set𝑙𝑚𝑘𝑏 ; 𝑏 end in release𝑙𝑜;𝑣 We show the presampling specification and the specification for hashf in Figure 15. To achieve collision-freedom, we need to ensure that every value we presample to a tape generated by the hash is different from any value previously presampled to alltapes generated by the hash. To be precise, suppose we have presampled a total of 𝑠values on all tapes of the hash. If we want to presample a new value to a tape, we need to pay at least E𝑠 𝑁+1to sample a unique value different from all values presampled before. To keep track of all the values sampled before, the interface introduces ahashsize abstract predicate that stores the set of all values that has been presampled before. To presample onto a tape for the collision-free hash, we need to additionally pass in a hashsize predicate to determine the amount of error needed to avoid the previous presampled-values. The hashf specification is defined in almost the same way as the lazyRandf specification, the view shift in the precondition performs a case split to determine whether a key has been hashed before by looking into the mutable map. © «∀𝜀𝑂𝑃𝜄E𝜅𝛾®𝑛𝑠ℎ. ((𝑠+(𝑁+1−𝑠)𝜀𝑂)/(𝑁+1)≤𝜀)∗ 𝜄∈E∗ isHashℎ𝑃𝜄𝛾∗ hashTape𝜅®𝑛𝛾∗ hashsize𝑠𝛾∗ E(𝜀)∗ |⇝⇝E∃𝑛.E(𝜀𝑂)∗hashsize(𝑠+1)𝛾∗ hashTape𝜅(®𝑛·[𝑛])𝛾ª®®®®®®®®®®®®® ¬ (a) Presampling Specification∀𝜄𝛾ℎ𝑃𝜅𝑘𝑄 1𝑄2.   isHashℎ𝑃𝜄𝛾∗ (∀𝑚.𝑃𝑚∗hashAuth𝑚𝛾∗ |⇝⇝⊤match𝑚!!𝑘with |Some𝑣⇒𝑃𝑚∗hashAuth𝑛𝛾∗𝑄1𝑛 |None⇒∃𝑛®𝑛.hashTape𝜅(𝑛·®𝑛)𝛾∗ (hashTape𝜅®𝑛𝛾∗ |⇛⊤𝑃(𝑚[𝑘:=𝑛])∗ hashAuth(𝑚[𝑘:=𝑛])𝛾∗𝑄2𝑛®𝑛) end)   hashfℎ𝑘𝜅{𝑥.𝑄 1𝑥∨∃®𝑛.𝑄 2𝑥®𝑛}E (b)hashf Specification Fig. 15. Selection of Specification of the Collision-free Hash We also used this specification to derive an amortized version of the collision-free hash, which we show in Figure 16. This hash specification has two main advantages. Firstly, clients do not need to pass a hashsize predicate as a precondition for presampling into the tape. In addition, the error credit𝜀𝐴(𝑁,𝑀)to be paid is constant as it is amortized across a fixed number of insertions 𝑀that is decided in advance. To keep track of the maximum number of times the hash is used, clients need Page 39: Modular Reasoning about Error Bounds for Concurrent Probabilistic Programs 39 to give up a single hashToken predicate; exactly 𝑀number of these hashToken s are generated when the hash is initialized. The proof of this more complex specification is similar to that in Aguirre et al.[2024] which we omit here. We emphasize that this amortized specification can be derived from the non-amortized specification (Figure 15) without taking into account how the concurrent hash is implemented. © «∀𝜀𝑂𝑃𝜄E𝜅𝛾®𝑛ℎ. 𝜄∈E∗ isHashℎ𝑃𝜄𝛾∗ hashTape𝜅®𝑛𝛾∗ hashToken 1𝛾∗ E(𝜀𝐴(𝑁,𝑀))∗ |⇝⇝E∃𝑛.hashTape𝜅(®𝑛·[𝑛])𝛾ª®®®®®®®®® ¬ Fig. 16. Amortized Presampling Specification Just like the specification presented in Appendix C.2, the specification for both the non-amortized and amortized concurrent collision-free hashes uses the probabilistic update modality in the view shift in its hashf specification to allow presampling to occur within the hashf body. As an example, consider the following program (similar to lazyRace in Appendix C.2) and its specification. {E(𝜀𝐴(𝑁,𝑀))} letℎ=initHash()in (hashfℎ0(hashAllocTape()))|||( hashfℎ0(hashAllocTape())) {𝑣.∃𝑛.𝑣=(𝑛,𝑛)} Here we create an amortized hash and spawn two threads that each creates a tape and uses the tape to hash the value 0. Because both threads are hashing the same key 0, it should be the case that we only need to pay one constant 𝜀𝐴(𝑁,𝑀)for the first hash operation. However we do not know which thread is scheduled first in advance, so we cannot perform the presampling in advance before the hashf call. The probabilistic update modality allows us to perform the presampling within the hashf call in the case where the value of 𝑚!!0isNone , indicating that this thread has been scheduled first to do the randomized sampling.