loader
Generating audio...

arxiv

Paper 2503.03207

PolyVer: A Compositional Approach for Polyglot System Modeling and Verification

Authors: Pei-Wei Chen, Shaokai Lin, Adwait Godbole, Ramneet Singh, Elizabeth Polgreen, Edward A. Lee, Sanjit A. Seshia

Published: 2025-03-05

Abstract:

Several software systems are polyglot; that is, they comprise programs implemented in a combination of programming languages. Verifiers that directly run on mainstream programming languages are currently customized for single languages. Thus, to verify polyglot systems, one usually translates them into a common verification language or formalism on which the verifier runs. In this paper, we present an alternative approach, PolyVer, which employs abstraction, compositional reasoning, and synthesis to directly perform polyglot verification. PolyVer constructs a formal model of the original polyglot system as a transition system where the update functions associated with transitions are implemented in target languages such as C or Rust. To perform verification, PolyVer then connects a model checker for transition systems with language-specific verifiers (e.g., for C or Rust) using pre/post-condition contracts for the update functions. These contracts are automatically generated by synthesis oracles based on syntax-guided synthesis or large language models (LLMs), and checked by the language-specific verifiers. The contracts form abstractions of the update functions using which the model checker verifies the overall system-level property on the polyglot system model. PolyVer iterates between counterexample-guided abstraction-refinement (CEGAR) and counterexample-guided inductive synthesis (CEGIS) until the property is verified or a true system-level counterexample is found. We demonstrate the utility of PolyVer for verifying programs in the Lingua Franca polyglot language using the UCLID5 model checker connected with the CBMC and Kani verifiers for C and Rust respectively.

Paper Content:
Page 1: POLYVER: A Compositional Approach for Polyglot System Modeling and Verification Pei-Wei Chen1, Shaokai Lin1, Adwait Godbole1, Ramneet Singh2, Elizabeth Polgreen3, Edward A. Lee1, and Sanjit A. Seshia1 1University of California, Berkeley, USA, {pwchen,shaokai,adwait,eal,sseshia}@berkeley.edu 2Indian Institute of Technology, Delhi, India, ramneet2001@gmail.com 3University of Edinburgh, UK, elizabeth.polgreen@ed.ac.uk Abstract. Several software systems are polyglot ; that is, they comprise programs implemented in a combination of programming languages. Verifiers that directly run on mainstream programming languages are currently customized for single languages. Thus, to verify polyglot systems, one usually translates them into a common verification language or formalism on which the verifier runs. In this pa- per, we present an alternative approach, P OLYVER, which employs abstraction, compositional reasoning, and synthesis to directly perform polyglot verification . POLYVERconstructs a formal model of the original polyglot system as a tran- sition system where the update functions associated with transitions are imple- mented in target languages such as C or Rust. To perform verification, P OLYVER then connects a model checker for transition systems with language-specific ver- ifiers (e.g. for C or Rust) using pre/post-condition contracts for the update func- tions. These contracts are automatically generated by synthesis oracles based on syntax-guided synthesis or large language models (LLMs), and checked by the language-specific verifiers. The contracts form abstractions of the update func- tions using which the model checker verifies the overall system-level property on the polyglot system model. P OLYVERiterates between counterexample-guided abstraction-refinement (CEGAR) and counterexample-guided inductive synthe- sis (CEGIS) until the property is verified or a true system-level counterexample is found. We demonstrate the utility of P OLYVERfor verifying programs in the Lingua Franca polyglot language using the UCLID5 model checker connected with the CBMC and Kani verifiers for C and Rust respectively. Keywords: Polyglot languages · verification · synthesis · contracts · large lan- guage models. 1 Introduction Most modern software systems comprise code written in a combination of program- ming languages [45]. For example, the Robot Operation System 2 (ROS2) [38,30] al- lows users to specify processes that communicate with each other, called nodes, where each node can be implemented in a different programming language. Other languages and frameworks provide similar cross-language functionality for various domains, like Lingua Franca [29] for real-time, distributed embedded systems, and Eclipse CyclonearXiv:2503.03207v2 [cs.PL] 12 Mar 2025 Page 2: 2 Chen et al. DDS [1] and gRPC [20] for distributed systems and networking. We refer to these multi-language software systems as polyglot systems. Polyglot systems allow developers to pick and choose the best programming lan- guage for specific tasks. Unfortunately, this flexibility comes at a cost, with multiple studies indicating that the increased complexity of polyglot systems can lead to an in- crease in bugs [45]. Recent work has addressed these issues through testing and other in- complete methods. For example, Yang et al. [46] develop a deep-learning-based method that helps localize bugs specifically due to the interaction between multiple languages in a single system. These tools and techniques can help developers build safer systems, but they cannot provide the guarantees needed in safety-critical settings and other vital settings in which polyglot systems appear. In this paper, we study the problem of formally verifying polyglot systems. Just like developing polyglot systems introduces new challenges in the development cycle, verifying polyglot systems introduces new challenges in the verification process. Over the past few decades, there has been substantial work on program verifiers that operate directly on individual programming languages. We refer to these tools as language- specific verifiers . For example, CBMC [24] and Frama-C [13] can be used to verify C code, Kani [41] and Verus [26] for Rust, ESC/Java [15], Java PathFinder [18], and JBMC for Java [11], etc. At the same time, few approaches have been proposed for multi-language systems, so polyglot verification challenges remain largely unexplored. The major existing approach to verify polyglot systems involves translating all code to one common language and applying existing reasoning engines to that common rep- resentation. However, this approach is challenging due to the complexity of handling diverse syntax and semantics across languages, and arguably, it loses the benefits of having highly optimized language-specific verifiers. For instance, prior work such as Lin et al. [28], only handles very restricted C code within the Lingua Franca language, highlighting the difficulties of such monolithic approaches. Encoding a multi-language system in a single verification model often results in scalability issues, as the generated verification problem can become prohibitively large. To our knowledge, no existing ap- proach enables automated verification of multi-language software systems by allowing language-specific verifiers to cooperate to solve a problem. We propose a compositional verification approach that leverages contracts, synthe- sis, and language-specific verifiers to address this gap. Contracts, which abstract in- dividual components, eliminate the need for full language translation and act as glue between components. In our approach, language-specific verifiers are used to validate contracts for components written in the corresponding language, while system-level verification is performed using these contracts within a formal modeling language. We employ a large language model (LLM)-based contract synthesizer to generate con- tracts by summarizing programs and reasoning across multiple languages. This enables scalable and automated verification of multi-language systems without requiring a full translation into a single verification framework. In summary, the contributions of this paper are as follows. 1.Polyglot verification problem. We formalize the polyglot verification problem, and propose P OLYVER, an approach that uses contracts to perform compositional verification by integrating multiple language-specific verifiers. Page 3: POLYVER: A Compositional Approach for Polyglot System Modeling and Verification 3 2.Automatic synthesis of contracts. We propose a method for automatically syn- thesizing and refining pre/post-condition contracts needed for the polyglot verifica- tion problem, combining counterexample-guided inductive synthesis (CEGIS) and counterexample-guided abstraction refinement (CEGAR). 3.Our tool. We implemented a tool, called P OLYVER, that builds upon UCLID5 [36] and supports C and Rust through CBMC [22] and Kani [41]. We developed an LLM-based synthesis oracle that outperforms pure symbolic approaches. The tool is modular in that synthesis and verification oracles can be easily added or replaced. 4.Case study and evaluation. We developed a verifier for the Lingua Franca coor- dination language [29] and show that our approach is able to verify multi-language systems with reasonable synthesis time. We also introduce a new set of polyglot benchmarks and realistic applications that are only solved by our approach. 2 Motivation and Background We start by presenting a motivating (running) example of a multi-language verification problem. Figure 1 shows a model of a train management system as an Extended Finite State Machine (EFSM) [2], i.e., a transition system where transitions have procedures attached to them. Our example has procedures in multiple languages, e.g., the Waited procedure is a Rust function while ProcessQuery is a C function. 𝑞!𝑞"t := 0;initialize;ProcessQuery;ProcessResponse;wait /t := t + 2 Waited; ProcessQuery; ProcessResponse;pass /t := t + 10;Passed;state vars: t, wait, pass, response, request, count, passedfnWaited(request: &mut Port,count: &mut u32) {*count= count.saturating_add(1);ctx.set(request, count);}void ProcessQuery(Port request, Port response) {if(request->value< 3) {lf_set(response, rand() % 2);} else{lf_set(response, true);}} Fig. 1: An Extended Finite State Machine (EFSM) describing a train control system. The procedures in blue andorange are Rust and C procedures respectively. Imple- mentations of two procedures are shown on the right. Encodings must match language semantics. Formal verification of such a system requires encoding the procedures and the transition logic in a formal modeling/ver- ification language (e.g., either directly as an SMT or an intermediate language such as Boogie). This encoding must respect the language semantics. For example, instead of wrapping around, the x.saturating_add(1) Rust function in Waited satu- rates addition overflows to the maximum integer value (for that type). Figure 2 shows the SMT encoding of this function. In general, a valid encoding must account for all language-specific constructs (e.g., types, operators) which can be very complex for modern languages such as Rust. Page 4: 4 Chen et al. ifx.type=="i32":returnCodeBlock(f"if(x + 1 < 2147483647bv32) then x + 1 else 2147483647bv32")elifx.type=="u32":returnCodeBlock(f"if(x + 1 <_u 4294967295bv32) then x + 1 else 4294967295bv32")elif...:... Fig. 2: UCLID5 translation of x.saturating_add(1) in Rust. Notice that the translation depends on the type of x. In UCLID5 and SMT-LIB, bitvectors do not have implicit type interpretation. Instead, they are only interpreted when specific operators are applied. For example, <(bvslt ) and <_u (bvult ) stands for the signed and un- signed less than comparators in UCLID (SMT-LIB), respectively. Language-specific verification tools such as CBMC [23] for C, Kani [42] for Rust, and Cameleer [35] for OCaml already perform the heavy lifting of faithfully captur- ing language semantics. However, these verification engines might use different proof techniques. For example, CBMC and Kani perform symbolic bounded model checking, while Cameleer uses deductive verification. Even if they perform a symbolic encod- ing to a common language, combining these encodings requires identifying symbol mappings between encodings. This can be challenging to reverse-engineer and imposes restrictions on the verifiers one can combine. Thus, solving the multi-language verifi- cation problem, even with language-specific verifiers, is challenging. Contract-based verification operates by: (a) identifying contracts , which are abstrac- tions of individual system components, (b) checking contracts independently for each component, and (c) composing contracts to check that they together imply the sys- tem level property. In P OLYVER, language-specific verifiers reason about individual procedures while contracts abstract procedures and thus act as glue specifications be- tween the verifiers. In this sense, verification in P OLYVERoperates much like assume- guarantee/requires-ensures reasoning [32,43]. However, manually identifying contracts is challenging. Not only must they be a valid abstraction of the procedure, but they should also be strong enough to imply the overall system property. Even the relatively simple ProcessQuery procedure in Figure 1 requires a non-trivial postcondition: ((request->value < 3) || (response->value == true)) && (request == \old(request)) To address this, POLYVERperforms automated contract synthesis by using refinement- based neural (LLM-based) search for a contract candidate in combination with symbolic verification of that candidate. For the example in Figure 1, P OLYVERconverges on the contract above. Using the synthesized contracts P OLYVERcan piggyback on existing single-language verifiers to solve the multi-language verification, all the time avoiding the challenges of encoding the entire system into a single language. 3 Problem Formulation Polyglot systems are compositions of components where each component is a proce- dure in some target language L(e.g., C, Rust, Java). Such a setup is widely seen in Page 5: POLYVER: A Compositional Approach for Polyglot System Modeling and Verification 5 distributed systems where nodes execute code in a specific language and communicate with other nodes using, for example, remote procedure calls (RPCs). In this section, we formalize such systems and define the verification problem. We view polyglot systems as a variant of Extended Finite State Machines (EFSM) [2]. Informally, EFSMs are suitable abstractions for polyglot systems because compo- nent coordination can be modeled as transitions between modes , representing polyglot system states, where the update function of a transition represents the actual coordina- tion done as a sequence of procedure calls. EFSMs operate on a set of typed variables V. We refer to the (typed) domain of variable v∈VasD(v). Further, by D(V), we denote the Cartesian product of the domains of all variables (and thus forms the set of assignments to variables in V). Unless otherwise noted, we use the same set of typed variables Vfor the rest of the paper. We now formally define EFSMs. Definition 1 (Extended Finite-State Machine). AnExtended Finite-State Machine (EFSM) is a tuple (Q, V, I 0, T)where 1.Qis a set of modes, 2.Vis a set of typed variables, 3.I0⊆Q×D(V)is a set of initial states, 4.T⊆(Q×Q)×(G×U)is a relation that maps a pair of modes to a transition characterized by a guard in G=D(V)→Band an update function in U= D(V)→D(V). The translation relation is characterized such that, if the guard condition gof an edge transition from qtoq′is satisfied, variables will be updated by a corresponding update function u. Formally, (q, q′, g, u)∈T∧g(d)∧d′=u(d) where dandd′are variable assignments in qandq′respectively. While the update functions ( u) that act in each mode of an EFSM are kept abstract, polyglot systems specialize them to be sequences of procedures. We first formally de- fine languages, which will then be used to define procedures. Definition 2 (Language). Alanguage Lis a set of expressions and predicates con- structed through syntactic rules over a set of variables V. The interpretation of an expression e∈ L is defined as JeKL:D(V)→D(V), and the interpretation of a predicate p∈ L is defined as JpKL:D(V)→B. We assume the same set of variables Vfor each language and that variables in each language have matching types, hence using the same D(V)for different languages. In cases where a type is ambiguous, we assume we have access to a type mapping that gives additional information about the variable. For example, UCLID5 does not have signed bitvectors but instead has signed bitvector operations (similar to SMT-LIB [6]), and so when translating a 32-bit bitvector xin the UCLID5 language to Rust, the signed/unsignedness of the bitvector must be determined by propagating the signs from the bitvector operators through to the operands. We assume this simple type-inference is available to us. Next, we define procedures. Page 6: 6 Chen et al. Definition 3 (Procedure). Aprocedure fis a tuple (V, C,L)where C, the procedure body, is an expression in Lover a set of variables V. We assume that procedure bodies ( C) are deterministic and free of side effects (e.g., changes to global variables). Note that nondeterministic procedures can be introduced by changing the interpretation of expressions from a function to a relation. We remark that only the expression Cis known to us, and obtaining the interpretation JCKLas an encoding is difficult because it requires encoding the semantics of the full language. We denote ⃝k j=1fj(·) =Ck◦···◦ C1(·)to represent the composition of a sequence of procedure bodies where each element fj= (Vj, Cj,Lj). We now formulate polyglot models which serve as our abstraction of polyglot systems. Essentially, a polyglot model is an EFSM where update functions ( u) are compositions of procedure invocations. Definition 4 (Polyglot Model). Apolyglot model Mis a tuple (Q, V, I 0, T, F )where (Q, V, I 0, T)form an EFSM and Fis a set of procedures. Update functions u(·) = ⃝jfj(·)are characterized as sequences of procedure invocations where fj∈F. An example of a polyglot model is shown in Figure 1, where Q={q0, q1},Vis the set of state variables, I0is the state after executing the procedures on the initial transition, Tis shown by the transitions, and Fis the set of procedures in colored text. A trace of a model Mis a sequence of assignments to variables in V, related by the transition relation, that can be obtained by executing M. A property ϕis a formula overVthat characterizes a set of traces. Let the set of all traces that can be produced by a model MbeT(M). Given a property ϕ, we write M |=ϕifT(M)⊆ϕ. Below we define the polyglot verification problem. Problem Statement 1 (Polyglot verification problem) Given a polyglot model M= (Q, V, I 0, T, F )and a system property ϕdetermine whether M |=ϕ. IfM ̸|=ϕ, provide a counterexample trace that violates ϕand is producible by procedures in F. Naively, to verify a polyglot model one has to translate procedures in the update functions to a common modeling language, e.g. SMT-LIB. However, this is a tedious and error-prone task for each language. To avoid the need for translation, we propose using contracts as abstractions of procedure behaviors instead to verify the model. 4 Polyglot Verification through Contracts In this section, we introduce the key ingredient in our approach to Problem 1: a contract- based decomposition. How contracts help. Contracts help in two ways. Firstly, contracts abstract (over- approximate) the internal complexity of the procedure, leading to smaller queries/more performant verification. However, more crucially, contracts avoid having to encode the internal procedure body in our verification query . Instead, we only need to be able to encode the contract grammar. We begin by formally defining contracts. Page 7: POLYVER: A Compositional Approach for Polyglot System Modeling and Verification 7 Definition 5 (Contract). Acontract of a procedure (V, C,L)is a tuple (P, Q)where PandQare precondition and postcondition predicates over Vin language L. The contract is valid if and only if the Hoare-triple {P}C{Q}holds, meaning ∀d, d′∈D(V).(JPKL(d)∧d′=JCKL(d)) =⇒ JQKL(d′). (1) Equation (1) can be encoded and checked by a language-specific verifier for lan- guageL, thereby removing the translation burden from users. Once we have obtained a valid contract for a procedure, we can safely replace the procedure without losing any possible behaviors. Given a polyglot model M, we define an abstract (EFSM) model M′that replaces each update in Mwith a valid contract. Formally, for each update u=⃝j∈1...kfj(·) we generate contracts (Pj, Qj)for each fjin language Ljsuch that, JQjKLj=⇒ JPj+1KLj+1. Then M′replaces uwith the contract-derived relation u′={(a, b)| JP1(a)KL1=⇒ JQk(b)KLk}. Since the replacement contract u′overapproximates u, we have the following theorem. Theorem 1. IfM′|=ϕ, thenM |=ϕ. Proof. SinceM′|=ϕandu⊆u′,T(M)⊆ T(M′)⊆ϕ. Thus M |=ϕ. Although checking the model M′holds on ϕimplies that the original model also holds, we note that checking the implication JQjKLj=⇒ JPj+1KLj+1is non-trivial. Since the implication contains predicate interpretations in two languages LjandLj+1, different from Equation (1) we are unable to use a language-specific verifier to check the implication. We address this problem by using an intermediate language to connect different languages, discussed in Section 5.3. Counterexample guided contract refinement. While a property holding in M′im- plies that it holds in M, the converse however does not hold. A counterexample trace found in M′may be a spurious counterexample that cannot be produced by M. In this case, the contracts need to be strengthened to exclude this counterexample. Coming up with valid contracts that are still strong enough to prove the system-level property is challenging, often involving iterative contract refinement. To address this, we introduce the valid contract synthesis problem, which aims to generate a contract that avoids a set of spurious traces provided as negative examples. Problem Statement 2 (Valid Contract Synthesis Problem) Given a procedure f= (V, C,L)and positive and negative example sets E={X,Y}iand¯E={¯X,¯Y}i, synthesize a valid contract for fthat includes the positive examples and excludes the negative examples. In the next section, we first develop an approach for Problem 2 and then use that to solve Problem 1 by iteratively synthesizing (and improving) contracts. 5 An Abstraction Synthesis Approach for Polyglot Model Verification We propose a contract-based approach to solve the polyglot verification problem. Con- tracts are synthesized using an intermediate language in a Counterexample Guided In- Page 8: 8 Chen et al. ductive Synthesis (CEGIS) loop. However, since contracts may include spurious be- haviors that violate the property, they are further refined in a Counterexample Guided Refinement Abstraction (CEGAR) loop. The approach iteratively switches between CEGIS and CEGAR loops to search for contracts that prove or disprove the property. We first briefly introduce CEGIS and CEGAR in Section 5.1, and propose our ap- proach in Section 5.2. In Section 5.3, we explain how the intermediate language “glues” different languages and simplifies contract synthesis, and in Section 5.4 and 5.5 detail how contracts are synthesized and refined. Finally, we discuss correctness guarantees in Section 5.6. 5.1 Preliminaries Below we briefly describe the main algorithmic concepts used in our approach. Counterexample Guided Inductive Synthesis (CEGIS) is an iterative interaction between a learner and a teacher , where the goal is to synthesize a correct solution that satisfies a given specification. The learner , relying on a synthesis oracle, is responsible for generating candidate solutions that satisfy all known constraints from a constrained search space. The teacher , relying on a verification oracle, checks whether the candidate solution provided by the learner satisfies the full specification. Synthesis succeeds if the solution is correct; otherwise, the teacher provides a counterexample that shows the failure of the candidate. Counterexample Guided Abstract Refinement (CEGAR) is a technique used in model checking to iteratively refine an abstract model of a system. Initially, a coarse abstraction of the system is checked for correctness. If the verification fails, a coun- terexample is analyzed to determine whether it is spurious. If the counterexample is real, the system is deemed incorrect; otherwise, the abstraction is refined to eliminate the spurious counterexample, leading to a more precise model. 5.2 Approach Overview We propose a CEGIS/CEGAR-hybrid approach to solve the polyglot verification prob- lem through the use of contracts, illustrated in Figure 3. The approach is parameterized by a set of synthesis oracles, a set of verification oracles, and an intermediate language ¯Lfor use in contract synthesis. The approach consists of an inner CEGIS loop and an outer CEGAR loop, described in Algorithm 1. The inner CEGIS loop uses synthesis and verification oracles to itera- tively synthesize valid contracts, solving Problem 2. Upon obtaining valid contracts for all procedures, we generate an induced model M′specified entirely in a modeling lan- guage by combining the system model and the contracts. The system property is then checked against the induced model – if the property is proven, the system is verified; otherwise, the counterexample trace is checked for spuriousness. If the trace is spuri- ous, we exclude it by returning to the inner CEGIS loop to refine contracts; otherwise, we report failure with the trace. We assume access to synthesis oracles, verification oracles, and a model checker. Synthesis oracles take as inputs a set of variables V, a piece of code C, a set of positive Page 9: POLYVER: A Compositional Approach for Polyglot System Modeling and Verification 9 CEGIS loopSynthesisOracleVerificationOracleModel CheckerSpuriousness Checkerfail / traceinvalid contract /𝑿𝒊←𝑿𝒊 ⋃ 𝒙𝒊spurious trace /𝑿&𝒊←𝑿&𝒊 ⋃ 𝒙&𝒊contractprocedure all contracts valid / 𝑀’CEGAR loop pass / Passtrace not spurious / (violated, trace) Fig. 3: Flow diagram of our approach. The process starts from the synthesis oracle. Contracts are synthesized in the inner CEGIS loop and refined in the outer CEGAR loop.Xand¯Xdenotes positive and negative example sets. examples X, and a set of negative examples ¯X, and outputs a contract (P, Q)in¯Lthat satisfies the examples in Xand dissatisfies those in ¯X. Verification oracles take in a Hoare-triple, and return a pass result if the triple holds or a violated result along with a counterexample that violates the triple. A model checker takes in a model and a property, both specified in the modeling language, and reports pass if the property holds otherwise failure with a trace. 5.3 Intermediate Language for Contracts In Section 4, we discussed how contracts help abstract procedure bodies, which can be encoded and checked by language-specific verifiers. We also stated the challenge where, without loss of generality, given two contracts (P1, Q1)inL1and(P2, Q2) inL2, checking whether JQ1KL1=⇒ JP2KL2holds is challenging because JQ1KL1and JP2KL2cannot be easily encoded. In this section, we discuss how this challenge is solved by connecting contracts using an intermediate language as a glue between different languages. The idea is to check the implication in the underlying modeling language. This is intuitive since a model can only be checked if it is fully specified in the modeling language. In order to achieve this, we use an intermediate language ¯Lthat can be trans- lated into the modeling language and all other target languages. We assume there exists a translation function fL:¯L → L for each language Lthat translates expressions from ¯LtoLwhile preserving semantic equivalence, i.e. ∀e∈¯L.JeK¯L≡JfL(e)KL. Page 10: 10 Chen et al. The intermediate language and translation functions help in two ways. On the verifi- cation side, they allow us to work with expressions in the intermediate language without needing to obtain the actual encoding. This is because the translation functions faith- fully translate expressions to specific languages, and the model checker and verification oracles handle the encoding of the translated expressions. On the synthesis side, they allow us to maintain only a single interface with synthesis oracles, e.g., a grammar in the case of Syntax Guided Synthesis (SyGuS) [5], or examples/data in the case of neural approaches. Users can also provide an intermediate language and translation functions as parameters to the approach, effectively tailoring contract synthesis to specific appli- cations. An example of an intermediate language ¯Land its translations are shown in Fig 4. For the rest of the paper, we use ¯Gas the synthesis grammar and assume semantically consistent translation between languages. LESS_THAN(x, y)UCLID5Rustifx.is_signed() andy.is_signed():returnCodeBlock(f”{str(x)} < {str(y)}")ifx.is_unsigned() and y.is_unsigned():returnCodeBlock(f”{str(x)} <_u {str(y)}")returnCodeBlock(f”{str(x)} < {str(y)}") Fig. 4: Example translation functions for the less-than operator. The intermediate lan- guage ¯Lis in the middle. The modeling language L∗at the top is UCLID5 and the bottom target language Lis Rust. In Rust the operator directly translates to <, while in UCLID5 it depends on the variable types. 5.4 Contract Synthesis via CEGIS Loop To obtain a valid contract for a procedure (V, C,L)∈F, we iteratively synthesize contracts using synthesis oracles and check their validity using verification oracles in a CEGIS loop. In each iteration, given V,C, and initially empty example sets Xand¯X, a synthesis oracle returns a contract (P, Q)whose validity is determined by a verification oracle. If the Hoare-triple {P}C{Q}holds, the contract is valid; otherwise, a positive example generated by the verification oracle is added to Xand the loop continues until a valid contract is found. Provided that results from the verification oracle are correct, this process is guaranteed to produce valid contracts upon termination, thereby solving Problem 2. An example of a CEGIS iteration is shown in Table 1. With valid contracts for all procedures, an induced model entirely specified in the modeling language is created and checked by a model checker. The model checker may return a counterexample trace, which is then checked for spuriousness. Page 11: POLYVER: A Compositional Approach for Polyglot System Modeling and Verification 11 CEGIS Output SO iter. 1 P1:⊤, Q1: (resp == (req >= 3)) VO iter. 1 fail /(req,resp ) :d= (1,⊥), d′= (1,⊤) SO iter. 2 P2:⊤, Q2:(req < 3 || resp == ⊤) VO iter. 2 pass /− Table 1: Example of contract synthesis in a CEGIS loop for ProcessQuery . A syn- thesis oracle (SO) first synthesizes (P1, Q1). A verification oracle (VO) checks the contract and returns a positive example. Note that the pre-state dsatisfies P1but the post-state d′violates Q2. SO then refines the contract based on the example and gener- ates(P2, Q2), which passes the VO check. req andresp are short for request and response respectively. 5.5 Spuriousness Check A trace is not spurious if all steps of the trace can be produced by the procedures. To check this, for each procedure (V, C,L)we extract input/output pairs p={(d, d′)}i from the trace, where dandd′are variable assignments to Vbefore and after the pro- cedure call, respectively. For each (d, d′)∈p, we check whether {V=d}C{V̸=d′} holds using a verification oracle. If it holds, the procedure is impossible to produce the pair and thus the trace is spurious. The pair is then added as a negative example to ¯X and is excluded through re-synthesis in the CEGIS loop. Otherwise, if all such triples do not hold, meaning that it is possible for procedures to generate the trace, the trace is a true counterexample. Note that inequality instead of equality is used in the triple to account for non-deterministic procedures. Table 2 shows an example of a spuriousness check performed on ProcessQuery against a pair of assignments. Oracle Output MC in CEGAR (···, d= (2,⊤), d′= (10,⊤),···) VO in CEGAR pass /(req,resp ) :d= (2,⊤), d′= (2,⊤) SO in CEGIS P2:⊤, Q2:(req < 3 || resp == ⊤) && \old(req) == req VO in CEGIS pass /− Table 2: Example of spuriousness check for ProcessQuery . Continuing from Ta- ble 1, assume that a model checker (MC) checks the property using (P2, Q2)and re- turns a trace containing dandd′. To check spuriousness, we check the Hoare-triple {(req,resp ) =d}ProcessQuery {(req,resp )̸=d′}using VO, who returns pass . The pair is thus impossible, making the trace spurious. We then return to the CEGIS loop and re-synthesize a contract, which is proven valid by VO. Page 12: 12 Chen et al. 5.6 Correctness Guarantee We assume that results given by the verification oracle and the model checker are cor- rect, and that translation of the intermediate language preserves semantic equivalence. Provided these assumptions, our approach guarantees partial correctness , meaning that the result is correct when it terminates. A brief explanation is provided below. For soundness, notice that contracts used in the induced model must overapprox- imate procedure behaviors because they are validated by the verification oracle. This means that when the model checker reports pass , there does not exist a trace that is producible by the procedures that violates the property, since all producible behaviors have already been captured by the contracts. For completeness, when the model checker reports violated with a trace, the trace must violate the property. This trace must also be producible by the procedures because each step is checked by the verification oracle. 6 P OLYVERImplementation We implemented a tool called P OLYVERthat automates our approach. P OLYVERbuilds on top of the UCLID5 model checker, which supports procedural verification. It sup- ports the C and Rust programming languages. The infrastructure is shown in Figure 5. The tool is modular in that it can be extended to support other languages and verifiers. Users would only need to provide an oracle along with an oracle interface that parses the oracle output. Below we suggest synthesis and verification oracles and describe their interfaces. For the synthesis grammar ¯G, we use Boolean connectives, arithmetic operations, and comparators. 6.1 Synthesis Oracles and Interface Given a procedure, a set of positive Xand negative examples ¯X, and the synthesis grammar ¯G, the synthesis oracle is expected to return a contract that is consistent with the examples. Below we discuss two candidate synthesis oracles and their interfaces. LLM-based Synthesizer To take advantage of the procedure body C, we developed an LLM synthesizer that accounts for Cand applies chain-of-thought prompting to generate contracts based on the LLM’s reasoning. The interface of the LLM synthesizer consists of a prompt generator and a DSL parser. The LLM is first asked to summarize the behavior of Cand then asked to generate a contract in ¯Lbased on its summary. Inspired by [31], we converted ¯Gto a Python DSL and used it as a building block to help the LLM generate syntactically correct expressions. If an expression lies outside of ¯L, an error message returned by the parser is provided to the LLM for re-prompting. The message provides insightful information by locating the source of error, such as incorrect syntax or inconsistent types. If positive and negative examples are provided, the LLM will be prompted to find a constraint that distinguishes the two sets. Figure 7 shows an example of our prompt. Page 13: POLYVER: A Compositional Approach for Polyglot System Modeling and Verification 13 PolyVer (𝑃,𝑄)Verification OracleQuery GeneratorResultParserVerif. Oracle InterfacequeryP/NP + cexSynthesis OracleQuery GeneratorResult ParserSynth. Oracle Interfacequerysynth. resultModel CheckerModelBuilderSystemCex.ParserModel Checker Interface𝑀′system cex(𝑑,𝑑!)(𝑃,𝑄) Fig. 5: Infrastructure of PolyVer. (P, Q)are contracts and (d, d′)is an input/output pair. Note that this neural approach is not fully explored and is used to demonstrate the applicability of LLM-based solutions. Existing synthesis approaches can be easily inte- grated into the system as long as they generate contracts in ¯L. Syntax Guided Synthesis (SyGuS)/Syntax Guided Modulo Oracles (SyMO) Solver Another solution is to treat the problem as a Programming by Example (PBE) problem, asserting that the contract should satisfy examples in Xand dissatisfy those in ¯X. SyGuS [5] solvers can be used to solve PBE problems. A similar approach is to encode as a SyMO [37] problem, a variant of the Sy- GuS problem that allows constraints to be specified using black-box functions. Using a language-specific verifier as the black-box function, the SyMO solver itself effectively forms a CEGIS loop which directly solves Problem 2. The interface of a SyGuS/SyMO solver uses ¯Gas the grammar and builds a Sy- GuS/SyMO query that includes the examples as constraints. An SMT-LIB parser is used to parse the output of the solver, which is then converted to a contract in ¯G. 6.2 Verification Oracles and Interface We use CBMC and Kani as the verification oracles for C and Rust, respectively. The interface of the verification oracle consists of a query generator and a result parser. Given a contract (P, Q)and a procedure body Cin a language L, the query generator generates {P}C{Q}. Concretely, the procedure body has stubs for the contract, and the query generator replaces the stubs with the contract. An example is shown in List- ing 1.1. Receiving results from the oracle, the result parser parses the result and variable assignments before and after the procedure call if the triple does not hold. Page 14: 14 Chen et al. 7 Case Study: Lingua Franca Program Verification To demonstrate P OLYVER’s effectiveness, we developed a verifier for the Lingua Franca coordination language and show evaluation results in Section 8. Lingua Franca (LF) is a polyglot coordination language designed for real-time, concurrent, and distributed sys- tems. Using LF, the user defines concurrent components called reactors and connects them through ports and connections. Within a reactor, the user defines reactive message handlers called reactions , which are written in a target programming language, such as C, C++, Python, Rust, or TypeScript. A reaction is invoked when any of its triggers is present, and when invoked may trigger downstream reactions based on its connections. In this section, we give the LF program for the running example in Section 7.1, dis- cuss the challenges of LF program verification in Section 7.2, and describe our solution in the rest of the section. // Rust reactorreactorTrain{inputresponse:booloutputrequest:i32statecount:i32= 0statepassed:bool= falselogicalactionwait(2minute)logicalactionpass(10minute)reaction Init(startup) ->request{=ctx.set(request, self.count);=}reaction Waited(wait) ->request{=self.count= self.count.saturating_add(1);ctx.set(request, self.count);=}reaction Passed (pass) ->request{=self.passed= true;=}reaction ProcessResponse(response) ->pass,wait{=if ctx.get(response).unwrap() {ctx.schedule(pass, 0);} else{ctx.schedule(wait, 0);}=}}// C reactorreactorUMS{inputrequest:intoutputresponse : boolreaction ProcessQuery(request) ->response{=if(request->value< 3) {lf_set(response, rand() % 2 == 1);} else{lf_set(response, true);}=}}// Property to be verified@property(name="pass_within_12_seconds", spec="F[0, 12 minute](t.passed)", expect=false)mainreactor{// Component instantiationu=newUMS()t=newTrain()// Component interactionst.request->u.requestu.response->t.response} Fig. 6: LF program for a subway U-turn section management system from [17]. 7.1 Running Example in Lingua Franca Figure 6 shows the LF program for the running example, a subway U-turn manage- ment system (UMS). The Train andUMS reactors are implemented in Rust and C respectively. Reaction bodies are enclosed in special brackets {=...=} . The system describes a train asking for permission to pass a railway crossing. The train sends out Page 15: POLYVER: A Compositional Approach for Polyglot System Modeling and Verification 15 requests to the UMS, which decides whether to grant permission based on the number of attempts the train has made, described in ProcessQuery . If the train has made more than three attempts, the UMS will grant permission; otherwise, the UMS will ran- domly decide whether to grant permission. If permission is granted, the train will pass which takes ten minutes; otherwise, it waits for two minutes before making another request, described in ProcessResponse . The property under verification is that the train passes in 12 minutes, which is expected to be violated since the train may take as long as 16 minutes to pass. 7.2 Where Existing Solutions Fall Short As LF programs may consist of reactors in multiple languages, using a single language- specific verifier, e.g., CBMC for C and Kani for Rust, would require compiling the reactors into the supported language. This is clearly infeasible for languages with in- compatible types and semantics. Lin et al. present the LFVerifier [28], which performs end-to-end verification of LF programs by translating an LF program with its embedded C code into a monolithic axiomatic model that compiles to SMT. However, it only supports a small subset of the C language. Expanding on the LFVerifier to support other languages or a comprehensive fragment of C would be equivalent to developing encodings from these languages to SMT. This is precisely the challenge we aim to avoid with P OLYVER. By leveraging tools such as CBMC/Kani, we can already support large C/Rust fragments, for example, ANSI-C in the case of CBMC. 7.3 Mapping LF onto EFSM in UCLID5 Template LF is a deterministic model of computation due to its use of logical time [25], which defines a unique sequence of reaction invocations over logical time steps given the same inputs to the program. The determinism property enables a straightforward mapping from an LF program to a polyglot model defined in Definition 4. Similar to [28], an LF program is first mapped onto a State Space Diagram (SSD), which is a graph where each node vicontains the current logical time, a set Riof reac- tions, and a set of pending events, and each edge represents logical time advancement. Figure 8 shows an example SSD generated from the LF program in Figure 6. Once we obtain an SSD that describes the LF program behavior across logical time steps, we automatically map the SSD to the proposed polyglot model using a custom UCLID5 model generator. For an SSD, the process executes the following steps: 1. For each SSD node vi, generate a mode qi; 2. For each reaction, generate a UCLID5 procedure with empty pre/postconditions, to be later filled in by P OLYVER; 3. For each edge in the SSD connecting node vito node vj, generate a guarded tran- sition from mode qito mode qjwith guard g=V t∈TisPresent (t)for each subset Tof reaction triggers obtained from Ri. The associated update function uof the transition is the sequence of reaction procedures triggered by T. Transitions are concisely encoded using conditional statements in UCLID5. Page 16: 16 Chen et al. 7.4 Generating Rust and C Templates for Contract Verification To verify contracts for all procedures in the LF program, i.e., reactions written in Rust and C, we implemented a code generator in the LF compiler to generate a standalone C/Rust program for each procedure. Each program assumes the precondition, invokes the procedure under verification, and asserts the postcondition. For example, the reac- tion in UMS in Figure 6 results in the following C program, to be checked by CBMC. In Listing 1.1, PRECONDITION andPOSTCONDITION are undefined C macros set to a candidate pre/post condition pair passed in from the command line - therefore this program is considered a “template”. Rust templates with a similar structure are generated for Rust procedures, which are checked by the Kani model checker. 7.5 P OLYVER-based Verification of LF Programs We integrate the P OLYVER-based verifier into the LF compiler, which code-generates the C, Rust, and UCLID5 templates. Similar to [28], the compiler converts the system- level property, specified by the user in Safety MTL [33], into first-order logic over traces. A separate configuration JSON file is generated to record all variables in the application and the representation of their types in each target language. Once the tem- plates and the configuration file are generated, P OLYVERexecutes the tool flow de- scribed in Figure 3 until it returns to the user a verdict of pass , orviolated with a counterexample trace to the system-level property. 8 Experimental Evaluation Our experimentation evaluates P OLYVERon both its components: contract synthesis and verification using the synthesized contracts. Through our results, we would like to answer the following research questions: RQ1 How does P OLYVERcompare to previous work on verifying LF? RQ2 Is P OLYVERable to handle full-fledged LF examples? RQ3 Is P OLYVERable to verify multi-language LF systems? Benchmark Setup. We run all experiments on a 3.7GHz Intel Core i9 machine with 62GB RAM. We use a single core for all benchmarks. We use an LLM-based syn- thesizer (the o1-mini model) to generate contracts. For verification, we use CBMC- 6.3.1, Kani-0.56.0, and z3-4.8.7 as the backend SMT engine. Parallel synthesis is im- plemented in P OLYVERbut not used in the experiments. Since LF reactions often include unmodified variables that are important to identify strong/tight contracts, our LLM prompt contains a single instruction that asks the LLM to pay extra attention to unmodified state variables. 8.1 RQ1: How does P OLYVERcompare to previous work on verifying LF? We compare with the LFVerifier [28] using the LF benchmarks proposed by [28], which are arguably toy examples since the LFVerifier only supports a small fragment of C syn- tax. This comparison is not entirely fair because [28] uses a monolithic, axiomatic en- coding specialized for LF programs, while P OLYVERuses a compositional, operational Page 17: POLYVER: A Compositional Approach for Polyglot System Modeling and Verification 17 LOC #Iters CPU Time (s) Benchmark #Reaction C LF CEGIS CEGAR SOT VOT UT [28] ADASModel 7 16 73 7 1 220.9 3.2 4.7 22.8 AircraftDoor 3 10 38 3 1 82.1 1.2 1.7 6.6 Alarm 2 12 30 2 1 62.3 0.9 2.0 5.4 CoopSchedule 2 2 37 2 1 54.5 0.8 2.5 21.4 Election 3 7 35 3 1 124.2 1.4 3.8 44.9 Election2 2 6 31 3 2 142.8 1.5 5.9 12.4 Elevator 13 53 206 13 1 413.1 33.4 265.8 1363.2 Factorial 3 11 36 4 1 138.2 1.9 5.2 34.6 Fibonacci 5 14 39 5 1 167.3 2.4 3.2 248.5 PingPong 4 8 62 4 1 131.0 1.8 3.0 13.6 Pipe 5 10 60 5 1 147.4 2.4 163.7 154.5 ProcessMsg 4 10 32 4 1 116.6 1.8 2.6 14.0 ProcessSync 1 2 18 1 1 35.2 0.4 1.5 5.0 Railroad 10 64 66 17 2 678.9 9.0 74.3 162.5 Ring 4 7 45 4 1 105.6 1.8 3.3 15.9 RoadsideUnit 8 26 81 8 1 259.0 3.8 3.2 74.7 SafeSend 4 12 53 4 1 124.3 1.8 1.9 6.5 Subway 10 20 64 10 1 266.5 4.5 11.6 30.3 Thermostat 6 29 45 6 1 192.6 2.6 2.3 13.6 TrafficLight 4 46 42 7 1 396.1 3.8 8.9 - TrainDoor 3 7 50 3 1 74.5 1.2 1.8 5.7 UnsafeSend 4 12 53 4 1 122.1 1.8 2.7 6.5 Table 3: The synthesis and verification oracle times (SOT and VOT) are the total time across all CEGIS iterations, while the verification oracle time (VOT) is the total time spent on verification oracles, including contract validation and spuriousness checks. The verification time (UT) is the total time spent on UCLID5, which is called once per CEGAR iteration. The total runtime is SOT + VOT + UT. Page 18: 18 Chen et al. encoding that is more general. However, [28] is the only prior work to our knowledge that performs end-to-end automated verification of LF programs, so we compare to it. The proposed compositional approach requires P OLYVERto synthesize contracts for all reactions and use them to verify the system-level property. In the benchmark eval- uation, P OLYVERsuccessfully synthesizes the contracts and verifies allbenchmarks automatically, while [28] is unable to verify the TrafficLight example. We report on the synthesis iteration counts as well as runtimes in Table 3. In 18 out of the 22 benchmarks, the contracts were synthesized correctly in a single CEGIS loop, which takes around 33 seconds on average. For the rest, we observe that all but one reaction still requires one CEGIS iteration while one (hard) reaction requires all the remainder of iterations. The hard reaction often includes complex control flow and data dependencies, which makes it difficult even for human experts to reason about. We record the synthesis oracle time (SOT), verification oracle time (VOT), and ver- ification runtime (UT), and we note that the time to perform contract synthesis (SOT + VOT) far dominates the verification runtime (UT). Since [28] (and previous work generally) does not perform contract synthesis for multi-language systems, we can- not compare with prior work w.r.t. contract synthesis. However, given valid contracts POLYVERis up to ∼77x faster than [28] w.r.t. the verification time alone (VT vs. [28]). Qualitatively, P OLYVERimproves on [28] in a major way: There is no longer a need to develop a translator from a target language to the modeling language (e.g., C to UCLID5). Since we leverage robust language-specific verifiers (e.g., CBMC, Kani), we are also able to support a much more comprehensive fragment of C/Rust. For example, CBMC officially supports ANSI-C. In contrast, [28] develops a custom C-to-UCLID5 translator for a relatively small C subset (which is what these benchmarks use). We also experimented with SyGuS/SyMO solvers but found that these approaches required a large number of CEGAR iterations and failed to terminate in most bench- marks. We attribute this to the solvers’ lack of access to procedure bodies, relying solely on positive and negative examples for guidance. 8.2 RQ2: Is P OLYVERable to handle full-fledged LF examples? The benchmarks in Section 8.1 only use the C subset supported by [28], while P OLYVER (like CBMC) supports ANSI-C programs. Hence, we now evaluate P OLYVERon ex- amples requiring a broader fragment of C that [28] is unable to verify. Two examples from embedded systems (an LED controller and a hill climbing robot) and a satellite attitude controller [8,27] are used to evaluate the performance of our approach. The LED controller tracks whether the LED is blinking, and the hill climbing robot checks if it is in climbing mode when it is on a ramp, detected by ac- celerometers. These examples are resource-constrained embedded systems. When the implementation of a library function is unavailable, we assign non-deterministic values to input/output variables to over-approximate the function’s behaviors. The satellite ex- ample uses a PID controller and ensures that motors are not activated when the user input falls outside an acceptable range. The results in Table 4 show that our approach is able to handle language complexi- ties and effectively generate contracts for actual implementation code. In the LED and hill climbing examples, the LLM-based synthesizer was able to summarize procedure Page 19: POLYVER: A Compositional Approach for Polyglot System Modeling and Verification 19 behavior while ignoring details about low-level code when they are irrelevant. In the satellite example, the LLM struggled to synthesize a valid contract for the PID con- troller. However, we noticed that providing a simple comment in the code that captures the logic of the PID controller greatly helped synthesis. The synthesized contracts can all be validated by language-specific verifiers. This shows the potential applicability of POLYVERto aid verification of real-world systems. LOC #Iters CPU Time (s) Benchmark #Reaction C/Rust LF CEGIS CEGAR SOT VOT UT Blink 1 123 35 1 1 31.0 0.45 1.26 HillClimb 12 75 102 20 1 685.4 74.9 9.7 Satellite 6 424 166 9 2 729.0 32.2 7.0 Submit 7 43 66 7 1 276.7 4.8 9.4 TrainPass 5 14 44 5 1 160 8.8 14.1 Table 4: Results for full-fledged and multi-language examples. The Blink, HillClimb, and Satellite examples include real implementation code in C. Submit, TrainPass, and other 22 variants of the benchmarks in Table 3 are multi-language systems that can only be solved by P OLYVER. 8.3 RQ3: Is P OLYVERable to verify multi-language LF systems? The third set of benchmarks is the same as the first set, except that half of the C reactions are replaced with Rust reactions. On this set, our approach performed similarly to the first set. The LLM-based synthesizer is able to reason about Rust code as well as library functions in both C and Rust. The example shown (TrainPass) in Figure 6 was verified with one CEGIS iteration per reaction and one CEGAR iteration. Note that this set of benchmarks is not solvable by previous works as they involve multiple languages. This demonstrates the flexibility of our approach in supporting different languages. 9 Related Work The majority of approaches to verifying systems which involve multiple implementa- tion languages rely on manually translating some parts of the system until everything is written in one common language. For instance, Cook et al. [10] translate fragments of assembly code into C in order to achieve verification of the Xen hypervisor. Where it is common to combine small parts of one language with another, tools may provide automated translation for a fragment of that language, for example, prior work [28] verifying programs in Lingua Franca is capable of automatically translating a fragment of C into the verification language, and tools like CBMC [23,22] support small frag- ments of assembly code. Another approach to this problem is to treat the fragments of Page 20: 20 Chen et al. code that cannot be analyzed as black boxes and generate summaries for them based on observed input-output behavior of the code [34]. This is similar to our approach but re- lies on generating a sufficiently comprehensive set of input-output examples, and lacks guarantees that fully integrating the language-specific verifiers provides. We employ a style of compositional verification based on Hoare logic [19]. This is commonly applied to verification problems that, whilst written in only one language, are too large to be tackled in a single verification query [7]. To automate this, previ- ous work, e.g.,[12,32,16,39], has used techniques such as abstract interpretation and Counter-Example Guided Abstraction Refinement [9] to learn pre/post-conditions. There are also property-guided approaches to inferring contracts, while we aim to discover a contract that is just good enough to prove a given property. Here, previ- ous work uses syntactic patterns [14], Constrained Horn Clause solvers [4], and even CEGIS [40,3]. Recently, LLMs have been applied to learning specifications, including pre/post-conditions, for code [44]. The work closest to ours uses CEGIS and LLMs to infer post-conditions, but neither of these approaches addresses the challenge of multi- language verification. Our approach to learning contracts is based on an integration of CEGIS and CE- GAR. This integration can be considered a form of oracle-guided synthesis [21,37], with the verification oracle and model checker treated as oracles that guide the synthe- sis search. In the hardware domain, prior work [47] uses synthesis to generate environ- ment invariants for modular hardware verification. These invariants can be thought of as pre-conditions in our context. 10 Conclusions In conclusion, we formulate the polyglot verification problem and propose a contract- based approach that uses language-specific verifiers and a model checker to perform verification. We implemented our approach in a tool called P OLYVERthat automati- cally iterates between CEGAR and CEGIS loops to synthesize contracts using an LLM- based synthesizer. Results show that P OLYVER, when given valid contracts, is able to verify systems with faster runtime compared to previous work, and is able to verify full-fledged and multi-language Lingua Franca systems while previous work could not. For future work, we aim to enhance synthesis by exploring how incorporating sys- tem and property context into the synthesis oracle can enable property-directed contract generation. Symbolic techniques, such as program slicing, may help synthesizers focus on relevant procedure segments, leading to more precise abstractions. On the verifi- cation side, we plan to develop methods for identifying which contracts contribute to errors when a system-level counterexample is encountered. Acknowledgements. This work was supported in part by the DARPA Provably Cor- rect Design of Adaptive Hybrid Neuro-Symbolic Cyber Physical Systems (ANSR) program award number FA8750-23-C-0080; the National Science Foundation (NSF), award #CNS-2233769 (Consistency vs. Availability in Cyber-Physical Systems); Nis- san and Toyota under the iCyPhy Center, Intel, Berkeley Deep Drive, and the RDI Frontier Fellows Program. Page 21: POLYVER: A Compositional Approach for Polyglot System Modeling and Verification 21 References 1. GitHub - eclipse-cyclonedds/cyclonedds: Eclipse Cyclone DDS project, https:// github.com/eclipse-cyclonedds/cyclonedds 2. Alagar, V .S., Periyasamy, K.: Extended Finite State Machine, pp. 105–128. Springer London, London (2011). https://doi.org/10.1007/978-0-85729-277-3_7 ,https: //doi.org/10.1007/978-0-85729-277-3_7 3. Albarghouthi, A., Dillig, I., Gurfinkel, A.: Maximal specification synthesis. In: POPL. pp. 789–801. ACM (2016) 4. Alshnakat, A., Gurov, D., Lidström, C., Rümmer, P.: Constraint-based contract inference for deductive verification. In: 20 Years of KeY , Lecture Notes in Computer Science, vol. 12345, pp. 149–176. Springer (2020) 5. Alur, R., Bodik, R., Juniwal, G., Martin, M.M.K., Raghothaman, M., Seshia, S.A., Singh, R., Solar-Lezama, A., Torlak, E., Udupa, A.: Syntax-guided synthesis. In: 2013 Formal Methods in Computer-Aided Design. pp. 1–8 (2013). https://doi.org/10.1109/ FMCAD.2013.6679385 6. Barrett, C., Sebastiani, R., Seshia, S.A., Tinelli, C.: Satisfiability modulo theories. In: Biere, A., van Maaren, H., Walsh, T. (eds.) Handbook of Satisfiability, chap. 26, pp. 825–885. IOS Press (2009) 7. Beyer, D., Kettl, M., Lemberger, T.: Decomposing software verification using distributed summary synthesis. Proc. ACM Softw. Eng. 1(FSE), 1307–1329 (2024) 8. Cardoso, J., Chanel, C., Chauvin, P., Hostallier, A., Lamaison, J., Mascarenas-Gonzales, A., Metral, E., Alloza, L.: Lab on the real-time control of reaction wheel (2022), unpublished lab exercise for the course 1MAE803: Real-time Control of Aerospace Systems, M.Sc in Aerospace Engineering, ISAE-SUPAERO 9. Clarke, E.M., Grumberg, O., Jha, S., Lu, Y ., Veith, H.: Counterexample-guided abstraction refinement. In: CA V. Lecture Notes in Computer Science, vol. 1855, pp. 154–169. Springer (2000) 10. Cook, B., Döbel, B., Kroening, D., Manthey, N., Pohlack, M., Polgreen, E., Tautschnig, M., Wieczorkiewicz, P.: Using model checking tools to triage the severity of security bugs in the xen hypervisor. In: FMCAD. pp. 185–193. IEEE (2020) 11. Cordeiro, L., Kesseli, P., Kroening, D., Schrammel, P., Trtik, M.: Jbmc: A bounded model checking tool for verifying java bytecode. In: International Conference on Computer Aided Verification. pp. 183–190. Springer (2018) 12. Cousot, P., Cousot, R., Fähndrich, M., Logozzo, F.: Automatic inference of necessary precon- ditions. In: VMCAI. Lecture Notes in Computer Science, vol. 7737, pp. 128–148. Springer (2013) 13. Cuoq, P., Kirchner, F., Kosmatov, N., Prevosto, V ., Signoles, J., Yakobowski, B.: Frama-c: A software analysis perspective. In: International conference on software engineering and formal methods. pp. 233–247. Springer (2012) 14. Denney, E., Fischer, B.: A generic annotation inference algorithm for the safety certification of automatically generated code. In: GPCE. pp. 121–130. ACM (2006) 15. Flanagan, C., Leino, K.R.M., Lillibridge, M., Nelson, G., Saxe, J.B., Stata, R.: Extended static checking for java. In: Proceedings of the ACM SIGPLAN 2002 Conference on Pro- gramming language design and implementation (PLDI). pp. 234–245 (2002) 16. Gordon, M., Collavizza, H.: Forward with hoare. In: Reflections on the Work of C. A. R. Hoare, pp. 101–121. Springer (2010) 17. Halbwachs, N., Lagnier, F., Ratel, C.: Programming and verifying real-time systems by means of the synchronous data-flow language lustre. IEEE transactions on software engi- neering 18(9), 785–793 (1992) Page 22: 22 Chen et al. 18. Havelund, K., Pressburger, T.: Model checking java programs using java pathfinder. Interna- tional Journal on Software Tools for Technology Transfer 2, 366–381 (2000) 19. Hoare, C.A.R.: An axiomatic basis for computer programming. Commun. ACM 12(10), 576–580 (1969) 20. Indrasiri, K., Kuruppu, D.: gRPC: up and running: building cloud native applications with Go and Java for Docker and Kubernetes. O’Reilly Media (2020) 21. Jha, S., Seshia, S.A.: A theory of formal synthesis via inductive learning. Acta Informatica 54(7), 693–726 (2017) 22. Kroening, D., Schrammel, P., Tautschnig, M.: CBMC: the C bounded model checker. CoRR abs/2302.02384 (2023) 23. Kroening, D., Tautschnig, M.: CBMC - C bounded model checker - (competition contri- bution). In: Ábrahám, E., Havelund, K. (eds.) Tools and Algorithms for the Construction and Analysis of Systems - 20th International Conference, TACAS 2014, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2014, Grenoble, France, April 5-13, 2014. Proceedings. Lecture Notes in Computer Science, vol. 8413, pp. 389–391. Springer (2014). https://doi.org/10.1007/978-3-642-54862-8_ 26,https://doi.org/10.1007/978-3-642-54862-8_26 24. Kroening, D., Tautschnig, M.: Cbmc–c bounded model checker: (competition contribution). In: Tools and Algorithms for the Construction and Analysis of Systems: 20th International Conference, TACAS 2014, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2014, Grenoble, France, April 5-13, 2014. Proceedings 20. pp. 389–391. Springer (2014) 25. Lamport, L.: Time, clocks, and the ordering of events in a distributed system, p. 179–196. Association for Computing Machinery, New York, NY , USA (2019), https://doi.org/ 10.1145/3335772.3335934 26. Lattuada, A., Hance, T., Cho, C., Brun, M., Subasinghe, I., Zhou, Y ., Howell, J., Parno, B., Hawblitzel, C.: Verus: Verifying rust programs using linear ghost types. Proceedings of the ACM on Programming Languages 7(OOPSLA1), 286–315 (2023) 27. Lin, S., Jellum, E., Theile, M., Tanneberger, T., Sun, B., Jerad, C., Xu, R., Feng, G., Menard, C., Lohstroh, M., Castrillon, J., Seshia, S., Lee, E.: Pretvm: Predictable, efficient virtual machine for real-time concurrency (2024), https://arxiv.org/abs/2406.06253 28. Lin, S., Manerkar, Y .A., Lohstroh, M., Polgreen, E., Yu, S.J., Jerad, C., Lee, E.A., Seshia, S.A.: Towards building verifiable cps using lingua franca. ACM Trans. Embed. Comput. Syst. 22(5s) (Sep 2023). https://doi.org/10.1145/3609134 ,https://doi. org/10.1145/3609134 29. Lohstroh, M., Menard, C., Bateni, S., Lee, E.A.: Toward a lingua franca for determinis- tic concurrent systems. ACM Transactions on Embedded Computing Systems 20(4) (May 2021). https://doi.org/10.1145/3448128 ,https://doi.org/10.1145/ 3448128 30. Macenski, S., Foote, T., Gerkey, B., Lalancette, C., Woodall, W.: Robot operating sys- tem 2: Design, architecture, and uses in the wild. Science Robotics 7(66), eabm6074 (2022). https://doi.org/10.1126/scirobotics.abm6074 ,https://www. science.org/doi/abs/10.1126/scirobotics.abm6074 31. Mora, F., Wong, J., Lepe, H., Bhatia, S., Elmaaroufi, K., Varghese, G., Gonzalez, J.E., Pol- green, E., Seshia, S.A.: Synthetic programming elicitation for text-to-code in very low- resource programming and formal languages. In: The Thirty-eighth Annual Conference on Neural Information Processing Systems (NeurIPS) (2024), https://openreview. net/forum?id=kQPzFiwVIu 32. Moy, Y .: Sufficient preconditions for modular assertion checking. In: VMCAI. Lecture Notes in Computer Science, vol. 4905, pp. 188–202. Springer (2008) Page 23: POLYVER: A Compositional Approach for Polyglot System Modeling and Verification 23 33. Ouaknine, J., Worrell, J.: Safety metric temporal logic is fully decidable. In: International conference on tools and algorithms for the construction and analysis of systems. pp. 411– 425. Springer (2006) 34. Penninckx, W., Jacobs, B., Piessens, F.: Sound, modular and compositional verification of the input/output behavior of programs. In: ESOP. Lecture Notes in Computer Science, vol. 9032, pp. 158–182. Springer (2015) 35. Pereira, M., Ravara, A.: Cameleer: A deductive verification tool for ocaml. In: Computer Aided Verification: 33rd International Conference, CA V 2021, Virtual Event, July 20–23, 2021, Proceedings, Part II. p. 677–689. Springer-Verlag, Berlin, Heidelberg (2021). https: //doi.org/10.1007/978-3-030-81688-9_31 36. Polgreen, E., Cheang, K., Gaddamadugu, P., Godbole, A., Laeufer, K., Lin, S., Manerkar, Y .A., Mora, F., Seshia, S.A.: UCLID5: multi-modal formal modeling, verification, and syn- thesis. In: 34th International Conference on Computer Aided Verification (CA V). Lecture Notes in Computer Science, vol. 13371, pp. 538–551. Springer (2022) 37. Polgreen, E., Reynolds, A., Seshia, S.A.: Satisfiability and synthesis modulo oracles. In: VMCAI. Lecture Notes in Computer Science, vol. 13182, pp. 263–284. Springer (2022) 38. Quigley, M., Conley, K., Gerkey, B., Faust, J., Foote, T., Leibs, J., Wheeler, R., Ng, A.Y ., et al.: ROS: An open-source robot operating system. In: ICRA workshop on open source software. vol. 3, p. 5 (2009) 39. Seghir, M.N., Kroening, D.: Counterexample-guided precondition inference. In: ESOP. Lec- ture Notes in Computer Science, vol. 7792, pp. 451–471. Springer (2013) 40. Solar-Lezama, A., Tancau, L., Bodík, R., Seshia, S.A., Saraswat, V .A.: Combinatorial sketch- ing for finite programs. In: ASPLOS. pp. 404–415. ACM (2006) 41. VanHattum, A., Schwartz-Narbonne, D., Chong, N., Sampson, A.: Verifying dynamic trait objects in rust. In: Proceedings of the 44th International Conference on Software Engi- neering: Software Engineering in Practice. p. 321–330. ICSE-SEIP ’22, Association for Computing Machinery, New York, NY , USA (2022). https://doi.org/10.1145/ 3510457.3513031 ,https://doi.org/10.1145/3510457.3513031 42. VanHattum, A., Schwartz-Narbonne, D., Chong, N., Sampson, A.: Verifying dynamic trait objects in rust. In: Proceedings of the 44th International Conference on Software Engi- neering: Software Engineering in Practice. p. 321–330. ICSE-SEIP ’22, Association for Computing Machinery, New York, NY , USA (2022). https://doi.org/10.1145/ 3510457.3513031 ,https://doi.org/10.1145/3510457.3513031 43. Viswanathan, M., Viswanathan, R.: Foundations for Circular Compositional Reasoning. In: Proceedings of the 28th International Colloquium on Automata, Languages and Program- ming,. pp. 835–847. ICALP ’01, Springer-Verlag, Berlin, Heidelberg (2001) 44. Wen, C., Cao, J., Su, J., Xu, Z., Qin, S., He, M., Li, H., Cheung, S., Tian, C.: Enchanting program specification synthesis by large language models using static analysis and program verification. In: CA V (2). Lecture Notes in Computer Science, vol. 14682, pp. 302–328. Springer (2024) 45. Yang, H., Nong, Y ., Wang, S., Cai, H.: Multi-language software development: Issues, chal- lenges, and solutions. IEEE Transactions on Software Engineering 50(3), 512–533 (2024). https://doi.org/10.1109/TSE.2024.3358258 46. Yang, H., Nong, Y ., Zhang, T., Luo, X., Cai, H.: Learning to detect and localize multilingual bugs. Proceedings of the ACM on Software Engineering 1(FSE), 2190–2213 (2024) 47. Zhang, H., Yang, W., Fedyukovich, G., Gupta, A., Malik, S.: Synthesizing environment in- variants for modular hardware verification. In: VMCAI. Lecture Notes in Computer Science, vol. 11990, pp. 202–225. Springer (2020) Page 24: 24 Chen et al. A Appendix Algorithm 1 PolyVer: CEGIS/CEGAR-hybrid approach Require: M= (Q, X, I 0, E, G, U, T ): a polyglot model, ϕ: a property Ensure: res: verification result, cex: a sequence of assignments to X 1:for all ei∈Edo 2: Xi,¯Xi← ∅,∅ 3:end for 4:while⊤do 5: C ← { synthesizeContract (ei,Xi,¯Xi)|ei∈E} ▷CEGIS loop 6: M′←induce (M,C) 7: (res,cex)←check (M′, ϕ) 8: ifres=pass then 9: return pass , - 10: else 11: isSpurious ← ⊥ 12: for all ei∈Edo 13: xi,¯xi←checkSpurious (cex, ei) 14: if¯xi̸=∅then 15: isSpurious ← ⊤ 16: end if 17: Xi←Xi∪xi 18: ¯Xi←¯Xi∪¯xi 19: end for 20: if¬isSpurious then 21: return violated ,cex 22: end if 23: end if 24:end while 1// ... Definitions of structs and LF APIs ... 2void UMS_reaction_1() { 3 if (query->value < 3) { 4 lf_set(response, rand() % 2 == 1); 5 } else { 6 lf_set(response, true); 7 } 8} 9int main() { 10 // calloc for inputs and the self struct. 11 query = calloc(1, sizeof(UMS_query_t)); 12 response = calloc(1, sizeof(UMS_response_t)); 13 // Assume that there are no NULL pointers. 14 __CPROVER_assume( 15 query && response 16 ); Page 25: POLYVER: A Compositional Approach for Polyglot System Modeling and Verification 25 Algorithm 2 synthesizeContract Require: e= (f, V, C, L): an external procedure, Xi: a set of possible examples, ¯Xi: a set of negative examples Ensure: (P, Q): a valid contract for e 1:while⊤do 2: (P, Q)←synthesisOracle (V, C,Xi,¯Xi,¯G) 3: res,ex←verificationOracle (J{P}C{Q}KL) 4: ifres=pass then 5: return (P, Q) 6: else 7: Xi←Xi∪ {ex} 8: end if 9:end while Algorithm 3 checkSpurious Require: cex: a sequence of assignments to X,e= (f, V, C, L): an external procedure Ensure: xi: a set of valid behaviors of e,¯xi: a set of invalid behaviors of e 1:p←collectPrestatePoststatePairsFromTrace (cex) 2:for all p= (d, d′)∈pdo 3: res ,_←verificationOracle (J{V=d}C{V̸=d′}KL) 4: ifres=pass then ▷cex is spurious 5: res,ex←verificationOracle (J{V=d}C{V=d′}KL) 6: xi←xi∪ {ex} 7: ¯xi←¯xi∪ {(d, d′)} 8: end if 9:end for 10:return xi,¯xi Your are a logician and your task is to create two Boolean expressions, a precondition and a postcondition, for a given function named ‘ProcessQuery'. Instructions below.-The precondition should depend solely on function inputs.-The postcondition can depend on both inputs and outputs of the function.-Only use constants that are defined in the code.-Express the pre/postconditions using the following set of Python API.# Addition operationdef Add(a, b): return a + b…First describe in words what the function `UMS_reaction_1`1. may assume before the function is executed, and2. guarantees after the function is executed. Also,3. think about constraints to add to the postcondition to satisfy (resp. eliminate) positive (resp. negative) examples, if provided.Output an itemized list of the above points and nothing else.The function has the following inputs and outputs:Inputs: ‘request'Outputs: 'response'Code below:…Based on your analysis, complete the following two Python functions PRECONDITION and POSTCONDITION that returns pre/postconditions.Requirements:-Ignore null and None inputs and outputs.-Do not write comments and explanations.-The function body should be a single return statement. -Do not use intermediate variables.…Use the provided Python API to synthesize the pre/postconditions.Complete the code below and put it in a Python code block.```pythondef PRECONDITION(query):return # TODO: Fill in using the provided APIdef POSTCONDITION(query, response):return # TODO: Fill in using the provided API```Below is an example of using the API to generate pre/postconditions.```pythondef PRECONDITION(in):return TrueExpr()def POSTCONDITION(in, out):return And(Eq(Select(out, 'count'), Select(in, 'count')),# Use TrueExpr() instead of TrueEq(Select(out, 'is_present), TrueExpr()),)``` Fig. 7: Prompt used in the LLM-based synthesizer. Page 26: 26 Chen et al. S0 (0, 0) Reactions invoked: t.reaction_4 t.reaction_1 u.reaction_1 Pending events: (t.wait, (2 minute, 0)) (t.pass, (10 minute, 0))S1 (2 minute, 0) Reactions invoked: t.reaction_4 t.reaction_2 u.reaction_1 Pending events: (t.wait, (4 minute, 0)) (t.pass, (10 minute, 0)) (t.pass, (12 minute, 0))+2 minuteS2 (4 minute, 0) Reactions invoked: t.reaction_4 t.reaction_2 u.reaction_1 Pending events: (t.wait, (6 minute, 0)) (t.pass, (12 minute, 0)) (t.pass, (10 minute, 0)) (t.pass, (14 minute, 0))+2 minuteS3 (6 minute, 0) Reactions invoked: t.reaction_4 t.reaction_2 u.reaction_1 Pending events: (t.wait, (8 minute, 0)) (t.pass, (10 minute, 0)) (t.pass, (14 minute, 0)) (t.pass, (12 minute, 0)) (t.pass, (16 minute, 0))+2 minute (a) The initialization phase of SSD (including S4) S4 (8 minute, 0) Reactions invoked: t.reaction_4 t.reaction_2 u.reaction_1 Pending events: (t.pass, (10 minute, 0)) (t.wait, (10 minute, 0)) (t.pass, (14 minute, 0)) (t.pass, (16 minute, 0)) (t.pass, (12 minute, 0)) (t.pass, (18 minute, 0))S5 (10 minute, 0) Reactions invoked: t.reaction_4 t.reaction_3 t.reaction_2 u.reaction_1 Pending events: (t.pass, (12 minute, 0)) (t.wait, (12 minute, 0)) (t.pass, (14 minute, 0)) (t.pass, (18 minute, 0)) (t.pass, (16 minute, 0)) (t.pass, (20 minute, 0))+2 minuteS6 (12 minute, 0) Reactions invoked: t.reaction_4 t.reaction_3 t.reaction_2 u.reaction_1 Pending events: (t.pass, (14 minute, 0)) (t.wait, (14 minute, 0)) (t.pass, (20 minute, 0)) (t.pass, (18 minute, 0)) (t.pass, (16 minute, 0)) (t.pass, (22 minute, 0)) +2 minuteS7 (14 minute, 0) Reactions invoked: t.reaction_4 t.reaction_3 t.reaction_2 u.reaction_1 Pending events: (t.pass, (16 minute, 0)) (t.wait, (16 minute, 0)) (t.pass, (20 minute, 0)) (t.pass, (22 minute, 0)) (t.pass, (18 minute, 0)) (t.pass, (24 minute, 0))+2 minute +2 minute -6 minute (b) The periodic phase of SSD (excluding S4)) Fig. 8: The State Space Diagram (SSD) of the LF program in Figure 6. The SSD is a lasso, with an initialization phase in Figure 8a and a periodic phase in Figure 8b. Page 27: POLYVER: A Compositional Approach for Polyglot System Modeling and Verification 27 17 // Initialize input ports, input logical actions, and self structs with nondeterministic values. 18 *query = nondet_UMS_query_t(); 19 // CBMC checks pre/post-conditions. 20 __CPROVER_assume(PRECONDITION); 21 UMS_reaction_1(); 22 assert(POSTCONDITION); 23} Listing 1.1: C file generated for CBMC

---