Authors: Pei-Wei Chen, Shaokai Lin, Adwait Godbole, Ramneet Singh, Elizabeth Polgreen, Edward A. Lee, Sanjit A. Seshia
Paper Content:
Page 1:
POLYVER: A Compositional Approach for Polyglot
System Modeling and Verification
Pei-Wei Chen1, Shaokai Lin1, Adwait Godbole1, Ramneet Singh2,
Elizabeth Polgreen3, Edward A. Lee1, and Sanjit A. Seshia1
1University of California, Berkeley, USA,
{pwchen,shaokai,adwait,eal,sseshia}@berkeley.edu
2Indian Institute of Technology, Delhi, India, ramneet2001@gmail.com
3University of Edinburgh, UK, elizabeth.polgreen@ed.ac.uk
Abstract. Several software systems are polyglot ; that is, they comprise programs
implemented in a combination of programming languages. Verifiers that directly
run on mainstream programming languages are currently customized for single
languages. Thus, to verify polyglot systems, one usually translates them into a
common verification language or formalism on which the verifier runs. In this pa-
per, we present an alternative approach, P OLYVER, which employs abstraction,
compositional reasoning, and synthesis to directly perform polyglot verification .
POLYVERconstructs a formal model of the original polyglot system as a tran-
sition system where the update functions associated with transitions are imple-
mented in target languages such as C or Rust. To perform verification, P OLYVER
then connects a model checker for transition systems with language-specific ver-
ifiers (e.g. for C or Rust) using pre/post-condition contracts for the update func-
tions. These contracts are automatically generated by synthesis oracles based on
syntax-guided synthesis or large language models (LLMs), and checked by the
language-specific verifiers. The contracts form abstractions of the update func-
tions using which the model checker verifies the overall system-level property on
the polyglot system model. P OLYVERiterates between counterexample-guided
abstraction-refinement (CEGAR) and counterexample-guided inductive synthe-
sis (CEGIS) until the property is verified or a true system-level counterexample
is found. We demonstrate the utility of P OLYVERfor verifying programs in the
Lingua Franca polyglot language using the UCLID5 model checker connected
with the CBMC and Kani verifiers for C and Rust respectively.
Keywords: Polyglot languages · verification · synthesis · contracts · large lan-
guage models.
1 Introduction
Most modern software systems comprise code written in a combination of program-
ming languages [45]. For example, the Robot Operation System 2 (ROS2) [38,30] al-
lows users to specify processes that communicate with each other, called nodes, where
each node can be implemented in a different programming language. Other languages
and frameworks provide similar cross-language functionality for various domains, like
Lingua Franca [29] for real-time, distributed embedded systems, and Eclipse CyclonearXiv:2503.03207v2 [cs.PL] 12 Mar 2025
Page 2:
2 Chen et al.
DDS [1] and gRPC [20] for distributed systems and networking. We refer to these
multi-language software systems as polyglot systems.
Polyglot systems allow developers to pick and choose the best programming lan-
guage for specific tasks. Unfortunately, this flexibility comes at a cost, with multiple
studies indicating that the increased complexity of polyglot systems can lead to an in-
crease in bugs [45]. Recent work has addressed these issues through testing and other in-
complete methods. For example, Yang et al. [46] develop a deep-learning-based method
that helps localize bugs specifically due to the interaction between multiple languages
in a single system. These tools and techniques can help developers build safer systems,
but they cannot provide the guarantees needed in safety-critical settings and other vital
settings in which polyglot systems appear.
In this paper, we study the problem of formally verifying polyglot systems. Just
like developing polyglot systems introduces new challenges in the development cycle,
verifying polyglot systems introduces new challenges in the verification process. Over
the past few decades, there has been substantial work on program verifiers that operate
directly on individual programming languages. We refer to these tools as language-
specific verifiers . For example, CBMC [24] and Frama-C [13] can be used to verify
C code, Kani [41] and Verus [26] for Rust, ESC/Java [15], Java PathFinder [18], and
JBMC for Java [11], etc. At the same time, few approaches have been proposed for
multi-language systems, so polyglot verification challenges remain largely unexplored.
The major existing approach to verify polyglot systems involves translating all code
to one common language and applying existing reasoning engines to that common rep-
resentation. However, this approach is challenging due to the complexity of handling
diverse syntax and semantics across languages, and arguably, it loses the benefits of
having highly optimized language-specific verifiers. For instance, prior work such as
Lin et al. [28], only handles very restricted C code within the Lingua Franca language,
highlighting the difficulties of such monolithic approaches. Encoding a multi-language
system in a single verification model often results in scalability issues, as the generated
verification problem can become prohibitively large. To our knowledge, no existing ap-
proach enables automated verification of multi-language software systems by allowing
language-specific verifiers to cooperate to solve a problem.
We propose a compositional verification approach that leverages contracts, synthe-
sis, and language-specific verifiers to address this gap. Contracts, which abstract in-
dividual components, eliminate the need for full language translation and act as glue
between components. In our approach, language-specific verifiers are used to validate
contracts for components written in the corresponding language, while system-level
verification is performed using these contracts within a formal modeling language.
We employ a large language model (LLM)-based contract synthesizer to generate con-
tracts by summarizing programs and reasoning across multiple languages. This enables
scalable and automated verification of multi-language systems without requiring a full
translation into a single verification framework.
In summary, the contributions of this paper are as follows.
1.Polyglot verification problem. We formalize the polyglot verification problem,
and propose P OLYVER, an approach that uses contracts to perform compositional
verification by integrating multiple language-specific verifiers.
Page 3:
POLYVER: A Compositional Approach for Polyglot System Modeling and Verification 3
2.Automatic synthesis of contracts. We propose a method for automatically syn-
thesizing and refining pre/post-condition contracts needed for the polyglot verifica-
tion problem, combining counterexample-guided inductive synthesis (CEGIS) and
counterexample-guided abstraction refinement (CEGAR).
3.Our tool. We implemented a tool, called P OLYVER, that builds upon UCLID5 [36]
and supports C and Rust through CBMC [22] and Kani [41]. We developed an
LLM-based synthesis oracle that outperforms pure symbolic approaches. The tool
is modular in that synthesis and verification oracles can be easily added or replaced.
4.Case study and evaluation. We developed a verifier for the Lingua Franca coor-
dination language [29] and show that our approach is able to verify multi-language
systems with reasonable synthesis time. We also introduce a new set of polyglot
benchmarks and realistic applications that are only solved by our approach.
2 Motivation and Background
We start by presenting a motivating (running) example of a multi-language verification
problem. Figure 1 shows a model of a train management system as an Extended Finite
State Machine (EFSM) [2], i.e., a transition system where transitions have procedures
attached to them. Our example has procedures in multiple languages, e.g., the Waited
procedure is a Rust function while ProcessQuery is a C function.
𝑞!𝑞"t := 0;initialize;ProcessQuery;ProcessResponse;wait /t := t + 2 Waited; ProcessQuery; ProcessResponse;pass /t := t + 10;Passed;state vars: t, wait, pass, response, request, count, passedfnWaited(request: &mut Port,count: &mut u32) {*count= count.saturating_add(1);ctx.set(request, count);}void ProcessQuery(Port request, Port response) {if(request->value< 3) {lf_set(response, rand() % 2);} else{lf_set(response, true);}}
Fig. 1: An Extended Finite State Machine (EFSM) describing a train control system.
The procedures in blue andorange are Rust and C procedures respectively. Imple-
mentations of two procedures are shown on the right.
Encodings must match language semantics. Formal verification of such a system
requires encoding the procedures and the transition logic in a formal modeling/ver-
ification language (e.g., either directly as an SMT or an intermediate language such
as Boogie). This encoding must respect the language semantics. For example, instead
of wrapping around, the x.saturating_add(1) Rust function in Waited satu-
rates addition overflows to the maximum integer value (for that type). Figure 2 shows
the SMT encoding of this function. In general, a valid encoding must account for all
language-specific constructs (e.g., types, operators) which can be very complex for
modern languages such as Rust.
Page 4:
4 Chen et al.
ifx.type=="i32":returnCodeBlock(f"if(x + 1 < 2147483647bv32) then x + 1 else 2147483647bv32")elifx.type=="u32":returnCodeBlock(f"if(x + 1 <_u 4294967295bv32) then x + 1 else 4294967295bv32")elif...:...
Fig. 2: UCLID5 translation of x.saturating_add(1) in Rust. Notice that the
translation depends on the type of x. In UCLID5 and SMT-LIB, bitvectors do not have
implicit type interpretation. Instead, they are only interpreted when specific operators
are applied. For example, <(bvslt ) and <_u (bvult ) stands for the signed and un-
signed less than comparators in UCLID (SMT-LIB), respectively.
Language-specific verification tools such as CBMC [23] for C, Kani [42] for Rust,
and Cameleer [35] for OCaml already perform the heavy lifting of faithfully captur-
ing language semantics. However, these verification engines might use different proof
techniques. For example, CBMC and Kani perform symbolic bounded model checking,
while Cameleer uses deductive verification. Even if they perform a symbolic encod-
ing to a common language, combining these encodings requires identifying symbol
mappings between encodings. This can be challenging to reverse-engineer and imposes
restrictions on the verifiers one can combine. Thus, solving the multi-language verifi-
cation problem, even with language-specific verifiers, is challenging.
Contract-based verification operates by: (a) identifying contracts , which are abstrac-
tions of individual system components, (b) checking contracts independently for each
component, and (c) composing contracts to check that they together imply the sys-
tem level property. In P OLYVER, language-specific verifiers reason about individual
procedures while contracts abstract procedures and thus act as glue specifications be-
tween the verifiers. In this sense, verification in P OLYVERoperates much like assume-
guarantee/requires-ensures reasoning [32,43]. However, manually identifying contracts
is challenging. Not only must they be a valid abstraction of the procedure, but they
should also be strong enough to imply the overall system property. Even the relatively
simple ProcessQuery procedure in Figure 1 requires a non-trivial postcondition:
((request->value < 3) || (response->value == true))
&& (request == \old(request))
To address this, POLYVERperforms automated contract synthesis by using refinement-
based neural (LLM-based) search for a contract candidate in combination with symbolic
verification of that candidate. For the example in Figure 1, P OLYVERconverges on the
contract above. Using the synthesized contracts P OLYVERcan piggyback on existing
single-language verifiers to solve the multi-language verification, all the time avoiding
the challenges of encoding the entire system into a single language.
3 Problem Formulation
Polyglot systems are compositions of components where each component is a proce-
dure in some target language L(e.g., C, Rust, Java). Such a setup is widely seen in
Page 5:
POLYVER: A Compositional Approach for Polyglot System Modeling and Verification 5
distributed systems where nodes execute code in a specific language and communicate
with other nodes using, for example, remote procedure calls (RPCs). In this section, we
formalize such systems and define the verification problem.
We view polyglot systems as a variant of Extended Finite State Machines (EFSM)
[2]. Informally, EFSMs are suitable abstractions for polyglot systems because compo-
nent coordination can be modeled as transitions between modes , representing polyglot
system states, where the update function of a transition represents the actual coordina-
tion done as a sequence of procedure calls. EFSMs operate on a set of typed variables
V. We refer to the (typed) domain of variable v∈VasD(v). Further, by D(V), we
denote the Cartesian product of the domains of all variables (and thus forms the set of
assignments to variables in V). Unless otherwise noted, we use the same set of typed
variables Vfor the rest of the paper. We now formally define EFSMs.
Definition 1 (Extended Finite-State Machine). AnExtended Finite-State Machine
(EFSM) is a tuple (Q, V, I 0, T)where
1.Qis a set of modes,
2.Vis a set of typed variables,
3.I0⊆Q×D(V)is a set of initial states,
4.T⊆(Q×Q)×(G×U)is a relation that maps a pair of modes to a transition
characterized by a guard in G=D(V)→Band an update function in U=
D(V)→D(V).
The translation relation is characterized such that, if the guard condition gof an edge
transition from qtoq′is satisfied, variables will be updated by a corresponding update
function u. Formally,
(q, q′, g, u)∈T∧g(d)∧d′=u(d)
where dandd′are variable assignments in qandq′respectively.
While the update functions ( u) that act in each mode of an EFSM are kept abstract,
polyglot systems specialize them to be sequences of procedures. We first formally de-
fine languages, which will then be used to define procedures.
Definition 2 (Language). Alanguage Lis a set of expressions and predicates con-
structed through syntactic rules over a set of variables V. The interpretation of an
expression e∈ L is defined as JeKL:D(V)→D(V), and the interpretation of a
predicate p∈ L is defined as JpKL:D(V)→B.
We assume the same set of variables Vfor each language and that variables in each
language have matching types, hence using the same D(V)for different languages. In
cases where a type is ambiguous, we assume we have access to a type mapping that
gives additional information about the variable. For example, UCLID5 does not have
signed bitvectors but instead has signed bitvector operations (similar to SMT-LIB [6]),
and so when translating a 32-bit bitvector xin the UCLID5 language to Rust, the
signed/unsignedness of the bitvector must be determined by propagating the signs from
the bitvector operators through to the operands. We assume this simple type-inference
is available to us. Next, we define procedures.
Page 6:
6 Chen et al.
Definition 3 (Procedure). Aprocedure fis a tuple (V, C,L)where C, the procedure
body, is an expression in Lover a set of variables V.
We assume that procedure bodies ( C) are deterministic and free of side effects (e.g.,
changes to global variables). Note that nondeterministic procedures can be introduced
by changing the interpretation of expressions from a function to a relation. We remark
that only the expression Cis known to us, and obtaining the interpretation JCKLas an
encoding is difficult because it requires encoding the semantics of the full language.
We denote ⃝k
j=1fj(·) =Ck◦···◦ C1(·)to represent the composition of a sequence
of procedure bodies where each element fj= (Vj, Cj,Lj). We now formulate polyglot
models which serve as our abstraction of polyglot systems. Essentially, a polyglot model
is an EFSM where update functions ( u) are compositions of procedure invocations.
Definition 4 (Polyglot Model). Apolyglot model Mis a tuple (Q, V, I 0, T, F )where
(Q, V, I 0, T)form an EFSM and Fis a set of procedures. Update functions u(·) =
⃝jfj(·)are characterized as sequences of procedure invocations where fj∈F.
An example of a polyglot model is shown in Figure 1, where Q={q0, q1},Vis
the set of state variables, I0is the state after executing the procedures on the initial
transition, Tis shown by the transitions, and Fis the set of procedures in colored text.
A trace of a model Mis a sequence of assignments to variables in V, related by
the transition relation, that can be obtained by executing M. A property ϕis a formula
overVthat characterizes a set of traces. Let the set of all traces that can be produced
by a model MbeT(M). Given a property ϕ, we write M |=ϕifT(M)⊆ϕ. Below
we define the polyglot verification problem.
Problem Statement 1 (Polyglot verification problem) Given a polyglot model M=
(Q, V, I 0, T, F )and a system property ϕdetermine whether M |=ϕ. IfM ̸|=ϕ,
provide a counterexample trace that violates ϕand is producible by procedures in F.
Naively, to verify a polyglot model one has to translate procedures in the update
functions to a common modeling language, e.g. SMT-LIB. However, this is a tedious
and error-prone task for each language. To avoid the need for translation, we propose
using contracts as abstractions of procedure behaviors instead to verify the model.
4 Polyglot Verification through Contracts
In this section, we introduce the key ingredient in our approach to Problem 1: a contract-
based decomposition.
How contracts help. Contracts help in two ways. Firstly, contracts abstract (over-
approximate) the internal complexity of the procedure, leading to smaller queries/more
performant verification. However, more crucially, contracts avoid having to encode the
internal procedure body in our verification query . Instead, we only need to be able to
encode the contract grammar. We begin by formally defining contracts.
Page 7:
POLYVER: A Compositional Approach for Polyglot System Modeling and Verification 7
Definition 5 (Contract). Acontract of a procedure (V, C,L)is a tuple (P, Q)where
PandQare precondition and postcondition predicates over Vin language L. The
contract is valid if and only if the Hoare-triple {P}C{Q}holds, meaning
∀d, d′∈D(V).(JPKL(d)∧d′=JCKL(d)) =⇒ JQKL(d′). (1)
Equation (1) can be encoded and checked by a language-specific verifier for lan-
guageL, thereby removing the translation burden from users. Once we have obtained a
valid contract for a procedure, we can safely replace the procedure without losing any
possible behaviors.
Given a polyglot model M, we define an abstract (EFSM) model M′that replaces
each update in Mwith a valid contract. Formally, for each update u=⃝j∈1...kfj(·)
we generate contracts (Pj, Qj)for each fjin language Ljsuch that, JQjKLj=⇒
JPj+1KLj+1. Then M′replaces uwith the contract-derived relation u′={(a, b)|
JP1(a)KL1=⇒ JQk(b)KLk}. Since the replacement contract u′overapproximates u,
we have the following theorem.
Theorem 1. IfM′|=ϕ, thenM |=ϕ.
Proof. SinceM′|=ϕandu⊆u′,T(M)⊆ T(M′)⊆ϕ. Thus M |=ϕ.
Although checking the model M′holds on ϕimplies that the original model also
holds, we note that checking the implication JQjKLj=⇒ JPj+1KLj+1is non-trivial.
Since the implication contains predicate interpretations in two languages LjandLj+1,
different from Equation (1) we are unable to use a language-specific verifier to check
the implication. We address this problem by using an intermediate language to connect
different languages, discussed in Section 5.3.
Counterexample guided contract refinement. While a property holding in M′im-
plies that it holds in M, the converse however does not hold. A counterexample trace
found in M′may be a spurious counterexample that cannot be produced by M. In this
case, the contracts need to be strengthened to exclude this counterexample. Coming up
with valid contracts that are still strong enough to prove the system-level property is
challenging, often involving iterative contract refinement. To address this, we introduce
the valid contract synthesis problem, which aims to generate a contract that avoids a set
of spurious traces provided as negative examples.
Problem Statement 2 (Valid Contract Synthesis Problem) Given a procedure f=
(V, C,L)and positive and negative example sets E={X,Y}iand¯E={¯X,¯Y}i,
synthesize a valid contract for fthat includes the positive examples and excludes the
negative examples.
In the next section, we first develop an approach for Problem 2 and then use that to
solve Problem 1 by iteratively synthesizing (and improving) contracts.
5 An Abstraction Synthesis Approach for Polyglot Model
Verification
We propose a contract-based approach to solve the polyglot verification problem. Con-
tracts are synthesized using an intermediate language in a Counterexample Guided In-
Page 8:
8 Chen et al.
ductive Synthesis (CEGIS) loop. However, since contracts may include spurious be-
haviors that violate the property, they are further refined in a Counterexample Guided
Refinement Abstraction (CEGAR) loop. The approach iteratively switches between
CEGIS and CEGAR loops to search for contracts that prove or disprove the property.
We first briefly introduce CEGIS and CEGAR in Section 5.1, and propose our ap-
proach in Section 5.2. In Section 5.3, we explain how the intermediate language “glues”
different languages and simplifies contract synthesis, and in Section 5.4 and 5.5 detail
how contracts are synthesized and refined. Finally, we discuss correctness guarantees
in Section 5.6.
5.1 Preliminaries
Below we briefly describe the main algorithmic concepts used in our approach.
Counterexample Guided Inductive Synthesis (CEGIS) is an iterative interaction
between a learner and a teacher , where the goal is to synthesize a correct solution that
satisfies a given specification. The learner , relying on a synthesis oracle, is responsible
for generating candidate solutions that satisfy all known constraints from a constrained
search space. The teacher , relying on a verification oracle, checks whether the candidate
solution provided by the learner satisfies the full specification. Synthesis succeeds if the
solution is correct; otherwise, the teacher provides a counterexample that shows the
failure of the candidate.
Counterexample Guided Abstract Refinement (CEGAR) is a technique used in
model checking to iteratively refine an abstract model of a system. Initially, a coarse
abstraction of the system is checked for correctness. If the verification fails, a coun-
terexample is analyzed to determine whether it is spurious. If the counterexample is
real, the system is deemed incorrect; otherwise, the abstraction is refined to eliminate
the spurious counterexample, leading to a more precise model.
5.2 Approach Overview
We propose a CEGIS/CEGAR-hybrid approach to solve the polyglot verification prob-
lem through the use of contracts, illustrated in Figure 3. The approach is parameterized
by a set of synthesis oracles, a set of verification oracles, and an intermediate language
¯Lfor use in contract synthesis.
The approach consists of an inner CEGIS loop and an outer CEGAR loop, described
in Algorithm 1. The inner CEGIS loop uses synthesis and verification oracles to itera-
tively synthesize valid contracts, solving Problem 2. Upon obtaining valid contracts for
all procedures, we generate an induced model M′specified entirely in a modeling lan-
guage by combining the system model and the contracts. The system property is then
checked against the induced model – if the property is proven, the system is verified;
otherwise, the counterexample trace is checked for spuriousness. If the trace is spuri-
ous, we exclude it by returning to the inner CEGIS loop to refine contracts; otherwise,
we report failure with the trace.
We assume access to synthesis oracles, verification oracles, and a model checker.
Synthesis oracles take as inputs a set of variables V, a piece of code C, a set of positive
Page 9:
POLYVER: A Compositional Approach for Polyglot System Modeling and Verification 9
CEGIS loopSynthesisOracleVerificationOracleModel CheckerSpuriousness Checkerfail / traceinvalid contract /𝑿𝒊←𝑿𝒊 ⋃ 𝒙𝒊spurious trace /𝑿&𝒊←𝑿&𝒊 ⋃ 𝒙&𝒊contractprocedure
all contracts valid / 𝑀’CEGAR loop
pass / Passtrace not spurious / (violated, trace)
Fig. 3: Flow diagram of our approach. The process starts from the synthesis oracle.
Contracts are synthesized in the inner CEGIS loop and refined in the outer CEGAR
loop.Xand¯Xdenotes positive and negative example sets.
examples X, and a set of negative examples ¯X, and outputs a contract (P, Q)in¯Lthat
satisfies the examples in Xand dissatisfies those in ¯X. Verification oracles take in a
Hoare-triple, and return a pass result if the triple holds or a violated result along
with a counterexample that violates the triple. A model checker takes in a model and a
property, both specified in the modeling language, and reports pass if the property holds
otherwise failure with a trace.
5.3 Intermediate Language for Contracts
In Section 4, we discussed how contracts help abstract procedure bodies, which can
be encoded and checked by language-specific verifiers. We also stated the challenge
where, without loss of generality, given two contracts (P1, Q1)inL1and(P2, Q2)
inL2, checking whether JQ1KL1=⇒ JP2KL2holds is challenging because JQ1KL1and JP2KL2cannot be easily encoded. In this section, we discuss how this challenge
is solved by connecting contracts using an intermediate language as a glue between
different languages.
The idea is to check the implication in the underlying modeling language. This
is intuitive since a model can only be checked if it is fully specified in the modeling
language. In order to achieve this, we use an intermediate language ¯Lthat can be trans-
lated into the modeling language and all other target languages. We assume there exists
a translation function fL:¯L → L for each language Lthat translates expressions from
¯LtoLwhile preserving semantic equivalence, i.e. ∀e∈¯L.JeK¯L≡JfL(e)KL.
Page 10:
10 Chen et al.
The intermediate language and translation functions help in two ways. On the verifi-
cation side, they allow us to work with expressions in the intermediate language without
needing to obtain the actual encoding. This is because the translation functions faith-
fully translate expressions to specific languages, and the model checker and verification
oracles handle the encoding of the translated expressions. On the synthesis side, they
allow us to maintain only a single interface with synthesis oracles, e.g., a grammar in
the case of Syntax Guided Synthesis (SyGuS) [5], or examples/data in the case of neural
approaches. Users can also provide an intermediate language and translation functions
as parameters to the approach, effectively tailoring contract synthesis to specific appli-
cations.
An example of an intermediate language ¯Land its translations are shown in Fig 4.
For the rest of the paper, we use ¯Gas the synthesis grammar and assume semantically
consistent translation between languages.
LESS_THAN(x, y)UCLID5Rustifx.is_signed() andy.is_signed():returnCodeBlock(f”{str(x)} < {str(y)}")ifx.is_unsigned() and y.is_unsigned():returnCodeBlock(f”{str(x)} <_u {str(y)}")returnCodeBlock(f”{str(x)} < {str(y)}")
Fig. 4: Example translation functions for the less-than operator. The intermediate lan-
guage ¯Lis in the middle. The modeling language L∗at the top is UCLID5 and the
bottom target language Lis Rust. In Rust the operator directly translates to <, while in
UCLID5 it depends on the variable types.
5.4 Contract Synthesis via CEGIS Loop
To obtain a valid contract for a procedure (V, C,L)∈F, we iteratively synthesize
contracts using synthesis oracles and check their validity using verification oracles in a
CEGIS loop. In each iteration, given V,C, and initially empty example sets Xand¯X, a
synthesis oracle returns a contract (P, Q)whose validity is determined by a verification
oracle. If the Hoare-triple {P}C{Q}holds, the contract is valid; otherwise, a positive
example generated by the verification oracle is added to Xand the loop continues until
a valid contract is found. Provided that results from the verification oracle are correct,
this process is guaranteed to produce valid contracts upon termination, thereby solving
Problem 2. An example of a CEGIS iteration is shown in Table 1.
With valid contracts for all procedures, an induced model entirely specified in the
modeling language is created and checked by a model checker. The model checker may
return a counterexample trace, which is then checked for spuriousness.
Page 11:
POLYVER: A Compositional Approach for Polyglot System Modeling and Verification 11
CEGIS Output
SO iter. 1 P1:⊤, Q1: (resp == (req >= 3))
VO iter. 1 fail /(req,resp ) :d= (1,⊥), d′= (1,⊤)
SO iter. 2 P2:⊤, Q2:(req < 3 || resp == ⊤)
VO iter. 2 pass /−
Table 1: Example of contract synthesis in a CEGIS loop for ProcessQuery . A syn-
thesis oracle (SO) first synthesizes (P1, Q1). A verification oracle (VO) checks the
contract and returns a positive example. Note that the pre-state dsatisfies P1but the
post-state d′violates Q2. SO then refines the contract based on the example and gener-
ates(P2, Q2), which passes the VO check. req andresp are short for request and
response respectively.
5.5 Spuriousness Check
A trace is not spurious if all steps of the trace can be produced by the procedures. To
check this, for each procedure (V, C,L)we extract input/output pairs p={(d, d′)}i
from the trace, where dandd′are variable assignments to Vbefore and after the pro-
cedure call, respectively. For each (d, d′)∈p, we check whether {V=d}C{V̸=d′}
holds using a verification oracle. If it holds, the procedure is impossible to produce the
pair and thus the trace is spurious. The pair is then added as a negative example to ¯X
and is excluded through re-synthesis in the CEGIS loop. Otherwise, if all such triples
do not hold, meaning that it is possible for procedures to generate the trace, the trace is
a true counterexample. Note that inequality instead of equality is used in the triple to
account for non-deterministic procedures. Table 2 shows an example of a spuriousness
check performed on ProcessQuery against a pair of assignments.
Oracle Output
MC in CEGAR (···, d= (2,⊤), d′= (10,⊤),···)
VO in CEGAR pass /(req,resp ) :d= (2,⊤), d′= (2,⊤)
SO in CEGIS P2:⊤, Q2:(req < 3 || resp == ⊤) && \old(req) == req
VO in CEGIS pass /−
Table 2: Example of spuriousness check for ProcessQuery . Continuing from Ta-
ble 1, assume that a model checker (MC) checks the property using (P2, Q2)and re-
turns a trace containing dandd′. To check spuriousness, we check the Hoare-triple
{(req,resp ) =d}ProcessQuery {(req,resp )̸=d′}using VO, who returns
pass . The pair is thus impossible, making the trace spurious. We then return to the
CEGIS loop and re-synthesize a contract, which is proven valid by VO.
Page 12:
12 Chen et al.
5.6 Correctness Guarantee
We assume that results given by the verification oracle and the model checker are cor-
rect, and that translation of the intermediate language preserves semantic equivalence.
Provided these assumptions, our approach guarantees partial correctness , meaning that
the result is correct when it terminates. A brief explanation is provided below.
For soundness, notice that contracts used in the induced model must overapprox-
imate procedure behaviors because they are validated by the verification oracle. This
means that when the model checker reports pass , there does not exist a trace that is
producible by the procedures that violates the property, since all producible behaviors
have already been captured by the contracts.
For completeness, when the model checker reports violated with a trace, the
trace must violate the property. This trace must also be producible by the procedures
because each step is checked by the verification oracle.
6 P OLYVERImplementation
We implemented a tool called P OLYVERthat automates our approach. P OLYVERbuilds
on top of the UCLID5 model checker, which supports procedural verification. It sup-
ports the C and Rust programming languages. The infrastructure is shown in Figure 5.
The tool is modular in that it can be extended to support other languages and verifiers.
Users would only need to provide an oracle along with an oracle interface that parses
the oracle output. Below we suggest synthesis and verification oracles and describe
their interfaces. For the synthesis grammar ¯G, we use Boolean connectives, arithmetic
operations, and comparators.
6.1 Synthesis Oracles and Interface
Given a procedure, a set of positive Xand negative examples ¯X, and the synthesis
grammar ¯G, the synthesis oracle is expected to return a contract that is consistent with
the examples. Below we discuss two candidate synthesis oracles and their interfaces.
LLM-based Synthesizer To take advantage of the procedure body C, we developed
an LLM synthesizer that accounts for Cand applies chain-of-thought prompting to
generate contracts based on the LLM’s reasoning. The interface of the LLM synthesizer
consists of a prompt generator and a DSL parser.
The LLM is first asked to summarize the behavior of Cand then asked to generate
a contract in ¯Lbased on its summary. Inspired by [31], we converted ¯Gto a Python
DSL and used it as a building block to help the LLM generate syntactically correct
expressions. If an expression lies outside of ¯L, an error message returned by the parser
is provided to the LLM for re-prompting. The message provides insightful information
by locating the source of error, such as incorrect syntax or inconsistent types. If positive
and negative examples are provided, the LLM will be prompted to find a constraint that
distinguishes the two sets. Figure 7 shows an example of our prompt.
Page 13:
POLYVER: A Compositional Approach for Polyglot System Modeling and Verification 13
PolyVer
(𝑃,𝑄)Verification OracleQuery GeneratorResultParserVerif. Oracle InterfacequeryP/NP + cexSynthesis OracleQuery GeneratorResult ParserSynth. Oracle Interfacequerysynth. resultModel CheckerModelBuilderSystemCex.ParserModel Checker Interface𝑀′system cex(𝑑,𝑑!)(𝑃,𝑄)
Fig. 5: Infrastructure of PolyVer. (P, Q)are contracts and (d, d′)is an input/output pair.
Note that this neural approach is not fully explored and is used to demonstrate the
applicability of LLM-based solutions. Existing synthesis approaches can be easily inte-
grated into the system as long as they generate contracts in ¯L.
Syntax Guided Synthesis (SyGuS)/Syntax Guided Modulo Oracles (SyMO) Solver
Another solution is to treat the problem as a Programming by Example (PBE) problem,
asserting that the contract should satisfy examples in Xand dissatisfy those in ¯X.
SyGuS [5] solvers can be used to solve PBE problems.
A similar approach is to encode as a SyMO [37] problem, a variant of the Sy-
GuS problem that allows constraints to be specified using black-box functions. Using a
language-specific verifier as the black-box function, the SyMO solver itself effectively
forms a CEGIS loop which directly solves Problem 2.
The interface of a SyGuS/SyMO solver uses ¯Gas the grammar and builds a Sy-
GuS/SyMO query that includes the examples as constraints. An SMT-LIB parser is
used to parse the output of the solver, which is then converted to a contract in ¯G.
6.2 Verification Oracles and Interface
We use CBMC and Kani as the verification oracles for C and Rust, respectively. The
interface of the verification oracle consists of a query generator and a result parser.
Given a contract (P, Q)and a procedure body Cin a language L, the query generator
generates {P}C{Q}. Concretely, the procedure body has stubs for the contract, and
the query generator replaces the stubs with the contract. An example is shown in List-
ing 1.1. Receiving results from the oracle, the result parser parses the result and variable
assignments before and after the procedure call if the triple does not hold.
Page 14:
14 Chen et al.
7 Case Study: Lingua Franca Program Verification
To demonstrate P OLYVER’s effectiveness, we developed a verifier for the Lingua Franca
coordination language and show evaluation results in Section 8. Lingua Franca (LF) is a
polyglot coordination language designed for real-time, concurrent, and distributed sys-
tems. Using LF, the user defines concurrent components called reactors and connects
them through ports and connections. Within a reactor, the user defines reactive message
handlers called reactions , which are written in a target programming language, such as
C, C++, Python, Rust, or TypeScript. A reaction is invoked when any of its triggers is
present, and when invoked may trigger downstream reactions based on its connections.
In this section, we give the LF program for the running example in Section 7.1, dis-
cuss the challenges of LF program verification in Section 7.2, and describe our solution
in the rest of the section.
// Rust reactorreactorTrain{inputresponse:booloutputrequest:i32statecount:i32= 0statepassed:bool= falselogicalactionwait(2minute)logicalactionpass(10minute)reaction Init(startup) ->request{=ctx.set(request, self.count);=}reaction Waited(wait) ->request{=self.count= self.count.saturating_add(1);ctx.set(request, self.count);=}reaction Passed (pass) ->request{=self.passed= true;=}reaction ProcessResponse(response) ->pass,wait{=if ctx.get(response).unwrap() {ctx.schedule(pass, 0);} else{ctx.schedule(wait, 0);}=}}// C reactorreactorUMS{inputrequest:intoutputresponse : boolreaction ProcessQuery(request) ->response{=if(request->value< 3) {lf_set(response, rand() % 2 == 1);} else{lf_set(response, true);}=}}// Property to be verified@property(name="pass_within_12_seconds", spec="F[0, 12 minute](t.passed)", expect=false)mainreactor{// Component instantiationu=newUMS()t=newTrain()// Component interactionst.request->u.requestu.response->t.response}
Fig. 6: LF program for a subway U-turn section management system from [17].
7.1 Running Example in Lingua Franca
Figure 6 shows the LF program for the running example, a subway U-turn manage-
ment system (UMS). The Train andUMS reactors are implemented in Rust and C
respectively. Reaction bodies are enclosed in special brackets {=...=} . The system
describes a train asking for permission to pass a railway crossing. The train sends out
Page 15:
POLYVER: A Compositional Approach for Polyglot System Modeling and Verification 15
requests to the UMS, which decides whether to grant permission based on the number
of attempts the train has made, described in ProcessQuery . If the train has made
more than three attempts, the UMS will grant permission; otherwise, the UMS will ran-
domly decide whether to grant permission. If permission is granted, the train will pass
which takes ten minutes; otherwise, it waits for two minutes before making another
request, described in ProcessResponse . The property under verification is that the
train passes in 12 minutes, which is expected to be violated since the train may take as
long as 16 minutes to pass.
7.2 Where Existing Solutions Fall Short
As LF programs may consist of reactors in multiple languages, using a single language-
specific verifier, e.g., CBMC for C and Kani for Rust, would require compiling the
reactors into the supported language. This is clearly infeasible for languages with in-
compatible types and semantics.
Lin et al. present the LFVerifier [28], which performs end-to-end verification of LF
programs by translating an LF program with its embedded C code into a monolithic
axiomatic model that compiles to SMT. However, it only supports a small subset of the
C language. Expanding on the LFVerifier to support other languages or a comprehensive
fragment of C would be equivalent to developing encodings from these languages to
SMT. This is precisely the challenge we aim to avoid with P OLYVER. By leveraging
tools such as CBMC/Kani, we can already support large C/Rust fragments, for example,
ANSI-C in the case of CBMC.
7.3 Mapping LF onto EFSM in UCLID5 Template
LF is a deterministic model of computation due to its use of logical time [25], which
defines a unique sequence of reaction invocations over logical time steps given the same
inputs to the program. The determinism property enables a straightforward mapping
from an LF program to a polyglot model defined in Definition 4.
Similar to [28], an LF program is first mapped onto a State Space Diagram (SSD),
which is a graph where each node vicontains the current logical time, a set Riof reac-
tions, and a set of pending events, and each edge represents logical time advancement.
Figure 8 shows an example SSD generated from the LF program in Figure 6.
Once we obtain an SSD that describes the LF program behavior across logical time
steps, we automatically map the SSD to the proposed polyglot model using a custom
UCLID5 model generator. For an SSD, the process executes the following steps:
1. For each SSD node vi, generate a mode qi;
2. For each reaction, generate a UCLID5 procedure with empty pre/postconditions, to
be later filled in by P OLYVER;
3. For each edge in the SSD connecting node vito node vj, generate a guarded tran-
sition from mode qito mode qjwith guard g=V
t∈TisPresent (t)for each subset
Tof reaction triggers obtained from Ri. The associated update function uof the
transition is the sequence of reaction procedures triggered by T. Transitions are
concisely encoded using conditional statements in UCLID5.
Page 16:
16 Chen et al.
7.4 Generating Rust and C Templates for Contract Verification
To verify contracts for all procedures in the LF program, i.e., reactions written in Rust
and C, we implemented a code generator in the LF compiler to generate a standalone
C/Rust program for each procedure. Each program assumes the precondition, invokes
the procedure under verification, and asserts the postcondition. For example, the reac-
tion in UMS in Figure 6 results in the following C program, to be checked by CBMC.
In Listing 1.1, PRECONDITION andPOSTCONDITION are undefined C macros
set to a candidate pre/post condition pair passed in from the command line - therefore
this program is considered a “template”. Rust templates with a similar structure are
generated for Rust procedures, which are checked by the Kani model checker.
7.5 P OLYVER-based Verification of LF Programs
We integrate the P OLYVER-based verifier into the LF compiler, which code-generates
the C, Rust, and UCLID5 templates. Similar to [28], the compiler converts the system-
level property, specified by the user in Safety MTL [33], into first-order logic over
traces. A separate configuration JSON file is generated to record all variables in the
application and the representation of their types in each target language. Once the tem-
plates and the configuration file are generated, P OLYVERexecutes the tool flow de-
scribed in Figure 3 until it returns to the user a verdict of pass , orviolated with a
counterexample trace to the system-level property.
8 Experimental Evaluation
Our experimentation evaluates P OLYVERon both its components: contract synthesis
and verification using the synthesized contracts. Through our results, we would like to
answer the following research questions:
RQ1 How does P OLYVERcompare to previous work on verifying LF?
RQ2 Is P OLYVERable to handle full-fledged LF examples?
RQ3 Is P OLYVERable to verify multi-language LF systems?
Benchmark Setup. We run all experiments on a 3.7GHz Intel Core i9 machine with
62GB RAM. We use a single core for all benchmarks. We use an LLM-based syn-
thesizer (the o1-mini model) to generate contracts. For verification, we use CBMC-
6.3.1, Kani-0.56.0, and z3-4.8.7 as the backend SMT engine. Parallel synthesis is im-
plemented in P OLYVERbut not used in the experiments. Since LF reactions often
include unmodified variables that are important to identify strong/tight contracts, our
LLM prompt contains a single instruction that asks the LLM to pay extra attention to
unmodified state variables.
8.1 RQ1: How does P OLYVERcompare to previous work on verifying LF?
We compare with the LFVerifier [28] using the LF benchmarks proposed by [28], which
are arguably toy examples since the LFVerifier only supports a small fragment of C syn-
tax. This comparison is not entirely fair because [28] uses a monolithic, axiomatic en-
coding specialized for LF programs, while P OLYVERuses a compositional, operational
Page 17:
POLYVER: A Compositional Approach for Polyglot System Modeling and Verification 17
LOC #Iters CPU Time (s)
Benchmark #Reaction C LF CEGIS CEGAR SOT VOT UT [28]
ADASModel 7 16 73 7 1 220.9 3.2 4.7 22.8
AircraftDoor 3 10 38 3 1 82.1 1.2 1.7 6.6
Alarm 2 12 30 2 1 62.3 0.9 2.0 5.4
CoopSchedule 2 2 37 2 1 54.5 0.8 2.5 21.4
Election 3 7 35 3 1 124.2 1.4 3.8 44.9
Election2 2 6 31 3 2 142.8 1.5 5.9 12.4
Elevator 13 53 206 13 1 413.1 33.4 265.8 1363.2
Factorial 3 11 36 4 1 138.2 1.9 5.2 34.6
Fibonacci 5 14 39 5 1 167.3 2.4 3.2 248.5
PingPong 4 8 62 4 1 131.0 1.8 3.0 13.6
Pipe 5 10 60 5 1 147.4 2.4 163.7 154.5
ProcessMsg 4 10 32 4 1 116.6 1.8 2.6 14.0
ProcessSync 1 2 18 1 1 35.2 0.4 1.5 5.0
Railroad 10 64 66 17 2 678.9 9.0 74.3 162.5
Ring 4 7 45 4 1 105.6 1.8 3.3 15.9
RoadsideUnit 8 26 81 8 1 259.0 3.8 3.2 74.7
SafeSend 4 12 53 4 1 124.3 1.8 1.9 6.5
Subway 10 20 64 10 1 266.5 4.5 11.6 30.3
Thermostat 6 29 45 6 1 192.6 2.6 2.3 13.6
TrafficLight 4 46 42 7 1 396.1 3.8 8.9 -
TrainDoor 3 7 50 3 1 74.5 1.2 1.8 5.7
UnsafeSend 4 12 53 4 1 122.1 1.8 2.7 6.5
Table 3: The synthesis and verification oracle times (SOT and VOT) are the total time
across all CEGIS iterations, while the verification oracle time (VOT) is the total time
spent on verification oracles, including contract validation and spuriousness checks.
The verification time (UT) is the total time spent on UCLID5, which is called once per
CEGAR iteration. The total runtime is SOT + VOT + UT.
Page 18:
18 Chen et al.
encoding that is more general. However, [28] is the only prior work to our knowledge
that performs end-to-end automated verification of LF programs, so we compare to it.
The proposed compositional approach requires P OLYVERto synthesize contracts
for all reactions and use them to verify the system-level property. In the benchmark eval-
uation, P OLYVERsuccessfully synthesizes the contracts and verifies allbenchmarks
automatically, while [28] is unable to verify the TrafficLight example.
We report on the synthesis iteration counts as well as runtimes in Table 3. In 18
out of the 22 benchmarks, the contracts were synthesized correctly in a single CEGIS
loop, which takes around 33 seconds on average. For the rest, we observe that all but
one reaction still requires one CEGIS iteration while one (hard) reaction requires all the
remainder of iterations. The hard reaction often includes complex control flow and data
dependencies, which makes it difficult even for human experts to reason about.
We record the synthesis oracle time (SOT), verification oracle time (VOT), and ver-
ification runtime (UT), and we note that the time to perform contract synthesis (SOT
+ VOT) far dominates the verification runtime (UT). Since [28] (and previous work
generally) does not perform contract synthesis for multi-language systems, we can-
not compare with prior work w.r.t. contract synthesis. However, given valid contracts
POLYVERis up to ∼77x faster than [28] w.r.t. the verification time alone (VT vs. [28]).
Qualitatively, P OLYVERimproves on [28] in a major way: There is no longer a need
to develop a translator from a target language to the modeling language (e.g., C to
UCLID5). Since we leverage robust language-specific verifiers (e.g., CBMC, Kani), we
are also able to support a much more comprehensive fragment of C/Rust. For example,
CBMC officially supports ANSI-C. In contrast, [28] develops a custom C-to-UCLID5
translator for a relatively small C subset (which is what these benchmarks use).
We also experimented with SyGuS/SyMO solvers but found that these approaches
required a large number of CEGAR iterations and failed to terminate in most bench-
marks. We attribute this to the solvers’ lack of access to procedure bodies, relying solely
on positive and negative examples for guidance.
8.2 RQ2: Is P OLYVERable to handle full-fledged LF examples?
The benchmarks in Section 8.1 only use the C subset supported by [28], while P OLYVER
(like CBMC) supports ANSI-C programs. Hence, we now evaluate P OLYVERon ex-
amples requiring a broader fragment of C that [28] is unable to verify.
Two examples from embedded systems (an LED controller and a hill climbing
robot) and a satellite attitude controller [8,27] are used to evaluate the performance
of our approach. The LED controller tracks whether the LED is blinking, and the hill
climbing robot checks if it is in climbing mode when it is on a ramp, detected by ac-
celerometers. These examples are resource-constrained embedded systems. When the
implementation of a library function is unavailable, we assign non-deterministic values
to input/output variables to over-approximate the function’s behaviors. The satellite ex-
ample uses a PID controller and ensures that motors are not activated when the user
input falls outside an acceptable range.
The results in Table 4 show that our approach is able to handle language complexi-
ties and effectively generate contracts for actual implementation code. In the LED and
hill climbing examples, the LLM-based synthesizer was able to summarize procedure
Page 19:
POLYVER: A Compositional Approach for Polyglot System Modeling and Verification 19
behavior while ignoring details about low-level code when they are irrelevant. In the
satellite example, the LLM struggled to synthesize a valid contract for the PID con-
troller. However, we noticed that providing a simple comment in the code that captures
the logic of the PID controller greatly helped synthesis. The synthesized contracts can
all be validated by language-specific verifiers. This shows the potential applicability of
POLYVERto aid verification of real-world systems.
LOC #Iters CPU Time (s)
Benchmark #Reaction C/Rust LF CEGIS CEGAR SOT VOT UT
Blink 1 123 35 1 1 31.0 0.45 1.26
HillClimb 12 75 102 20 1 685.4 74.9 9.7
Satellite 6 424 166 9 2 729.0 32.2 7.0
Submit 7 43 66 7 1 276.7 4.8 9.4
TrainPass 5 14 44 5 1 160 8.8 14.1
Table 4: Results for full-fledged and multi-language examples. The Blink, HillClimb,
and Satellite examples include real implementation code in C. Submit, TrainPass, and
other 22 variants of the benchmarks in Table 3 are multi-language systems that can only
be solved by P OLYVER.
8.3 RQ3: Is P OLYVERable to verify multi-language LF systems?
The third set of benchmarks is the same as the first set, except that half of the C reactions
are replaced with Rust reactions. On this set, our approach performed similarly to the
first set. The LLM-based synthesizer is able to reason about Rust code as well as library
functions in both C and Rust. The example shown (TrainPass) in Figure 6 was verified
with one CEGIS iteration per reaction and one CEGAR iteration. Note that this set of
benchmarks is not solvable by previous works as they involve multiple languages. This
demonstrates the flexibility of our approach in supporting different languages.
9 Related Work
The majority of approaches to verifying systems which involve multiple implementa-
tion languages rely on manually translating some parts of the system until everything
is written in one common language. For instance, Cook et al. [10] translate fragments
of assembly code into C in order to achieve verification of the Xen hypervisor. Where
it is common to combine small parts of one language with another, tools may provide
automated translation for a fragment of that language, for example, prior work [28]
verifying programs in Lingua Franca is capable of automatically translating a fragment
of C into the verification language, and tools like CBMC [23,22] support small frag-
ments of assembly code. Another approach to this problem is to treat the fragments of
Page 20:
20 Chen et al.
code that cannot be analyzed as black boxes and generate summaries for them based on
observed input-output behavior of the code [34]. This is similar to our approach but re-
lies on generating a sufficiently comprehensive set of input-output examples, and lacks
guarantees that fully integrating the language-specific verifiers provides.
We employ a style of compositional verification based on Hoare logic [19]. This is
commonly applied to verification problems that, whilst written in only one language,
are too large to be tackled in a single verification query [7]. To automate this, previ-
ous work, e.g.,[12,32,16,39], has used techniques such as abstract interpretation and
Counter-Example Guided Abstraction Refinement [9] to learn pre/post-conditions.
There are also property-guided approaches to inferring contracts, while we aim to
discover a contract that is just good enough to prove a given property. Here, previ-
ous work uses syntactic patterns [14], Constrained Horn Clause solvers [4], and even
CEGIS [40,3]. Recently, LLMs have been applied to learning specifications, including
pre/post-conditions, for code [44]. The work closest to ours uses CEGIS and LLMs to
infer post-conditions, but neither of these approaches addresses the challenge of multi-
language verification.
Our approach to learning contracts is based on an integration of CEGIS and CE-
GAR. This integration can be considered a form of oracle-guided synthesis [21,37],
with the verification oracle and model checker treated as oracles that guide the synthe-
sis search. In the hardware domain, prior work [47] uses synthesis to generate environ-
ment invariants for modular hardware verification. These invariants can be thought of
as pre-conditions in our context.
10 Conclusions
In conclusion, we formulate the polyglot verification problem and propose a contract-
based approach that uses language-specific verifiers and a model checker to perform
verification. We implemented our approach in a tool called P OLYVERthat automati-
cally iterates between CEGAR and CEGIS loops to synthesize contracts using an LLM-
based synthesizer. Results show that P OLYVER, when given valid contracts, is able to
verify systems with faster runtime compared to previous work, and is able to verify
full-fledged and multi-language Lingua Franca systems while previous work could not.
For future work, we aim to enhance synthesis by exploring how incorporating sys-
tem and property context into the synthesis oracle can enable property-directed contract
generation. Symbolic techniques, such as program slicing, may help synthesizers focus
on relevant procedure segments, leading to more precise abstractions. On the verifi-
cation side, we plan to develop methods for identifying which contracts contribute to
errors when a system-level counterexample is encountered.
Acknowledgements. This work was supported in part by the DARPA Provably Cor-
rect Design of Adaptive Hybrid Neuro-Symbolic Cyber Physical Systems (ANSR)
program award number FA8750-23-C-0080; the National Science Foundation (NSF),
award #CNS-2233769 (Consistency vs. Availability in Cyber-Physical Systems); Nis-
san and Toyota under the iCyPhy Center, Intel, Berkeley Deep Drive, and the RDI
Frontier Fellows Program.
Page 21:
POLYVER: A Compositional Approach for Polyglot System Modeling and Verification 21
References
1. GitHub - eclipse-cyclonedds/cyclonedds: Eclipse Cyclone DDS project, https://
github.com/eclipse-cyclonedds/cyclonedds
2. Alagar, V .S., Periyasamy, K.: Extended Finite State Machine, pp. 105–128. Springer London,
London (2011). https://doi.org/10.1007/978-0-85729-277-3_7 ,https:
//doi.org/10.1007/978-0-85729-277-3_7
3. Albarghouthi, A., Dillig, I., Gurfinkel, A.: Maximal specification synthesis. In: POPL. pp.
789–801. ACM (2016)
4. Alshnakat, A., Gurov, D., Lidström, C., Rümmer, P.: Constraint-based contract inference for
deductive verification. In: 20 Years of KeY , Lecture Notes in Computer Science, vol. 12345,
pp. 149–176. Springer (2020)
5. Alur, R., Bodik, R., Juniwal, G., Martin, M.M.K., Raghothaman, M., Seshia, S.A., Singh,
R., Solar-Lezama, A., Torlak, E., Udupa, A.: Syntax-guided synthesis. In: 2013 Formal
Methods in Computer-Aided Design. pp. 1–8 (2013). https://doi.org/10.1109/
FMCAD.2013.6679385
6. Barrett, C., Sebastiani, R., Seshia, S.A., Tinelli, C.: Satisfiability modulo theories. In: Biere,
A., van Maaren, H., Walsh, T. (eds.) Handbook of Satisfiability, chap. 26, pp. 825–885. IOS
Press (2009)
7. Beyer, D., Kettl, M., Lemberger, T.: Decomposing software verification using distributed
summary synthesis. Proc. ACM Softw. Eng. 1(FSE), 1307–1329 (2024)
8. Cardoso, J., Chanel, C., Chauvin, P., Hostallier, A., Lamaison, J., Mascarenas-Gonzales, A.,
Metral, E., Alloza, L.: Lab on the real-time control of reaction wheel (2022), unpublished
lab exercise for the course 1MAE803: Real-time Control of Aerospace Systems, M.Sc in
Aerospace Engineering, ISAE-SUPAERO
9. Clarke, E.M., Grumberg, O., Jha, S., Lu, Y ., Veith, H.: Counterexample-guided abstraction
refinement. In: CA V. Lecture Notes in Computer Science, vol. 1855, pp. 154–169. Springer
(2000)
10. Cook, B., Döbel, B., Kroening, D., Manthey, N., Pohlack, M., Polgreen, E., Tautschnig, M.,
Wieczorkiewicz, P.: Using model checking tools to triage the severity of security bugs in the
xen hypervisor. In: FMCAD. pp. 185–193. IEEE (2020)
11. Cordeiro, L., Kesseli, P., Kroening, D., Schrammel, P., Trtik, M.: Jbmc: A bounded model
checking tool for verifying java bytecode. In: International Conference on Computer Aided
Verification. pp. 183–190. Springer (2018)
12. Cousot, P., Cousot, R., Fähndrich, M., Logozzo, F.: Automatic inference of necessary precon-
ditions. In: VMCAI. Lecture Notes in Computer Science, vol. 7737, pp. 128–148. Springer
(2013)
13. Cuoq, P., Kirchner, F., Kosmatov, N., Prevosto, V ., Signoles, J., Yakobowski, B.: Frama-c:
A software analysis perspective. In: International conference on software engineering and
formal methods. pp. 233–247. Springer (2012)
14. Denney, E., Fischer, B.: A generic annotation inference algorithm for the safety certification
of automatically generated code. In: GPCE. pp. 121–130. ACM (2006)
15. Flanagan, C., Leino, K.R.M., Lillibridge, M., Nelson, G., Saxe, J.B., Stata, R.: Extended
static checking for java. In: Proceedings of the ACM SIGPLAN 2002 Conference on Pro-
gramming language design and implementation (PLDI). pp. 234–245 (2002)
16. Gordon, M., Collavizza, H.: Forward with hoare. In: Reflections on the Work of C. A. R.
Hoare, pp. 101–121. Springer (2010)
17. Halbwachs, N., Lagnier, F., Ratel, C.: Programming and verifying real-time systems by
means of the synchronous data-flow language lustre. IEEE transactions on software engi-
neering 18(9), 785–793 (1992)
Page 22:
22 Chen et al.
18. Havelund, K., Pressburger, T.: Model checking java programs using java pathfinder. Interna-
tional Journal on Software Tools for Technology Transfer 2, 366–381 (2000)
19. Hoare, C.A.R.: An axiomatic basis for computer programming. Commun. ACM 12(10),
576–580 (1969)
20. Indrasiri, K., Kuruppu, D.: gRPC: up and running: building cloud native applications with
Go and Java for Docker and Kubernetes. O’Reilly Media (2020)
21. Jha, S., Seshia, S.A.: A theory of formal synthesis via inductive learning. Acta Informatica
54(7), 693–726 (2017)
22. Kroening, D., Schrammel, P., Tautschnig, M.: CBMC: the C bounded model checker. CoRR
abs/2302.02384 (2023)
23. Kroening, D., Tautschnig, M.: CBMC - C bounded model checker - (competition contri-
bution). In: Ábrahám, E., Havelund, K. (eds.) Tools and Algorithms for the Construction
and Analysis of Systems - 20th International Conference, TACAS 2014, Held as Part of the
European Joint Conferences on Theory and Practice of Software, ETAPS 2014, Grenoble,
France, April 5-13, 2014. Proceedings. Lecture Notes in Computer Science, vol. 8413, pp.
389–391. Springer (2014). https://doi.org/10.1007/978-3-642-54862-8_
26,https://doi.org/10.1007/978-3-642-54862-8_26
24. Kroening, D., Tautschnig, M.: Cbmc–c bounded model checker: (competition contribution).
In: Tools and Algorithms for the Construction and Analysis of Systems: 20th International
Conference, TACAS 2014, Held as Part of the European Joint Conferences on Theory and
Practice of Software, ETAPS 2014, Grenoble, France, April 5-13, 2014. Proceedings 20. pp.
389–391. Springer (2014)
25. Lamport, L.: Time, clocks, and the ordering of events in a distributed system, p. 179–196.
Association for Computing Machinery, New York, NY , USA (2019), https://doi.org/
10.1145/3335772.3335934
26. Lattuada, A., Hance, T., Cho, C., Brun, M., Subasinghe, I., Zhou, Y ., Howell, J., Parno, B.,
Hawblitzel, C.: Verus: Verifying rust programs using linear ghost types. Proceedings of the
ACM on Programming Languages 7(OOPSLA1), 286–315 (2023)
27. Lin, S., Jellum, E., Theile, M., Tanneberger, T., Sun, B., Jerad, C., Xu, R., Feng, G., Menard,
C., Lohstroh, M., Castrillon, J., Seshia, S., Lee, E.: Pretvm: Predictable, efficient virtual
machine for real-time concurrency (2024), https://arxiv.org/abs/2406.06253
28. Lin, S., Manerkar, Y .A., Lohstroh, M., Polgreen, E., Yu, S.J., Jerad, C., Lee, E.A., Seshia,
S.A.: Towards building verifiable cps using lingua franca. ACM Trans. Embed. Comput.
Syst. 22(5s) (Sep 2023). https://doi.org/10.1145/3609134 ,https://doi.
org/10.1145/3609134
29. Lohstroh, M., Menard, C., Bateni, S., Lee, E.A.: Toward a lingua franca for determinis-
tic concurrent systems. ACM Transactions on Embedded Computing Systems 20(4) (May
2021). https://doi.org/10.1145/3448128 ,https://doi.org/10.1145/
3448128
30. Macenski, S., Foote, T., Gerkey, B., Lalancette, C., Woodall, W.: Robot operating sys-
tem 2: Design, architecture, and uses in the wild. Science Robotics 7(66), eabm6074
(2022). https://doi.org/10.1126/scirobotics.abm6074 ,https://www.
science.org/doi/abs/10.1126/scirobotics.abm6074
31. Mora, F., Wong, J., Lepe, H., Bhatia, S., Elmaaroufi, K., Varghese, G., Gonzalez, J.E., Pol-
green, E., Seshia, S.A.: Synthetic programming elicitation for text-to-code in very low-
resource programming and formal languages. In: The Thirty-eighth Annual Conference
on Neural Information Processing Systems (NeurIPS) (2024), https://openreview.
net/forum?id=kQPzFiwVIu
32. Moy, Y .: Sufficient preconditions for modular assertion checking. In: VMCAI. Lecture Notes
in Computer Science, vol. 4905, pp. 188–202. Springer (2008)
Page 23:
POLYVER: A Compositional Approach for Polyglot System Modeling and Verification 23
33. Ouaknine, J., Worrell, J.: Safety metric temporal logic is fully decidable. In: International
conference on tools and algorithms for the construction and analysis of systems. pp. 411–
425. Springer (2006)
34. Penninckx, W., Jacobs, B., Piessens, F.: Sound, modular and compositional verification of the
input/output behavior of programs. In: ESOP. Lecture Notes in Computer Science, vol. 9032,
pp. 158–182. Springer (2015)
35. Pereira, M., Ravara, A.: Cameleer: A deductive verification tool for ocaml. In: Computer
Aided Verification: 33rd International Conference, CA V 2021, Virtual Event, July 20–23,
2021, Proceedings, Part II. p. 677–689. Springer-Verlag, Berlin, Heidelberg (2021). https:
//doi.org/10.1007/978-3-030-81688-9_31
36. Polgreen, E., Cheang, K., Gaddamadugu, P., Godbole, A., Laeufer, K., Lin, S., Manerkar,
Y .A., Mora, F., Seshia, S.A.: UCLID5: multi-modal formal modeling, verification, and syn-
thesis. In: 34th International Conference on Computer Aided Verification (CA V). Lecture
Notes in Computer Science, vol. 13371, pp. 538–551. Springer (2022)
37. Polgreen, E., Reynolds, A., Seshia, S.A.: Satisfiability and synthesis modulo oracles. In:
VMCAI. Lecture Notes in Computer Science, vol. 13182, pp. 263–284. Springer (2022)
38. Quigley, M., Conley, K., Gerkey, B., Faust, J., Foote, T., Leibs, J., Wheeler, R., Ng, A.Y .,
et al.: ROS: An open-source robot operating system. In: ICRA workshop on open source
software. vol. 3, p. 5 (2009)
39. Seghir, M.N., Kroening, D.: Counterexample-guided precondition inference. In: ESOP. Lec-
ture Notes in Computer Science, vol. 7792, pp. 451–471. Springer (2013)
40. Solar-Lezama, A., Tancau, L., Bodík, R., Seshia, S.A., Saraswat, V .A.: Combinatorial sketch-
ing for finite programs. In: ASPLOS. pp. 404–415. ACM (2006)
41. VanHattum, A., Schwartz-Narbonne, D., Chong, N., Sampson, A.: Verifying dynamic trait
objects in rust. In: Proceedings of the 44th International Conference on Software Engi-
neering: Software Engineering in Practice. p. 321–330. ICSE-SEIP ’22, Association for
Computing Machinery, New York, NY , USA (2022). https://doi.org/10.1145/
3510457.3513031 ,https://doi.org/10.1145/3510457.3513031
42. VanHattum, A., Schwartz-Narbonne, D., Chong, N., Sampson, A.: Verifying dynamic trait
objects in rust. In: Proceedings of the 44th International Conference on Software Engi-
neering: Software Engineering in Practice. p. 321–330. ICSE-SEIP ’22, Association for
Computing Machinery, New York, NY , USA (2022). https://doi.org/10.1145/
3510457.3513031 ,https://doi.org/10.1145/3510457.3513031
43. Viswanathan, M., Viswanathan, R.: Foundations for Circular Compositional Reasoning. In:
Proceedings of the 28th International Colloquium on Automata, Languages and Program-
ming,. pp. 835–847. ICALP ’01, Springer-Verlag, Berlin, Heidelberg (2001)
44. Wen, C., Cao, J., Su, J., Xu, Z., Qin, S., He, M., Li, H., Cheung, S., Tian, C.: Enchanting
program specification synthesis by large language models using static analysis and program
verification. In: CA V (2). Lecture Notes in Computer Science, vol. 14682, pp. 302–328.
Springer (2024)
45. Yang, H., Nong, Y ., Wang, S., Cai, H.: Multi-language software development: Issues, chal-
lenges, and solutions. IEEE Transactions on Software Engineering 50(3), 512–533 (2024).
https://doi.org/10.1109/TSE.2024.3358258
46. Yang, H., Nong, Y ., Zhang, T., Luo, X., Cai, H.: Learning to detect and localize multilingual
bugs. Proceedings of the ACM on Software Engineering 1(FSE), 2190–2213 (2024)
47. Zhang, H., Yang, W., Fedyukovich, G., Gupta, A., Malik, S.: Synthesizing environment in-
variants for modular hardware verification. In: VMCAI. Lecture Notes in Computer Science,
vol. 11990, pp. 202–225. Springer (2020)
Page 24:
24 Chen et al.
A Appendix
Algorithm 1 PolyVer: CEGIS/CEGAR-hybrid approach
Require: M= (Q, X, I 0, E, G, U, T ): a polyglot model, ϕ: a property
Ensure: res: verification result, cex: a sequence of assignments to X
1:for all ei∈Edo
2: Xi,¯Xi← ∅,∅
3:end for
4:while⊤do
5: C ← { synthesizeContract (ei,Xi,¯Xi)|ei∈E} ▷CEGIS loop
6: M′←induce (M,C)
7: (res,cex)←check (M′, ϕ)
8: ifres=pass then
9: return pass , -
10: else
11: isSpurious ← ⊥
12: for all ei∈Edo
13: xi,¯xi←checkSpurious (cex, ei)
14: if¯xi̸=∅then
15: isSpurious ← ⊤
16: end if
17: Xi←Xi∪xi
18: ¯Xi←¯Xi∪¯xi
19: end for
20: if¬isSpurious then
21: return violated ,cex
22: end if
23: end if
24:end while
1// ... Definitions of structs and LF APIs ...
2void UMS_reaction_1() {
3 if (query->value < 3) {
4 lf_set(response, rand() % 2 == 1);
5 } else {
6 lf_set(response, true);
7 }
8}
9int main() {
10 // calloc for inputs and the self struct.
11 query = calloc(1, sizeof(UMS_query_t));
12 response = calloc(1, sizeof(UMS_response_t));
13 // Assume that there are no NULL pointers.
14 __CPROVER_assume(
15 query && response
16 );
Page 25:
POLYVER: A Compositional Approach for Polyglot System Modeling and Verification 25
Algorithm 2 synthesizeContract
Require: e= (f, V, C, L): an external procedure, Xi: a set of possible examples, ¯Xi: a set of
negative examples
Ensure: (P, Q): a valid contract for e
1:while⊤do
2: (P, Q)←synthesisOracle (V, C,Xi,¯Xi,¯G)
3: res,ex←verificationOracle (J{P}C{Q}KL)
4: ifres=pass then
5: return (P, Q)
6: else
7: Xi←Xi∪ {ex}
8: end if
9:end while
Algorithm 3 checkSpurious
Require: cex: a sequence of assignments to X,e= (f, V, C, L): an external procedure
Ensure: xi: a set of valid behaviors of e,¯xi: a set of invalid behaviors of e
1:p←collectPrestatePoststatePairsFromTrace (cex)
2:for all p= (d, d′)∈pdo
3: res ,_←verificationOracle (J{V=d}C{V̸=d′}KL)
4: ifres=pass then ▷cex is spurious
5: res,ex←verificationOracle (J{V=d}C{V=d′}KL)
6: xi←xi∪ {ex}
7: ¯xi←¯xi∪ {(d, d′)}
8: end if
9:end for
10:return xi,¯xi
Your are a logician and your task is to create two Boolean expressions, a precondition and a postcondition, for a given function named ‘ProcessQuery'. Instructions below.-The precondition should depend solely on function inputs.-The postcondition can depend on both inputs and outputs of the function.-Only use constants that are defined in the code.-Express the pre/postconditions using the following set of Python API.# Addition operationdef Add(a, b): return a + b…First describe in words what the function `UMS_reaction_1`1. may assume before the function is executed, and2. guarantees after the function is executed. Also,3. think about constraints to add to the postcondition to satisfy (resp. eliminate) positive (resp. negative) examples, if provided.Output an itemized list of the above points and nothing else.The function has the following inputs and outputs:Inputs: ‘request'Outputs: 'response'Code below:…Based on your analysis, complete the following two Python functions PRECONDITION and POSTCONDITION that returns pre/postconditions.Requirements:-Ignore null and None inputs and outputs.-Do not write comments and explanations.-The function body should be a single return statement. -Do not use intermediate variables.…Use the provided Python API to synthesize the pre/postconditions.Complete the code below and put it in a Python code block.```pythondef PRECONDITION(query):return # TODO: Fill in using the provided APIdef POSTCONDITION(query, response):return # TODO: Fill in using the provided API```Below is an example of using the API to generate pre/postconditions.```pythondef PRECONDITION(in):return TrueExpr()def POSTCONDITION(in, out):return And(Eq(Select(out, 'count'), Select(in, 'count')),# Use TrueExpr() instead of TrueEq(Select(out, 'is_present), TrueExpr()),)```
Fig. 7: Prompt used in the LLM-based synthesizer.
Page 26:
26 Chen et al.
S0
(0, 0)
Reactions invoked:
t.reaction_4
t.reaction_1
u.reaction_1
Pending events:
(t.wait, (2 minute, 0))
(t.pass, (10 minute, 0))S1
(2 minute, 0)
Reactions invoked:
t.reaction_4
t.reaction_2
u.reaction_1
Pending events:
(t.wait, (4 minute, 0))
(t.pass, (10 minute, 0))
(t.pass, (12 minute, 0))+2 minuteS2
(4 minute, 0)
Reactions invoked:
t.reaction_4
t.reaction_2
u.reaction_1
Pending events:
(t.wait, (6 minute, 0))
(t.pass, (12 minute, 0))
(t.pass, (10 minute, 0))
(t.pass, (14 minute, 0))+2 minuteS3
(6 minute, 0)
Reactions invoked:
t.reaction_4
t.reaction_2
u.reaction_1
Pending events:
(t.wait, (8 minute, 0))
(t.pass, (10 minute, 0))
(t.pass, (14 minute, 0))
(t.pass, (12 minute, 0))
(t.pass, (16 minute, 0))+2 minute
(a) The initialization phase of SSD (including S4)
S4
(8 minute, 0)
Reactions invoked:
t.reaction_4
t.reaction_2
u.reaction_1
Pending events:
(t.pass, (10 minute, 0))
(t.wait, (10 minute, 0))
(t.pass, (14 minute, 0))
(t.pass, (16 minute, 0))
(t.pass, (12 minute, 0))
(t.pass, (18 minute, 0))S5
(10 minute, 0)
Reactions invoked:
t.reaction_4
t.reaction_3
t.reaction_2
u.reaction_1
Pending events:
(t.pass, (12 minute, 0))
(t.wait, (12 minute, 0))
(t.pass, (14 minute, 0))
(t.pass, (18 minute, 0))
(t.pass, (16 minute, 0))
(t.pass, (20 minute, 0))+2 minuteS6
(12 minute, 0)
Reactions invoked:
t.reaction_4
t.reaction_3
t.reaction_2
u.reaction_1
Pending events:
(t.pass, (14 minute, 0))
(t.wait, (14 minute, 0))
(t.pass, (20 minute, 0))
(t.pass, (18 minute, 0))
(t.pass, (16 minute, 0))
(t.pass, (22 minute, 0))
+2 minuteS7
(14 minute, 0)
Reactions invoked:
t.reaction_4
t.reaction_3
t.reaction_2
u.reaction_1
Pending events:
(t.pass, (16 minute, 0))
(t.wait, (16 minute, 0))
(t.pass, (20 minute, 0))
(t.pass, (22 minute, 0))
(t.pass, (18 minute, 0))
(t.pass, (24 minute, 0))+2 minute
+2 minute -6 minute
(b) The periodic phase of SSD (excluding S4))
Fig. 8: The State Space Diagram (SSD) of the LF program in Figure 6. The SSD is a
lasso, with an initialization phase in Figure 8a and a periodic phase in Figure 8b.
Page 27:
POLYVER: A Compositional Approach for Polyglot System Modeling and Verification 27
17 // Initialize input ports, input logical actions, and self
structs with nondeterministic values.
18 *query = nondet_UMS_query_t();
19 // CBMC checks pre/post-conditions.
20 __CPROVER_assume(PRECONDITION);
21 UMS_reaction_1();
22 assert(POSTCONDITION);
23}
Listing 1.1: C file generated for CBMC