List of publications

SMT 2025A Conjecture Regarding SMT InstabilityCan Cebeci, Nikolaj Bjørner, George Candea, Clément Pit-Claudel.
23rd International Workshop on Satisfiability Modulo Theories (SMT 2025).Automated verifiers rely on SMT solvers as a backend for proof automation. This simplifies verification in the common case, but also creates usability challenges due to instability. Previous work proposes mitigation strategies motivated by the assumption that instability is a fundamental theoretical limitation. We conjecture that the instability experienced by verifiers today is often caused by fixable engineering problems, and is thus not fundamental. Should this be true, the more consequential approach to addressing instability would be to identify root causes and connect them with solver improvements.
We conducted case studies on 11 deemed-unstable queries from existing verification projects and from issues reported on the Z3 repository, with the goal of diagnosing the root causes of instability. For all of them, instability is attributable to solver bugs, misconfigurations, or misaligned expectations. We present our analysis and draw conclusions regarding better instability metrics, interfaces that make instability explicit, systematic debugging methods, and easier-to-debug SMT solvers.

CAV 2025Verified and Optimized Implementation of Orthologic Proof SearchSimon Guilloud, Clément Pit-Claudel.
Proceedings of the 37th International Conference on Computer Aided Verification (CAV 2025).We report on the development of an optimized and verified decision procedure for orthologic equalities and inequalities. This decision procedure is quadratic-time and is used as a sound, efficient and predictable approximation to classical propositional logic in automated reasoning tools. We formalize, in the Coq proof assistant, a proof system in sequent-calculus style for orthologic. We then prove its soundness and completeness with respect to the algebraic variety of ortholattices, and we formalize a cut-elimination theorem. In doing so, we discover and fix a missing case in a previously published proof.
We then implement and verify a complete proof search procedure for orthologic. A naive implementation is exponential; to obtain an optimal quadratic run time, we optimize the implementation by memoizing its results and simulating reference equality testing. We leverage the resulting correctness theorem to implement a reflective Coq tactic. We present benchmarks showing that the procedure, under various optimizations, matches its theoretical complexity.
Finally, we develop a collection of tactics, including normalization with respect to orthologic and a boolean solver, which we also benchmark. We make tactics available as a standalone Coq plugin.

ICSE 2025Formally Verified Cloud-Scale AuthorizationAleks Chakarov, Jaco Geldenhuys, Matthew Heck, Michael Hicks, Sam Huang, Georges-Axel Jaloyan, Anjali Joshi, K. Rustan M. Leino, Mikael Mayer, Sean McLaughlin, Akhilesh Mritunjai, Clement Pit-Claudel, Sorawee Porncharoenwase, Florian Rabe, Marianna Rapoport, Giles Reger, Cody Roux, Neha Rungta, Robin Salkeld, Matthias Schlaipfer, Daniel Schoepe, Johanna Schwartzentruber, Serdar Tasiran, Aaron Tomb, Emina Torlak, Jean-Baptiste Tristan, Lucas Wagner, Michael W. Whalen, Remy Willems, Tongtong Xiang, Tae Joon Byun, Joshua Cohen, Ruijie Fang, Junyoung Jang, Jakob Rath, Hira Taqdees Syeda, Dominik Wagner, Yongwei Yuan.
Proceedings of the 47th IEEE/ACM International Conference on Software Engineering (ICSE 2025).
Distinguished paper.All critical systems must evolve to meet the needs of a growing and diversifying user base. But supporting that evolution is challenging at increasing scale: Maintainers must find a way to ensure that each change does only what is intended, and will not inadvertently change behavior for existing users. This paper presents how we addressed this challenge for the Amazon Web Services (AWS) authorization engine, invoked 1 billion times per second, by using formal verification. Over a period of four years, we built a new authorization engine, one that behaves functionally the same as its predecessor, using the verification-aware programming language Dafny. We can now confidently deploy enhancements and optimizations while maintaining the highest assurance of both correctness and backward compatibility. We deployed the new engine in 2024 without incident and customers immediately enjoyed a threefold performance improvement. The methodology we followed to build this new engine was not an off-the-shelf application of an existing verification tool, and this paper presents several key insights: First, rather than prove correct the existing engine, written in Java, we found it more effective to write a new engine in Dafny, a language built for verification from the ground up, and then compile the result to Java. Second, to ensure performance, debuggability, and to gain trust from stakeholders, we needed to generate readable, idiomatic Java code, essentially a transliteration of the source Dafny. And third, to ensure that the specification matches the system's actual behavior, we performed extensive differential and shadow testing throughout the development process, ultimately comparing against one quadrillion of production samples prior to deployment. Our approach demonstrates how formal verification can be effectively applied to evolve critical legacy software at scale.

SOSP 2024Practical Verification of System-Software Components Written in Standard CCan Cebeci, Yonghao Zou, Diyu Zhou, George Candea, Clément Pit-Claudel.
Proceedings of the ACM SIGOPS 30th Symposium on Operating Systems Principles (SOSP 2024).Systems code is challenging to verify, because it uses constructs (like raw pointers, pointer arithmetic, and bit twiddling) that are hard for tools to reason about. Existing approaches either sacrifice programmer friendliness, by demanding significant manual effort and verification expertise, or generality, by restricting the programming language or requiring that the code adapt to the verification tool.
We identify a new point in the design space of verifiers that enables a different set of trade-offs, which prove practical in the context of verifying critical system-components. We use several novel techniques to develop the TPot verification framework, targeted specifically at systems code written in C. With TPot, developers verify critical components they implement in standard, unrestricted C, using a C-based language to write “proof-oriented tests” (POTs) that specify the desired behavior. TPot then runs the POTs to prove correctness. We evaluate TPot on 6 different systems-code bases, and show that it can verify them at the same level as 4 state-of-the-art verifiers, while reducing the annotation burden by a factor of 1.36-3.46x. TPot does not require these code bases to be adapted for verification and achieves verification times compatible with typical continuous-integration workflows.

CCS 2024Specification and Verification of Strong Timing Isolation of Hardware EnclavesStella Lau, Thomas Bourgeat, Clément Pit-Claudel, Adam Chlipala.
Proceedings of the 2024 ACM SIGSAC Conference on Computer and Communications Security (CCS 2024).
Distinguished artifact.The process isolation enforceable by commodity hardware and operating systems is too weak to protect secrets from malicious code running on the same machine: Spectre-era attacks exploit timing side channels derived from contention on shared microarchitectural resources to extract secrets. With appropriate hardware support, however, we can construct isolated enclaves and safeguard independent processes from interference through timing side channels, a necessary step towards integrity and confidentiality guarantees.
In this paper, we describe our work on formally specifying and verifying that a synthesizable hardware architecture implements strong timing isolation for enclaves. We reason about the cycle-accurate semantics of circuits with respect to a trustworthy formulation of strong isolation based on “air-gapped machines” and develop a modular proof strategy that sidesteps the need to prove functional correctness of processors. We apply our method on a synthesizable, multicore, pipelined RISC-V design formalized in Coq.

OOPSLA 2024Gradient: Gradual Compartmentalization via Object Capabilities Tracked in TypesAleksander Boruch-Gruszecki, Adrien Ghosn, Mathias Payer, Clément Pit-Claudel.
Proceedings of the ACM on Programming Languages (OOPSLA 2024).Modern software needs fine-grained compartmentalization, i.e., intra-process isolation. A particularly important reason for it are supply-chain attacks, the need for which is aggravated by modern applications depending on hundreds or even thousands of libraries. Object capabilities (ocaps) are a particularly salient approach to compartmentalization, but they require the entire program to assume a lack of ambient authority. Most of existing code was written under no such assumption; effectively, existing applications need to undergo a rewrite-the-world migration to reap the advantages of ocap. We propose gradual compartmentalization, an approach which allows gradually migrating an application to object capabilities, component by component in arbitrary order, all the while continuously enjoying security guarantees. The approach relies on runtime authority enforcement and tracking the authority of objects the type system. We present Gradient, a proof-of-concept gradual compartmentalization extension to Scala which uses Enclosures and Capture Tracking as its key components. We evaluate our proposal by migrating the standard XML library of Scala to Gradient.

ICFP 2024A Coq Mechanization of JavaScript Regular Expression SemanticsNoé De Santo, Aurèle Barrière, Clément Pit-Claudel.
Proceedings of the ACM on Programming Languages (ICFP 2024).We present an executable, proven-safe, faithful, and future-proof Coq mechanization of JavaScript regular expression (regex) matching, as specified by the last published edition of ECMA-262 section 22.2. This is, to our knowledge, the first time that an industrial-strength regex language has been faithfully mechanized in an interactive theorem prover. We highlight interesting challenges that arose in the process (including issues of encoding, corner cases, and executability), and we document the steps that we took to ensure that the result is straightforwardly auditable and that our understanding of the spec aligns with existing implementations.
We demonstrate the usability and versatility of the mechanization through a broad collection of analyses, case studies, and experiments: we prove that JavaScript regex matching always terminates and is safe (no assertion failures); we identify subtle corner cases that led to mistakes in previous publications; we verify an optimization extracted from a state-of-the-art regex engine; we show that some classic properties described in automata textbooks and used in derivatives-based matchers do not hold in JavaScript regexes; and we demonstrate that the cost of updating the mechanization to account for changes in the original specification is reasonably low.
Our mechanization can be extracted to OCaml and linked with Unicode libraries to produce an executable engine that passes the relevant parts of the official Test262 conformance test suite.

PLDI 2024Linear Matching of JavaScript Regular ExpressionsAurèle Barrière, Clément Pit-Claudel.
Proceedings of the ACM on Programming Languages (PLDI 2024).Modern regex languages have strayed far from well-understood traditional regular expressions: they include features that fundamentally transform the matching problem. In exchange for these features, modern regex engines at times suffer from exponential complexity blowups, a frequent source of denial-of-service vulnerabilities in JavaScript applications. Worse, regex semantics differ across languages, and the impact of these divergences on algorithmic design and worst-case matching complexity has seldom been investigated.
This paper provides a novel perspective on JavaScript's regex semantics by identifying a larger-than-previously-understood subset of the language that can be matched with linear time guarantees. In the process, we discover several cases where state-of-the-art algorithms were either wrong (semantically incorrect), inefficient (suffering from superlinear complexity) or excessively restrictive (assuming certain features could not be matched linearly). We introduce novel algorithms to restore correctness and linear complexity. We further advance the state-of-the-art in linear regex matching by presenting the first nonbacktracking algorithms for matching lookarounds in linear time: one supporting captureless lookbehinds in any regex language, and another leveraging a JavaScript property to support unrestricted lookaheads and lookbehinds. Finally, we describe new time and space complexity tradeoffs for regex engines. All of our algorithms are practical: we validated them in a prototype implementation, and some have also been merged in the V8 JavaScript implementation used in Chrome and Node.js.

PLDI 2024Foundational Integration Verification of a Cryptographic ServerAndres Erbsen, Jade Philipoom, Dustin Jamner, Ashley Lin, Samuel Gruetter, Clément Pit-Claudel, Adam Chlipala.
Proceedings of the ACM on Programming Languages (PLDI 2024).We present verification of a bare-metal server built using diverse implementation techniques and languages against a whole-system input-output specification in terms of machine code, network packets, and mathematical specifications of elliptic-curve cryptography. We used very different formal-reasoning techniques throughout the stack, ranging from computer algebra, symbolic execution, and verification-condition generation to interactive verification of a compiler for a C-like language. All these component specifications and domain-specific reasoning techniques are defined and justified against common foundations in the Coq proof assistant. Common between these components is a minimalistic specification style based on functional programs operating on simple objects, omnisemantics for program execution, and assertions in simple separation logic. This design enables us to bring the components together in a top-level correctness theorem that can be audited without understanding or trusting the internal interfaces and tools. Our case study is a simple cryptographic server for flipping of a bit of state through authenticated network messages, and the proof about it shows total functional correctness including static bounds about memory usage. We believe it is the first formal verification covering a full software stack that models both network input and output. This paper also describes our experiences with the specific verification tools we build upon, along with detailed analysis of reasons behind the widely varying levels of productivity we experienced between combinations of tools and tasks.

Dafny 2024Incremental Proof Development in Dafny with Module-Based InductionSon Ho, Clément Pit-Claudel.
Dafny 2024 (Dafny 2024).Highly automated theorem provers like Dafny allow users to prove simple properties with little effort, making it easy to quickly sketch proofs. The drawback is that such provers leave users with little control about the proof search, meaning that the small changes inherent to the iterative process of writing a proof often lead to unpredictable variations in verification time, and eventually hard-to-diagnose proof failures. This sometimes turns the boon of high automation into a curse, as instead of breaking early and showing unsolved goals to the user like in Coq, proofs tend to gradually become unstable until their verification time explodes. At this point, the absence of a proof context to investigate often leaves the user to a painful debugging session. In this paper, we show how to use Dafny modules to encode Coq-like induction principles to dramatically improve the stability and maintainability of proofs about inductive data structures.

HATRA 2023Diagrammatic notations for interactive theorem provingShardul Chiplunkar, Clément Pit-Claudel.
4th International Workshop on Human Aspects of Types and Reasoning Assistants (HATRA 2023).Diagrams are ubiquitous in the development and presentation of proofs, yet surprisingly uncommon in computerized mathematics. Instead, authors and developers rely almost exclusively on line-oriented notations (textual abbreviations and symbols). How might we enrich interactive theorem provers with on-the-fly visual aids that are just as usable? We answer this question by identifying a key challenge: designing declarative languages for composable diagram templates, that provide good-looking implementations of common patterns, and allow for rapid prototyping of diagrams that remain stable across transformations and proof steps.

PLARCH 2023Hardware Verification of Timing Side Channel Freedom in the Spectre EraStella Lau, Thomas Bourgeat, Clément Pit-Claudel, Adam Chlipala.
Workshop on Programming Languages for Architecture (PLARCH 2023).The process isolation enforceable by commodity hardware is too weak to protect secrets from malicious code running on the same machine: Spectre-era attacks have exploited timing side channels derived from contention on shared microarchitectural resources to extract secrets. We describe our work in progress on specifying and verifying strong timing isolation for synthesizable hardware. We reason about the cycle-accurate semantics of circuits with respect to a trustworthy formulation of strong isolation based on “air-gapped machines” and develop a modular proof strategy that sidesteps the need to prove functional correctness of processors. We demonstrate our method on a multicore, pipelined, RISC-V design formalized in Coq. Finally, we discuss future directions toward composing isolation guarantees with speculative, constant-time guarantees to obtain a software-to-hardware, end-to-end guarantee of timing side channel freedom.

PLDI 2022Relational Compilation for Performance-Critical Applications: Extensible Proof-Producing Translation of Functional Models into Low-Level CodeClément Pit-Claudel, Jade Philipoom, Dustin Jamner, Andres Erbsen, Adam Chlipala.
Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation (PLDI 2022).There are typically two ways to compile and run a purely functional program verified using an interactive theorem prover (ITP): automatically extracting it to a similar language (typically an unverified process, like Coq to OCaml) or manually proving it equivalent to a lower-level reimplementation (like a C program). Traditionally, only the latter produced both excellent performance and end-to-end proofs.
This paper shows how to recast program extraction as a proof-search problem to automatically derive correct-by-construction, high-performance code from purely functional programs. We call this idea relational compilation — it extends recent developments with novel solutions to loop-invariant inference and genericity in kinds of side effects.
Crucially, relational compilers are incomplete, and unlike traditional compilers, they generate good code not because of a fixed set of clever built-in optimizations but because they allow experts to plug in domain–specific extensions that give them complete control over the compiler's output.
We demonstrate the benefits of this approach with Rupicola, a new compiler-construction toolkit designed to extract fast, verified, idiomatic low-level code from annotated functional models. Using case studies and performance benchmarks, we show that it is extensible with minimal effort and that it achieves performance on par with that of handwritten C programs.

PhD thesisRelational Compilation: Functional-to-Imperative Code Generation for Performance-Critical Applicationsvideo, bibClément Pit-Claudel.
PhD thesis at MIT.Purely functional programs verified using interactive theorem provers typically need to be translated to run: either by extracting them to a similar language (like Coq to OCaml) or by proving them equivalent to deeply embedded implementations (like C programs). Traditionally, the first approach is automated but produces unverified programs with average performance, and the second approach is manual but produces verified, high-performance programs.
This thesis shows how to recast program extraction as a proof-search problem to automatically derive correct-by-construction, high-performance code from shallowly embedded functional programs. It introduces a unifying framework, *relational compilation*, to capture and extend recent developments in program extraction, with a focus on modularity and sound extensibility. To demonstrate the value of this approach, it then presents Rupicola, a relational compiler-construction toolkit designed to extract fast, verified, idiomatic low-level code from annotated functional models.
The originality of this approach lies in its combination of foundational proofs, extensibility, and performance, backed by an unconventional take on compiler extensions: unlike traditional compilers, Rupicola generates good code not because of clever built-in optimizations, but because it allows expert users to plug in domain- and sometimes program-specific extensions that allow them to generate exactly the low-level code that they want. This thesis demonstrates the benefits of this approach through case studies and performance benchmarks that highlight how easy Rupicola makes it to create domain-specific compilers that generate code with performance comparable to that of handwritten C programs.

ASPLOS 2021Effective Simulation and Debugging for a High-Level Hardware Language Using Software CompilersClément Pit-Claudel, Thomas Bourgeat, Stella Lau, Arvind, Adam Chlipala.
Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2021).Rule-based hardware-design languages (RHDLs) promise to enhance developer productivity by offering convenient abstractions. Advanced compiler technology keeps the cost of these abstractions low, generating circuits with excellent area and timing properties.
Unfortunately, comparatively little effort has been spent on building simulators and debuggers for these languages, so users often simulate and debug their designs at the RTL level. This is problematic because generated circuits typically suffer from poor readability, as compiler optimizations can break high-level abstractions. Worse, optimizations that operate under the assumption that concurrency is essentially free yield faster circuits but often actively hurt simulation performance on platforms with limited concurrency, like desktop computers or servers.
This paper demonstrates the benefits of completely separating the simulation and synthesis pipelines. We propose a new approach, yielding the first compiler designed for effective simulation and debugging of a language in the Bluespec family. We generate cycle-accurate C++ models that are readable, compatible with a wide range of traditional software-debugging tools, and fast (often two to three times faster than circuit-level simulation). We achieve these results by optimizing for sequential performance and using static analysis to minimize redundant work. The result is a vastly improved hardware-design experience, which we demonstrate on embedded processor designs and DSP building blocks using performance benchmarks and debugging case studies.

CoqPL 2021Automated Synthesis of Verified FirewallsShardul Chiplunkar, Clément Pit-Claudel, Adam Chlipala.
The 7th International Workshop on Coq for PL (CoqPL 2021).
Workshop paper.We demonstrate correct-by-construction firewalls—stateful packet filters for TCP/IP packets—using the Fiat synthesis library. We present a general DSL for specifying their behavior independent of algorithmic implementation. We outline the design of a verified compiler in Coq, detail a few verified efficiency optimizations, and show how the compiler can easily be extended to support custom optimizations for user-defined policies.

CoqPL 2021An experience report on writing usable DSLs in CoqClément Pit-Claudel, Thomas Bourgeat.
The 7th International Workshop on Coq for PL (CoqPL 2021).
Workshop paper.Features added to Coq over the last 5 years have made it possible to create drastically more usable domain-specific languages (DSLs). We report on our experience building and using a hardware-description language embedded within Coq, highlighting how recent Coq improvements make it possible to solve longstanding pain points with Coq DSLs.

SLE 2020Untangling Mechanized ProofsClément Pit-Claudel.
Proceedings of the 13th ACM SIGPLAN International Conference on Software Language Engineering (SLE 2020).
Distinguished artifact.Proof assistants like Coq, Lean, or HOL4 rely heavily on stateful meta-programs called scripts to assemble proofs. Unlike pen-and-paper proofs, proof scripts only describe the steps to take (induct on x, apply a theorem, …), not the states that these steps lead to; as a result, plain proof scripts are essentially incomprehensible without the assistance of an interactive user interface able to run the script and show the corresponding proof states.
Until now, the standard process to communicate a proof without forcing readers to execute its script was to manually copy-paste intermediate proof states into the script, as source code comments — a tedious and error-prone exercise. Additional prose (such as for a book or tutorial) was likewise embedded in comments, preserving executability at the cost of a mediocre text-editing experience.
This paper describes a new approach to the development and dissemination of literate proof scripts, with a focus on the Coq proof assistant. Specifically, we describe two contributions: a compiler that interleaves Coq's output with the original proof script to produce interactive webpages that are complete, self-contained presentations of Coq proofs; and a new literate programming toolkit that allows authors to switch seamlessly between prose- and code-oriented views of the same sources, by translating back and forth between reStructuredText documents and literate Coq source files. In combination, these tools offer a new way to write, communicate, and preserve proofs, combining the flexibility of procedural proof scripts and the intelligibility of declarative proofs.

IJCAR 2020Extensible Extraction of Efficient Imperative Programs with Foreign Functions, Manually Managed Memory, and ProofsClément Pit-Claudel, Peng Wang, Benjamin Delaware, Jason Gross, Adam Chlipala.
Proceedings of the 10th International Joint Conference on Automated Reasoning (IJCAR 2020).We present an original approach to sound program extraction in a proof assistant, using syntax-driven automation to derive correct-by-construction imperative programs from nondeterministic functional source code. Our approach does not require committing to a single inflexible compilation strategy and instead makes it straightforward to create domain-specific code translators. In addition to a small set of core definitions, our framework is a large, user-extensible collection of compilation rules each phrased to handle specific language constructs, code patterns, or data manipulations. By mixing and matching these pieces of logic, users can easily tailor extraction to their own domains and programs, getting maximum performance and ensuring correctness of the resulting assembly code.
Using this approach, we complete the first proof-generating pipeline that goes automatically from high-level specifications to assembly code. In our main case study, the original specifications are phrased to resemble SQL-style queries, while the final assembly code does manual memory management, calls out to foreign data structures and functions, and is suitable to deploy on resource-constrained platforms. The pipeline runs entirely within the Coq proof assistant, leading to final, linked assembly code with overall full-functional-correctness proofs in separation logic.

PLDI 2020The Essence of Bluespec: A Core Language for Rule-Based Hardware DesignThomas Bourgeat, Clément Pit-Claudel, Adam Chlipala, Arvind.
Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2020).The Bluespec hardware-description language presents a significantly higher-level view than hardware engineers are used to, exposing a simpler concurrency model that promotes formal proof, without compromising on performance of compiled circuits. Unfortunately, the cost model of Bluespec has been unclear, with performance details depending on a mix of user hints and opaque static analysis of potential concurrency conflicts within a design. In this paper we present Kôika, a derivative of Bluespec that preserves its desirable properties and yet gives direct control over the scheduling decisions that determine performance. Kôika has a novel and deterministic operational semantics that uses dynamic analysis to avoid concurrency anomalies. Our implementation includes Coq definitions of syntax, semantics, key metatheorems, and a verified compiler to circuits. We argue that most of the extra circuitry required for dynamic analysis can be eliminated by compile-time BSV-style static analysis.

ICFP 2019Narcissus: Correct-by-Construction Derivation of Decoders and Encoders from Binary FormatsBenjamin Delaware, Sorawit Suriyakarn, Clément Pit-Claudel, Qianchuan Ye, Adam Chlipala.
Proceedings of the ACM on Programming Languages (ICFP 2019).It is a neat result from functional programming that libraries of parser combinators can support rapid construction of decoders for quite a range of formats. With a little more work, the same combinator program can denote both a decoder and an encoder. Unfortunately, the real world is full of gnarly formats, as with the packet formats that make up the standard Internet protocol stack. Most past parser-combinator approaches cannot handle these formats, and the few exceptions require redundancy – one part of the natural grammar needs to be hand-translated into hints in multiple parts of a parser program. We show how to recover very natural and nonredundant format specifications, covering all popular network packet formats and generating both decoders and encoders automatically. The catch is that we use the Coq proof assistant to derive both kinds of artifacts using tactics, automatically, in a way that guarantees that they form inverses of each other. We used our approach to reimplement packet processing for a full Internet protocol stack, inserting our replacement into the OCaml-based MirageOS unikernel, resulting in minimal performance degradation.

ESOP 2019Meta-F⋆ : Proof Automation with SMT, Tactics, and MetaprogramsGuido Martínez, Danel Ahman, Victor Dumitrescu, Nick Giannarakis, Chris Hawblitzel, Cătălin Hriţcu, Monal Narasimhamurthy, Zoe Paraskevopoulou, Clément Pit-Claudel, Jonathan Protzenko, Tahina Ramananandro, Aseem Rastogi, Nikhil Swamy.
Proceedings of the 28th European Symposium on Programming (ESOP 2019).We introduce Meta-F⋆, a tactics and metaprogramming framework for the F⋆ program verifier. The main novelty of Meta-F⋆ is allowing the use of tactics and metaprogramming to discharge assertions not solvable by SMT, or to just simplify them into well-behaved SMT fragments. Plus, Meta-F⋆ can be used to generate verified code automatically.
Meta-F⋆ is implemented as an F⋆ effect, which, given the powerful effect system of F⋆, heavily increases code reuse and even enables the lightweight verification of metaprograms. Metaprograms can be either interpreted, or compiled to efficient native code that can be dynamically loaded into the F⋆ type-checker and can interoperate with interpreted code.
Evaluation on realistic case studies shows that Meta-F⋆ provides substantial gains in proof development, efficiency, and robustness.

SETTA 2018Correct-by-Construction Implementation of Runtime Monitors Using Stepwise RefinementTeng Zhang, John Wiegley, Theophilos Giannakopoulos, Gregory Eakman, Clément Pit-Claudel, Insup Lee, Oleg Sokolsky.
Proceedings of the 4th International Symposium on Dependable Software Engineering: Theories, Tools, and Applications (SETTA 2018).Runtime verification (RV) is a lightweight technique for verifying traces of computer systems. One challenge in applying RV is to guarantee that the implementation of a runtime monitor correctly detects and signals unexpected events. In this paper, we present a method for deriving correct-by-construction implementations of runtime monitors from high-level specifications using Fiat, a Coq library for stepwise refinement. SMEDL (Scenario-based Meta-Event Definition Language), a domain specific language for event-driven RV, is chosen as the specification language. We propose an operational semantics for SMEDL suitable to be used in Fiat to describe the behavior of a monitor in a relational way. Then, by utilizing Fiat’s refinement calculus, we transform a declarative monitor specification into an executable runtime monitor with a proof that the behavior of the implementation is strictly a subset of that provided by the specification. Moreover, we define a predicate on the syntax structure of a monitor definition to ensure termination and determinism. Most of the proof work required to generate monitor code has been automated.

ML 2018ML as a Tactic Language, AgainGuido Martínez, Danel Ahman, Victor Dumitrescu, Nick Giannarakis, Chris Hawblitzel, Cătălin Hriţcu, Monal Narasimhamurthy, Zoe Paraskevopoulou, Clément Pit-Claudel, Jonathan Protzenko, Tahina Ramananandro, Aseem Rastogi, Nikhil Swamy.
ML Family Workshop (ML 2018).
Workshop paper.ML originated as the meta language for constructing proofs in the LCF theorem prover. Due to its many virtues, it became a role model for object-level languages as well, inspiring an entire family of strict functional programming languages.
A member of this family is F⋆, a verification-oriented dialect of ML including dependent types and a monadic effect system, the combination of which allows to express arbitrarily complex properties of (effectful!) programs within its type system itself. To prove these properties, F⋆ relies on querying an SMT solver (usually Z3). For properties in well-supported SMT theories (e.g. linear arithmetic or uninterpreted functions) this approach works very well, allowing programmers to verify programs with relative ease. However, for poorly-supported theories (e.g. non-linear arithmetic), the SMT solver might require a prohibitive amount of resources or be wildly unpredictable. Furthermore, these rotten apples spoil the barrel, since combining even a basic provable fact from a poorly-supported theory with a well-supported one might stump the solver.
Motivated by these issues, we have recently extended F⋆ with a tactics and metaprogramming engine, which allows to attack proof obligations with user-defined procedures. Tactics are written in Meta-F⋆, which is simply the set of those F⋆ programs of a particular Tac effect, defined within the existing type-and-effect system. As such, they readily interoperate with the rest of language and can themselves be verified (to a degree). Using tactics allows the programmer to massage proof obligations before querying the SMT solver (possibly simplifying them into the well-supported fragments) or even solving them completely.
Here, we present an overview of Meta-F⋆, focusing on its design and its execution models. In addition, we discuss the benefits of tactics and metaprogramming in F⋆ through some idiomatic case studies.

SNAPL 2017The End of History? Using a Proof Assistant to Replace Language Design with Library DesignAdam Chlipala, Benjamin Delaware, Samuel Duchovni, Jason Gross, Clément Pit-Claudel, Sorawit Suriyakarn, Peng Wang, Katherine Ye.
The 2nd Summit on Advances in Programming Languages (SNAPL 2017).Functionality of software systems has exploded in part because of advances in programming-language support for packaging reusable functionality as libraries. Developers benefit from the uniformity that comes of exposing many interfaces in the same language, as opposed to stringing together hodgepodges of command-line tools. Domain-specific languages may be viewed as an evolution of the power of reusable interfaces, when those interfaces become so flexible as to deserve to be called programming languages. However, common approaches to domain-specific languages give up many of the hard-won advantages of library-building in a rich common language, and even the traditional approach poses significant challenges in learning new APIs. We suggest that instead of continuing to develop new domain-specific languages, our community should embrace library-based ecosystems within very expressive languages that mix programming and theorem proving. Our prototype framework Fiat, a library for the Coq proof assistant, turns languages into easily comprehensible libraries via the key idea of modularizing functionality and performance away from each other, the former via macros that desugar into higher-order logic and the latter via optimization scripts that derive efficient code from logical programs.

CAV 2016Trigger Selection Strategies to Stabilize Program VerifiersK. Rustan M. Leino and Clément Pit-Claudel.
Proceedings of the 28th International Conference on Computer Aided Verification (CAV 2016).SMT-based program verifiers often suffer from the so-called butterfly effect, in which minor modifications to the program source cause significant instabilities in verification times, which in turn may lead to spurious verification failures and a degraded user experience. This paper identifies matching loops (ill-behaved quantifiers causing an SMT solver to repeatedly instantiate a small set of quantified formulas) as a significant contributor to these instabilities, and describes some techniques to detect and prevent them. At their core, the contributed techniques move the trigger selection logic away from the SMT solver and into the high-level verifier: this move allows authors of verifiers to annotate, rewrite, and analyze user-written quantifiers to improve the solver’s performance, using information that is easily available at the source level but would be hard to extract from the heavily encoded terms that the solver works with. The paper demonstrates three core techniques (quantifier splitting, trigger sharing, and matching loop detection) by extending the Dafny verifier with its own trigger selection routine, and demonstrates significant predictability and performance gains on both Dafny’s test suite and large verification efforts using Dafny.

CoqPL 2016Company-Coq: Taking Proof General one step closer to a real IDEClément Pit-Claudel, Pierre Courtieu.
The 2nd International Workshop on Coq for PL (CoqPL 2016).
Workshop paper.Company-Coq is a new Emacs package that extends Proof General with a contextual auto-completion engine for Coq proofs and many additional facilities to make writing proofs easier and more efficient. Beyond fuzzy auto-completion of tactics, options, module names, and local definitions, company-coq offers offline in-editor documentation, convenient snippets, and multiple other Coq-specific IDE features.

MS thesisCompilation Using Correct-by-Construction Program Synthesishtml, bibClément Pit-Claudel.
Master's thesis at MIT.
William A. Martin Memorial Thesis Award for Outstanding Thesis in CS.Extracting and compiling certified programs may introduce bugs in otherwise proven-correct code, reducing the extent of the guarantees that proof assistants and correct-by-construction program-derivation frameworks provide. We explore a novel approach to extracting and compiling embedded domain-specific languages developed in a proof assistant (Coq), showing how it allows us to extend correctness guarantees all the way down to a verification-aware assembly language. Our core idea is to phrase compilation of shallowly embedded programs to a lower-level deeply embedded language as a synthesis problem, solved using simple proof-search techniques. This technique is extensible (support for individual language constructs is provided by a user-extensible database of compilation tactics and lemmas) and allows the source programs to depend on axiomatically specified methods of externally implemented data structures, delaying linking to the assembly stage. Composed with the Fiat and Bedrock frameworks, our new method provides the first proof-generating automatic translation from SQL-style relational programs into executable assembly code.

POPL 2015Fiat: Deductive Synthesis of Abstract Data Types in a Proof AssistantBenjamin Delaware, Clément Pit-Claudel, Jason Gross, Adam Chlipala.
Proceedings of the 42nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL 2015).We present Fiat, a library for the Coq proof assistant supporting refinement of declarative specifications into efficient functional programs with a high degree of automation. Each refinement process leaves a proof trail, checkable by the normal Coq kernel, justifying its soundness. We focus on the synthesis of abstract data types that package methods with private data. We demonstrate the utility of our framework by applying it to the synthesis of query structures — abstract data types with SQL-like query and insert operations. Fiat includes a library for writing specifications of query structures in SQL-inspired notation, expressing operations over relations (tables) in terms of mathematical sets. This library includes a suite of tactics for automating the refinement of specifications into efficient, correct-by-construction OCaml code. Using these tactics, a programmer can generate such an implementation completely automatically by only specifying the equivalent of SQL indexes, data structures capturing useful views of the abstract data. Throughout we speculate on the new programming modularity possibilities enabled by an automated refinement system with proved-correct rules.

Tech reportOutlier Detection in Heterogeneous Datasets using Automatic Tuple ExpansionClément Pit-Claudel, Zelda Mariet, Rachael Harding, Sam Madden.
Technical report.Rapidly developing areas of information technology are generating massive amounts of data. Human errors, sensor failures, and other unforeseen circumstances unfortunately tend to undermine the quality and consistency of these datasets by introducing outliers -- data points that exhibit surprising behavior when compared to the rest of the data. Characterizing, locating, and in some cases eliminating these outliers offers interesting insight about the data under scrutiny and reinforces the confidence that one may have in conclusions drawn from otherwise noisy datasets. In this paper, we describe a tuple expansion procedure which reconstructs rich information from semantically poor SQL data types such as strings, integers, and floating point numbers. We then use this procedure as the foundation of a new user-guided outlier detection framework, dBoost, which relies on inference and statistical modeling of heterogeneous data to flag suspicious fields in database tuples. We show that this novel approach achieves good classification performance, both in traditional numerical datasets and in highly non-numerical contexts such as mostly textual datasets.