Keynote 1: An Exploration of Automated Software Testing, Verification, and Repair Strategies using ESBMC and ChatGPT (slides)
Speaker: Lucas C. Cordeiro, University of Manchester

In this talk, I'll discuss automated testing, verification, and repair techniques that establish a robust foundation for building secure software systems. We'll begin by exploring security concepts, demystifying traditional notions and vulnerabilities unique to (low-level) software systems. Building on this, we'll dive into cutting-edge methods, covering state-of-the-art testing, verification, and repair techniques. This dynamic spectrum spans fuzzing, model checking, constraint programming, and abstract interpretation for vulnerability detection using ESBMC; we'll also touch on leveraging large language models for root cause analysis and program repair using ChatGPT. Lastly, I'll showcase recent achievements highlighting a hybrid approach for safeguarding against embedded software's memory safety and concurrency vulnerabilities. This forward-looking combined approach employs a logic-based automated reasoning framework as its main ingredient, thoughtfully examining search, learning, memory, and parallelization aspects. Throughout the talk, practical examples will bring concepts to life, spanning scenarios such as inspecting firmware security vulnerabilities in modern processors and ensuring the integrity of communication protocols.

Coffee Break
Latest Analysis Techniques around CPAchecker
Can ChatGPT Support CPAchecker with Useful Loop Invariants? (slides)
Speaker: Cedric Richter

Generative language models have made tremendous progress in software engineering tasks such as code generation, code refinement and debugging. Many of these tasks require strong reasoning skills over code and code executions. This raises the question whether we can utilize the reasoning skills of ChatGPT in the formal setting of software verification and complement existing automatic verifiers such as CPAchecker. To answer this question, we study the abilities of ChatGPT in the context of automatic software verification. More specifically, we focus on the task of loop invariant generation. Loop invariant generation requires reasoning over the code semantics to infer invariants that hold for all loop executions and good loop invariants can significantly simplify the verification process. Our initial investigation on the SV-COMP Loops category shows that ChatGPT can effectively annotate C programs with valid loop invariants, many of which can be useful for automatic software verification. Interestingly, we find that many generated loop invariants are far from trivial and have a similar quality as invariants produced by a human annotator.

CPA-DF: A Tool for Configurable Interval Analysis to Boost Program Verification (slides)
Speaker: Po-Chun Chien

Software verification is challenging, and auxiliary program invariants are used to improve the effectiveness of verification approaches. For instance, the k-induction implementation in CPAchecker, an award-winning framework for program analysis, uses invariants produced by a configurable data-flow analysis to strengthen induction hypotheses. This invariant generator, CPA-DF, uses arithmetic expressions over intervals as its abstract domain and is able to prove some safe verification tasks alone. After extensively evaluating CPA-DF on SV-Benchmarks, the largest publicly available suite of C safety-verification tasks, we discover that its potential as a stand-alone analysis or a sub-analysis in a parallel portfolio for combined verification approaches has been significantly underestimated: (1) As a stand-alone analysis, CPA-DF finds almost as many proofs as the plain k-induction implementation without auxiliary invariants. (2) As a sub-analysis running in parallel to the plain k-induction implementation, CPA-DF boosts the portfolio verifier to solve a comparable amount of tasks as the heavily-optimized k-induction implementation with invariant injection. Our detailed analysis reveals that dynamic precision adjustment is crucial to the efficiency and effectiveness of CPA-DF. To generalize our results beyond CPAchecker, we use CoVeriTeam, a platform for cooperative verification, to compose three portfolio verifiers that execute CPA-DF and three other software verifiers in parallel, respectively. Surprisingly, running CPA-DF merely in parallel to these state-of-the-art tools further boosts the number of correct results up to more than 20%.

A Unifying Approach for Control-Flow-Based Loop Abstraction (slides)
Speaker: Marian Lingsch-Rosenfeld

Loop abstraction is a central technique for program analysis, because loops can cause large state spaces if they are unfolded. In many cases, simple tricks can accelerate the program analysis significantly. There are several successful techniques for loop abstraction, but they are hard-wired into different tools and therefore difficult to compare and experiment with. We present a framework that allows us to implement loop abstractions in one common environment, where each technique can be freely switched on and off on-the-fly during the analysis. We treat loops as part of the abstract model of the program, and use counterexample-guided abstraction refinement to increase the precision of the analysis by dynamically selecting particular techniques for loop abstraction. The framework is independent from the underlying abstract domain of the program analysis, and can therefore be used for several different program analyses. Furthermore, our framework can transform the input program to a modified C program that is unsafe if the input program is unsafe. This allows loop abstraction to be used by other verifiers and our improvements are not `locked in' to our verifier. We implemented several existing approaches and evaluate their effects on the program analysis.

Backward Bounded Model Checking in CPAchecker (slides)
Speaker: Bas Laarakker

Backward bounded model checking (BBMC) aims to solve the error location reachability problem by searching for execution paths from an error location to the initial location and determining their feasibility. Backward analyses can complement forward analyses, as backward analyses may be able to determine the reachability of error locations on programs where forward analyses may fail. An implementation of BBMC requires general support for backward analyses in CPAchecker, which is currently limited. We investigate the existing support for backward analyses in CPAchecker, and build on this to be able to implement the BBMC algorithm in CPAchecker. Moreover, we develop support for pointer aliasing in backward Predicate CPA, allowing backward analyses to be used on a larger set of programs and language features. An existing paper states that BBMC may be equivalent to k-induction. To test the comparative performance of the algorithms in practice, we evaluate the algorithms on a benchmark to see when they are indeed equivalent and in what cases they differ. Finally, we consider further work to be done and future improvements both for the BBMC algorithm and the support for backward analyses.

Lunch Break
Keynote 2: Software Model Checking: 20 Years and Beyond (slides)
Speaker: Dirk Beyer, LMU Munich

We give an overview of the development of software model checking, a general approach to algorithmic program verification that integrates static analysis, model checking, and deduction. We start with a look backwards and briefly cover some of the important steps in the past decades. The general approach has become a research topic on its own, with a wide range of tools that are based on the approach. Therefore, we discuss the maturity of the research area of software model checking in terms of looking at competitions, at citations, and most importantly, at the tools that were build in this area: we count 76 verification systems for software written in C or Java. We conclude that software model checking has quickly grown to a significant field of research with a high impact on current research directions and tools in software verification. This talk is based on the following joint work with Andreas Podelski:

Lightning Talks
Enhancing CPAchecker: A Framework for Distributed Analyses (slides)
Speaker: Matthias Kettl

There are many approaches for automated software verification, but they are either imprecise, or they do not scale well to large systems and do not sufficiently leverage parallelization. This leads to large response times, which hinders the integration of software model checking into the development process (continuous integration). We propose a framework for the tool CPAchecker that allows distributing arbitrary CPAs by extending them with only two operators: serialize and deserialize. Serialize takes abstract states and transforms them into messages. Deserialize is the inverse operation and creates abstract states from messages. For successfully deploying a new distributed approach, developers only need to implement a protocol. The talk gives an abstract overview of the framework.

Scaling Formal Verification: Parallel Analysis of Functions (slides)
Speaker: George Granberry
Dual Approximated Reachability Model Checking in CPAchecker
Speaker: Marek Jankola

Hardware verification algorithms were previously modified to suit software verification and implemented in CPAchecker. These algorithms iteratively compute interpolants to overapproximate reachable state space and eventually converge to invariant. They use interpolation in a so-called forward fashion to overapproximate reachable states from initial states. Dual Approximated Reachability algorithm also uses interpolants to overapproximate states that can reach assertion-violating states. It combines information from both forward and backward interpolation sequences to converge faster. In the talk, we briefly explain the main idea of the algorithm, show practical obstacles in the CPAchecker implementation that we overcame and present the final results of the algorithm's performance.

Current State of Memory-Safety Analysis in CPAchecker
Speaker: Daniel Baier

CPAchecker uses a Symbolic Memory Graph (SMG) based analysis to argue about the safety of memory operations. CPAchecker has good results in the MemorySafety category in the SV-COMP23, but no development on the analysis itself has taken place over a long period of time. As a result, the current analysis, based on symbolic execution, is being shifted out and replaced with a new analysis. The new analysis is aimed to be more robust and versatile, as it allows to be run as a explicit-value analysis as well as with symbolic execution and will be extended to be used with CEGAR soon. Preliminary results support this claim.

CPAdaemon: Progress and Demo
Speaker: Henrik Wachowitz
Coffee Break
Insights from Using CPAchecker
Timeout Prediction for Software Analyses
Speaker: Nicola Thoben

As different analyses are good at different sorts of verification tasks, state-of-the-art tools often employ sequential compositions in which every analysis gets a fixed time slot assigned for verification. As a consequence, however, one analysis might consume parts of the overall available time although it does not finish within its time slot. Timeout prediction is supposed to determine when an analysis should get its full-time slot and when to prematurely stop it. Our technique for timeout prediction employs machine learning to predict whether a given analysis will terminate on a given verification task (within a time limit) or will time out. To this end, we develop static as well as dynamic features of verification tasks for predicate and value analyses. This is a follow-up to last year's presentation.

Real-World Software Verification with CPAchecker
Speaker: Thomas Lemberger
Using CPAchecker in Teaching (slides)
Speaker: Jan Haltermann

The CPAchecker is successfully used in verification and has a lot of active developers around the world. Due to its configurable design, it offers many possibilities for developing new analyses. We use the CPAchecker in a Bachelor and a Master course to enrich theoretical lectures with practical hands-on exercises. In this talk, I will give an overview of our usage within the lecture as well as a personal experience report.

Handling Flaky Regression Tests in CPAchecker (slides)
Speaker: Philipp Wendler

Flaky tests that non-deterministically pass or fail are a major impediment to regression testing. Existing work on identifying flaky tests is mostly based on test re-executions, which may be a feasible solution for short-running unit tests, but can become prohibitively expensive for long-running integration and system tests. To alleviate the problem, we propose a novel heuristic for identifying flaky test results by analyzing run-length encodings of regression test histories. For flagging tests as flaky, the proposed heuristic solely requires knowledge about previous test results and neither requires test re-executions nor expensive model fitting. We apply this heuristic in the context of the CPAchecker project, which heavily relies on integration tests for regression testing.

Discussion and Closing
Workshop Dinner at Bella Ciao (self-paid)