Any programmer productivity tool presupposes a specification of the task that it is supposed to perform. However, formal specifications tend to be hard to write, and the need for specifications remains a key impediment to the adoption of tools for program analysis and synthesis. In the recent past, a number of groups, including ours, have studied a novel response to this challenge: the automatic extraction of specifications from “Big Code”, or the many repositories of open-source code available on the internet. Such repositories contain rich statistical knowledge about what programs should and should not do, and what programs that implement various requirements look like. Our recent work has developed a deep learning framework for automatically discovering this sort of knowledge, as well as methods for using this framework to guide systems for automatic program analysis, repair, and synthesis. This tutorial will present our framework, called Bayou, using two concrete tools: BayouSynth, a system for synthesizing API-heavy Java programs, and BayouDebug, a system for the automatic detection of API misuse.

The tutorial will start with an introduction to deep learning over code. Next, we will present the probabilistic model of program design that lies at the heart of Bayou. This model relates the “true” specification of a programming task with the form and function of programs that implement this task, as well as the ambiguous, incomplete language in which users of programming tools might describe these tasks. We will show how this model has a neural implementation supporting efficient inference, how it can be used to direct and complement classic methods for symbolic reasoning about programs, and how the resulting “neurosymbolic” algorithms can be used to synthesize programs from highly ambiguous specifications and find bugs without any explicit specifications. The tutorial will be accompanied by hands-on demonstrations of how to use and extend BayouSynth, BayouDebug, and the datasets that can be used to train these systems.

Mon 18 Jun

Displayed time zone: Eastern Time (US & Canada) change

14:00 - 15:40
Bayou: Deep Learning over “Big Code” for Program Analysis and SynthesisPLDI Tutorials at Innovation
14:00
1h40m
Other
Bayou: Deep Learning over “Big Code” for Program Analysis and Synthesis
PLDI Tutorials
Swarat Chaudhuri Rice University, Vijayaraghavan Murali Rice University, USA, Chris Jermaine Rice University
16:10 - 17:35
Bayou: Deep Learning over “Big Code” for Program Analysis and SynthesisPLDI Tutorials at Innovation
16:10
85m
Other
Bayou: Deep Learning over “Big Code” for Program Analysis and Synthesis
PLDI Tutorials
Swarat Chaudhuri Rice University, Vijayaraghavan Murali Rice University, USA, Chris Jermaine Rice University