SPLASH 2021
Sun 17 - Fri 22 October 2021 Chicago, Illinois, United States
Thu 21 Oct 2021 11:05 - 11:20 at Zurich G - PLDI 2021, PLDI 2020, and OOPSLA 2020 Papers 1 Chair(s): James Koppel

In this paper, we propose a new technique based on program synthesis for extracting information from webpages. Given a natural language query and a few labeled webpages, our method synthesizes a program that can be used to extract similar types of information from other unlabeled webpages. To handle websites with diverse structure, our approach employs a neurosymbolic DSL that incorporates both neural NLP models as well as standard language constructs for tree navigation and string manipulation. We also propose an optimal synthesis algorithm that generates all DSL programs that achieve optimal $F_1$ score on the training examples. Our synthesis technique is compositional, prunes the search space by exploiting a monotonicity property of the DSL, and uses self-supervision to select programs with good generalization power. We have implemented these ideas in a new tool called WebQA and evaluate it on 25 different tasks across multiple domains. Our experiments show that WebQA significantly outperforms existing tools such as state-of-the-art question answering models and wrapper induction systems.

Thu 21 Oct

Displayed time zone: Central Time (US & Canada) change

10:50 - 12:10
PLDI 2021, PLDI 2020, and OOPSLA 2020 Papers 1SIGPLAN Papers at Zurich G
Chair(s): James Koppel Massachusetts Institute of Technology, USA
10:50
15m
Talk
Example-Guided Synthesis of Relational Queries
SIGPLAN Papers
Aalok Thakkar University of Pennsylvania, Aaditya Naik University of Pennsylvania, Nathaniel Sands University of Southern California, Mukund Raghothaman University of Southern California, Mayur Naik University of Pennsylvania, Rajeev Alur University of Pennsylvania
11:05
15m
Talk
Web Question Answering with Neurosymbolic Program Synthesis
SIGPLAN Papers
Qiaochu Chen University of Texas at Austin, USA, Aaron Lamoreaux University of Texas at Austin, Xinyu Wang University of Michigan, Greg Durrett University of Texas at Austin, USA, Osbert Bastani University of Pennsylvania, Isil Dillig University of Texas at Austin
11:20
15m
Talk
Reactive Probabilistic Programming
SIGPLAN Papers
Guillaume Baudart IBM Research, USA, Louis Mandel IBM Research, Eric Atkinson Massachusetts Institute of Technology, Benjamin Sherman Massachusetts Institute of Technology, USA, Marc Pouzet École normale supérieure, Michael Carbin Massachusetts Institute of Technology
DOI Pre-print
11:35
15m
Talk
A Sparse Iteration Space Transformation Framework for Sparse Tensor Algebra
SIGPLAN Papers
Ryan Senanayake Reservoir Labs, Changwan Hong Massachusetts Institute of Technology, Ziheng Wang Massachusetts Institute of Technology, Amalee Wilson Stanford University, Stephen Chou Massachusetts Institute of Technology, Shoaib Kamil Adobe Research, Saman Amarasinghe Massachusetts Institute of Technology, Fredrik Kjolstad Stanford University
11:50
20m
Live Q&A
Discussion, Questions and Answers
SIGPLAN Papers