Papers | Shreya Shankar

3 representative papers (for job market)

🏆 Won an award

Co-first author is my mentee

Semantic Data Processing with Holistic Data Understanding

Youran Sun*, Sepanta Zeighami*, Bhavya Chopra, Shreya Shankar, Aditya G. Parameswaran

Preprint
Can AI Agents Answer Your Data Questions? A Benchmark for Data Agents

Ruiying Ma*, Shreya Shankar*, Ruiqi Chen, Yiming Lin, Sepanta Zeighami, Rajoshi Ghosh, Abhinav Gupta, Anushrut Gupta, Tanmai Gopal, Aditya G. Parameswaran

Preprint

Co-first author is my mentee
Multi-Objective Agentic Rewrites for Unstructured Data Processing

Lindsey Linxi Wei*, Shreya Shankar*, Sepanta Zeighami, Yeounoh Chung, Fatma Ozcan, Aditya G. Parameswaran

To appear at VLDB 2026

Co-first author is my mentee
Featurized-Decomposition Join: Low-Cost Semantic Joins with Guarantees

Sepanta Zeighami, Shreya Shankar, Aditya G. Parameswaran

To appear at VLDB 2026
Task Cascades for Efficient Unstructured Data Processing

Shreya Shankar, Sepanta Zeighami, Aditya G. Parameswaran

To appear at SIGMOD 2026
Cut Costs, Not Accuracy: LLM-Powered Data Processing with Guarantees

Sepanta Zeighami, Shreya Shankar, Aditya G. Parameswaran

To appear at SIGMOD 2026
RAG Without the Lag: Interactive Debugging for Retrieval-Augmented Generation Pipelines

Quentin Romero Lauro*, Shreya Shankar*, Sepanta Zeighami, Aditya G. Parameswaran

CHI 2026

🏆 Best Paper

Co-first author is my mentee
Supporting Our AI Overlords: Redesigning Data Systems to be Agent-First

Shu Liu, Soujanya Ponnapalli, Shreya Shankar, Sepanta Zeighami, Alan Zhu, Shubham Agarwal, Ruiqi Chen, Samion Suwito, Shuo Yuan, Ion Stoica, Matei Zaharia, Alvin Cheung, Natacha Crooks, Joseph E. Gonzalez, Aditya G. Parameswaran

CIDR 2026
Steering Semantic Data Processing with DocWrangler

Shreya Shankar*, Bhavya Chopra*, Mawil Hasan, Stephen Lee, Björn Hartmann, Joseph M. Hellerstein, Aditya G. Parameswaran, Eugene Wu

UIST 2025

🏆 Best Paper Honorable Mention
Rethinking Dataset Discovery with DataScout

Rachel Lin*, Bhavya Chopra*, Wenjing Lin, Shreya Shankar, Madelon Hulsebos, Aditya G. Parameswaran

UIST 2025

Co-first author is my mentee
DocETL: Agentic Query Rewriting and Evaluation for Complex Document Processing

Shreya Shankar, Tristan Chambers, Tarak Shah, Aditya G. Parameswaran, Eugene Wu

VLDB 2025
LLM-Powered Proactive Data Systems

Sepanta Zeighami, Yiming Lin, Shreya Shankar, Aditya G. Parameswaran

IEEE Data Engineering Bulletin 2025
Querying Templatized Document Collections with Large Language Models

Yiming Lin, Madelon Hulsebos, Ruiying Ma, Shreya Shankar, Sepanta Zeighami, Aditya G. Parameswaran, Eugene Wu

ICDE 2025
PromptEvals: A Dataset of Assertions and Guardrails for Custom Production Large Language Model Pipelines

Reya Vir*, Shreya Shankar*, Harrison Chase, William Hinthorn, Aditya G. Parameswaran

NAACL 2025

🏆 Selected for Oral Presentation

Co-first author is my mentee
Who Validates the Validators? Aligning LLM-Assisted Evaluation of LLM Outputs with Human Preferences

Shreya Shankar, J.D. Zamfirescu-Pereira, Björn Hartmann, Aditya G. Parameswaran, Ian Arawjo

UIST 2024
SPADE: Synthesizing Data Quality Assertions for Large Language Model Pipelines

Shreya Shankar, Haotian Li, Parth Asawa, Madelon Hulsebos, Yiming Lin, J.D. Zamfirescu-Pereira, Harrison Chase, Will Fu-Hinthorn, Aditya G. Parameswaran, Eugene Wu

VLDB 2024
What We've Learned From a Year of Building with LLMs

Eugene Yan, Bryan Bischof, Charles Frye, Hamel Husain, Jason Liu, Shreya Shankar

O'Reilly Radar
Building Reactive Large Language Model Pipelines with Motion

Shreya Shankar, Aditya G. Parameswaran

SIGMOD 2024 (Demo)
It Took Longer Than I Was Expecting: Why Is Dataset Search Still So Hard?

Madelon Hulsebos, Wenjing Lin, Shreya Shankar, Aditya G. Parameswaran

HILDA 2024 (Workshop on Human-in-the-Loop Data Analytics)
Revisiting Prompt Engineering via Declarative Crowdsourcing

Aditya G. Parameswaran, Shreya Shankar, Parth Asawa, Naman Jain, Yujie Wang

CIDR 2024
Operationalizing Machine Learning: An Interview Study

Shreya Shankar*, Rolando Garcia*, Joseph M. Hellerstein, Aditya G. Parameswaran

CSCW 2024
Towards Observability for Production Machine Learning Pipelines

Shreya Shankar, Aditya G. Parameswaran

VLDB 2023
Bolt-on, Compact, and Rapid Program Slicing for Notebooks

Shreya Shankar*, Stephen Macke*, Sarah Chasins, Andrew Head, Aditya G. Parameswaran

VLDB 2023
Automatic and Precise Data Validation for Machine Learning

Shreya Shankar, Labib Fawaz, Karl Gyllstrom, Aditya G. Parameswaran

CIKM 2023
Rethinking Streaming Machine Learning Evaluation

Shreya Shankar, Bernease Herman, Aditya G. Parameswaran

ICLR 2022: Workshop on ML Evaluation Standards
Enabling certification of verification-agnostic networks via memory-efficient semidefinite programming

Sumanth Dathathri, Krishnamurthy Dvijotham, Alexey Kurakin, Aditi Raghunathan, Jonathan Uesato, Rudy R Bunel, Shreya Shankar, Jacob Steinhardt, Ian Goodfellow, Percy S Liang, Pushmeet Kohli

NeurIPS 2020
Adversarial examples that fool both computer vision and time-limited humans

Gamalelden F. Elsayed, Shreya Shankar, Brian Cheung, Nicolas Papernot, Alexey Kurakin, Ian Goodfellow, Jascha Sohl-Dickstein

NIPS 2018
No classification without representation: Assessing geodiversity issues in open data sets for the developing world

Shreya Shankar, Yoni Halpern, Eric Breck, James Atwood, Jimbo Wilson, D. Sculley

NIPS 2017: Workshop on Machine Learning for the Developing World