Dr. John Doe

Zezhou Huang

zh2408@columbia.edu

About Me

I'm Zezhou Huang. You can call me Zachary.

I'm a PhD student at Columbia University advised by Professor Eugene Wu. My work centers on developing a semantic layer for large join graphs in cloud data warehouses. If you look at the image below, you'll see what join graph from a place like IMDB can look like - it's a bit messy, right? My job is to clean up that mess.

My previous projects built interactive dashboards, ML systems, and data discovery tools on top of join graphs. Now, I'm also looking into how to solve data problems with large language models (LLM) and speed up query processing with GPU acceleration.

The Google PHD Fellowship and Avanessian Fellowship provide generous funding for my research.

Publications

  1. Relationalizing Tables with Large Language Models: The Promise and Challenges
    Zezhou Huang, Eugene Wu.
    DBML@ICDE 2024.
  2. The Fast and the Private: Task-based Dataset Search
    Zezhou Huang, Jiaxiang Liu, Haonan Wang, Eugene Wu.
    CIDR 2024.
  3. Lightweight Materialization for Fast Dashboards Over Joins
    Zezhou Huang, and Eugene Wu.
    SIGMOD 2024.
  4. Data Ambiguity Strikes Back: How Documentation Improves GPT's Text-to-SQL
    Zezhou Huang, Pavan Kalyan Damalapati, and Eugene Wu.
    TRL@NeurIPS 2023 - selected for spotlight talks, Video.
  5. Saibot: A Differentially Private Data Search Platform.
    Zezhou Huang, Jiaxiang Liu, Daniel Gbenga Alabi, Raul Castro Fernandez, and Eugene Wu.
    VLDB 2023.
  6. Kitana: Efficient Data Augmentation Search for AutoML.
    Zezhou Huang, Pranav Subramaniam, Raul Castro Fernandez, and Eugene Wu.
    Arxiv.
  7. Random Forests over normalized data in CPU-GPU DBMSes.
    Zezhou Huang, Pavan Kalyan Damalapati, Rathijit Sen, and Eugene Wu.
    DaMoN@SIGMOD 2023, Slides.
  8. JoinBoost: Grow Trees Over Normalized Data Using Only SQL.
    Zezhou Huang, Rathijit Sen, Jiaxiang Liu, and Eugene Wu.
    VLDB 2023, Video 1, Video 2,
  9. Aggregation Consistency Errors in Semantic Layers and How to Avoid Them.
    Zezhou Huang, Pavan Kalyan Damalapati, and Eugene Wu.
    HILDA@SIGMOD 2023, Slides.
  10. Reptile: Aggregation-level Explanations for Hierarchical Data.
    Zezhou Huang, and Eugene Wu.
    SIGMOD 2022, Video, News, Interview
  11. Calibration: A Simple Trick for Wide-table Delta Analytics
    Zezhou Huang, and Eugene Wu.
    Arxiv.
  12. Spatial and hedonic analysis of housing prices in Shanghai
    Zezhou Huang, Ruishan Chen, Di Xu, Wei Zhou.
    Habitat International 2017.

Service

Random

I developed a game, once live on the App Store, now offline due to the Apple Developer Program fee. Someone made a gameplay video about it. If there's interest, I can bring it back.