You can also call me Zachary
I'm currently a Ph.D student at Columbia University advised by Eugene Wu.
My research areas are DBMS, Machine Learning, Data Integration, Data Wrangling, HCI.
I'm currently building a factorized DBMS (database management system) to manage Database with Large Join Graph and efficiently execute semiring aggregation queries, which are at the heart of data analytics (e.g. pearson correlation, PCA, SVD...) and machine learning tasks (e.g. linear regression, support vector machine, regression tree...).
The key insight is to apply theoretic ideas from PGM (Probalistic Graphical Model) to DBMS including variable elimination, greedy ordering and junction tree. There is a strong mapping between inference tasks in PGM and aggregation queries in DBMS:
- Inference tasks in PGM is asking: given prior probabilities and bayesian network, what is the marginal probability over joint probability?
- Aggregation queries in DBMS is asking: given individual tables and join graph, what is the aggregation result over join table?
Calibrated Junction Hypertree: Data Structure for Exploratory Queries over Join ResultZezhou Huang, Eugene Wu, in preparation. Intro
Greedy Algorithm for Marginalization Ordering of Hypertree DecompositionsZezhou Huang, Eugene Wu, in preparation. Intro
Reptile: Aggregation-level Explanations for Hierarchical DataZezhou Huang, Eugene Wu, submitted to SIGMOD. Arxiv
Spatial and hedonic analysis of housing prices in ShanghaiZezhou Huang, Ruishan Chen, Di Xu, and Wei Zhou. Habitat International 67 (2017): 69-78. PDF
Aggregation-level Explanations for Hierarchical Data
An explanation system for hierarchical data.
Managed Storage Hierarchy in WiscKey
Combine the performance advantages of LevelDB by storing frequently range queried data into LSM tree directly
Python Based Labeler
Build the Python based GUI for manual labeling of candidate pairs
Micro Cloud Labeler
Build the cloud-based labeler which can be deployed in a remote web server
Python Missing Value Detector
Build the Python Missing Value Detector
Distributed Deep Neural Inspector
Build the Distributed Deep Neural Inspector