I am Data Science Tech Lead in Wayfair, leading competitor matching team
My projects involves solving business problem and build scalable ML pipelines
I got a Ph.D. from Applied Mathematics in Brown University; my advisor is Prof. Stuart Geman.
My background is probability, statistics, and some mathematics.
I enjoy building and repair things, from algorithms to carpentry.
We have a simple algorithm to achieve data visualization and dimension reduction. The left plot is the original data point, they are roughly following a spiral structure. In the right plot, we achieve the visualization of the spiral data. The same algorithm can achieve data reduction too. Continue Reading…
We built an algorithm to find “linear structures” (like lines in the right plot) to summarize data. We characterize a property of the minimizer of our loss function and built an algorithm that approximate it really well. Continue Reading…
We developed a lossless image compression algorithm that has analytic guarantee. When context size is getting bigger, our algorithm reaches the optimal compression rate a.e. under very light assumptions. We actually don’t need a lot of context size to outperform CALIC algorithm, one of the best scheme so far. Continue Reading…
This is a deep learning (LSTM) model with Keras module on Python. The model generates Shakespeare’s text. The tutorial will help people to quickly understand Keras and LSTM model. Continue Reading…
The tutorial describes deep learning (CNN) models on learning MNIST digits. Keras module allows user to fast setup the neural network structures without too much hassle. It is easy to get about 99% correction rate. Continue Reading…
Detecting spam from SMS text message with NLP model. In this post, I compared 7 machine learning tools. Continue Reading…
Predicting New York city taxi trip durations with Gradient Boost Machine. The data consists with 1.5 million of samples with features like pickup and dropoff coordinators, hours when pickup occurred, etc. Continue Reading…