About
I'm Bing — a Machine Learning Engineer specializing in taking models from research to production.
My work sits at the intersection of ML engineering, infrastructure, and systems design. I've built and operated inference platforms handling millions of requests, designed training pipelines for large-scale models, and implemented MLOps practices that keep systems reliable.
I care about the details that matter: reducing p99 latency, building robust monitoring, designing APIs that scale, and writing code that other engineers can maintain.
Inference & Serving
Low-latency GPU inference, dynamic batching, model optimization, autoscaling
Training & Pipelines
Distributed training, feature engineering, data pipelines, experiment tracking
MLOps & Reliability
CI/CD for ML, monitoring, A/B testing, gradual rollouts, incident response