About

I'm Bing — a Machine Learning Engineer specializing in taking models from research to production.

My work sits at the intersection of ML engineering, infrastructure, and systems design. I've built and operated inference platforms handling millions of requests, designed training pipelines for large-scale models, and implemented MLOps practices that keep systems reliable.

I care about the details that matter: reducing p99 latency, building robust monitoring, designing APIs that scale, and writing code that other engineers can maintain.

Inference & Serving

Low-latency GPU inference, dynamic batching, model optimization, autoscaling

Training & Pipelines

Distributed training, feature engineering, data pipelines, experiment tracking

MLOps & Reliability

CI/CD for ML, monitoring, A/B testing, gradual rollouts, incident response