You are here

Real-Time Big Data Analytics: From Deployment to Production

Presentation by David Smith (VP Marketing and Community, Revolution Analytics) to Strata Santa Clara, February 26 2013. More at

Talk overview:

Taking data science into action requires deploying statistical models into production environments, usually with real-time processing requirements. Every company that relies on predictive models to drive their applications and operations has a different process for model deployment, but by working with many such companies I've seen a common pattern emerge. The real-time model deployment process can be broken down into these five stages:

  • Data distillation
  • Model development
  • Model validation and deployment
  • Model refresh
  • Real-time model scoring

In this talk, I'll describe the five stages of real-time analytics deployment, and the technologies supporting each stage, including Hadoop, R, and database warehousing systems. I'll share some best practices for setting up a the technology stack and processes for model deployment, based on some real-life case studies.