Revolution R Enterprise for Big Data Analysis and Predictive Analytics
Revolution Analytics has taken the popular R language to unprecedented new levels of capacity and performance for statistical analysis of very large data sets. Using the built-in RevoScaleR package, R users can process, visualize and model terabyte-class data sets in a fraction of the time of legacy products – without requiring expensive or specialized hardware.
Harvard Business Review on Big Data
RevoScaleR: Big-Data Statistical Analysis with Revolution R Enterprise
Import “Big Data”: import your largest data sets from ASCII, SAS, SPSS, relational databases or data warehouses into R, without being constrained by memory limitations.
Powerful “Data Step”: Use the power of the R language to select records, transform variables, and sort and merge data. Thanks to scalable, out-of-memory parallel processing, there’s no need to leave the Revolution R environment to quickly prepare Big Data for analysis in R.
- White Paper: Revolution R Enterprise Data Step
- White Paper: Big Data Analysis with Revolution R Enterprise
Big-Data statistical algorithms make use of all available computing resources for high-performance analysis, without data size limitations. Revolution R Enterprise includes distributed, multi-threaded implementations of the following algorithms, with more planned for future updates:
Descriptive Statistics and Cross Tabs on very large data sets
- Basic summary statistics of data.
- Quantile approximations
- Cross Tabulations (standard tables and long form)
- Pairwise cross tabulations
- Marginal summaries of cross-tabulations.
- Chi-squared Test and Fisher's Exact Test
- Kendall's Tau Rank Correlation Coefficient
- Risk Ratio and Odds Ratio on a two-by-two objects.
Statistical Modeling on very large data sets
- Multiple Regression
- Stepwise Regression
- Covariance and correlation matrices
- Sum of squares and cross-product matrices
- Generalized Linear Models
- All Exponential Family Distributions including
- Binomial
- Gamma
- Gaussian
- Inverse Gaussian
- Poisson
- Standard Link Functions including
- Cauchit
- Identity
- Log
- Logit
- Probit
- Tweedie Distributions
- User defined distributions and link functions
- All Exponential Family Distributions including
- Receiver Operating Characteristic (ROC) computations
- Predictions for fitted models.
- K--Means Clustering.
- Classification and Regression Tree
![]() |
Out-of-memory, multi-threaded algorithms in Revolution R Enterprise are faster and more scalable than corresponding functions in open-source R. |
Learn more in these video demonstrations:
- Video demo: High-Performance GLM with R: An auto insurance example
- Video demo: RevoScaleR demo: Old Wives
Distributed Computing for clusters, grids and the Cloud. Deploy the power of a Windows-based Microsoft HPC Server cluster, or a Linux-based grid managed with Platform LSF. Revolution R Enterprise Server makes it easy to cut down the computation time for Big Data analytics simply by scaling with compute nodes. And with Microsoft HPC Server, you can seamlessly transition computations from local resources to the Azure Cloud.
- Video demo: Distributed Data Analysis: A Billion Row Logistic Regression
- Video demo: Cloud Computing with Revolution R Enterprise 6 and Azure Burst
In-Database Analytics: When data locality is critical, bring Revolution R to your data for massively scalable analytics. Use RevoConnectR for Hadoop to distribute R computations across Hadoop nodes with the power of Map-Reduce. Or use Revolution R Enterprise for IBM Netezza for in-database analytics using the power of the IBM Netezza data warehouse appliance.
- Revolution R Enterprise with RevoConnectR for Hadoop
- Revolution R Enterprise for IBM PureData System for Analytics
Learn More about Big Data Analysis with Revolution R Enterprise:
General overview:
- On-demand webinar: Big Data Analysis Starts with R
- White paper: RevoScaleR Speed and Scalability
- On-demand webinar: Scalable Data Analysis in R
- White paper: Big Data Analysis with Revolution R Enterprise
Case Studies
- Achieving High-Performing, Simulation-Based Operational Risk Measurement with RevoScaleR: Estimate operational risk and satisfy the capital requirements mandated for banks by the Basel II Acord.
- How Big Data is Changing Retail Marketing Analytics: How smart retailers are using advanced revenue attribution and customer-level response modelling to optimize their marketing spends.
- Actuarial Analytics in R: Advanced analytical modeling in the actuarial and insurance sectors.
- Financial Services Sector: Asia Capital Reinsurance chose RevoDeployR to analyze complex data and generate business insights,.
- Complex Data Sets in Genomic Diagnostics Require Multiple Analytic Methods: Complex clinical data from thousands of patients were analyzed and the results leveraged to build diagnostic algorithms.
- High Performance Analytics Improves Productivity for Busy Research Institution: Michigan State University used Revolution R Enterprise with Microsoft HPC Server 2008 to outperform alternatives.
- Leading Research Center Speeds up Analysis and Simplifies Complex Analysis on Very Large Data Sets: Multiple sclerosis (MS) researchers at State University of New York (SUNY) at Buffalo crunched through immense data sets to build analytical models.
- UpStream Software’s Big Data Analytics Platform for Marketing Optimization Helps Clients Understand Buying Behavior and Improve Customer Targeting: Revolution R Enterprise and Hadoop deployed to a real-time production environment.
- [x+1] Completes Next-Generation POE; Its Origin Enterprise Data Management Platform for Automated, Big Data-Driven Marketing Optimization: [X+1] analysts satisfied the need for real-time analytics and automated model updates without sacrificing performance.

