Fully featured, commercially supported machine learning suites that can build Decision Trees in Hadoop are few and far between. Addressing this gap, Revolution Analytics recently enhanced its entire scalable analytics suite to run in Hadoop. In this talk, I will explain how our Decision Tree implementation exploits recent research reducing the computational complexity of decision tree estimation, allowing linear scalability with data size and number of nodes.
You are here
Revolution Free Webinars
In this webinar we will demonstrate the value of advanced analytics by looking at computing price volatility and an Option Pricing business case. The inputs from U.S. equity and options markets measure and calculate volatility, Option Greeks and theoretical option values. Data analysis reveals unique observations, patterns and the possibility for predicting future values. Precise analytics and exploiting price deviations between theoretical option values and the market is at the core of modern option pricing and trading.
Intelligent marketing requires intelligent analytics. Building, validating and deploying customer behavior models has, historically, been a long, complicated and resource-intensive process. With the volume and velocity of data continuing to get bigger and faster, marketers have long been playing catch up; waiting months for data scientists and IT departments to develop and deploy models and processes that are already out of date by the time they are put into production.
Predictive analytics is on the rise and there are critical power users within most organizations that are tasked with supporting the business. However, traditional solutions make it impossible for them to scale the value of predictive analytics to all the needs of their organization. Alteryx and Revolution Analytics combine the easy-to-use Alteryx analytics platform with the production-ready Revolution R Enterprise solution, delivering a deployment platform for Big Data and predictive analytics across the organization.
Big Data Analytics just got easier for Teradata users. Revolution Analytics and Teradata will present joint solutions for maximizing the value of Big Data by running R analytics inside the Teradata database. Come hear how our two companies are shattering past limitations and how leading companies are:
The Revolution R Enterprise Big Data Big Analytics (BDBA) platform super-charges organizations beyond the status quo in order to expedite discovery, accelerate growth and sharpen operations.
Companies are doing a better and better job of collecting data that explains why consumers behave the way they do. These diverse data sets cause us to rethink some of the workhorse algorithms for data analysis. Specifically, the traditional binary response model leaves much room for improvement in how it embraces time. Cross–sectional models allow much rich data to fall through the cracks.
The business cases for Hadoop can be made on the tremendous operational cost savings that it affords. But why stop there? The integration of R-powered analytics in Hadoop presents a totally new value proposition. Organizations can write R code and deploy it natively in Hadoop without data movement or the need to write their own MapReduce. Bringing R-powered predictive analytics into Hadoop will accelerate Hadoop’s value to organizations by allowing them to break through performance and scalability challenges and solve new analytic problems.
Hadoop, being a disruptive data processing framework, has made a large impact in the data ecosystems of today. Enabling business users to translate existing skills to Hadoop is necessary to encourage the adoption and allow businesses to get value out of their Hadoop investment quickly. R, being a prolific and rapidly growing data analysis language, now has a place in the Hadoop ecosystem.
We will cover:
Hadoop is rapidly being adopted as a major platform for storing and managing massive amounts of data, and for computing descriptive and query types of analytics on that data. However, it has a reputation for not being a suitable environment for high performance complex iterative algorithms such as logistic regression, generalized linear models, and decision trees.
A central question in advertising is how to measure the effectiveness of different ad campaigns. In online advertising, including social media, it is possible to create thousands of different variations on an ad, and serve millions of impressions to targeted audiences each day. Rather too often, digital advertisers use the last click attribution model to evaluate the success of campaigns. In other words, when a user clicks on an ad impression, only the very last event is deemed assignﬁcant. This is convenient but doesn’t help in making good marketing decisions.
Gaming is one of the hottest and most innovative industries and the stakes have never been higher. It has experienced many disruptive factors in recent years that challenge anyone responsible for game development, generating revenue, the customer experience and running the business.
The latest update to Revolution R Enterprise 6.2 is now available to customers. In this 30-minute webinar, David Smith will provide a brief overview of Revolution R Enterprise. Then, Thomas Dinsmore will introduce existing users to the new features of Revolution R Enterprise 6.2.
R users already know why the R language is the lingua franca of statisticians today: because it's the most powerful statistical language in the world. Revolution Analytics builds on the power of open source R, and adds performance, productivity and integration features to create Revolution R Enterprise.
We at Revolution Analytics are often asked “What is the best way to learn R?” While acknowledging that there may be as many effective learning styles as there are people we have identified three factors that greatly facilitate learning R.
R and Hadoop are changing the way organizations manage and utilize big data. Think Big Analytics and Revolution Analytics are helping clients plan, build, test and implement innovative solutions based on the two technologies that allow clients to analyze data in new ways; exposing new insights for the business. Join us as Jeffrey Breen explains the core technology concepts and illustrates how to utilize R and Revolution Analytics’ RevoR in Hadoop environments.
As the Big Data market has evolved, the focus has shifted from data operations (storage, access and processing of data) to data science (understanding, analyzing and forecasting from data). And as new models are developed, organizations need a process for deploying analytics from research into the production environment.
Revolution R Enterprise 6.1 includes two important advances in high performance predictive analytics with R: (1) big data decision trees, and (2) the ability to easily extract and perform predictive analytics on data stored in the Hadoop Distributed File System (HDFS).
Statistical analysis has been known to be invaluable to any manufactory’s quality assurance for decades. Recently the value of valid statistical analysis has also been demonstrated to radically improve the ability of a company’s ability to weather extreme peaks and valley in customer demand. John Deere has been able to adjust to commodity spikes and housing downturns much better than its competitors have. This is in part due to the implementation of statistical analysis and the use of R software in the order fulfillment function of John Deere.
The reason why Big Data is important is because we want to use it to make sense of our world. It’s tempting to think there’s some “magic bullet” for analyzing big data, but simple “data distillation” often isn’t enough, and unsupervised machine-learning systems can be dangerous. (Like, bringing-down-the-entire-financial-system dangerous.) Data Science is the key to unlocking insight from Big Data: by combining computer science skills with statistical analysis and a deep understanding of the data and problem we can not only make better predictions, but also fill in gaps in our knowledge, and even find answers to questions we hadn’t even thought of yet.
In this talk, David will
Under the Basel II Accord, financial institutions are required for the first time to determine capital requirements for a new class of risk – operational risk. Large and internally active banks are required to estimate operational risk exposure using the Advanced Measurement Approach (AMA), which relies on advanced empirical models. As banks continue to develop and enhance their own AMA models for operational risk measurement, they are increasingly utilizing R to perform various modeling tasks.
Everything happens somewhere and spatial analysis attempts to use location as an explanatory variable. Such analysis is made complex by the very many ways we habitually record spatial location, the complexity of spatial data structures, and the wide variety of possible domain-driven questions we might ask. One option is to develop and use software for specific types of spatial data, another is to use a purpose-built geographical information system (GIS), but determined work by R enthusiasts has resulted in a multiplicity of packages in the R environment that can also be used.
The Institute for Statistics Education at Statistics.com offers a graduate-level certificate program in R for those who want to use the R statistical programming environment for statistical analysis, visualization and modeling.
RHadoop is an open source project spearheaded by Revolution Analytics to grant data scientists access to Hadoop’s scalability from their favorite language, R. It allows users to write general MapReduce programs, offering the full power and ecosystem of an existing, established programming language.
Smart retailers are using advanced revenue attribution and customer-level response modeling to optimize their marketing spends. This new technique pioneered by Upstream Software employs survival analysis on retail data, with a strong emphasis on time to event modeling. By attending this session you will gain insight into the changing world of retail analytics.
With data analysis showing up in domains as varied as baseball, evidence-based medicine, predicting recidivism and child support lapses, judging wine quality, credit scoring, supermarket scanner data analysis, and “genius” recommendation engines, “business analytics” is part of the zeitgeist. This is a good moment for actuaries to remember that their discipline is arguably the first – and a quarter of a millennium old – example of business analytics at work.
Data scientists sometimes lament, "Why can't I get anyone to use my predictions?" Great models that make accurate predictions are sometimes disconnected from organizational decision-making. This hurts the business and reduces the data scientists’ perceived value the within the organization. But it doesn't have to be this way.
Dr. Sanjiv Das has held positions as at Citibank, Harvard University Professor and Program Director at the FDIC’s Center for Financial Research. His research relies heavily on R for analysis and decision-making. In this webinar, Dr. Das will present a mix of some of his more current and topical research that uses R-based models, and some pedagogical applications of R.
Everyone involved in high-stakes analytics wants power, speed and flexibility regardless of the size of the data set and complexity of the analysis. Trailblazing organizations that have deployed IBM Netezza Analytics with their IBM Netezza data warehouse appliances (TwinFin) with Revolution R Enterprise are getting all three.
GGplot2 is one of R’s most popular, widely used packages, developed by Rice University’s Hadley Wickham. Ggplot2’s exploratory graphics capabilities are driving the use of R as a complement to legacy analytics tools such as SAS. SAS is well-regarded for its strength in data management and "production" statistics, where you know what you want to do and need to do it repeatedly. On the other hand, R is strong in data analysis and exploration in situations where figuring out what is needed is the biggest challenge. In this important way, SAS and R are strong companions.
Join us for a 30-minute executive Webinar to find out how companies of all types and sizes can integrate “R” into their “big data” analytics infrastructure strategy.
For the past several decades the rising tide of technology -- especially the increasing speed of single processors -- has allowed the same data analysis code to run faster and on bigger data sets. That happy era is ending. The size of data sets is increasing much more rapidly than the speed of single cores, of I/O, and of RAM. To deal with this, we need software that can use multiple cores, multiple hard drives, and multiple computers. That is, we need scalable data analysis software.
Hong Ooi’s analysis supports bottom line-impacting decisions made a wide spectrum of groups at Australia and New Zealand Banking Group (ANZ). He has broad experience with both SAS and R, and depends on R for the bulk of his analysis. In this webinar, he will discuss his challenges, how he’s using R along with SAS and Excel to overcome them in areas such as:
R is free software for data analysis and graphics that is similar to SAS and SPSS. Two million people are part of the R Open Source Community. Its use is growing very rapidly and Revolution Analytics distributes a commercial version of R that adds capabilities that are not available in the Open Source version. This 60-minute webinar is for people who are familiar with SAS or SPSS who want to know how R can strengthen their analytics strategy.
This webcast is for statisticians, analysts and IT teams responsible for big data analytics looking for ways to achieve greater innovation and leapfrog current performance. On May 6, 2010, at 2:45 PM, the Dow Jones Industrial Average plummeted approximately 900 points and rebounded within a matter of minutes. This temporary disappearance of $1 trillion in market value would later become known as the Flash Crash and prompt hearings by the U.S. Congressional House Subcommittee on Capital Markets, Insurance, and Government Sponsored Enterprises. As a result of those hearings, the Financial Industry Regulatory Authority (FINRA) instituted rules to regulate trading in the event of a precipitous drop in stock price.
Traditional IT infrastructure is simply unable to meet the demands of the new “Big Data Analytics” landscape. Many enterprises are turning to the “R” statistical programming language and Hadoop (both open source projects) as a potential solution. This webinar will introduce the statistical capabilities of R within the Hadoop ecosystem.
You’ve heard about “R,” now learn about “R.” Join us for a 30-minute executive Webinar to find out how companies of all types and sizes can integrate the “R,” into their modern, enterprise analytics infrastructure strategy.
The rule in the past was that whenever a predictive model was built in a particular development environment, it remained in that environment forever, unless it was manually recoded to work somewhere else. This rule has been shattered with the advent of PMML (Predictive Modeling Markup Language). By providing a uniform standard to represent predictive models, PMML allows for the exchange of predictive solutions between different applications and various vendors.
R, the most powerful statistical language in the world, is ideally suited for creating custom data analysis, statistical models, and data visualizations. But how can application developers make the results of these dynamic R-based computations easily accessible to business users accustomed to spreadsheets like Microsoft Excel or business intelligence tools such as Jaspersoft?
Revolution Analytics is proud to present a new webinar from some of the leading researchers in portfolio design: Diethelm Würtz and Mahendra Mehta for the Rmetrics Association. This webinar will give an overview on current and recent developments and tools for portfolio design, optimization and stability analysis with the R/Rmetrics software environment.
R is a popular and powerful system for creating custom data analysis, statistical models, and data visualizations. But how can you make the results of these R-based computations easily accessible to others? A PhD statistician could use R directly to run the forecasting model on the latest sales data, and email a report on request, but then the process is just going to have to be repeated again next month, even if the model hasn't changed. Wouldn't it be better to empower the Sales manager to run the model on demand from within the BI application she already uses—daily, even!—and free up the statistician to build newer, better models for others?
The R language is well-established as the modern language for predictive analytics. However, given the deluge of data that must be processed and analyzed today, some organizations have been reluctant to deploy R beyond research into production applications. Additionally, R's in-memory design offers great flexibility, but can be limiting when processing multi-gigabyte or terabyte-class datasets.
Business intelligence is about providing reporting and analysis solutions that show business users what happened and why. On the other hand, advanced analytics solutions deliver deeper insight into what might happen in the future, based upon high volumes of historical data and sophisticated modeling techniques. Traditionally, advanced analytics has been reserved for a highly technical audience in fields such as life sciences and academia. However, with the explosion in data in nearly every sector, advanced analytics is now becoming useful to more mainstream business users. This webinar introduces commercial open source business intelligence solutions from Jaspersoft, advanced analytics solutions with the popular open source project R and Revolution Analytics, and how those separate products mesh together in a demonstration by OpenBI, the expert commercial open source system integrator.
If you analyze data for a living, you've probably heard the buzz about the open-source statistical language R in major articles in the New York Times, Forbes and InformationWeek. Business data analysts are now discovering what academia and research statisticians already knew: R's flexibility and power make it simple to do more with your data in a fraction of the time and cost.
Statistical data analysis is a key part of the operations of just about every business today. But as data sets get larger, analyzing trends or generating predictions becomes more and more of a challenge. If you're doing predictive modeling today and find that you can no longer use all of your data because of size limitations, or the computations are taking too long for you to take action on the results, then the parallel-processing capabilities of Windows HPC Server can help.
The R language is quickly evolving from an open-source academic research tool into a commercial application for industrial use. And as R programs become more and more complex, there is an increasing need for developer features that increase productivity and improve performance. Discover how easy it is to increase your productivity with the new R Integrated Development Environment for Windows.
If you analyze data for a living, you've probably heard the buzz about open-source R in major articles in the New York Times and a new "animal" book from O'Reilly. Business data analysts are now discovering what academia and research statisticians already knew: R's flexibility and power make it simple to do more with your data in a fraction of the time and cost.
If you're currently using SAS, SPSS, or Excel to analyze your data, join our webinar and learn how to: