R + Hadoop = Big Data Analytics.
How Revolution Analytics' RHadoop Project Allows All Developers to Leverage the MapReduce Framework

Presented: Wednesday, May 2, 2012
Presenter: Antonio Piccolboni, Big Data Scientist, Revolution Analytics
 

RHadoop is an open source project spearheaded by Revolution Analytics to grant data scientists access to Hadoop’s scalability from their favorite language, R.   It allows users to write general MapReduce programs, offering the full power and ecosystem of an existing, established programming language.  RHadoop is comprised of three packages.

  1. RHDFS, which provides file level manipulation for HDFS, the Hadoop file system
  2. RHBASE, which provides access to HBASE, the Hadoop database
  3. rmr, which allows you to write MapReduce programs in R

In this webinar, Antonio will provide a brief introduction to Hadoop and R.  He will describe how rmr allows R developers to program in the MapReduce framework, and provides for all developers an alternative way to implement MapReduce programs that strikes a delicate compromise between power and usability.  He’ll explain:

  • Its simplicity.  For example, you don’t need to replace the R interpreter with a special run-time—it is just a library
  • How rmr’s handful of functions (with a modest number of arguments and sensible defaults) can be combined in many useful ways
  • Using examples such as machine learning and statistics, he’ll show examples of the power of the API 
  • Ways you can contribute to the further development of the RHadoop project.

View the presentation on Slideshare.

View the webinar on Youtube.

About the Speaker

Antonio Piccolboni

Antonio Piccolboni
Data Scientist
Revolution Analytics

Antonio is a data scientist with experience in Big Data analytics.  He is the lead developer of Revolution Analytics’ big data package for R (RHadoop/rmr).   With specialties in algorithm design and implementation and big data analytics, he has experience in information filtering, social network analysis, web analytics and more.    His work has been cited more than 3,500 times in the scientific literature.    http://piccolboni.info/