You are here

Decision Trees built in Hadoop plus more Big Data Analytics with Revolution R Enterprise


Presenters:

Mario Inchiosa, US Chief Scientist, Revolution Analytics

Fully featured, commercially supported machine learning suites that can build Decision Trees in Hadoop are few and far between.  Addressing this gap, Revolution Analytics recently enhanced its entire scalable analytics suite to run in Hadoop. In this talk, I will explain how our Decision Tree implementation exploits recent research reducing the computational complexity of decision tree estimation, allowing linear scalability with data size and number of nodes. This streaming algorithm processes data in chunks, allowing scaling unconstrained by aggregate cluster memory. The implementation supports both classification and regression and is fully integrated with the R statistical language and the rest of our advanced analytics and machine learning algorithms, as well as our interactive Decision Tree visualizer.

About the speakers:

About the Speakers:
Speaker photoAbout the speaker

Mario Inchiosa’s passion for data science and high performance computing drives his work at Revolution Analytics, where he focuses on delivering parallelized, scalable advanced analytics integrated with the R language. Previously, Mario served as Analytics Architect in IBM’s Big Data organization, working on Social and Machine Data analytics for the BigInsights Hadoop platform. Prior to that, he was US Chief Scientist in Netezza Labs, bringing advanced analytics and R integration to Netezza’s SQL-based data warehouse appliances. Their success led to Netezza’s acquisition by IBM. Mario also served as US Chief Science Officer at NuTech Solutions, a computer science consultancy specializing in simulation, optimization, and data mining, and Senior Scientist at BiosGroup, a complexity science spin-off of the Santa Fe Institute.

Dr. Inchiosa holds Bachelors, Masters, and PhD degrees from Harvard University. He has been awarded four patents and has published over 30 research papers, earning Publication of the Year and Open Literature Publication Excellence awards.