Skip to content
This repository has been archived by the owner on May 22, 2024. It is now read-only.
Qian, Hai edited this page Jul 22, 2013 · 25 revisions

This page is for master branch only.

Newest features in master

  • A graphical user interface based upon package shiny

  • sample with or without replacement

  • generic.cv for cross-validation

  • generic.bagging for bagging

  • crossprod for multiplication of two matrices

  • Full support for columns with array values


PivotalR is an R package, which you can download from CRAN. However, GitHub has the latest code, which has many more functionalities but is less stable.

GUI

Table's summary view

Linear regression

Big Data

PivotalR is an R front-end to PostgreSQL and all PostgreSQL-like databases like Pivotal Inc.'s Greenplum Database (GPDB), Pivotal HD / HAWQ.

When running on the products of Pivotal Inc., PivotalR utilizes the full power of parallel computation and distributive storage embedded in Greenplum Database or HAWQ, and thus gives the normal R user access to Big Data.

Machine Learning

PivotalR also provides the R wrapper for MADlib. MADlib is an open-source library for scalable in-database analytics. It provides data-parallel implementations of mathematical, statistical and machine-learning algorithms for structured and unstructured data. The number of machine learning algorithms that MADlib covers is quickly increasing.

Easily Exploring Data Using the Familiar R Syntax

PivotalR mimics the regular R syntax for manipulating data.frame to operate on the tables stored in the databases. We strive hard to make PivotalR's learning curve as smooth as possible.

PivotalR also brings R's power graphical functionalities to Big Data stored in database or Hadoop.

Quick Prototype of Data-Parallel Machine Learning Algorithms

PivotalR enables the user to create prototypes of machine learning algorithms quickly using the regular R syntax. These prototypes acquire parallel computation power automatically when running on GPDB or HAWQ.

Thus, first copy your R script and then make proper changes to make the script runnable in PivotalR. The goal of PivotalR is to minimize the amount of changes that are needed to convert a normal R script to parallel R script.

See here for some examples.

Minimizing the data flow between R and databases

PivotalR class hierarchy structure

The demo script of PivotalR

Clone this wiki locally