-
Notifications
You must be signed in to change notification settings - Fork 53
Development Guide
Here is a simple guide for developing wrapper functions of MADlib.
-
Useful Utility Functions in PivotalR
schema.madlib
cbind
db.array
db.data.frame
as.db.data.frame
conn.id
names
delete
content
conn.eql
as.factor
eql
lookat
arraydb.to.arrayr
-
Useful Internal Utility Functions in PivotalR
Call a hidden function from command line for testing
.db.getQuery
.load.func
.suppress.warnings
.restore.warnings
.check.madlib.version
.get.params
.get.res
.is.conn.id.valid
.unique.string
.strip
.get.dbms.str
.madlib.version.number
-
Raise an error
Output a string
Join multiple strings
Exception handling
Set the result class
Check whether an object belongs to a class
Get the command call as a language object
Get the command call as a string
Check whether an argument is missing
Regular expression and gsub
The for loop
Check NULL value
The following functions are exposed to the users. For details, please refer to PivotalR's user manual.
Returns MADlib schema name
Combine two db.obj
objects
Combine multiple columns to form a column of array
Create a wrapper of a table
Create a copy of a table, data.frame
, or file
Returns the connection ID of an object
Returns all the column names
Delete all related tables of an object (db.data.frame
, table name, ...).
Returns the table name or SQL query of a db.obj
object.
Are two connection ID equal?
Convert a column into a categorial variable.
Are two db.obj
objects equal;
Load part or all of a table into memory. Try lookat(table_name, "all", array = FALSE)
and lookat(table_name, "all")
to see the difference.
This can be used together with lookat(table_name, "all", array = FALSE)
to parse the results of an execution.
PivotalR:::.unique.string() # NOTE: three ":" here
The following functions are "hidden" from the user, and you can call them only from within a PivotalR's function definition. You cannot call these functions from R's command line. However, when you are developing wrapper functions for MADlib, it is helpful to try and play around with these functions from command line. Then you can use the above method to call them.
Execute the query string in connection conn.id, returns a data.frame,
which is the result of the SQL query. The function .get.params is
preferred than this one. One should use .get.res
function instead.
Load a SQL function definition from inst/sql/
Suppress all warnings, returns the original warning levels
Restore the original levels
When MADlib version is smaller than allowed.version
, raise an error
Analyze a formula and get the dependent, independent and grouping
variables. Do pivoting if factor column is specified. Create
intermediate table for db.Rquery
and db.view
objects.
SImilar to .db.getQuery
but has exception handling. Returns the execution result.
Check whether conn.id represents a valid existing connection
Generate a unique string.
Remove the string rm
from the beginning and end of the string str
.
Get the DBMS name (Greenplum, Postgres, or HAWQ).
Get a double value which is the MADlib version number in the connected database.
stop("We have ",
"an error at line ", 365, "!")
cat("We have ",
"a string here ", 365, sep = "")
paste("We have", "something at", 365)
paste("We have ", "something at ", 365, sep="")
paste0("We have ", "something at ", 365)
a <- c(1,2,3)
paste(a, "is a", collapse=" + ", sep="")
res <- try(.db.getQuery(sql, conn.id(x)), silent = TRUE)
if (is(res, .err.class))
stop("Could not do the summary!")
.get.res already has exception handling built in.
class(rst) <- "arima.css.madlib"
is(x, "db.Rquery")
is(res, .err.class)
is(res, "data.frame")
call <- match.call()
cat(match.call())
if (missing(j)) {
stop("Error")
}
gsub(regular-expression-to-replace, new-regular-expression, your-string)
gsub("\\d+", "digits", "1233535 is the number") # returns "digits is the number", note the double slashes
R's regular expressions use \\
instead of \
.
for (i in seq_len(n)) print(i) # seq_len(0) is integer(0), loop is not executed
if (is.null(x))
stop("Error: cannot be NULL!")
-
Add export("madlib.newwrapper") into the file
NAMESPACE
to expose it to the user -
Double check that all new internal functions start with "."
-
Add user doc into the folder
man/
. You can use the existing user doc as prototypes. -
On the upper directory of
PivotalR/
, run (for example, version 0.1.100)$ R CMD build --resave-data PivotalR 2>/dev/null $ R CMD check --as-cran PivotalR_0.1.100.tar.gz $ R CMD install PivotalR_0.1.100.tar.gz
Correct all errors, warnings and notes in the second step.
-
File the pull request.
For MADlib team member, directly push the code into a new branch.