-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathREADME.Rmd
71 lines (54 loc) · 1.96 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, echo = FALSE, message = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#",
fig.path = "tools/images/README-"
)
library(sparkts)
```
# sparkts
[![Project Status: Active - The project has reached a stable, usable state and is being actively developed.](http://www.repostatus.org/badges/latest/active.svg)](http://www.repostatus.org/#active)
[![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/sparkts)](http://cran.r-project.org/package=sparkts)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
The goal of `sparkts` is to provide a test bed of `sparklyr` extensions for the [`spark-ts`](https://github.com/srussell91/SparkTS) framework which was modified from the [`spark-timeseries`](https://github.com/sryza/spark-timeseries) framework.
## Installation
You can install `sparkts` from GitHub with:
```{r installation, eval = FALSE}
# install.packages("devtools")
devtools::install_github("nathaneastwood/sparkts")
```
For details on how to set up for further developing the package, please see the development vignette.
## Example
This is a basic example which shows you how to calculate the standard error for some time series data:
```{r example, cache = TRUE, message = FALSE}
library(sparkts)
# Set up a spark connection
sc <- sparklyr::spark_connect(
master = "local",
version = "2.2.0",
config = list(sparklyr.gateway.address = "127.0.0.1")
)
# Extract some data
std_data <- spark_read_json(
sc,
"std_data",
path = system.file(
"data_raw/StandardErrorDataIn.json",
package = "sparkts"
)
) %>%
spark_dataframe()
# Call the method
p <- sdf_standard_error(
sc = sc, data = std_data,
x_col = "xColumn", y_col = "yColumn", z_col = "zColumn",
new_column_name = "StandardError"
)
p %>% dplyr::collect()
# Disconnect from the spark connection
spark_disconnect(sc = sc)
```