Skip to content
This repository has been archived by the owner on Nov 8, 2022. It is now read-only.

Commit

Permalink
adds task documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
pittma committed Dec 2, 2015
1 parent f335f6d commit d0b5981
Show file tree
Hide file tree
Showing 2 changed files with 172 additions and 3 deletions.
4 changes: 1 addition & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -115,8 +115,6 @@ Checkout the [tribe](docs/TRIBE.md) doc for more info.
## Load Plugins
snap gets its power from the use of plugins. The [Plugin Catalog](#plugin-catalog) is a collection of all known plugins for snap.

- [ ] TODO - guide or pointer to building one of our plugins...

Next, lets load a few of the demo plugins. You can do this via cURL, or `snapctl`, snap's CLI.

Using cURL
Expand Down Expand Up @@ -170,7 +168,7 @@ $ ./bin/snapctl task watch 8b9babad-b3bc-4a16-9e06-1f35664a7679
```

### Building Tasks
TODO
Documentation for building a task can be found [here](docs/TASKS.md).

### Plugin Catalog
All known Plugins are tracked in the [Plugin Catalog](https://github.com/intelsdi-x/snap/blob/master/docs/PLUGIN_CATALOG.md) and are tagged as consumers, processors and publishers.
Expand Down
171 changes: 171 additions & 0 deletions docs/TASKS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,171 @@
Tasks
=====

A task describes the how, what, and when to do for a __snap__ job. A task is described in a task _manifest_, which can be either JSON or YAML<sup>1</sup>.

_Skip to the TL;DR example [here](#tldr)_.

The manifest can be divided into two parts: Header, and Workflow.

### The Header

```yaml
---
version: 1
schedule:
type: "simple"
interval: "1s"
```
#### Version
The header contains a version, used to differentiate between versions of the task manifest parser. Right now, there is only one version: `1`.

#### Schedule

The schedule describes the schedule type and interval for running the task. The type of a schedule could be a simple "run forever" schedule, which is what we see above as `"simple"` or something more complex. __snap__ is designed in a way where custom schedulers can easily be dropped in. If a custom schedule is used, it may require more key/value pairs in the schedule section of the manifest. At the time of this writing, __snap__ has a simple schedule which is described above, and a window schedule. The window schedule adds a start and stop time.

### The Workflow

```yaml
---
collect:
metrics:
/intel/mock/foo: {}
/intel/mock/bar: {}
/intel/mock/*/baz: {}
config:
/intel/mock:
user: "root"
password: "secret"
process:
-
plugin_name: "passthru"
publish:
-
plugin_name: "file"
config:
file: "/tmp/published"
```

The workflow is a [DAG](https://en.wikipedia.org/wiki/Directed_acyclic_graph) which describes the how and what of a task. It is always rooted by a `collect`, and then contains any number of `process`es and `publish`es.

#### collect

The collect section describes which metrics to collect. Metrics can be enumerated explicitly via a concrete _namespace_, or a wildcard (`*`) can be used<sup>2</sup>. The namespaces are keys to another nested object which may contain a specific version of a plugin, e.g.:

```yaml
---
/foo/bar/baz:
version: 4
```

If a version is not given, __snap__ will __select__ the latest for you.

The config section describes configuration data for metrics. Since metric namespaces form a tree, config can be described at a branch, and all leaves of that branch will receive the given config. For example, say a task is going to collect `/intel/perf/foo`, `/intel/perf/bar`, and `/intel/perf/baz`, all of which require a username and password to collect. That config could be described like so:

```yaml
---
metrics:
/intel/perf/foo: {}
/intel/perf/bar: {}
/intel/perf/baz: {}
config:
/intel/perf:
username: jerr
password: j3rr
```

Applying the config at `/intel/perf` means that all leaves of `/intel/perf` (`/intel/perf/foo`, `/intel/perf/bar`, and `/intel/perf/baz` in this case) will receive the config.

A collect node can also contain any number of process or publish nodes. These nodes describe what to do next.

#### process

A process node describes which plugin to use to process data coming from either a collection or another process node. The config section describes config data which may be needed for the chosen plugin.

A process node may have any number of process or publish nodes.

#### publish

A publish node describes which plugin to use to process data coming from either a collection or a process node. The config section describes config data which may be needed for the chosen plugin.

A publish node is a [pendant vertex (a leaf)](http://mathworld.wolfram.com/PendantVertex.html). It may contain no collect, process, or publish nodes.

## TL;DR

Below is a complete example task.

### YAML

```yaml
---
version: 1
schedule:
type: "simple"
interval: "1s"
workflow:
collect:
metrics:
/intel/mock/foo: {}
/intel/mock/bar: {}
/intel/mock/*/baz: {}
config:
/intel/mock:
user: "root"
password: "secret"
process:
-
plugin_name: "passthru"
process: null
publish:
-
plugin_name: "file"
config:
file: "/tmp/published"
```

### JSON

```json
{
"version": 1,
"schedule": {
"type": "simple",
"interval": "1s"
},
"workflow": {
"collect": {
"metrics": {
"/intel/mock/foo": {},
"/intel/mock/bar": {},
"/intel/mock/*/baz": {}
},
"config": {
"/intel/mock": {
"user": "root",
"password": "secret"
}
},
"process": [
{
"plugin_name": "passthru",
"process": null,
"publish": [
{
"plugin_name": "file",
"config": {
"file": "/tmp/published"
}
}
]
}
]
}
}
}
```

#### footnotes

1. YAML is only supported via the snapctl CLI. Only JSON is accepted via the REST API.
2. The wildcard must be supported by the target plugin.

0 comments on commit d0b5981

Please sign in to comment.