adds task documentation

intelsdi-x · Dec 2, 2015 · d0b5981 · d0b5981
1 parent f335f6d
commit d0b5981
Show file tree

Hide file tree

Showing 2 changed files with 172 additions and 3 deletions.
diff --git a/README.md b/README.md
@@ -115,8 +115,6 @@ Checkout the [tribe](docs/TRIBE.md) doc for more info.
 ## Load Plugins
 snap gets its power from the use of plugins. The [Plugin Catalog](#plugin-catalog) is a collection of all known plugins for snap.
 
-- [ ] TODO - guide or pointer to building one of our plugins...
-
 Next, lets load a few of the demo plugins.  You can do this via cURL, or `snapctl`, snap's CLI.
 
 Using cURL
@@ -170,7 +168,7 @@ $ ./bin/snapctl task watch 8b9babad-b3bc-4a16-9e06-1f35664a7679
 ```
 
 ### Building Tasks
-TODO
+Documentation for building a task can be found [here](docs/TASKS.md).
 
 ### Plugin Catalog
 All known Plugins are tracked in the [Plugin Catalog](https://github.com/intelsdi-x/snap/blob/master/docs/PLUGIN_CATALOG.md) and are tagged as consumers, processors and publishers.

diff --git a/docs/TASKS.md b/docs/TASKS.md
@@ -0,0 +1,171 @@
+Tasks
+=====
+
+A task describes the how, what, and when to do for a __snap__ job.  A task is described in a task _manifest_, which can be either JSON or YAML<sup>1</sup>.
+
+_Skip to the TL;DR example [here](#tldr)_.
+
+The manifest can be divided into two parts: Header, and Workflow.
+
+### The Header
+
+```yaml
+---
+  version: 1
+  schedule: 
+    type: "simple"
+    interval: "1s"
+```
+
+#### Version
+The header contains a version, used to differentiate between versions of the task manifest parser.  Right now, there is only one version: `1`.
+
+#### Schedule
+
+The schedule describes the schedule type and interval for running the task.  The type of a schedule could be a simple "run forever" schedule, which is what we see above as `"simple"` or something more complex.  __snap__ is designed in a way where custom schedulers can easily be dropped in.  If a custom schedule is used, it may require more key/value pairs in the schedule section of the manifest.  At the time of this writing, __snap__ has a simple schedule which is described above, and a window schedule.  The window schedule adds a start and stop time.
+
+### The Workflow
+
+```yaml
+---
+  collect:
+    metrics:
+      /intel/mock/foo: {}
+      /intel/mock/bar: {}
+      /intel/mock/*/baz: {}
+    config:
+      /intel/mock:
+        user: "root"
+        password: "secret"
+    process:
+      -
+        plugin_name: "passthru"
+        publish:
+          -
+            plugin_name: "file"
+            config:
+              file: "/tmp/published"
+```
+
+The workflow is a [DAG](https://en.wikipedia.org/wiki/Directed_acyclic_graph) which describes the how and what of a task.  It is always rooted by a `collect`, and then contains any number of `process`es and `publish`es.
+
+#### collect
+
+The collect section describes which metrics to collect.  Metrics can be enumerated explicitly via a concrete _namespace_, or a wildcard (`*`) can be used<sup>2</sup>.  The namespaces are keys to another nested object which may contain a specific version of a plugin, e.g.:
+
+```yaml
+---
+/foo/bar/baz:
+  version: 4
+```
+
+If a version is not given, __snap__ will __select__ the latest for you.
+
+The config section describes configuration data for metrics.  Since metric namespaces form a tree, config can be described at a branch, and all leaves of that branch will receive the given config.  For example, say a task is going to collect `/intel/perf/foo`, `/intel/perf/bar`, and `/intel/perf/baz`, all of which require a username and password to collect.  That config could be described like so:
+
+```yaml
+---
+metrics:
+  /intel/perf/foo: {}
+  /intel/perf/bar: {}
+  /intel/perf/baz: {}
+config:
+  /intel/perf:
+    username: jerr
+    password: j3rr
+```
+
+Applying the config at `/intel/perf` means that all leaves of `/intel/perf` (`/intel/perf/foo`, `/intel/perf/bar`, and `/intel/perf/baz` in this case) will receive the config.
+
+A collect node can also contain any number of process or publish nodes.  These nodes describe what to do next.
+
+#### process
+
+A process node describes which plugin to use to process data coming from either a collection or another process node.  The config section describes config data which may be needed for the chosen plugin.
+
+A process node may have any number of process or publish nodes.
+
+#### publish
+
+A publish node describes which plugin to use to process data coming from either a collection or a process node.  The config section describes config data which may be needed for the chosen plugin.
+
+A publish node is a [pendant vertex (a leaf)](http://mathworld.wolfram.com/PendantVertex.html).  It may contain no collect, process, or publish nodes.
+
+## TL;DR
+
+Below is a complete example task.
+
+### YAML
+
+```yaml
+---
+  version: 1
+  schedule:
+    type: "simple"
+    interval: "1s"
+  workflow:
+    collect:
+      metrics:
+        /intel/mock/foo: {}
+        /intel/mock/bar: {}
+        /intel/mock/*/baz: {}
+      config:
+        /intel/mock:
+          user: "root"
+          password: "secret"
+      process:
+        -
+          plugin_name: "passthru"
+          process: null
+          publish:
+            -
+              plugin_name: "file"
+              config:
+                file: "/tmp/published"
+```
+
+### JSON
+
+```json
+{
+    "version": 1,
+    "schedule": {
+        "type": "simple",
+        "interval": "1s"
+    },
+    "workflow": {
+        "collect": {
+            "metrics": {
+                "/intel/mock/foo": {},
+                "/intel/mock/bar": {},
+                "/intel/mock/*/baz": {}
+            },
+            "config": {
+                "/intel/mock": {
+                    "user": "root",
+                    "password": "secret"
+                }
+            },
+            "process": [
+                {
+                    "plugin_name": "passthru",
+                    "process": null,
+                    "publish": [
+                        {
+                            "plugin_name": "file",
+                            "config": {
+                                "file": "/tmp/published"
+                            }
+                        }
+                    ]
+                }
+            ]
+        }
+    }
+}
+```
+
+#### footnotes
+
+1. YAML is only supported via the snapctl CLI.  Only JSON is accepted via the REST API.
+2. The wildcard must be supported by the target plugin.