Skip to content

Commit

Permalink
[DOCS] Amends data frame analytics resources, GET, and PUT API docs (e…
Browse files Browse the repository at this point in the history
…lastic#44806)

This PR addresses the feedback in  elastic/ml-team#175 (comment).

* Adds an example to `analyzed_fields`
* Includes `source` and `dest` objects inline in the resource page
* Lists `model_memory_limit` in the PUT API page
* Amends the `analysis` section in the resource page
* Removes Properties headings in subsections
  • Loading branch information
szabosteve committed Jul 26, 2019
1 parent 1eb0958 commit a1f4c83
Show file tree
Hide file tree
Showing 4 changed files with 102 additions and 68 deletions.
91 changes: 44 additions & 47 deletions docs/reference/ml/df-analytics/apis/dfanalyticsresources.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,36 @@
(object) You can specify both `includes` and/or `excludes` patterns. If
`analyzed_fields` is not set, only the relevant fields will be included. For
example all the numeric fields for {oldetection}.

[source,js]
--------------------------------------------------
PUT _ml/data_frame/analytics/loganalytics
{
"source": {
"index": "logdata"
},
"dest": {
"index": "logdata_out"
},
"analysis": {
"outlier_detection": {
}
},
"analyzed_fields": {
"includes": [ "request.bytes", "response.counts.error" ],
"excludes": [ "source.geo" ]
}
}
--------------------------------------------------
// CONSOLE
// TEST[setup:setup_logdata]

`dest`::
(object) The destination configuration of the analysis. For more information,
see <<dfanalytics-dest-resources>>.
(object) The destination configuration of the analysis. The `index` property
(string) is the name of the index in which to store the results of the
{dfanalytics-job}. The `results_field` (string) property defines the name of
the field in which to store the results of the analysis. The default value is
`ml`.

`id`::
(string) The unique identifier for the {dfanalytics-job}. This identifier can
Expand All @@ -38,25 +64,29 @@
that setting. For more information, see <<ml-settings>>.

`source`::
(object) The source configuration, consisting of `index` and optionally a
`query`. For more information, see <<dfanalytics-source-resources>>.
(object) The source configuration, consisting of `index` (array) which is an
array of index names on which to perform the analysis. It can be a single
index or index pattern as well as an array of indices or patterns. Optionally,
`source` can have a `query` (object) property. The {es} query domain-specific
language (DSL). This value corresponds to the query object in an {es} search
POST body. All the options that are supported by {es} can be used, as this
object is passed verbatim to {es}. By default, this property has the following
value: `{"match_all": {}}`.

[[dfanalytics-types]]
==== Analysis objects

{dfanalytics-cap} resources contain `analysis` objects. For example, when you
create a {dfanalytics-job}, you must define the type of analysis it performs.
create a {dfanalytics-job}, you must define the type of analysis it performs.
Currently, `outlier_detection` is the only available type of analysis, however,
other types will be added, for example `regression`.

[discrete]
[[oldetection-resources]]
===== {oldetection-cap} configuration objects
==== {oldetection-cap} configuration objects

An {oldetection} configuration object has the following properties:

[discrete]
[[oldetection-properties]]
==== {api-definitions-title}

`n_neighbors`::
(integer) Defines the value for how many nearest neighbors each method of
{oldetection} will use to calculate its {olscore}. When the value is
Expand All @@ -65,44 +95,11 @@ An {oldetection} configuration object has the following properties:
`method`::
(string) Sets the method that {oldetection} uses. If the method is not set
{oldetection} uses an ensemble of different methods and normalises and
combines their individual {olscores} to obtain the overall {olscore}.
Available methods are `lof`, `ldof`, `distance_kth_nn`, `distance_knn`.
combines their individual {olscores} to obtain the overall {olscore}. We
recommend to use the ensemble method. Available methods are `lof`, `ldof`,
`distance_kth_nn`, `distance_knn`.

`feature_influence_threshold`::
(double) The minimum {olscore} that a document needs to have in order to
calculate its {fiscore}.
Value range: 0-1 (`0.1` by default).

[[dfanalytics-dest-resources]]
==== Dest configuration objects

{dfanalytics-cap} resources contain `dest` objects. For example, when you
create a {dfanalytics-job}, you must define its destination.

[discrete]
[[dfanalytics-dest-properties]]
==== {api-definitions-title}

`index`::
(string) The name of the index in which to store the results of the
{dfanalytics-job}.

`results_field`::
(string) The name of the field in which to store the results of the analysis.
The default value is `ml`.

[[dfanalytics-source-resources]]
==== Source configuration objects

The `source` configuration object has the following properties:

`index`::
(array) An array of index names on which to perform the analysis. It can be a
single index or index pattern as well as an array of indices or patterns.

`query`::
(object) The {es} query domain-specific language (DSL). This value
corresponds to the query object in an {es} search POST body. All the
options that are supported by {es} can be used, as this object is
passed verbatim to {es}. By default, this property has the following
value: `{"match_all": {}}`.
Value range: 0-1 (`0.1` by default).
22 changes: 19 additions & 3 deletions docs/reference/ml/df-analytics/apis/get-dfanalytics-stats.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -43,9 +43,18 @@ information, see {stack-ov}/security-privileges.html[Security privileges] and
==== {api-query-parms-title}

`allow_no_match`::
(Optional, boolean) If `false` and the `data_frame_analytics_id` does not
match any {dfanalytics-job} an error will be returned. The default value is
`true`.
(Optional, boolean) Specifies what to do when the request:
+
--
* Contains wildcard expressions and there are no {dfanalytics-jobs} that match.
* Contains the `_all` string or no identifiers and there are no matches.
* Contains wildcard expressions and there are only partial matches.

The default value is `true`, which returns an empty `data_frame_analytics` array
when there are no matches and the subset of results when there are partial
matches. If this parameter is `false`, the request returns a `404` status code
when there are no matches or only partial matches.
--

`from`::
(Optional, integer) Skips the specified number of {dfanalytics-jobs}. The
Expand All @@ -64,6 +73,13 @@ The API returns the following information:
(array) An array of statistics objects for {dfanalytics-jobs}, which are
sorted by the `id` value in ascending order.

[[ml-get-dfanalytics-stats-response-codes]]
==== {api-response-codes-title}

`404` (Missing resources)::
If `allow_no_match` is `false`, this code indicates that there are no
resources that match the request or only partial matches for the request.

[[ml-get-dfanalytics-stats-example]]
==== {api-examples-title}

Expand Down
39 changes: 25 additions & 14 deletions docs/reference/ml/df-analytics/apis/get-dfanalytics.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -33,38 +33,42 @@ information, see {stack-ov}/security-privileges.html[Security privileges] and
==== {api-description-title}

You can get information for multiple {dfanalytics-jobs} in a single API request
by using a comma-separated list of {dfanalytics-jobs} or a wildcard expression.
You can get information for all {dfanalytics-jobs} by using _all, by specifying
`*` as the `<data_frame_analytics_id>`, or by omitting the
`<data_frame_analytics_id>`.
by using a comma-separated list of {dfanalytics-jobs} or a wildcard expression.

[[ml-get-dfanalytics-path-params]]
==== {api-path-parms-title}

`<data_frame_analytics_id>`::
(Optional, string) Identifier for the {dfanalytics-job}. If you do not specify
one of these options, the API returns information for the first hundred
{dfanalytics-jobs}.

`allow_no_match` (Optional)::
(boolean) If `false` and the `data_frame_analytics_id` does not match any
{dfanalytics-job} an error will be returned. The default value is `true`.
{dfanalytics-jobs}. You can get information for all {dfanalytics-jobs} by
using _all, by specifying `*` as the `<data_frame_analytics_id>`, or by
omitting the `<data_frame_analytics_id>`.

[[ml-get-dfanalytics-query-params]]
==== {api-query-parms-title}

`allow_no_match`::
(Optional, boolean) If `false` and the `data_frame_analytics_id` does not
match any {dfanalytics-job} an error will be returned. The default value is
`true`.
(Optional, boolean) Specifies what to do when the request:
+
--
* Contains wildcard expressions and there are no {dfanalytics-jobs} that match.
* Contains the `_all` string or no identifiers and there are no matches.
* Contains wildcard expressions and there are only partial matches.

The default value is `true`, which returns an empty `data_frame_analytics` array
when there are no matches and the subset of results when there are partial
matches. If this parameter is `false`, the request returns a `404` status code
when there are no matches or only partial matches.
--

`from`::
(Optional, integer) Skips the specified number of {dfanalytics-jobs}. The
default value is `0`.

`size`::
(Optional, integer) Specifies the maximum number of {dfanalytics-jobs} to obtain. The
default value is `100`.
(Optional, integer) Specifies the maximum number of {dfanalytics-jobs} to
obtain. The default value is `100`.

[[ml-get-dfanalytics-results]]
==== {api-response-body-title}
Expand All @@ -73,6 +77,13 @@ You can get information for all {dfanalytics-jobs} by using _all, by specifying
(array) An array of {dfanalytics-job} resources. For more information, see
<<ml-dfanalytics-resources>>.

[[ml-get-dfanalytics-response-codes]]
==== {api-response-codes-title}

`404` (Missing resources)::
If `allow_no_match` is `false`, this code indicates that there are no
resources that match the request or only partial matches for the request.

[[ml-get-dfanalytics-example]]
==== {api-examples-title}

Expand Down
18 changes: 14 additions & 4 deletions docs/reference/ml/df-analytics/apis/put-dfanalytics.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -67,12 +67,22 @@ and mappings.
example, all the numeric fields for {oldetection}.

`dest`::
(Required, object) The destination configuration, consisting of `index` and optionally
`results_field` (`ml` by default). See <<dfanalytics-dest-resources>>.
(Required, object) The destination configuration, consisting of `index` and
optionally `results_field` (`ml` by default). See
<<ml-dfanalytics-properties,{dfanalytics} properties>>.

`model_memory_limit`::
(Optional, string) The approximate maximum amount of memory resources that are
permitted for analytical processing. The default value for {dfanalytics-jobs}
is `1gb`. If your `elasticsearch.yml` file contains an
`xpack.ml.max_model_memory_limit` setting, an error occurs when you try to
create {dfanalytics-jobs} that have `model_memory_limit` values greater than
that setting. For more information, see <<ml-settings>>.

`source`::
(Required, object) The source configuration, consisting of `index` and optionally a
`query`. See <<dfanalytics-source-resources>>.
(Required, object) The source configuration, consisting of `index` and
optionally a `query`. See
<<ml-dfanalytics-properties,{dfanalytics} properties>>.

[[ml-put-dfanalytics-example]]
==== {api-examples-title}
Expand Down

0 comments on commit a1f4c83

Please sign in to comment.