Skip to content

Commit

Permalink
[native] Prestissimo worker metrics documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
karteekmurthys authored and yingsu00 committed Aug 21, 2024
1 parent 61a01f4 commit dec6c14
Show file tree
Hide file tree
Showing 3 changed files with 61 additions and 8 deletions.
49 changes: 41 additions & 8 deletions presto-docs/src/main/sphinx/presto_cpp/features.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,15 +22,36 @@ HTTP endpoints related to tasks are registered to Proxygen in

Other HTTP endpoints include:

* POST: v1/memory
* Reports memory, but no assignments are adjusted unlike in Java workers.
* GET: v1/info
* GET: v1/status
* POST: v1/memory: Reports memory, but no assignments are adjusted unlike in Java workers
* GET: v1/info/metrics: Returns worker level metrics in Prometheus Data format. Refer section `Worker Metrics Collection <#worker-metrics-collection>`_ for more info. Here is a sample Metrics data returned by this API.

The request/response flow of Presto C++ is identical to Java workers. The
tasks or new splits are registered via `TaskUpdateRequest`. Resource
utilization and query progress are sent to the coordinator via task endpoints.
.. code-block:: text
# TYPE presto_cpp_num_http_request counter
presto_cpp_num_http_request{cluster="testing",worker=""} 0
# TYPE presto_cpp_num_http_request_error counter
presto_cpp_num_http_request_error{cluster="testing",worker=""} 0
# TYPE presto_cpp_memory_pushback_count counter
presto_cpp_memory_pushback_count{cluster="testing",worker=""} 0
# TYPE velox_driver_yield_count counter
velox_driver_yield_count{cluster="testing",worker=""} 0
# TYPE velox_cache_shrink_count counter
velox_cache_shrink_count{cluster="testing",worker=""} 0
# TYPE velox_memory_cache_num_stale_entries counter
velox_memory_cache_num_stale_entries{cluster="testing",worker=""} 0
# TYPE velox_arbitrator_requests_count counter
velox_arbitrator_requests_count{cluster="testing",worker=""} 0
* GET: v1/info: Returns basic information about the worker. Here is an example:

.. code-block:: text
{"coordinator":false,"environment":"testing","nodeVersion":{"version":"testversion"},"starting":false,"uptime":"49.00s"}
* GET: v1/status: Returns memory pool information.

The request/response flow of Presto C++ is identical to Java workers. The tasks or new splits are registered via `TaskUpdateRequest`. Resource utilization and query progress are sent to the coordinator via task endpoints.

Remote Function Execution
-------------------------
Expand Down Expand Up @@ -169,7 +190,7 @@ Size of the SSD cache when async data cache is enabled.
* **Default value:** ``true``
* **Presto on Spark default value:** ``false``

Enable periodic clean up of old tasks. The default value is ``true`` for Presto C++.
Enable periodic clean up of old tasks. The default value is ``true`` for Presto C++.
For Presto on Spark this property defaults to ``false``, as zombie or stuck tasks
are handled by Spark by speculative execution.

Expand All @@ -185,6 +206,18 @@ Old task is defined as a PrestoTask which has not received heartbeat for at leas
``old-task-cleanup-ms``, or is not running and has an end time more than
``old-task-cleanup-ms`` ago.

Worker metrics collection
-------------------------

Users can enable collection of worker level metrics by setting the property:

``runtime-metrics-collection-enabled``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* **Type:** ``boolean``
* **Default value:** ``false``

When true, the default behavior is a no-op. There is a prior setup that must be done before enabling this flag. To enable
metrics collection in Prometheus Data Format refer `here <https://github.com/prestodb/presto/tree/master/presto-native-execution#build-prestissimo>`_.

Session Properties
------------------
Expand Down
7 changes: 7 additions & 0 deletions presto-docs/src/main/sphinx/presto_cpp/properties.rst
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,13 @@ The configuration properties of Presto C++ workers are described here, in alphab
1) the non-reserved space in ``query-memory-gb`` is used up; and 2) the amount
it tries to get is less than ``memory-pool-reserved-capacity``.

``runtime-metrics-collection-enabled``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* **Type:** ``boolean``
* **Default value:** ``false``

Enables collection of worker level metrics.

``system-memory-gb``
^^^^^^^^^^^^^^^^^^^^

Expand Down
13 changes: 13 additions & 0 deletions presto-native-execution/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,7 @@ Compilers (and versions) not mentioned are known to not work or have not been tr
| CentOS 9/RHEL 9 | `gcc12` |

### Build Prestissimo
#### Parquet and S3 Supprt
To enable Parquet and S3 support, set `PRESTO_ENABLE_PARQUET = "ON"`,
`PRESTO_ENABLE_S3 = "ON"` in the environment.

Expand All @@ -76,6 +77,7 @@ This dependency can be installed by running the script below from the

`./velox/scripts/setup-adapters.sh aws`

#### JWT Authentication
To enable JWT authentication support, set `PRESTO_ENABLE_JWT = "ON"` in
the environment.

Expand All @@ -85,6 +87,17 @@ This dependency can be installed by running the script below from the

`./scripts/setup-adapters.sh jwt`

#### Worker Metrics Collection

To enable worker level metrics collection and to enable the REST API `v1/info/metrics`
follow these steps:

*Pre-build setup:* `./scripts/setup-adapters.sh prometheus`

*CMake flags:* `PRESTO_STATS_REPORTER_TYPE=PROMETHEUS`

*Runtime configuration:* `runtime-metrics-collection-enabled=true`

* After installing the above dependencies, from the
`presto/presto-native-execution` directory, run `make`
* For development, use
Expand Down

0 comments on commit dec6c14

Please sign in to comment.