Skip to content

Commit

Permalink
[native] Prestissimo worker metrics documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
karteekmurthys committed Aug 20, 2024
1 parent be29974 commit 14ba5f8
Show file tree
Hide file tree
Showing 3 changed files with 448 additions and 259 deletions.
49 changes: 41 additions & 8 deletions presto-docs/src/main/sphinx/presto_cpp/features.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,15 +22,36 @@ HTTP endpoints related to tasks are registered to Proxygen in

Other HTTP endpoints include:

* POST: v1/memory
* Reports memory, but no assignments are adjusted unlike in Java workers.
* GET: v1/info
* GET: v1/status
* POST: v1/memory: Reports memory, but no assignments are adjusted unlike in Java workers
* GET: v1/info/metrics: Returns worker level metrics in Prometheus Data format. Refer section `Worker Metrics Collection <#worker-metrics-collection>`_ for more info. Here is a sample Metrics data returned by this API.

The request/response flow of Presto C++ is identical to Java workers. The
tasks or new splits are registered via `TaskUpdateRequest`. Resource
utilization and query progress are sent to the coordinator via task endpoints.
.. code-block:: text
# TYPE presto_cpp_num_http_request counter
presto_cpp_num_http_request{cluster="testing",worker=""} 0
# TYPE presto_cpp_num_http_request_error counter
presto_cpp_num_http_request_error{cluster="testing",worker=""} 0
# TYPE presto_cpp_memory_pushback_count counter
presto_cpp_memory_pushback_count{cluster="testing",worker=""} 0
# TYPE velox_driver_yield_count counter
velox_driver_yield_count{cluster="testing",worker=""} 0
# TYPE velox_cache_shrink_count counter
velox_cache_shrink_count{cluster="testing",worker=""} 0
# TYPE velox_memory_cache_num_stale_entries counter
velox_memory_cache_num_stale_entries{cluster="testing",worker=""} 0
# TYPE velox_arbitrator_requests_count counter
velox_arbitrator_requests_count{cluster="testing",worker=""} 0
* GET: v1/info: Returns basic information about the worker. Here is an example:

.. code-block:: text
{"coordinator":false,"environment":"testing","nodeVersion":{"version":"testversion"},"starting":false,"uptime":"49.00s"}
* GET: v1/status: Returns memory pool information.

The request/response flow of Presto C++ is identical to Java workers. The tasks or new splits are registered via `TaskUpdateRequest`. Resource utilization and query progress are sent to the coordinator via task endpoints.

Remote Function Execution
-------------------------
Expand Down Expand Up @@ -169,7 +190,7 @@ Size of the SSD cache when async data cache is enabled.
* **Default value:** ``true``
* **Presto on Spark default value:** ``false``

Enable periodic clean up of old tasks. The default value is ``true`` for Presto C++.
Enable periodic clean up of old tasks. The default value is ``true`` for Presto C++.
For Presto on Spark this property defaults to ``false``, as zombie or stuck tasks
are handled by Spark by speculative execution.

Expand All @@ -185,6 +206,18 @@ Old task is defined as a PrestoTask which has not received heartbeat for at leas
``old-task-cleanup-ms``, or is not running and has an end time more than
``old-task-cleanup-ms`` ago.

Worker metrics collection
-------------------------

Users can enable collection of worker level metrics by setting the property:

``runtime-metrics-collection-enabled``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* **Type:** ``boolean``
* **Default value:** ``false``

When true, the default behavior is a no-op. There is a prior setup that must be done before enabling this flag. To enable
metrics collection in Prometheus Data Format refer `here <https://github.com/prestodb/presto/tree/master/presto-native-execution#build-prestissimo>`_.

Session Properties
------------------
Expand Down
7 changes: 7 additions & 0 deletions presto-docs/src/main/sphinx/presto_cpp/properties.rst
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,13 @@ The configuration properties of Presto C++ workers are described here, in alphab
1) the non-reserved space in ``query-memory-gb`` is used up; and 2) the amount
it tries to get is less than ``memory-pool-reserved-capacity``.

``runtime-metrics-collection-enabled``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* **Type:** ``boolean``
* **Default value:** ``false``

Enables collection of worker level metrics.

``system-memory-gb``
^^^^^^^^^^^^^^^^^^^^

Expand Down
Loading

0 comments on commit 14ba5f8

Please sign in to comment.