Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[native] Prestissimo worker metrics documentation #23107

Merged
merged 1 commit into from
Aug 21, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 41 additions & 8 deletions presto-docs/src/main/sphinx/presto_cpp/features.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,15 +22,36 @@ HTTP endpoints related to tasks are registered to Proxygen in

Other HTTP endpoints include:

* POST: v1/memory
* Reports memory, but no assignments are adjusted unlike in Java workers.
* GET: v1/info
* GET: v1/status
* POST: v1/memory: Reports memory, but no assignments are adjusted unlike in Java workers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is not clear enough. Will you be able to add more explanations here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@karteekmurthys, could you respond to @yingsu00's comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added more details now.

* GET: v1/info/metrics: Returns worker level metrics in Prometheus Data format. Refer section `Worker Metrics Collection <#worker-metrics-collection>`_ for more info. Here is a sample Metrics data returned by this API.

The request/response flow of Presto C++ is identical to Java workers. The
tasks or new splits are registered via `TaskUpdateRequest`. Resource
utilization and query progress are sent to the coordinator via task endpoints.
.. code-block:: text

# TYPE presto_cpp_num_http_request counter
presto_cpp_num_http_request{cluster="testing",worker=""} 0
# TYPE presto_cpp_num_http_request_error counter
presto_cpp_num_http_request_error{cluster="testing",worker=""} 0
# TYPE presto_cpp_memory_pushback_count counter
presto_cpp_memory_pushback_count{cluster="testing",worker=""} 0
# TYPE velox_driver_yield_count counter
velox_driver_yield_count{cluster="testing",worker=""} 0
# TYPE velox_cache_shrink_count counter
velox_cache_shrink_count{cluster="testing",worker=""} 0
# TYPE velox_memory_cache_num_stale_entries counter
velox_memory_cache_num_stale_entries{cluster="testing",worker=""} 0
# TYPE velox_arbitrator_requests_count counter
velox_arbitrator_requests_count{cluster="testing",worker=""} 0


* GET: v1/info: Returns basic information about the worker. Here is an example:

.. code-block:: text

{"coordinator":false,"environment":"testing","nodeVersion":{"version":"testversion"},"starting":false,"uptime":"49.00s"}

* GET: v1/status: Returns memory pool information.

The request/response flow of Presto C++ is identical to Java workers. The tasks or new splits are registered via `TaskUpdateRequest`. Resource utilization and query progress are sent to the coordinator via task endpoints.

Remote Function Execution
-------------------------
Expand Down Expand Up @@ -169,7 +190,7 @@ Size of the SSD cache when async data cache is enabled.
* **Default value:** ``true``
* **Presto on Spark default value:** ``false``

Enable periodic clean up of old tasks. The default value is ``true`` for Presto C++.
Enable periodic clean up of old tasks. The default value is ``true`` for Presto C++.
For Presto on Spark this property defaults to ``false``, as zombie or stuck tasks
are handled by Spark by speculative execution.

Expand All @@ -185,6 +206,18 @@ Old task is defined as a PrestoTask which has not received heartbeat for at leas
``old-task-cleanup-ms``, or is not running and has an end time more than
``old-task-cleanup-ms`` ago.

Worker metrics collection
-------------------------

Users can enable collection of worker level metrics by setting the property:

``runtime-metrics-collection-enabled``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* **Type:** ``boolean``
* **Default value:** ``false``

When true, the default behavior is a no-op. There is a prior setup that must be done before enabling this flag. To enable
metrics collection in Prometheus Data Format refer `here <https://github.com/prestodb/presto/tree/master/presto-native-execution#build-prestissimo>`_.

Session Properties
------------------
Expand Down
7 changes: 7 additions & 0 deletions presto-docs/src/main/sphinx/presto_cpp/properties.rst
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,13 @@ The configuration properties of Presto C++ workers are described here, in alphab
1) the non-reserved space in ``query-memory-gb`` is used up; and 2) the amount
it tries to get is less than ``memory-pool-reserved-capacity``.

``runtime-metrics-collection-enabled``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* **Type:** ``boolean``
* **Default value:** ``false``

Enables collection of worker level metrics.

``system-memory-gb``
^^^^^^^^^^^^^^^^^^^^

Expand Down
13 changes: 13 additions & 0 deletions presto-native-execution/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,7 @@ Compilers (and versions) not mentioned are known to not work or have not been tr
| CentOS 9/RHEL 9 | `gcc12` |

### Build Prestissimo
#### Parquet and S3 Supprt
To enable Parquet and S3 support, set `PRESTO_ENABLE_PARQUET = "ON"`,
`PRESTO_ENABLE_S3 = "ON"` in the environment.

Expand All @@ -76,6 +77,7 @@ This dependency can be installed by running the script below from the

`./velox/scripts/setup-adapters.sh aws`

#### JWT Authentication
To enable JWT authentication support, set `PRESTO_ENABLE_JWT = "ON"` in
the environment.

Expand All @@ -85,6 +87,17 @@ This dependency can be installed by running the script below from the

`./scripts/setup-adapters.sh jwt`

#### Worker Metrics Collection

To enable worker level metrics collection and to enable the REST API `v1/info/metrics`
follow these steps:

*Pre-build setup:* `./scripts/setup-adapters.sh prometheus`

*CMake flags:* `PRESTO_STATS_REPORTER_TYPE=PROMETHEUS`

*Runtime configuration:* `runtime-metrics-collection-enabled=true`

* After installing the above dependencies, from the
`presto/presto-native-execution` directory, run `make`
* For development, use
Expand Down
Loading