From 41d4cb8c6b9f2c17acd1ecd7eee3b948bbd53c7c Mon Sep 17 00:00:00 2001
From: Naman Nandan <namannan@amazon.com>
Date: Mon, 31 Jul 2023 16:54:03 -0700
Subject: [PATCH 1/7] Add backward compatibility issues to doc

---
 docs/metrics.md | 28 ++++++++++++++++++++++++++--
 1 file changed, 26 insertions(+), 2 deletions(-)

diff --git a/docs/metrics.md b/docs/metrics.md
index 9993948683..249065edfb 100644
--- a/docs/metrics.md
+++ b/docs/metrics.md
@@ -10,6 +10,7 @@
 * [Custom Metrics API](#custom-metrics-api)
 * [Logging custom metrics](#log-custom-metrics)
 * [Metrics YAML Parsing and Metrics API example](#Metrics-YAML-File-Parsing-and-Metrics-API-Custom-Handler-Example)
+* [Backwards compatibility warnings](#backwards-compatibility-warnings)
 
 ## Introduction
 
@@ -28,7 +29,7 @@ Metrics are collected by default at the following locations in `log` mode:
 
 The location of log files and metric files can be configured in the [log4j2.xml](https://github.com/pytorch/serve/blob/master/frontend/server/src/main/resources/log4j2.xml) file
 
-In `prometheus` mode, all metrics are made available in prometheus format via the [metrics](https://github.com/pytorch/serve/blob/master/docs/metrics_api.md) API endpoint.
+In `prometheus` mode, all metrics are made available in prometheus format via the [metrics API endpoint](https://github.com/pytorch/serve/blob/master/docs/metrics_api.md).
 
 ## Frontend Metrics
 
@@ -187,7 +188,10 @@ model_metrics:  # backend metrics
 ```
 
 
-Default metrics are provided in the [metrics.yaml](https://github.com/pytorch/serve/blob/master/ts/configs/metrics.yaml) file, but the user can either delete them to their liking / ignore them altogether, because these metrics will not be emitted unless they are edited.
+Note that **only** the metrics defined in the **metrics configuration file** can be emitted to logs or made available via the metrics API endpoint. This is done to ensure that the metrics configuration file serves as a central inventory of all the metrics that Torchserve can emit.
+
+Default metrics are provided in the [metrics.yaml](https://github.com/pytorch/serve/blob/master/ts/configs/metrics.yaml) file, but the user can either delete them to their liking / ignore them altogether, because these metrics will not be emitted unless they are edited.\
+When adding custom `model_metrics` in the metrics cofiguration file, ensure to include `ModelName` and `Level` dimension names towards the end of the list of dimensions since they are included by default by the custom metrics API.
 
 
 ### How it works
@@ -622,3 +626,23 @@ class CustomHandlerExample:
         # except this time with gauge metric type object
         metrics.add_size("GaugeModelMetricNameExample", 42.5)
 ```
+
+## Backwards compatibility warnings
+1. Starting [v0.6.1](https://github.com/pytorch/serve/releases/tag/v0.6.1), the `add_metric` API signature changed\
+   from [add_metric(name, value, unit, idx=None, dimensions=None)](https://github.com/pytorch/serve/blob/61f1c4182e6e864c9ef1af99439854af3409d325/ts/metrics/metrics_store.py#L184)\
+   to [add_metric(metric_name, unit, dimension_names, metric_type)](https://github.com/pytorch/serve/blob/35ef00f9e62bb7fcec9cec92630ae757f9fb0db0/ts/metrics/metric_cache_abstract.py#L272).\
+   Usage of the new API is shown [above](#specifying-metric-types).\
+   There are two approaches available when migrating to the new custom metrics API:
+   - Replace the call to `add_metric` in versions prior to v0.6.1 with calls to the following methods:
+   ```
+   metric1 = metrics.add_metric("GenericMetric", unit=unit, dimension_names=["name1", "name2", ...], metric_type=MetricTypes.GAUGE)
+   metric1.add_or_update(value, dimension_values=["value1", "value2", ...])
+   ```
+   - Replace the call to `add_metric` in versions prior to v0.6.1 with one of the suitable custom metrics APIs where applicable: [add_counter](#add-counter-based-metrics), [add_time](#add-time-based-metrics),
+   [add_size](#add-size-based-metrics) or [add_percent](#add-percentage-based-metrics)
+2. Starting [v0.8.0](https://github.com/pytorch/serve/releases/tag/v0.8.0), only metrics that are defined in the metrics config file(default: [metrics.yaml](https://github.com/pytorch/serve/blob/master/ts/configs/metrics.yaml))
+   are either all logged to `ts_metrics.log` and `model_metrics.log` or made available via the [metrics API endpoint](https://github.com/pytorch/serve/blob/master/docs/metrics_api.md)
+   based on the `metrics_mode` configuration as described [above](#introduction).\
+   The default `metrics_mode` is `log` mode.\
+   This is unlike in previous versions where all metrics were only logged to `ts_metrics.log` and `model_metrics.log` except for `ts_inference_requests_total`, `ts_inference_latency_microseconds` and `ts_queue_latency_microseconds`
+   which were only available via the metrics API endpoint.

From e26d9205e4beb54a1ca83aeda09355228f966739 Mon Sep 17 00:00:00 2001
From: Naman Nandan <namannan@amazon.com>
Date: Fri, 4 Aug 2023 00:46:14 -0700
Subject: [PATCH 2/7] Update example for custom metrics

---
 docs/metrics.md                               |  26 ++-
 examples/custom_metrics/README.md             | 205 ++++++++++++------
 examples/custom_metrics/config.properties     |  12 +
 examples/custom_metrics/metrics.yaml          |  97 +++++++++
 examples/custom_metrics/mnist_handler.py      |  83 ++++++-
 .../custom_metrics/torchserve_custom.mtail    |  24 --
 6 files changed, 335 insertions(+), 112 deletions(-)
 create mode 100644 examples/custom_metrics/config.properties
 create mode 100644 examples/custom_metrics/metrics.yaml
 delete mode 100644 examples/custom_metrics/torchserve_custom.mtail

diff --git a/docs/metrics.md b/docs/metrics.md
index 249065edfb..7ddde3dbd7 100644
--- a/docs/metrics.md
+++ b/docs/metrics.md
@@ -191,7 +191,8 @@ model_metrics:  # backend metrics
 Note that **only** the metrics defined in the **metrics configuration file** can be emitted to logs or made available via the metrics API endpoint. This is done to ensure that the metrics configuration file serves as a central inventory of all the metrics that Torchserve can emit.
 
 Default metrics are provided in the [metrics.yaml](https://github.com/pytorch/serve/blob/master/ts/configs/metrics.yaml) file, but the user can either delete them to their liking / ignore them altogether, because these metrics will not be emitted unless they are edited.\
-When adding custom `model_metrics` in the metrics cofiguration file, ensure to include `ModelName` and `Level` dimension names towards the end of the list of dimensions since they are included by default by the custom metrics API.
+When adding custom `model_metrics` in the metrics cofiguration file, ensure to include `ModelName` and `Level` dimension names towards the end of the list of dimensions since they are included by default by the following custom metrics APIs: [add_counter](#add-counter-based-metrics),
+[add_time](#add-time-based-metrics), [add_size](#add-size-based-metrics) or [add_percent](#add-percentage-based-metrics).
 
 
 ### How it works
@@ -377,7 +378,7 @@ Add time-based by invoking the following method:
 Function API
 
 ```python
-    def add_time(self, metric_name: str, value: int or float, idx=None, unit: str = 'ms', dimensions: list = None,
+    def add_time(self, name: str, value: int or float, idx=None, unit: str = 'ms', dimensions: list = None,
                  metric_type: MetricTypes = MetricTypes.GAUGE):
         """
         Add a time based metric like latency, default unit is 'ms'
@@ -385,7 +386,7 @@ Function API
 
         Parameters
         ----------
-        metric_name : str
+        name : str
             metric name
         value: int
             value of metric
@@ -422,7 +423,7 @@ Add size-based metrics by invoking the following method:
 Function API
 
 ```python
-    def add_size(self, metric_name: str, value: int or float, idx=None, unit: str = 'MB', dimensions: list = None,
+    def add_size(self, name: str, value: int or float, idx=None, unit: str = 'MB', dimensions: list = None,
                  metric_type: MetricTypes = MetricTypes.GAUGE):
         """
         Add a size based metric
@@ -430,7 +431,7 @@ Function API
 
         Parameters
         ----------
-        metric_name : str
+        name : str
             metric name
         value: int, float
             value of metric
@@ -467,7 +468,7 @@ Percentage based metrics can be added by invoking the following method:
 Function API
 
 ```python
-    def add_percent(self, metric_name: str, value: int or float, idx=None, dimensions: list = None,
+    def add_percent(self, name: str, value: int or float, idx=None, dimensions: list = None,
                     metric_type: MetricTypes = MetricTypes.GAUGE):
         """
         Add a percentage based metric
@@ -475,7 +476,7 @@ Function API
 
         Parameters
         ----------
-        metric_name : str
+        name : str
             metric name
         value: int, float
             value of metric
@@ -489,6 +490,8 @@ Function API
 
 ```
 
+**Inferred unit**: `percent`
+
 To add custom percentage-based metrics:
 
 ```python
@@ -507,14 +510,13 @@ Counter based metrics can be added by invoking the following method
 Function API
 
 ```python
-    def add_counter(self, metric_name: str, value: int or float, idx=None, dimensions: list = None,
-                    metric_type: MetricTypes = MetricTypes.COUNTER):
+    def add_counter(self, name: str, value: int or float, idx=None, dimensions: list = None):
         """
         Add a counter metric or increment an existing counter metric
             Default metric type is counter
         Parameters
         ----------
-        metric_name : str
+        name : str
             metric name
         value: int or float
             value of metric
@@ -522,11 +524,11 @@ Function API
             request_id index in batch
         dimensions: list
             list of dimensions for the metric
-        metric_type: MetricTypes
-           type for defining different operations, defaulted to counter metric type for Counter metrics
         """
 ```
 
+**Inferred unit**: `count`
+
 ### Getting a metric
 
 Users can get a metric from the cache. The Metric object is returned, so the user can access the methods of the Metric: (i.e. `Metric.update(value)`, `Metric.__str__`)
diff --git a/examples/custom_metrics/README.md b/examples/custom_metrics/README.md
index 149a71cf8d..6a199c5f41 100644
--- a/examples/custom_metrics/README.md
+++ b/examples/custom_metrics/README.md
@@ -1,93 +1,164 @@
-# Monitoring Torchserve custom metrics with mtail metrics exporter and prometheus
+# Torchserve custom metrics with prometheus support
 
-In this example, we show how to use a pre-trained custom MNIST model and export the custom metrics using mtail and prometheus
+In this example, we show how to use a pre-trained custom MNIST model and export custom metrics using prometheus.
 
-We used the following pytorch example to train the basic MNIST model for digit recognition : https://github.com/pytorch/examples/tree/master/mnist
+We use the following pytorch example of MNIST model for digit recognition : https://github.com/pytorch/examples/tree/master/mnist
 
-Run the commands given in following steps from the parent directory of the root of the repository. For example, if you cloned the repository into /home/my_path/serve, run the steps from /home/my_path
+Run the commands given in following steps from the root directory of the repository. For example, if you cloned the repository into /home/my_path/serve, run the steps from /home/my_path/serve
 
 ## Steps
 
-- Step 1: In this example we introduce a new custom metric `SizeOfImage` in the custom handler and export it using mtail.
+- Step 1: In this example we add the following custom metrics and access them in prometheus format via the [metrics API endpoint](https://github.com/pytorch/serve/blob/master/docs/metrics_api.md):
+  - InferenceRequestCount
+  - PostprocessCallCount
+  - RequestBatchSize
+  - SizeOfImage
+  - HandlerMethodTime
+  - ExamplePercentMetric
 
-  ```python
-  def preprocess(self, data):
-    metrics = self.context.metrics
-    input = data[0].get('body')
-    metrics.add_size('SizeOfImage', len(input) / 1024, None, 'kB')
-    return ImageClassifier.preprocess(self, data)
-  ```
+  The custom metrics configuration file `metrics.yaml` in this example builds on top of the [default metrics configuration file](https://github.com/pytorch/serve/blob/master/ts/configs/metrics.yaml) to include the custom metrics listed above.
+  The `config.properties` file in this example configures torchserve to use the custom metrics configuration file and sets the `metrics_mode` to `prometheus`. The custom handler
+  `mnist_handler.py` updates the metrics listed above.
 
-  Refer: [Custom Metrics](https://github.com/pytorch/serve/blob/master/docs/metrics.md#custom-metrics-api)
+  Refer: [Custom Metrics](https://github.com/pytorch/serve/blob/master/docs/metrics.md#custom-metrics-api)\
   Refer: [Custom Handler](https://github.com/pytorch/serve/blob/master/docs/custom_service.md#custom-handlers)
 
-- Step 2: Create a torch model archive using the torch-model-archiver utility to archive the above files.
+- Step 2: Create a torch model archive using the torch-model-archiver utility.
 
   ```bash
   torch-model-archiver --model-name mnist --version 1.0 --model-file examples/image_classifier/mnist/mnist.py --serialized-file examples/image_classifier/mnist/mnist_cnn.pt --handler examples/custom_metrics/mnist_handler.py
   ```
 
-- Step 3: Register the model on TorchServe using the above model archive file.
+- Step 3: Register the model to torchserve using the above model archive file.
 
   ```bash
   mkdir model_store
   mv mnist.mar model_store/
-  torchserve --start --model-store model_store --models mnist=mnist.mar
+  torchserve --ncs --start --model-store model_store --ts-config examples/custom_metrics/config.properties --models mnist=mnist.mar
   ```
 
-- Step 4: Install [mtail](https://github.com/google/mtail/releases)
-
-  ```bash
-  wget https://github.com/google/mtail/releases/download/v3.0.0-rc47/mtail_3.0.0-rc47_Linux_x86_64.tar.gz
-  tar -xvzf mtail_3.0.0-rc47_Linux_x86_64.tar.gz
-  chmod +x mtail
-  ```
-
-- Step 5: Create a mtail program. In this example we using a program to export default custom metrics.
-
-  Refer: [mtail Programming Guide](https://google.github.io/mtail/Programming-Guide.html).
-
-- Step 6: Start mtail export by running the below command
-
-  ```bash
-  ./mtail --progs examples/custom_metrics/torchserve_custom.mtail --logs logs/model_metrics.log
-  ```
-
-  The mtail program parses the log file extracts info by matching patterns and presents as JSON, Prometheus and other databases. https://google.github.io/mtail/Interoperability.html
-
-- Step 7: Make Inference request
+- Step 4: Make Inference request
 
   ```bash
   curl http://127.0.0.1:8080/predictions/mnist -T examples/image_classifier/mnist/test_data/0.png
   ```
 
-  The inference request logs the time taken for prediction to the model_metrics.log file.
-  Mtail parses the file and is served at 3903 port
-
-  `http://localhost:3903`
-
-- Step 8: Sart Prometheus with mtailtarget added to scarpe config
-
-  - Download [Prometheus](https://prometheus.io/download/)
-
-  - Add mtail target added to scrape config in the config file
-
-  ```yaml
-  scrape_configs:
-    # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
-    - job_name: "prometheus"
-
-      # metrics_path defaults to '/metrics'
-      # scheme defaults to 'http'.
-
-      static_configs:
-        - targets: ["localhost:9090", "localhost:3903"]
-  ```
-
-  - Start Prometheus with config file
-
-  ```bash
-  ./prometheus --config.file prometheus.yml
-  ```
-
-  The exported logs from mtail are scraped by prometheus on 3903 port.
+- Step 5: Install prometheus using the instructions [here](https://prometheus.io/download/#prometheus).
+
+- Step 6: Create a minimal `prometheus.yaml` config file as below and run `./prometheus --config.file=prometheus.yaml`.
+
+```yaml
+global:
+  scrape_interval:     15s
+  evaluation_interval: 15s
+
+scrape_configs:
+  - job_name: 'prometheus'
+    static_configs:
+    - targets: ['localhost:9090']
+  - job_name: 'torchserve'
+    static_configs:
+    - targets: ['localhost:8082'] #TorchServe metrics endpoint
+```
+
+- Step 7: Test metrics API endpoint
+```console
+curl http://127.0.0.1:8082/metrics
+
+# HELP PredictionTime Torchserve prometheus gauge metric with unit: ms
+# TYPE PredictionTime gauge
+PredictionTime{ModelName="mnist",Level="Model",Hostname="88665a372f4b.ant.amazon.com",} 23.3
+# HELP GPUMemoryUtilization Torchserve prometheus gauge metric with unit: Percent
+# TYPE GPUMemoryUtilization gauge
+# HELP ts_queue_latency_microseconds Torchserve prometheus counter metric with unit: Microseconds
+# TYPE ts_queue_latency_microseconds counter
+ts_queue_latency_microseconds{model_name="mnist",model_version="default",hostname="88665a372f4b.ant.amazon.com",} 164.607
+# HELP WorkerLoadTime Torchserve prometheus gauge metric with unit: Milliseconds
+# TYPE WorkerLoadTime gauge
+WorkerLoadTime{WorkerName="W-9000-mnist_1.0",Level="Host",Hostname="88665a372f4b.ant.amazon.com",} 5818.0
+# HELP SizeOfImage Torchserve prometheus gauge metric with unit: kB
+# TYPE SizeOfImage gauge
+SizeOfImage{ModelName="mnist",Level="Model",Hostname="88665a372f4b.ant.amazon.com",} 0.265625
+# HELP PostprocessCallCount Torchserve prometheus counter metric with unit: count
+# TYPE PostprocessCallCount counter
+PostprocessCallCount{ModelName="mnist",Level="Model",Hostname="88665a372f4b.ant.amazon.com",} 1.0
+# HELP GPUUtilization Torchserve prometheus gauge metric with unit: Percent
+# TYPE GPUUtilization gauge
+# HELP Requests5XX Torchserve prometheus counter metric with unit: Count
+# TYPE Requests5XX counter
+# HELP HandlerMethodTime Torchserve prometheus gauge metric with unit: ms
+# TYPE HandlerMethodTime gauge
+HandlerMethodTime{MethodName="preprocess",ModelName="mnist",Level="Model",Hostname="88665a372f4b.ant.amazon.com",} 13.740777969360352
+# HELP MemoryAvailable Torchserve prometheus gauge metric with unit: Megabytes
+# TYPE MemoryAvailable gauge
+MemoryAvailable{Level="Host",Hostname="88665a372f4b.ant.amazon.com",} 4584.23828125
+# HELP InferenceRequestCount Torchserve prometheus counter metric with unit: count
+# TYPE InferenceRequestCount counter
+InferenceRequestCount{Hostname="88665a372f4b.ant.amazon.com",} 1.0
+# HELP ts_inference_requests_total Torchserve prometheus counter metric with unit: Count
+# TYPE ts_inference_requests_total counter
+ts_inference_requests_total{model_name="mnist",model_version="default",hostname="88665a372f4b.ant.amazon.com",} 1.0
+# HELP HandlerTime Torchserve prometheus gauge metric with unit: ms
+# TYPE HandlerTime gauge
+HandlerTime{ModelName="mnist",Level="Model",Hostname="88665a372f4b.ant.amazon.com",} 23.17
+# HELP ExamplePercentMetric Torchserve prometheus histogram metric with unit: percent
+# TYPE ExamplePercentMetric histogram
+ExamplePercentMetric_bucket{ModelName="mnist",Level="Model",Hostname="88665a372f4b.ant.amazon.com",le="0.005",} 0.0
+ExamplePercentMetric_bucket{ModelName="mnist",Level="Model",Hostname="88665a372f4b.ant.amazon.com",le="0.01",} 0.0
+ExamplePercentMetric_bucket{ModelName="mnist",Level="Model",Hostname="88665a372f4b.ant.amazon.com",le="0.025",} 0.0
+ExamplePercentMetric_bucket{ModelName="mnist",Level="Model",Hostname="88665a372f4b.ant.amazon.com",le="0.05",} 0.0
+ExamplePercentMetric_bucket{ModelName="mnist",Level="Model",Hostname="88665a372f4b.ant.amazon.com",le="0.075",} 0.0
+ExamplePercentMetric_bucket{ModelName="mnist",Level="Model",Hostname="88665a372f4b.ant.amazon.com",le="0.1",} 0.0
+ExamplePercentMetric_bucket{ModelName="mnist",Level="Model",Hostname="88665a372f4b.ant.amazon.com",le="0.25",} 0.0
+ExamplePercentMetric_bucket{ModelName="mnist",Level="Model",Hostname="88665a372f4b.ant.amazon.com",le="0.5",} 0.0
+ExamplePercentMetric_bucket{ModelName="mnist",Level="Model",Hostname="88665a372f4b.ant.amazon.com",le="0.75",} 0.0
+ExamplePercentMetric_bucket{ModelName="mnist",Level="Model",Hostname="88665a372f4b.ant.amazon.com",le="1.0",} 0.0
+ExamplePercentMetric_bucket{ModelName="mnist",Level="Model",Hostname="88665a372f4b.ant.amazon.com",le="2.5",} 0.0
+ExamplePercentMetric_bucket{ModelName="mnist",Level="Model",Hostname="88665a372f4b.ant.amazon.com",le="5.0",} 0.0
+ExamplePercentMetric_bucket{ModelName="mnist",Level="Model",Hostname="88665a372f4b.ant.amazon.com",le="7.5",} 0.0
+ExamplePercentMetric_bucket{ModelName="mnist",Level="Model",Hostname="88665a372f4b.ant.amazon.com",le="10.0",} 0.0
+ExamplePercentMetric_bucket{ModelName="mnist",Level="Model",Hostname="88665a372f4b.ant.amazon.com",le="+Inf",} 1.0
+ExamplePercentMetric_count{ModelName="mnist",Level="Model",Hostname="88665a372f4b.ant.amazon.com",} 1.0
+ExamplePercentMetric_sum{ModelName="mnist",Level="Model",Hostname="88665a372f4b.ant.amazon.com",} 50.0
+# HELP WorkerThreadTime Torchserve prometheus gauge metric with unit: Milliseconds
+# TYPE WorkerThreadTime gauge
+WorkerThreadTime{Level="Host",Hostname="88665a372f4b.ant.amazon.com",} 3.0
+# HELP Requests2XX Torchserve prometheus counter metric with unit: Count
+# TYPE Requests2XX counter
+Requests2XX{Level="Host",Hostname="88665a372f4b.ant.amazon.com",} 1.0
+# HELP QueueTime Torchserve prometheus gauge metric with unit: Milliseconds
+# TYPE QueueTime gauge
+QueueTime{Level="Host",Hostname="88665a372f4b.ant.amazon.com",} 0.0
+# HELP MemoryUtilization Torchserve prometheus gauge metric with unit: Percent
+# TYPE MemoryUtilization gauge
+MemoryUtilization{Level="Host",Hostname="88665a372f4b.ant.amazon.com",} 72.0
+# HELP GPUMemoryUsed Torchserve prometheus gauge metric with unit: Megabytes
+# TYPE GPUMemoryUsed gauge
+# HELP ts_inference_latency_microseconds Torchserve prometheus counter metric with unit: Microseconds
+# TYPE ts_inference_latency_microseconds counter
+ts_inference_latency_microseconds{model_name="mnist",model_version="default",hostname="88665a372f4b.ant.amazon.com",} 26736.381
+# HELP DiskAvailable Torchserve prometheus gauge metric with unit: Gigabytes
+# TYPE DiskAvailable gauge
+DiskAvailable{Level="Host",Hostname="88665a372f4b.ant.amazon.com",} 306.9124526977539
+# HELP RequestBatchSize Torchserve prometheus gauge metric with unit: count
+# TYPE RequestBatchSize gauge
+RequestBatchSize{ModelName="mnist",Hostname="88665a372f4b.ant.amazon.com",} 1.0
+# HELP DiskUsage Torchserve prometheus gauge metric with unit: Gigabytes
+# TYPE DiskUsage gauge
+DiskUsage{Level="Host",Hostname="88665a372f4b.ant.amazon.com",} 8.438858032226562
+# HELP Requests4XX Torchserve prometheus counter metric with unit: Count
+# TYPE Requests4XX counter
+# HELP MemoryUsed Torchserve prometheus gauge metric with unit: Megabytes
+# TYPE MemoryUsed gauge
+MemoryUsed{Level="Host",Hostname="88665a372f4b.ant.amazon.com",} 8699.48046875
+# HELP CPUUtilization Torchserve prometheus gauge metric with unit: Percent
+# TYPE CPUUtilization gauge
+CPUUtilization{Level="Host",Hostname="88665a372f4b.ant.amazon.com",} 33.3
+# HELP DiskUtilization Torchserve prometheus gauge metric with unit: Percent
+# TYPE DiskUtilization gauge
+DiskUtilization{Level="Host",Hostname="88665a372f4b.ant.amazon.com",} 2.7
+```
+
+- Step 8: Navigate to `http://localhost:9090/` on a browser to execute queries and create graphs
+
+<img width="1777" alt="Screenshot 2023-08-03 at 6 46 47 PM" src="https://github.com/pytorch/serve/assets/5276346/a87d6ee4-a760-4da8-b0f6-d461df7e500d">
diff --git a/examples/custom_metrics/config.properties b/examples/custom_metrics/config.properties
new file mode 100644
index 0000000000..02607ac36d
--- /dev/null
+++ b/examples/custom_metrics/config.properties
@@ -0,0 +1,12 @@
+metrics_mode=prometheus
+metrics_config=examples/custom_metrics/metrics.yaml
+models={\
+  "mnist": {\
+    "1.0": {\
+        "defaultVersion": true,\
+        "marName": "mnist.mar",\
+        "minWorkers": 1,\
+        "maxWorkers": 1\
+    }\
+  }\
+}
diff --git a/examples/custom_metrics/metrics.yaml b/examples/custom_metrics/metrics.yaml
new file mode 100644
index 0000000000..699ca19aea
--- /dev/null
+++ b/examples/custom_metrics/metrics.yaml
@@ -0,0 +1,97 @@
+dimensions:
+  - &model_name "ModelName"
+  - &worker_name "WorkerName"
+  - &level "Level"
+  - &device_id "DeviceId"
+  - &hostname "Hostname"
+
+ts_metrics:
+  counter:
+    - name: Requests2XX
+      unit: Count
+      dimensions: [*level, *hostname]
+    - name: Requests4XX
+      unit: Count
+      dimensions: [*level, *hostname]
+    - name: Requests5XX
+      unit: Count
+      dimensions: [*level, *hostname]
+    - name: ts_inference_requests_total
+      unit: Count
+      dimensions: ["model_name", "model_version", "hostname"]
+    - name: ts_inference_latency_microseconds
+      unit: Microseconds
+      dimensions: ["model_name", "model_version", "hostname"]
+    - name: ts_queue_latency_microseconds
+      unit: Microseconds
+      dimensions: ["model_name", "model_version", "hostname"]
+  gauge:
+    - name: QueueTime
+      unit: Milliseconds
+      dimensions: [*level, *hostname]
+    - name: WorkerThreadTime
+      unit: Milliseconds
+      dimensions: [*level, *hostname]
+    - name: WorkerLoadTime
+      unit: Milliseconds
+      dimensions: [*worker_name, *level, *hostname]
+    - name: CPUUtilization
+      unit: Percent
+      dimensions: [*level, *hostname]
+    - name: MemoryUsed
+      unit: Megabytes
+      dimensions: [*level, *hostname]
+    - name: MemoryAvailable
+      unit: Megabytes
+      dimensions: [*level, *hostname]
+    - name: MemoryUtilization
+      unit: Percent
+      dimensions: [*level, *hostname]
+    - name: DiskUsage
+      unit: Gigabytes
+      dimensions: [*level, *hostname]
+    - name: DiskUtilization
+      unit: Percent
+      dimensions: [*level, *hostname]
+    - name: DiskAvailable
+      unit: Gigabytes
+      dimensions: [*level, *hostname]
+    - name: GPUMemoryUtilization
+      unit: Percent
+      dimensions: [*level, *device_id, *hostname]
+    - name: GPUMemoryUsed
+      unit: Megabytes
+      dimensions: [*level, *device_id, *hostname]
+    - name: GPUUtilization
+      unit: Percent
+      dimensions: [*level, *device_id, *hostname]
+
+model_metrics:
+  # Dimension "Hostname" is automatically added for model metrics in the backend
+  counter:
+    - name: InferenceRequestCount
+      unit: count
+      dimensions: []
+    - name: PostprocessCallCount
+      unit: count
+      dimensions: [*model_name, *level]
+  gauge:
+    - name: HandlerTime
+      unit: ms
+      dimensions: [*model_name, *level]
+    - name: PredictionTime
+      unit: ms
+      dimensions: [*model_name, *level]
+    - name: RequestBatchSize
+      unit: count
+      dimensions: ["ModelName"]
+    - name: SizeOfImage
+      unit: kB
+      dimensions: [*model_name, *level]
+    - name: HandlerMethodTime
+      unit: ms
+      dimensions: ["MethodName", *model_name, *level]
+  histogram:
+    - name: ExamplePercentMetric
+      unit: percent
+      dimensions: [*model_name, *level]
diff --git a/examples/custom_metrics/mnist_handler.py b/examples/custom_metrics/mnist_handler.py
index 919b3a8f83..db162d753d 100644
--- a/examples/custom_metrics/mnist_handler.py
+++ b/examples/custom_metrics/mnist_handler.py
@@ -1,6 +1,9 @@
-import io
-from PIL import Image
+import time
+
 from torchvision import transforms
+
+from ts.metrics.dimension import Dimension
+from ts.metrics.metric_type_enum import MetricTypes
 from ts.torch_handler.image_classifier import ImageClassifier
 
 
@@ -8,13 +11,29 @@ class MNISTDigitClassifier(ImageClassifier):
     """
     MNISTDigitClassifier handler class. This handler extends class ImageClassifier from image_classifier.py, a
     default handler. This handler takes an image and returns the number in that image.
-
-    Here method postprocess() has been overridden while others are reused from parent class.
     """
 
     image_processing = transforms.Compose(
-        [transforms.ToTensor(),
-         transforms.Normalize((0.1307,), (0.3081,))])
+        [transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))]
+    )
+
+    def initialize(self, context):
+        super().initialize(context)
+        metrics = context.metrics
+
+        # Usage of "add_metric"
+        self.inf_request_count = metrics.add_metric(
+            metric_name="InferenceRequestCount",
+            unit="count",
+            dimension_names=[],
+            metric_type=MetricTypes.COUNTER,
+        )
+        metrics.add_metric(
+            metric_name="RequestBatchSize",
+            unit="count",
+            dimension_names=["ModelName"],
+            metric_type=MetricTypes.GAUGE,
+        )
 
     def preprocess(self, data):
         """
@@ -27,10 +46,43 @@ def preprocess(self, data):
         Returns:
             tensor: Returns the tensor data of the input
         """
+        preprocess_start = time.time()
+
         metrics = self.context.metrics
-        input = data[0].get('body')
-        metrics.add_size('SizeOfImage', len(input) / 1024, None, 'kB')
-        return ImageClassifier.preprocess(self, data)
+
+        # Usage of "add_or_update"
+        self.inf_request_count.add_or_update(value=1, dimension_values=[])
+
+        # Usage of "get_metric"
+        request_batch_size_metric = metrics.get_metric(
+            metric_name="RequestBatchSize", metric_type=MetricTypes.GAUGE
+        )
+        request_batch_size_metric.add_or_update(
+            value=len(data), dimension_values=[self.context.model_name]
+        )
+
+        input = data[0].get("body")
+
+        # Usage of "add_size"
+        metrics.add_size(
+            name="SizeOfImage", value=len(input) / 1024, idx=None, unit="kB"
+        )
+
+        preprocessed_image = ImageClassifier.preprocess(self, data)
+
+        preprocess_stop = time.time()
+
+        # usage of add_time
+        metrics.add_time(
+            name="HandlerMethodTime",
+            value=(preprocess_stop - preprocess_start) * 1000,
+            idx=None,
+            unit="ms",
+            dimensions=[Dimension(name="MethodName", value="preprocess")],
+            metric_type=MetricTypes.GAUGE,
+        )
+
+        return preprocessed_image
 
     def postprocess(self, data):
         """The post process of MNIST converts the predicted output response to a label.
@@ -41,4 +93,17 @@ def postprocess(self, data):
         Returns:
             list : A list of dictionary with predictons and explanations are returned.
         """
+        # usage of add_counter
+        self.context.metrics.add_counter(
+            name="PostprocessCallCount", value=1, idx=None, dimensions=[]
+        )
+        # usage of add_percent
+        self.context.metrics.add_percent(
+            name="ExamplePercentMetric",
+            value=50,
+            idx=None,
+            dimensions=[],
+            metric_type=MetricTypes.HISTOGRAM,
+        )
+
         return data.argmax(1).flatten().tolist()
diff --git a/examples/custom_metrics/torchserve_custom.mtail b/examples/custom_metrics/torchserve_custom.mtail
deleted file mode 100644
index 15e642d762..0000000000
--- a/examples/custom_metrics/torchserve_custom.mtail
+++ /dev/null
@@ -1,24 +0,0 @@
-counter request_count
-gauge image_size
-gauge model_name
-gauge level
-gauge host_name
-gauge request_id
-gauge time_stamp
-
-# Sample log
-# 2021-08-27 21:15:03,376 - PredictionTime.Milliseconds:109.74|#ModelName:bert,Level:Model|#hostname:ubuntu-ThinkPad-E14,requestID:09ed6c2c-9380-480d-a61a-66bfea958c1d,timestamp:1630079103
-# 2021-08-27 21:15:03,376 - HandlerTime.Milliseconds:109.74|#ModelName:bert,Level:Model|#hostname:ubuntu-ThinkPad-E14,requestID:09ed6c2c-9380-480d-a61a-66bfea958c1d,timestamp:1630079103
-# 2021-09-02 00:24:34,001 - InferenceTime.Milliseconds:3.05|#ModelName:mnist,Level:Model|#hostname:ubuntu-ThinkPad-E14,requestID:ce9a3631-e509-4a82-91c4-482cd2a15cd9,timestamp:1630522474
-
-const PATTERN /SizeOfImage\.Kilobytes:(\d+\.\d+)\|#ModelName:([a-zA-Z]+),Level:([a-zA-Z]+)\|#hostname:([a-zA-Z0-9-]+),requestID:([a-zA-Z0-9-]+),timestamp:([0-9]+)/
-
-PATTERN{
-  request_count++
-  image_size = $1
-  model_name = $2
-  level = $3
-  host_name = $4
-  request_id = $5
-  time_stamp = $6
-}

From e1339e1792756d180c1bdfaafe370359043ee0bb Mon Sep 17 00:00:00 2001
From: Naman Nandan <namannan@amazon.com>
Date: Fri, 4 Aug 2023 00:57:15 -0700
Subject: [PATCH 3/7] fix lint error

---
 docs/metrics.md                         | 2 +-
 ts_scripts/spellcheck_conf/wordlist.txt | 5 +++++
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/docs/metrics.md b/docs/metrics.md
index 7ddde3dbd7..168418d1d2 100644
--- a/docs/metrics.md
+++ b/docs/metrics.md
@@ -191,7 +191,7 @@ model_metrics:  # backend metrics
 Note that **only** the metrics defined in the **metrics configuration file** can be emitted to logs or made available via the metrics API endpoint. This is done to ensure that the metrics configuration file serves as a central inventory of all the metrics that Torchserve can emit.
 
 Default metrics are provided in the [metrics.yaml](https://github.com/pytorch/serve/blob/master/ts/configs/metrics.yaml) file, but the user can either delete them to their liking / ignore them altogether, because these metrics will not be emitted unless they are edited.\
-When adding custom `model_metrics` in the metrics cofiguration file, ensure to include `ModelName` and `Level` dimension names towards the end of the list of dimensions since they are included by default by the following custom metrics APIs: [add_counter](#add-counter-based-metrics),
+When adding custom `model_metrics` in the metrics configuration file, ensure to include `ModelName` and `Level` dimension names towards the end of the list of dimensions since they are included by default by the following custom metrics APIs: [add_counter](#add-counter-based-metrics),
 [add_time](#add-time-based-metrics), [add_size](#add-size-based-metrics) or [add_percent](#add-percentage-based-metrics).
 
 
diff --git a/ts_scripts/spellcheck_conf/wordlist.txt b/ts_scripts/spellcheck_conf/wordlist.txt
index 902439747a..7692ca780d 100644
--- a/ts_scripts/spellcheck_conf/wordlist.txt
+++ b/ts_scripts/spellcheck_conf/wordlist.txt
@@ -1068,3 +1068,8 @@ chatGPT
 baseimage
 cuDNN
 Xformer
+ExamplePercentMetric
+HandlerMethodTime
+InferenceRequestCount
+PostprocessCallCount
+RequestBatchSize

From 9af12bfd1fd9e0cc11637b559cfda31a31270175 Mon Sep 17 00:00:00 2001
From: Naman Nandan <namannan@amazon.com>
Date: Thu, 17 Aug 2023 13:39:56 -0700
Subject: [PATCH 4/7] Update custom metrics example to work with backwards
 compatible API

---
 examples/custom_metrics/README.md        | 110 ++++++++++++-----------
 examples/custom_metrics/metrics.yaml     |   6 ++
 examples/custom_metrics/mnist_handler.py |  46 +++++++---
 3 files changed, 100 insertions(+), 62 deletions(-)

diff --git a/examples/custom_metrics/README.md b/examples/custom_metrics/README.md
index 6a199c5f41..53199ab3b3 100644
--- a/examples/custom_metrics/README.md
+++ b/examples/custom_metrics/README.md
@@ -10,6 +10,8 @@ Run the commands given in following steps from the root directory of the reposit
 
 - Step 1: In this example we add the following custom metrics and access them in prometheus format via the [metrics API endpoint](https://github.com/pytorch/serve/blob/master/docs/metrics_api.md):
   - InferenceRequestCount
+  - InitializeCallCount
+  - PreprocessCallCount
   - PostprocessCallCount
   - RequestBatchSize
   - SizeOfImage
@@ -65,42 +67,49 @@ scrape_configs:
 ```console
 curl http://127.0.0.1:8082/metrics
 
+# HELP Requests2XX Torchserve prometheus counter metric with unit: Count
+# TYPE Requests2XX counter
+Requests2XX{Level="Host",Hostname="88665a372f4b.ant.amazon.com",} 1.0
 # HELP PredictionTime Torchserve prometheus gauge metric with unit: ms
 # TYPE PredictionTime gauge
-PredictionTime{ModelName="mnist",Level="Model",Hostname="88665a372f4b.ant.amazon.com",} 23.3
-# HELP GPUMemoryUtilization Torchserve prometheus gauge metric with unit: Percent
-# TYPE GPUMemoryUtilization gauge
-# HELP ts_queue_latency_microseconds Torchserve prometheus counter metric with unit: Microseconds
-# TYPE ts_queue_latency_microseconds counter
-ts_queue_latency_microseconds{model_name="mnist",model_version="default",hostname="88665a372f4b.ant.amazon.com",} 164.607
+PredictionTime{ModelName="mnist",Level="Model",Hostname="88665a372f4b.ant.amazon.com",} 62.78
+# HELP DiskUsage Torchserve prometheus gauge metric with unit: Gigabytes
+# TYPE DiskUsage gauge
+DiskUsage{Level="Host",Hostname="88665a372f4b.ant.amazon.com",} 8.438858032226562
 # HELP WorkerLoadTime Torchserve prometheus gauge metric with unit: Milliseconds
 # TYPE WorkerLoadTime gauge
-WorkerLoadTime{WorkerName="W-9000-mnist_1.0",Level="Host",Hostname="88665a372f4b.ant.amazon.com",} 5818.0
-# HELP SizeOfImage Torchserve prometheus gauge metric with unit: kB
-# TYPE SizeOfImage gauge
-SizeOfImage{ModelName="mnist",Level="Model",Hostname="88665a372f4b.ant.amazon.com",} 0.265625
-# HELP PostprocessCallCount Torchserve prometheus counter metric with unit: count
-# TYPE PostprocessCallCount counter
-PostprocessCallCount{ModelName="mnist",Level="Model",Hostname="88665a372f4b.ant.amazon.com",} 1.0
-# HELP GPUUtilization Torchserve prometheus gauge metric with unit: Percent
-# TYPE GPUUtilization gauge
+WorkerLoadTime{WorkerName="W-9000-mnist_1.0",Level="Host",Hostname="88665a372f4b.ant.amazon.com",} 7425.0
 # HELP Requests5XX Torchserve prometheus counter metric with unit: Count
 # TYPE Requests5XX counter
-# HELP HandlerMethodTime Torchserve prometheus gauge metric with unit: ms
-# TYPE HandlerMethodTime gauge
-HandlerMethodTime{MethodName="preprocess",ModelName="mnist",Level="Model",Hostname="88665a372f4b.ant.amazon.com",} 13.740777969360352
-# HELP MemoryAvailable Torchserve prometheus gauge metric with unit: Megabytes
-# TYPE MemoryAvailable gauge
-MemoryAvailable{Level="Host",Hostname="88665a372f4b.ant.amazon.com",} 4584.23828125
-# HELP InferenceRequestCount Torchserve prometheus counter metric with unit: count
-# TYPE InferenceRequestCount counter
-InferenceRequestCount{Hostname="88665a372f4b.ant.amazon.com",} 1.0
+# HELP CPUUtilization Torchserve prometheus gauge metric with unit: Percent
+# TYPE CPUUtilization gauge
+CPUUtilization{Level="Host",Hostname="88665a372f4b.ant.amazon.com",} 100.0
+# HELP WorkerThreadTime Torchserve prometheus gauge metric with unit: Milliseconds
+# TYPE WorkerThreadTime gauge
+WorkerThreadTime{Level="Host",Hostname="88665a372f4b.ant.amazon.com",} 3.0
+# HELP DiskAvailable Torchserve prometheus gauge metric with unit: Gigabytes
+# TYPE DiskAvailable gauge
+DiskAvailable{Level="Host",Hostname="88665a372f4b.ant.amazon.com",} 308.94310760498047
 # HELP ts_inference_requests_total Torchserve prometheus counter metric with unit: Count
 # TYPE ts_inference_requests_total counter
 ts_inference_requests_total{model_name="mnist",model_version="default",hostname="88665a372f4b.ant.amazon.com",} 1.0
+# HELP GPUMemoryUtilization Torchserve prometheus gauge metric with unit: Percent
+# TYPE GPUMemoryUtilization gauge
 # HELP HandlerTime Torchserve prometheus gauge metric with unit: ms
 # TYPE HandlerTime gauge
-HandlerTime{ModelName="mnist",Level="Model",Hostname="88665a372f4b.ant.amazon.com",} 23.17
+HandlerTime{ModelName="mnist",Level="Model",Hostname="88665a372f4b.ant.amazon.com",} 62.64
+# HELP ts_inference_latency_microseconds Torchserve prometheus counter metric with unit: Microseconds
+# TYPE ts_inference_latency_microseconds counter
+ts_inference_latency_microseconds{model_name="mnist",model_version="default",hostname="88665a372f4b.ant.amazon.com",} 64694.367
+# HELP MemoryUtilization Torchserve prometheus gauge metric with unit: Percent
+# TYPE MemoryUtilization gauge
+MemoryUtilization{Level="Host",Hostname="88665a372f4b.ant.amazon.com",} 53.1
+# HELP MemoryAvailable Torchserve prometheus gauge metric with unit: Megabytes
+# TYPE MemoryAvailable gauge
+MemoryAvailable{Level="Host",Hostname="88665a372f4b.ant.amazon.com",} 7677.29296875
+# HELP PostprocessCallCount Torchserve prometheus counter metric with unit: count
+# TYPE PostprocessCallCount counter
+PostprocessCallCount{ModelName="mnist",Level="Model",Hostname="88665a372f4b.ant.amazon.com",} 1.0
 # HELP ExamplePercentMetric Torchserve prometheus histogram metric with unit: percent
 # TYPE ExamplePercentMetric histogram
 ExamplePercentMetric_bucket{ModelName="mnist",Level="Model",Hostname="88665a372f4b.ant.amazon.com",le="0.005",} 0.0
@@ -120,43 +129,42 @@ ExamplePercentMetric_bucket{ModelName="mnist",Level="Model",Hostname="88665a372f
 ExamplePercentMetric_bucket{ModelName="mnist",Level="Model",Hostname="88665a372f4b.ant.amazon.com",le="+Inf",} 1.0
 ExamplePercentMetric_count{ModelName="mnist",Level="Model",Hostname="88665a372f4b.ant.amazon.com",} 1.0
 ExamplePercentMetric_sum{ModelName="mnist",Level="Model",Hostname="88665a372f4b.ant.amazon.com",} 50.0
-# HELP WorkerThreadTime Torchserve prometheus gauge metric with unit: Milliseconds
-# TYPE WorkerThreadTime gauge
-WorkerThreadTime{Level="Host",Hostname="88665a372f4b.ant.amazon.com",} 3.0
-# HELP Requests2XX Torchserve prometheus counter metric with unit: Count
-# TYPE Requests2XX counter
-Requests2XX{Level="Host",Hostname="88665a372f4b.ant.amazon.com",} 1.0
+# HELP GPUUtilization Torchserve prometheus gauge metric with unit: Percent
+# TYPE GPUUtilization gauge
+# HELP MemoryUsed Torchserve prometheus gauge metric with unit: Megabytes
+# TYPE MemoryUsed gauge
+MemoryUsed{Level="Host",Hostname="88665a372f4b.ant.amazon.com",} 7903.734375
 # HELP QueueTime Torchserve prometheus gauge metric with unit: Milliseconds
 # TYPE QueueTime gauge
 QueueTime{Level="Host",Hostname="88665a372f4b.ant.amazon.com",} 0.0
-# HELP MemoryUtilization Torchserve prometheus gauge metric with unit: Percent
-# TYPE MemoryUtilization gauge
-MemoryUtilization{Level="Host",Hostname="88665a372f4b.ant.amazon.com",} 72.0
-# HELP GPUMemoryUsed Torchserve prometheus gauge metric with unit: Megabytes
-# TYPE GPUMemoryUsed gauge
-# HELP ts_inference_latency_microseconds Torchserve prometheus counter metric with unit: Microseconds
-# TYPE ts_inference_latency_microseconds counter
-ts_inference_latency_microseconds{model_name="mnist",model_version="default",hostname="88665a372f4b.ant.amazon.com",} 26736.381
-# HELP DiskAvailable Torchserve prometheus gauge metric with unit: Gigabytes
-# TYPE DiskAvailable gauge
-DiskAvailable{Level="Host",Hostname="88665a372f4b.ant.amazon.com",} 306.9124526977539
+# HELP ts_queue_latency_microseconds Torchserve prometheus counter metric with unit: Microseconds
+# TYPE ts_queue_latency_microseconds counter
+ts_queue_latency_microseconds{model_name="mnist",model_version="default",hostname="88665a372f4b.ant.amazon.com",} 115.79
+# HELP PreprocessCallCount Torchserve prometheus counter metric with unit: count
+# TYPE PreprocessCallCount counter
+PreprocessCallCount{ModelName="mnist",Hostname="88665a372f4b.ant.amazon.com",} 1.0
 # HELP RequestBatchSize Torchserve prometheus gauge metric with unit: count
 # TYPE RequestBatchSize gauge
 RequestBatchSize{ModelName="mnist",Hostname="88665a372f4b.ant.amazon.com",} 1.0
-# HELP DiskUsage Torchserve prometheus gauge metric with unit: Gigabytes
-# TYPE DiskUsage gauge
-DiskUsage{Level="Host",Hostname="88665a372f4b.ant.amazon.com",} 8.438858032226562
+# HELP SizeOfImage Torchserve prometheus gauge metric with unit: kB
+# TYPE SizeOfImage gauge
+SizeOfImage{ModelName="mnist",Level="Model",Hostname="88665a372f4b.ant.amazon.com",} 0.265625
 # HELP Requests4XX Torchserve prometheus counter metric with unit: Count
 # TYPE Requests4XX counter
-# HELP MemoryUsed Torchserve prometheus gauge metric with unit: Megabytes
-# TYPE MemoryUsed gauge
-MemoryUsed{Level="Host",Hostname="88665a372f4b.ant.amazon.com",} 8699.48046875
-# HELP CPUUtilization Torchserve prometheus gauge metric with unit: Percent
-# TYPE CPUUtilization gauge
-CPUUtilization{Level="Host",Hostname="88665a372f4b.ant.amazon.com",} 33.3
+# HELP HandlerMethodTime Torchserve prometheus gauge metric with unit: ms
+# TYPE HandlerMethodTime gauge
+HandlerMethodTime{MethodName="preprocess",ModelName="mnist",Level="Model",Hostname="88665a372f4b.ant.amazon.com",} 25.554895401000977
+# HELP InitializeCallCount Torchserve prometheus counter metric with unit: count
+# TYPE InitializeCallCount counter
+InitializeCallCount{ModelName="mnist",Level="Model",Hostname="88665a372f4b.ant.amazon.com",} 1.0
 # HELP DiskUtilization Torchserve prometheus gauge metric with unit: Percent
 # TYPE DiskUtilization gauge
 DiskUtilization{Level="Host",Hostname="88665a372f4b.ant.amazon.com",} 2.7
+# HELP GPUMemoryUsed Torchserve prometheus gauge metric with unit: Megabytes
+# TYPE GPUMemoryUsed gauge
+# HELP InferenceRequestCount Torchserve prometheus counter metric with unit: count
+# TYPE InferenceRequestCount counter
+InferenceRequestCount{Hostname="88665a372f4b.ant.amazon.com",} 1.0
 ```
 
 - Step 8: Navigate to `http://localhost:9090/` on a browser to execute queries and create graphs
diff --git a/examples/custom_metrics/metrics.yaml b/examples/custom_metrics/metrics.yaml
index 699ca19aea..a4f31fdfe3 100644
--- a/examples/custom_metrics/metrics.yaml
+++ b/examples/custom_metrics/metrics.yaml
@@ -72,6 +72,12 @@ model_metrics:
     - name: InferenceRequestCount
       unit: count
       dimensions: []
+    - name: InitializeCallCount
+      unit: count
+      dimensions: [*model_name, *level]
+    - name: PreprocessCallCount
+      unit: count
+      dimensions: [*model_name]
     - name: PostprocessCallCount
       unit: count
       dimensions: [*model_name, *level]
diff --git a/examples/custom_metrics/mnist_handler.py b/examples/custom_metrics/mnist_handler.py
index db162d753d..632afd5a82 100644
--- a/examples/custom_metrics/mnist_handler.py
+++ b/examples/custom_metrics/mnist_handler.py
@@ -21,18 +21,31 @@ def initialize(self, context):
         super().initialize(context)
         metrics = context.metrics
 
-        # Usage of "add_metric"
-        self.inf_request_count = metrics.add_metric(
+        # "add_metric_to_cache" will only register/override(if already present) a metric object in the metric cache and will not emit it
+        self.inf_request_count = metrics.add_metric_to_cache(
             metric_name="InferenceRequestCount",
             unit="count",
             dimension_names=[],
             metric_type=MetricTypes.COUNTER,
         )
-        metrics.add_metric(
-            metric_name="RequestBatchSize",
+        metrics.add_metric_to_cache(
+            metric_name="PreprocessCallCount",
             unit="count",
             dimension_names=["ModelName"],
-            metric_type=MetricTypes.GAUGE,
+            metric_type=MetricTypes.COUNTER,
+        )
+
+        # "add_metric" will register the metric if not already present in metric cache,
+        # include the "ModelName" and "Level" dimensions by default and emit it
+        metrics.add_metric(
+            name="InitializeCallCount",
+            value=1,
+            unit="count",
+            dimensions=[
+                Dimension(name="ModelName", value=context.model_name),
+                Dimension(name="Level", value="Model"),
+            ],
+            metric_type=MetricTypes.COUNTER,
         )
 
     def preprocess(self, data):
@@ -50,10 +63,17 @@ def preprocess(self, data):
 
         metrics = self.context.metrics
 
-        # Usage of "add_or_update"
+        # "add_or_update" will emit the metric
         self.inf_request_count.add_or_update(value=1, dimension_values=[])
 
-        # Usage of "get_metric"
+        # "get_metric" will fetch the corresponding metric from metric cache if present
+        preprocess_call_count_metric = metrics.get_metric(
+            metric_name="PreprocessCallCount", metric_type=MetricTypes.COUNTER
+        )
+        preprocess_call_count_metric.add_or_update(
+            value=1, dimension_values=[self.context.model_name]
+        )
+
         request_batch_size_metric = metrics.get_metric(
             metric_name="RequestBatchSize", metric_type=MetricTypes.GAUGE
         )
@@ -63,7 +83,8 @@ def preprocess(self, data):
 
         input = data[0].get("body")
 
-        # Usage of "add_size"
+        # "add_size" will register the metric if not already present in metric cache,
+        # include the "ModelName" and "Level" dimensions by default and emit it
         metrics.add_size(
             name="SizeOfImage", value=len(input) / 1024, idx=None, unit="kB"
         )
@@ -72,7 +93,8 @@ def preprocess(self, data):
 
         preprocess_stop = time.time()
 
-        # usage of add_time
+        # "add_time" will register the metric if not already present in metric cache,
+        # include the "ModelName" and "Level" dimensions by default and emit it
         metrics.add_time(
             name="HandlerMethodTime",
             value=(preprocess_stop - preprocess_start) * 1000,
@@ -93,11 +115,13 @@ def postprocess(self, data):
         Returns:
             list : A list of dictionary with predictons and explanations are returned.
         """
-        # usage of add_counter
+        # "add_counter" will register the metric if not already present in metric cache,
+        # include the "ModelName" and "Level" dimensions by default and emit it
         self.context.metrics.add_counter(
             name="PostprocessCallCount", value=1, idx=None, dimensions=[]
         )
-        # usage of add_percent
+        # "add_percent" will register the metric if not already present in metric cache,
+        # include the "ModelName" and "Level" dimensions by default and emit it
         self.context.metrics.add_percent(
             name="ExamplePercentMetric",
             value=50,

From d7fe4eb9cddfe63c290802747f8b107a47d95300 Mon Sep 17 00:00:00 2001
From: Naman Nandan <namannan@amazon.com>
Date: Thu, 17 Aug 2023 18:19:01 -0700
Subject: [PATCH 5/7] Update custom metrics API documentation

---
 docs/metrics.md                      | 124 ++++++++++++++++++++-------
 ts/metrics/metric_cache_yaml_impl.py |   2 +-
 2 files changed, 94 insertions(+), 32 deletions(-)

diff --git a/docs/metrics.md b/docs/metrics.md
index 168418d1d2..9c350c9dc6 100644
--- a/docs/metrics.md
+++ b/docs/metrics.md
@@ -10,7 +10,7 @@
 * [Custom Metrics API](#custom-metrics-api)
 * [Logging custom metrics](#log-custom-metrics)
 * [Metrics YAML Parsing and Metrics API example](#Metrics-YAML-File-Parsing-and-Metrics-API-Custom-Handler-Example)
-* [Backwards compatibility warnings](#backwards-compatibility-warnings)
+* [Backwards compatibility warnings](#backwards-compatibility-warnings-and-upgrade-guide)
 
 ## Introduction
 
@@ -197,7 +197,7 @@ When adding custom `model_metrics` in the metrics configuration file, ensure to
 
 ### How it works
 
-Whenever torchserve starts, the [backend worker](https://github.com/pytorch/serve/blob/master/ts/model_service_worker.py) initializes `service.context.metrics` with the [MetricsCache](https://github.com/pytorch/serve/blob/master/ts/metrics/metric_cache_yaml_impl.py) object. The `model_metrics` (backend metrics) section within the specified yaml file will be parsed, and Metric objects will be created based on the parsed section and added  that are added to the cache.
+Whenever torchserve starts, the [backend worker](https://github.com/pytorch/serve/blob/master/ts/model_service_worker.py) initializes `service.context.metrics` with the [MetricsCache](https://github.com/pytorch/serve/blob/master/ts/metrics/metric_cache_yaml_impl.py) object. The `model_metrics` (backend metrics) section within the specified yaml file will be parsed, and Metric objects will be created based on the parsed section and added to the cache.
 
 This is all done internally, so the user does not have to do anything other than specifying the desired yaml file.
 
@@ -248,7 +248,7 @@ When adding any metric via Metrics API, users have the ability to override the m
 `metric_type=MetricTypes.[COUNTER/GAUGE/HISTOGRAM]`.
 
 ```python
-metric1 = metrics.add_metric("GenericMetric", unit=unit, dimension_names=["name1", "name2", ...], metric_type=MetricTypes.GAUGE)
+metric1 = metrics.add_metric_to_cache("GenericMetric", unit=unit, dimension_names=["name1", "name2", ...], metric_type=MetricTypes.GAUGE)
 metric.add_or_update(value, dimension_values=["value1", "value2", ...])
 
 # Backwards compatible, combines the above two method calls
@@ -316,31 +316,35 @@ dimN= Dimension(name_n, value_n)
 
 One can add metrics with generic units using the following function.
 
-Function API
+#### Function API to add generic metrics without default dimensions
 
 ```python
-    def add_metric(self, metric_name: str, unit: str, idx=None, dimension_names: list = None,
-                   metric_type: MetricTypes = MetricTypes.COUNTER) -> None:
+    def add_metric_to_cache(
+        self,
+        metric_name: str,
+        unit: str,
+        dimension_names: list = [],
+        metric_type: MetricTypes = MetricTypes.COUNTER,
+    ) -> CachingMetric:
         """
-        Create a new metric and add into cache.
-            Add a metric which is generic with custom metrics
+        Create a new metric and add into cache. Override existing metric if already present.
 
         Parameters
         ----------
-        metric_name: str
+        metric_name str
             Name of metric
-        value: int, float
-            value of metric
-        unit: str
-            unit of metric
-        idx: int
-            request_id index in batch
-        dimensions: list
-            list of dimensions for the metric
-        metric_type: MetricTypes
-            Type of metric
+        unit str
+            unit can be one of ms, percent, count, MB, GB or a generic string
+        dimension_names list
+            list of dimension name strings for the metric
+        metric_type MetricTypes
+            Type of metric Counter, Gauge, Histogram
+        Returns
+        -------
+        newly created Metrics object
         """
 
+
     def add_or_update(
         self,
         value: int or float,
@@ -365,10 +369,52 @@ Function API
 # Add Distance as a metric
 # dimensions = [dim1, dim2, dim3, ..., dimN]
 # Assuming batch size is 1 for example
-metric = metrics.add_metric('DistanceInKM', unit='km', dimension_names=[...])
+metric = metrics.add_metric_to_cache('DistanceInKM', unit='km', dimension_names=[...])
 metric.add_or_update(distance, dimension_values=[...])
 ```
 
+Note that calling `add_metric_to_cache` will not emit the metric, `add_or_update` will need to be called on the metric object as shown above.
+
+#### Function API to add generic metrics with default dimensions
+
+```python
+    def add_metric(
+        self,
+        name: str,
+        value: int or float,
+        unit: str,
+        idx: str = None,
+        dimensions: list = [],
+        metric_type: MetricTypes = MetricTypes.COUNTER,
+    ):
+        """
+        Add a generic metric
+            Default metric type is counter
+
+        Parameters
+        ----------
+        name : str
+            metric name
+        value: int or float
+            value of the metric
+        unit: str
+            unit of metric
+        idx: str
+            request id to be associated with the metric
+        dimensions: list
+            list of Dimension objects for the metric
+        metric_type MetricTypes
+            Type of metric Counter, Gauge, Histogram
+        """
+```
+
+```python
+# Add Distance as a metric
+# dimensions = [dim1, dim2, dim3, ..., dimN]
+metric = metrics.add_metric('DistanceInKM', value=10, unit='km', dimensions=[...])
+```
+
+
 ### Add time-based metrics
 
 **Time-based metrics are defaulted to a `GAUGE` metric type**
@@ -629,22 +675,38 @@ class CustomHandlerExample:
         metrics.add_size("GaugeModelMetricNameExample", 42.5)
 ```
 
-## Backwards compatibility warnings
+## Backwards compatibility warnings and upgrade guide
 1. Starting [v0.6.1](https://github.com/pytorch/serve/releases/tag/v0.6.1), the `add_metric` API signature changed\
-   from [add_metric(name, value, unit, idx=None, dimensions=None)](https://github.com/pytorch/serve/blob/61f1c4182e6e864c9ef1af99439854af3409d325/ts/metrics/metrics_store.py#L184)\
-   to [add_metric(metric_name, unit, dimension_names, metric_type)](https://github.com/pytorch/serve/blob/35ef00f9e62bb7fcec9cec92630ae757f9fb0db0/ts/metrics/metric_cache_abstract.py#L272).\
+   from: [add_metric(name, value, unit, idx=None, dimensions=None)](https://github.com/pytorch/serve/blob/61f1c4182e6e864c9ef1af99439854af3409d325/ts/metrics/metrics_store.py#L184)\
+   to: [add_metric(metric_name, unit, dimension_names=None, metric_type=MetricTypes.COUNTER)](https://github.com/pytorch/serve/blob/35ef00f9e62bb7fcec9cec92630ae757f9fb0db0/ts/metrics/metric_cache_abstract.py#L272).\
+   In versions greater than v0.8.1 the `add_metric` API signature was updated to support backwards compatibility:\
+   from: [add_metric(metric_name, unit, dimension_names=None, metric_type=MetricTypes.COUNTER)](https://github.com/pytorch/serve/blob/35ef00f9e62bb7fcec9cec92630ae757f9fb0db0/ts/metrics/metric_cache_abstract.py#L272)\
+   to: `add_metric(name, value, unit, idx=None, dimensions=[], metric_type=MetricTypes.COUNTER)`\
    Usage of the new API is shown [above](#specifying-metric-types).\
+   **Upgrade paths**:
+   - **[< v0.6.1] to [v0.6.1 - v0.8.1]**\
    There are two approaches available when migrating to the new custom metrics API:
-   - Replace the call to `add_metric` in versions prior to v0.6.1 with calls to the following methods:
-   ```
-   metric1 = metrics.add_metric("GenericMetric", unit=unit, dimension_names=["name1", "name2", ...], metric_type=MetricTypes.GAUGE)
-   metric1.add_or_update(value, dimension_values=["value1", "value2", ...])
-   ```
-   - Replace the call to `add_metric` in versions prior to v0.6.1 with one of the suitable custom metrics APIs where applicable: [add_counter](#add-counter-based-metrics), [add_time](#add-time-based-metrics),
-   [add_size](#add-size-based-metrics) or [add_percent](#add-percentage-based-metrics)
+     - Replace the call to `add_metric` with calls to the following methods:
+       ```python
+       metric1 = metrics.add_metric("GenericMetric", unit=unit, dimension_names=["name1", "name2", ...], metric_type=MetricTypes.GAUGE)
+       metric1.add_or_update(value, dimension_values=["value1", "value2", ...])
+       ```
+     - Replace the call to `add_metric` in versions prior to v0.6.1 with one of the suitable custom metrics APIs where applicable: [add_counter](#add-counter-based-metrics), [add_time](#add-time-based-metrics),
+       [add_size](#add-size-based-metrics) or [add_percent](#add-percentage-based-metrics)
+   - **[< v0.6.1] to [> v0.8.1]**\
+     The call to `add_metric` is backwards compatible but the metric type is inferred to be `COUNTER`. If the metric is of a different type, an additional argument `metric_type` will need to be provided to the `add_metric`
+     call shown below
+     ```python
+     metrics.add_metric(name='GenericMetric', value=10, unit='count', dimensions=[...], metric_type=MetricTypes.GAUGE)
+     ```
+   - **[v0.6.1 - v0.8.1] to [> v0.8.1]**\
+     Replace the call to `add_metric` with `add_metric_to_cache`.
 2. Starting [v0.8.0](https://github.com/pytorch/serve/releases/tag/v0.8.0), only metrics that are defined in the metrics config file(default: [metrics.yaml](https://github.com/pytorch/serve/blob/master/ts/configs/metrics.yaml))
    are either all logged to `ts_metrics.log` and `model_metrics.log` or made available via the [metrics API endpoint](https://github.com/pytorch/serve/blob/master/docs/metrics_api.md)
    based on the `metrics_mode` configuration as described [above](#introduction).\
    The default `metrics_mode` is `log` mode.\
    This is unlike in previous versions where all metrics were only logged to `ts_metrics.log` and `model_metrics.log` except for `ts_inference_requests_total`, `ts_inference_latency_microseconds` and `ts_queue_latency_microseconds`
-   which were only available via the metrics API endpoint.
+   which were only available via the metrics API endpoint.\
+   **Upgrade paths**:
+   - **[< v0.8.0] to [>= v0.8.0]**\
+     Specify all the custom metrics added to the custom handler in the metrics configuration file as shown [above](#central-metrics-yaml-file-definition).
diff --git a/ts/metrics/metric_cache_yaml_impl.py b/ts/metrics/metric_cache_yaml_impl.py
index 7206c83c30..fa170dd816 100644
--- a/ts/metrics/metric_cache_yaml_impl.py
+++ b/ts/metrics/metric_cache_yaml_impl.py
@@ -109,7 +109,7 @@ def add_metric_to_cache(
         metric_type: MetricTypes = MetricTypes.COUNTER,
     ) -> CachingMetric:
         """
-        Create a new metric and add into cache
+        Create a new metric and add into cache. Override existing metric with same name if present.
 
         Parameters
         ----------

From c282d5f839c725e348cc898923ba19328b77800a Mon Sep 17 00:00:00 2001
From: Naman Nandan <namannan@amazon.com>
Date: Thu, 17 Aug 2023 18:28:21 -0700
Subject: [PATCH 6/7] Fix linter error

---
 ts_scripts/spellcheck_conf/wordlist.txt | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/ts_scripts/spellcheck_conf/wordlist.txt b/ts_scripts/spellcheck_conf/wordlist.txt
index 7692ca780d..7618579767 100644
--- a/ts_scripts/spellcheck_conf/wordlist.txt
+++ b/ts_scripts/spellcheck_conf/wordlist.txt
@@ -1073,3 +1073,5 @@ HandlerMethodTime
 InferenceRequestCount
 PostprocessCallCount
 RequestBatchSize
+InitializeCallCount
+PreprocessCallCount

From c4965dd798fb74564077484b12141b6f64e92a9f Mon Sep 17 00:00:00 2001
From: Naman Nandan <namannan@amazon.com>
Date: Fri, 18 Aug 2023 15:35:42 -0700
Subject: [PATCH 7/7] fix documentation

---
 docs/metrics.md | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/docs/metrics.md b/docs/metrics.md
index 9c350c9dc6..48b2065feb 100644
--- a/docs/metrics.md
+++ b/docs/metrics.md
@@ -10,7 +10,7 @@
 * [Custom Metrics API](#custom-metrics-api)
 * [Logging custom metrics](#log-custom-metrics)
 * [Metrics YAML Parsing and Metrics API example](#Metrics-YAML-File-Parsing-and-Metrics-API-Custom-Handler-Example)
-* [Backwards compatibility warnings](#backwards-compatibility-warnings-and-upgrade-guide)
+* [Backwards compatibility warnings and upgrade guide](#backwards-compatibility-warnings-and-upgrade-guide)
 
 ## Introduction
 
@@ -191,7 +191,8 @@ model_metrics:  # backend metrics
 Note that **only** the metrics defined in the **metrics configuration file** can be emitted to logs or made available via the metrics API endpoint. This is done to ensure that the metrics configuration file serves as a central inventory of all the metrics that Torchserve can emit.
 
 Default metrics are provided in the [metrics.yaml](https://github.com/pytorch/serve/blob/master/ts/configs/metrics.yaml) file, but the user can either delete them to their liking / ignore them altogether, because these metrics will not be emitted unless they are edited.\
-When adding custom `model_metrics` in the metrics configuration file, ensure to include `ModelName` and `Level` dimension names towards the end of the list of dimensions since they are included by default by the following custom metrics APIs: [add_counter](#add-counter-based-metrics),
+When adding custom `model_metrics` in the metrics configuration file, ensure to include `ModelName` and `Level` dimension names towards the end of the list of dimensions since they are included by default by the following custom metrics APIs:
+[add_metric](#function-api-to-add-generic-metrics-with-default-dimensions), [add_counter](#add-counter-based-metrics),
 [add_time](#add-time-based-metrics), [add_size](#add-size-based-metrics) or [add_percent](#add-percentage-based-metrics).