Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add simple report generator #3602

Merged
merged 5 commits into from
Jan 29, 2024

Conversation

dbutenhof
Copy link
Member

@dbutenhof dbutenhof commented Jan 21, 2024

This will report on the state of the ARCHIVE, BACKUP, and CACHE on-disk trees in addition to the state of the SQL database. (I'm going to leave analyzing and reporting on the Opensearch database for another time, since this is "off books" weekend upstream work!)

This packages the ad hoc SQL queries I've been doing to monitor the server as a CLI utility, plus some more.

Here's the output of pbench-report-generator --all on the production server:

Archive report:
  117,446 tarballs consuming 21.7 TB
  The smallest tarball is 1.0 kB, pbench-user-benchmark__2020.04.03T11.05.44
  The biggest tarball is 41.1 GB, uperf_Azure_RHEL-8.10.0-20240116.45_x86_64_gen2_pci_netvsc_quick_D240125T014727_2024.01.25T01.47.28
Backup report:
  117,447 tarballs consuming 21.7 TB
Cache report:
  97,464 datasets consuming 45.6 TB
  4 datasets have never been unpacked, 0 are missing reference timestamps, 0 have bad size metadata
  The smallest cache is 24.6 kB, pbench-user-benchmark__2020.04.03T11.05.44
  The biggest cache is 110.5 GB, trafficgen_RHOSP16.2-RHEL8.3-nrt-OVS-OFFLOAD-PVP-LossTests_tg:trex_r:none_fs:64,128,256,512,1024,1500_nf:1024_fm:si_td:bi_ml:0.002,0.0005,0.0001_tt:bs__2020-12-26T03:16:38
  The least recently used cache was referenced Dec 11, specjbb2005__2023.09.22T00.22.28
  The most recently used cache was referenced today, uperf_rhel84_4.18.0.277_kernel_10gb_jumbo_2021.01.26T09.51.18
SQL storage report:
  Table                Rows       Storage   
  -------------------- ---------- ----------
  alembic_version               1    57.3 kB
  audit                   683,922   224.7 MB
  datasets                117,449    34.3 MB
  templates                    12   221.2 kB
  server_settings               0    24.6 kB
  users                        11    81.9 kB
  dataset_metadata        352,344   217.9 MB
  dataset_operations      340,986    29.1 MB
  api_keys                      5    81.9 kB
  indexmaps               291,510    79.7 GB
Operational states:
  UPLOAD states:
          OK  117,449
  TOOLINDEX states:
       READY  106,112
  INDEX states:
          OK  106,112
      FAILED      494
           CODE  7:    365  Bad metadata.log file encountered
           CODE  1:    128  Operational error while indexing
           CODE 12:      1  Unexpected error encountered
       READY   10,819

@dbutenhof dbutenhof added Server Database Operations Related to operation and monitoring of a service labels Jan 21, 2024
@dbutenhof dbutenhof requested a review from webbnh January 21, 2024 20:42
@dbutenhof dbutenhof self-assigned this Jan 21, 2024
webbnh

This comment was marked as resolved.

webbnh

This comment was marked as resolved.

@webbnh

This comment was marked as resolved.

Copy link
Member Author

@dbutenhof dbutenhof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot to add...are you going to update the PR description with the new output?

Good point, although I'm not sure it needs to be merged into the repo -- it was just to give you some context.

lib/pbench/cli/server/report.py Outdated Show resolved Hide resolved
lib/pbench/cli/server/report.py Show resolved Hide resolved
lib/pbench/cli/server/report.py Outdated Show resolved Hide resolved
This will report on the state of the ARCHIVE, BACKUP, and CACHE on-disk trees
in addition to the state of the SQL database. (I'm going to leave analyzing
and reporting on the Opensearch database for another time, since this is "off
books" weekend upstream work!)

This packages the ad hoc SQL queries I've been doing to monitor the server as
a CLI utility, plus some more.

Here's the output of `pbench-report-generator --all` on the production server:

```
Archive report:
  117286 tarballs: 21.5 TB
  The smallest tarball, pbench-user-benchmark__2020.04.03T11.05.44, is 1.0 kB
  The biggest tarball, uperf_osp16_1_ml2ovs_25g_ew_2020.11.16T08.05.28, is 26.1 GB
Backup report:
  117286 tarballs are backed up, consuming 21.5 TB
Cache report:
  103904 datasets are cached, consuming 44.9 TB
  8 datasets have never been unpacked, 3 are missing reference timestamps, 0 have bad size metadata
  The smallest cache, pbench-user-benchmark__2020.04.03T11.05.44, is 24.6 kB
  The biggest cache, trafficgen_RHOSP16.2-RHEL8.3-nrt-OVS-OFFLOAD-PVP-LossTests_tg:trex_r:none_fs:64,128,256,512,1024,1500_nf:1024_fm:si_td:bi_ml:0.002,0.0005,0.0001_tt:bs__2020-12-26T03:16:38, is 110.5 GB
  The least recently used cache, uperf__2023.12.02T00.33.06, was referenced Dec 07
  The most recently used cache, uperf_tuned_virtual-guest_sys_file_none_2020.06.11T10.37.30, was referenced today
Operational states:
  UPLOAD states:
          OK   117285
  TOOLINDEX states:
       READY   103561
  INDEX states:
          OK   103561
      FAILED      376
       READY    13324
SQL storage report:
  Table                Rows       Storage
  -------------------- ---------- ----------
  alembic_version               1 57.3 kB
  audit                    672249 221.8 MB
  datasets                 117285 34.3 MB
  templates                    12 221.2 kB
  server_settings               0 24.6 kB
  users                        10 81.9 kB
  dataset_metadata         351852 217.6 MB
  dataset_operations       338107 28.9 MB
  api_keys                      5 81.9 kB
  indexmaps                283670 74.5 GB
```
The `Watcher` is now asynchronous, so it can indicate progress even when we're
not in control. I also tweaked the formatting, and enhanced the detailed SQL
dump.

I'm resisting the temptation to add colors for detailed errors and extended
info.
webbnh

This comment was marked as resolved.

Copy link
Member

@webbnh webbnh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@dbutenhof dbutenhof merged commit 369736b into distributed-system-analysis:main Jan 29, 2024
4 checks passed
@dbutenhof dbutenhof deleted the report branch January 29, 2024 19:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Database Operations Related to operation and monitoring of a service Server
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants