HOWTO: Make complex queries from the database

First, read this documentation and this documentation, then proceed below. CloudBench has recently introduced (in 2020) MySQL support. We use exactly the same database schema (while, currently unversioned) has not changed since almost the inception of the project in 2011. This schema is exactly the same as the MongoDB schema, except we use MySQL's JSON support, which makes the operation of CloudBench 100% backward-compatible with MongoDB.

In the future, we may create a migration tool, but regardless, we will take you through a layout of the tables in MySQL as an example of how to make a complex query of timeseries information in the database.

Connecting via Private mode vs. Shared mode:

In the configuration file, you can have the following:

[METRICSTORE]
KIND = mysql # or mongodb
# along with the associated passwords, port information and hostnames.

But, you can also do the following:

[METRICSTORE]
USAGE = shared # default is private
KIND = mysql # or mongodb

By default, upon an initial install of CloudBench we will transparently startup a local copy of the database without you knowing about it. It should "just work". CloudBench will automatically manage the startup of the database forever as your local username (not as root). When we do this, we configure the database without any authentication at all. The database will run as your local username without any password.

Alternatively, as you scale the use of your benchmarks, you will eventually switch "shared" mode which tells CloudBench to stop doing that so that your database can be located elsewhere. Furthermore, whether or not it's Mysql or MongoDB, the name of the database holding your data can be changed like this:

[METRICSTORE]
USAGE = shared
KIND = mysql
HOST = xxxxx
DATABASE = xxxxxx # default 'metrics'
MYSQL_USERNAME = xxxx # or MONGODB_USERNAME
MYSQL_PORT = xxxx # or MONGODB_PORT
             # Port 33306 (NOT 3306) is the default port we use, so as not to conflict with any existing            
             # databases while running in "private mode"

As you can see ^^ The default database name will be *metrics*.

The runtime tables where the CloudBench timeseries data is located.


cbtool$ mysql -h127.0.0.1 -P 33306 -uroot metrics -e "show tables LIKE '%runtime%'" | grep -v "metric_names"
Tables_in_metrics (%runtime%)
latest_runtime_app_VM_mrhines
latest_runtime_os_HOST_mrhines
latest_runtime_os_VM_mrhines
runtime_app_VM_mrhines
runtime_os_HOST_mrhines
runtime_os_VM_mrhines

latest_ simply refers to the most recently produced metric. You can skip these tables for the most part, as they are just optimizations. They are used by the user interface, and not meant for complex queries.

Identifying the right table/collection:

If you're installing CloudBench for the first time, the two tables that matter the most are the application-level metrics and the guest operating system metrics, which ship with CloudBench by default:

runtime_app_VM_mrhines
runtime_os_VM_mrhines

app means application metrics and os means guest Operating system metrics.

Primary keys

Every timeseries metric we produce has the following two primary keys associated with them:

uuid
expid

The expid corresponds directly with the expid API command which is used to identify one CloudBench session from another until the cloud is detached.

The uuid corresponds to each unique object (e.g. an application or a virtual machine), which are both distinct objects.

The UUID comes directly from the vmshow and appshow API commands and should used accordingly to lookup metrics.

Example of a complex timeseries query:

First let's create a application, such as netperf.

./cb
The experiment identifier is EXP-06-19-2020-03-18-35-AM-UTC

Take note of the experiment identifier, which can also be retrieved from the Command Line or API like this:

cbtool$ ./cb cldshow time | grep experiment_id
|time                             |experiment_id                  |EXP-06-19-2020-03-18-35-AM-UTC

Now, make the application:

cbtool> aiattach netperf
AI object 0C53A2F0-7931-507F-8734-3DAF37F72B8D (named "ai_2") successfully attached to this experiment.
# After completion, go the UI and see that metrics are being reported.
# Now, lookup the constituent VMs that belong to this AI:
cbtool> vmlist
The following ATTACHED VMs are in this experiment :
|name          |role                |size        |cloud_ip        |host_name                  |vmc_pool      |netname    |ai      |aidrs      |uuid
|vm_2          |netclient           |2gb         |10.108.0.2      |cb-mrhines-MYCLOUD-vm2-netclient-ai-2|SUT           |private    |ai_2    |none       |EC0FE3FB-E43A-51A6-B0C4-8251547F2B6C
|vm_3          |netserver           |2gb         |10.108.0.3      |cb-mrhines-MYCLOUD-vm3-netserver-ai-2|SUT           |private    |ai_2    |none       |9FA821BB-E285-5BC2-98EC-845CF101F702

Here, the "leader" VM (e.g. netclient) is the the VM object where the timeseries data for the netperf benchmark will be reported from, represented by UUID EC0FE3FB-E43A-51A6-B0C4-8251547F2B6C. Now, let's lookup the application metrics for that UUID: EC0FE3FB-E43A-51A6-B0C4-8251547F2B6C

First, let's see what keys are available to us:

mysql> select JSON_KEYS(document) as 'Keys available' from runtime_app_VM_mrhines where expid = 'EXP-06-19-2020-03-18-35-AM-UTC' and uuid = 'EC0FE3FB-E43A-51A6-B0C4-8251547F2B6C' limit 1;

| Keys available                                                                                                                                                                                                            

| ["time", "uuid", "expid", "time_h", "app_errors", "app_load_id", "time_cbtool", "app_bandwidth", "time_cbtool_h", "app_load_level", "app_load_profile", "app_load_duration", "app_quiescent_time", "app_completion_time"] 
1 row in set (0.00 sec)

NOTE: The list above ^^^ Changes dramatically for different application benchmarks.

OK, let's choose bandwidth from netperf:

mysql> select STR_TO_DATE(document->>'$.time_h', '%m/%d/%Y %r UTC') as date, document->>'$.app_bandwidth.val' as 'Throughput', document->>'$.app_bandwidth.units' as 'Unit' from runtime_app_VM_mrhines where expid = 'EXP-06-19-2020-03-18-35-AM-UTC' and uuid = 'EC0FE3FB-E43A-51A6-B0C4-8251547F2B6C';

+---------------------+------------+------+
| date                | Throughput | Unit |
+---------------------+------------+------+
| 2020-06-20 17:21:31 | 1705.87    | Mbps |
| 2020-06-20 17:23:19 | 1490.76    | Mbps |
| 2020-06-20 17:25:04 | 1730.20    | Mbps |
| 2020-06-20 17:26:47 | 1564.05    | Mbps |

... snip ...

And there are the last few timeseries reported by the application.

Now, let's take a look at guest operating system metrics and see what keys are available to us:

mysql> select JSON_KEYS(document) as 'Keys available' from runtime_os_VM_mrhines where expid = 'EXP-06-19-2020-03-18-35-AM-UTC' and uuid = 'EC0FE3FB-E43A-51A6-B0C4-8251547F2B6C' limit 1;
| Keys available

| ["time", "uuid", "expid", "time_h", "cpu_num", "cpu_wio", "pkts_in", "boottime", "bytes_in", "cpu_idle", "cpu_intr", "cpu_nice", "cpu_user", "load_one", "mem_free", "pkts_out", "proc_run", "bytes_out", "cpu_aidle", "cpu_sintr", "cpu_speed", "cpu_steal", "disk_free", "load_five", "mem_dirty", "mem_total", "swap_free", "cpu_system", "disk_total", "ds_KB_read", "mem_cached", "mem_mapped", "mem_shared", "proc_total", "swap_total", "tcp_closed", "tcp_listen", "ds_KB_write", "ds_ios_read", "mem_buffers", "tcp_closing", "tcp_lastack", "tcp_synrecv", "tcp_synsent", "tcp_synwait", "tcp_unknown", "ds_in_flight", "ds_ios_write", "load_fifteen", "swap_KB_read", "tcp_finwait1", "tcp_finwait2", "tcp_timewait", "latest_update", "mem_anonpages", "mem_writeback", "part_max_used", "swap_KB_write", "swap_ios_read", "tcp_closewait", "swap_in_flight", "swap_ios_write", "tcp_established", "ds_procs_blocked", "ds_procs_running", "swap_procs_blocked", "swap_procs_running", "mem_hardware_corrupted"] |

There is a lot to choose from. Let's pull out CPU idle time while netperf is running:

mysql> select STR_TO_DATE(document->>'$.time_h', '%m/%d/%Y %r UTC') as date, document->>'$.cpu_idle.val' as 'CPU Idle time', document->>'$.cpu_idle.units' as 'Unit' from runtime_os_VM_mrhines where expid = 'EXP-06-19-2020-03-18-35-AM-UTC' and uuid = 'EC0FE3FB-E43A-51A6-B0C4-8251547F2B6C' order by date asc limit 20;
+---------------------+---------------+------+
| date                | CPU Idle time | Unit |
+---------------------+---------------+------+
| 2020-06-20 17:20:13 | 88.5          | %    |
| 2020-06-20 17:20:24 | 88.5          | %    |
| 2020-06-20 17:20:35 | 88.5          | %    |
| 2020-06-20 17:20:51 | 88.5          | %    |
| 2020-06-20 17:21:09 | 88.5          | %    |
| 2020-06-20 17:21:28 | 88.5          | %    |
| 2020-06-20 17:21:42 | 90.2          | %    |
| 2020-06-20 17:21:57 | 90.2          | %    |
| 2020-06-20 17:22:16 | 90.2          | %    |
| 2020-06-20 17:22:29 | 90.2          | %    |
| 2020-06-20 17:22:48 | 91.7          | %    |
| 2020-06-20 17:22:59 | 91.7          | %    |
| 2020-06-20 17:23:17 | 91.7          | %    |
| 2020-06-20 17:23:30 | 91.7          | %    |
| 2020-06-20 17:23:43 | 91.7          | %    |
| 2020-06-20 17:24:01 | 91.7          | %    |
| 2020-06-20 17:24:16 | 91.7          | %    |
| 2020-06-20 17:24:31 | 91.7          | %    |
| 2020-06-20 17:24:47 | 91.7          | %    |
| 2020-06-20 17:24:59 | 91.7          | %    |
+---------------------+---------------+------+
20 rows in set (0.00 sec)

As you can see above, the metric is refreshed approximately every minute. This is currently hard-coded. I actually don't know why it's hard-coded, since the metric is actually reported every 15 seconds. We will need to get that fixed.

Now that you have time and date and metric key names, you can then use standard SQL techniques to cross-reference metrics between tables.

Quering API (management) metrics:

You can also query information about interactions with cloud, for example.

Here is a list of available keys VM provisioning metric keys:

mysql> select JSON_KEYS(document) as 'Keys available' from management_VM_mrhines where expid = 'EXP-06-19-2020-03-18-35-AM-UTC' and uuid = 'EC0FE3FB-E43A-51A6-B0C4-8251547F2B6C' limit 1;

| ["ai", "_id", "name", "role", "size", "type", "uuid", "abort", "aidrs", "expid", "model", "state", "tenant", "ai_name", "netname", "pattern", "cloud_ip", "comments", "imageid1", "obj_type", "vmc_name", "host_name", "aidrs_name", "cloud_name", "vmc_cloud_ip", "cloud_hostname", "last_known_state", "utc_offset_delta", "mgt_005_file_transfer", "application_start_on_vm", "mgt_004_network_acessible", "mgt_007_application_start", "instance_preparation_on_vm", "dostaging_005_get_size_time", "mgt_006_instance_preparation", "dostaging_004_get_sshkey_time", "dostaging_003_get_imageid_time", "dostaging_001_authenticate_time", "mgt_002_provisioning_request_sent", "dostaging_007_instance_active_time", "dostaging_006_instance_creation_time", "dostaging_008_instance_reachable_time", "mgt_003_provisioning_request_completed", "mgt_001_provisioning_request_originated", "dostaging_002_check_existing_instance_time"] |

Let's extract VM mgt_004_network_acessible, which is a typo, but it's been that way for many years, so we cannot change it. =)

mysql> select FROM_UNIXTIME(document->>'$.mgt_001_provisioning_request_originated') as "Time VM was created", (document->>'$.mgt_003_provisioning_request_completed' + document->>'$.mgt_004_network_acessible') as 'Time to SSH
+----------------------------+-------------------+
| Time VM was created        | Time to SSH Ready |
+----------------------------+-------------------+
| 2020-06-20 12:11:10.000000 |                92 |
| 2020-06-20 12:11:10.000000 |                92 |
+----------------------------+-------------------+

There are, of course, many other metrics above, such as metrics for when the VM was removed, how long it took the application to start, and so forth.

Good luck!

Provide feedback

Saved searches