Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

records.config to records.yaml #9264

Merged
merged 5 commits into from
Feb 2, 2023
Merged

Conversation

brbzull0
Copy link
Contributor

@brbzull0 brbzull0 commented Dec 20, 2022

Description

This PR introduces the change from a records style configuration structure to a YAML structure(records.yaml).

There are a few commits in this PR which are separated so we can split this into smaller PRs needed, also to be able to read the change in small chunks.

A few things that needs to be highlighted

New YAML structure

We are introducing a root node ts for the records.yaml, so all existing records will belong to the ts node.
From the current records.config structure we have dropped the prefix proxy.config and proxy.local and use the rest to build a YAML structure.
The logic is basically to walk down the record name separated by the dots and build a new YAML map from each name(separated by dots), so for instance, the following records:

CONFIG proxy.config.exec_thread.autoconfig.scale FLOAT 1.0
CONFIG proxy.config.exec_thread.limit INT 2
CONFIG proxy.config.accept_threads INT 1
CONFIG proxy.config.task_threads INT 2
CONFIG proxy.config.cache.threads_per_disk INT 8
CONFIG proxy.config.exec_thread.affinity INT 1
CONFIG proxy.config.diags.debug.enabled INT 0
CONFIG proxy.config.diags.debug.tags STRING http|dns

We first drop the proxy.config. and walk down each field, so we will end up with the following:

ts:
  accept_threads: 1
  cache:
    threads_per_disk: 8
  diags:
    debug:
      enabled: 0
      tags: http|dns
  exec_thread:
    affinity: 1
    autoconfig:
      scale: 1.0
    limit: 2
  task_threads: 2

There were a few things that needed to be done on the names and types that I will be describing next.

Records renaming

There are few changes that I had to perform in order to be able to keep the similar structure used in the records.config into a YAML based configuration.

An example of this would be a record key holding a value which also contains children fields:

proxy.config.exec_thread.autoconfig INT 1
proxy.config.exec_thread.autoconfig.scale FLOAT 1.0 << children of autoconfig

As this cannot be done in YAML we need to rename proxy.config.exec_thread.autoconfig to proxy.config.exec_thread.autoconfig.enabled so we no longer have a map with values other than some children fields. So in YAML we can go from something like this:

proxy.config.exec_thread.autoconfig INT 1
proxy.config.exec_thread.autoconfig.scale FLOAT 1.0

to

exec_thread:
    autoconfig:
      enabled: 1 << new field.
      scale: 1.0

All this record renaming will be reflected in the ATS core as well.

NOTE: The convert to yaml script will work away all this.

Naming convention

All this renamed records are subject to the following reasoning:

If a value is meant to be used as a toggle on/off, I’ve used the nameenabled.
If a value is meant to be used a an enumeration, I’ve used the name value

Possible issue with this is that if we scale the field meaning and we need to move from a toggle to an enumeration then this logic will make no sense.
Should we just use one or the other? Current diags.debug.enabled is an example of an enable field holding more than 0,1 value.

records remaning

The following is the list records that needs to be changed:

proxy.config.output.logfile -> proxy.config.output.logfile.name
proxy.config.exec_thread.autoconfig -> proxy.config.exec_thread.autoconfig.enabled
proxy.config.hostdb -> proxy.config.hostdb.enabled
proxy.config.tunnel.prewarm -> proxy.config.tunnel.prewarm.enabled
proxy.config.ssl.origin_session_cache -> proxy.config.ssl.origin_session_cache.enabled
proxy.config.ssl.session_cache -> proxy.config.ssl.session_cache.value
proxy.config.ssl.TLSv1_3 -> proxy.config.ssl.TLSv1_3.enabled
proxy.config.ssl.client.TLSv1_3 -> proxy.config.ssl.client.TLSv1_3.enabled

Record types

Currently the way for ATS to know the record type is by specifying it in the config file (proxy.config.exec_thread.autoconfig INT 1) and also inside RecordsConfig.cc (not only the type but also name, type, default value, validation regex, etc). This internally lets us validate the config record set in the records.config among the one registered internally; this is done to avoid mismatched types, etc.

In YAML you do not need to explicitly set the type as some of the types can be deduced, using the YAML-CPP library we are not able to deduce this type unless we use tags for a field with the corresponding type (!!int, !!float, etc).

NOTE: Is important to note that for core registered records we do not need to set the type as we have them already in the RecordsConfig.cc array.

Non core records.

Now the problem arises when a plugin register a new field which is not registered in the core, the legacy implementation would read the record type from the config file and then once the record is registered by a plugin it will be properly registered inside ATS( * mismatched type gets the current value reset to the default one*)

As said before in ATS when using YAML without explicit saying the type we have no way to know the type till the registration happens(after reading the config). We need to overcome this and to me there seems to be two options here:

  1. Expect non core records to set the type (!!int, !!float, etc).
ts:
   plugin_x:
      my_field_1: !!int '1'
      my_field_2: !!float '1.2'
      my_field_3: 'my string'
  1. Non core records will be all treated as strings and once registered by a plugin we set the type right(this is already done) but we do NOT reset the value to default(as done currently), again this is a change that may cause issues if anyone is expecting the value to be reset.

This PR implements option 1.

NOTE: The convert tool provided to migrate from records.config to records.yaml has an option to include types in the converted file. All existing records.config files that will be migrated with this tool won't need to set the type, but new fields added to the already migrated file would.

Just to be 100% clear, there is no need to specify the type on all the fields, just the ones that aren’t present in the core.

Why are there no tags in any of the current yaml config files handled by ATS?

ATS is parsing all the YAML files by hand, this means that each field is known by the decoder so it deals with the type properly. For records as we keep the internal librecords structure we do not deal with any particular field as this is already done by the librecords.

NOTE: Further down the line a schema file should be introduced and replace the RecordsConfig.cc ; this may not help with non-core records, unless it’s changed by the user.

Code changes

There is no change in the current librecords implementation, I have just added a bunch of functions to read the yaml (instead of parsing the legacy text file) and then register the records(as currently done). Some of these functions are the base for a future TS API that can be used by conf_remap.(proposal in progress) and also they will be used by the JSON-RPC to be able to inject records.yaml.

As said, we keep the internal record names without a change, so querying and setting records keeps the same.

How do we keep the record name from the YAML structure?

As described above this just just build down the YAML structure base on the record name, so to build up the record name from a YAML field we just need to walk it down and build the name, so the records.yaml parser will flatten the following:

ts:
  exec_thread:
    autoconfig:
      scale: 1.0

to proxy.config.exec_thread.autoconfig.scale for instance. So this code is just a thin layer on top of the existing librecords API which still deals with records names as before.

Few notes:

  • Traffic_ctl still works with record names, all existing API remains unchanged. Further down the line I will be adding a new JSON-RPC API to push YAML config. The existing code will work fine with this.
  • All existing internal TS API remains unchanged.

Important

This implementation will only accept a records.yaml config file, if not found the defaults will be used. If a record.config file is found in the config directory a warning will be shown. This is done to avoid someone having the legacy file thinking it is the current one.

[Dec 20 18:42:12.075] traffic_server WARNING: Found a legacy config file. /home/to/ats/config/records.config

Unit test

Beside the records that were renamed I had changed the trafficserver.ext to keep the same interface for records update:

       ts.Disk.records_config.update({
            'proxy.config.diags.logfile.filename': 'stdout',
            'proxy.config.error.logfile.filename': 'stdout',
        })

The above remains the same.

It usually takes a dict with the name and the value, this internally will be parsed and converted to a YAML file and then injected into the record.yaml file for the particular unit test.

There is also the possibility to just use yaml instead of a dict and update just the nodes passed in the YAML string.

ts.Disk.records_config.update(
    '''
    diags:
      debug:
        enabled: 1
        tags: rpc|rec
    '''
 )

For the unit tests all this is transparent and they can either keep using the records style of just the YAML, I would recommend for new tests just use YAML straight away.

I have introduced a new file type ats:config:yaml which can be used to build any YAML style file.

Documentation

This area is still WIP, but I am introducing a new records.yaml RST (in replace of the old one) which keeps the record name in the documentation but also includes the YAML representation of each record.

[ ] I still need to complete a guide to migrate from one to another.

conf_remap plugin.

There will be a proposal API for plugins like conf_remap that needs to parse YAML files and work away from the record name as we do in core, this will be added after.

@brbzull0 brbzull0 self-assigned this Dec 20, 2022
@brbzull0 brbzull0 force-pushed the conf_rec_to_yaml branch 3 times, most recently from 52c6cc1 to 64c98d8 Compare December 20, 2022 16:36
Copy link
Member

@maskit maskit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we should probably gather feedback for setting names and the detailed format (I'm fine with YAML but see my inline comment) from users (or people who don't review this PR).

'proxy.config.hostdb': 'proxy.config.hostdb.enabled',
'proxy.config.tunnel.prewarm': 'proxy.config.tunnel.prewarm.enabled',
'proxy.config.ssl.origin_session_cache': 'proxy.config.ssl.origin_session_cache.enabled',
'proxy.config.ssl.session_cache': 'proxy.config.ssl.session_cache.value',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering why this one uses "value" instead of "enabled".

Since you're applying some naming rules, it would be nice if you could provide a naming convention or guideline for setting names.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some details on the PR description. Thanks.

@@ -76,7 +76,7 @@ static const RecordElement RecordsConfig[] =
,
{RECT_CONFIG, "proxy.config.cache.max_disk_errors", RECD_INT, "5", RECU_DYNAMIC, RR_NULL, RECC_NULL, nullptr, RECA_NULL}
,
{RECT_CONFIG, "proxy.config.output.logfile", RECD_STRING, "traffic.out", RECU_RESTART_TM, RR_REQUIRED, RECC_NULL, nullptr,
{RECT_CONFIG, "proxy.config.output.logfile.name", RECD_STRING, "traffic.out", RECU_RESTART_TM, RR_REQUIRED, RECC_NULL, nullptr,
RECA_NULL}
,
{RECT_CONFIG, "proxy.config.output.logfile_perm", RECD_STRING, "rw-r--r--", RECU_RESTART_TS, RR_NULL, RECC_NULL, nullptr, RECA_NULL}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we change setting names, I'd like to rename this to roxy.config.output.logfile.permission.

##############################################################################

ts:
accept_threads: !!int '1'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, !!int doesn't look great to me, although I understand what it does... Do we do this for other ATS config files? I've never seen other software that requires explicit type tags.

Copy link
Contributor Author

@brbzull0 brbzull0 Dec 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need this here, there is only 1 scenario where this would be needed and this is when you register configuration inside a plugin which core has no idea about the type at loading time(before the plugin gets to say RecRegisterConfigXX). I will add a description on this PR about this in the next hours.

@brbzull0
Copy link
Contributor Author

brbzull0 commented Dec 20, 2022

I feel like we should probably gather feedback for setting names and the detailed format (I'm fine with YAML but see my inline comment) from users (or people who don't review this PR).

Hello!!

yes definitelly, that's the plan.
I'll clarify all this once I complete the description with the reasoning behind the types/names,etc and of course the idea is to gather feedback, I'll be also sending an email around with this renaming as I some may consider them breaking changes.

names will be proposed and discussed, I won't make this an official PR till all agreed.
Thanks.

@brbzull0 brbzull0 force-pushed the conf_rec_to_yaml branch 2 times, most recently from fbbf65d to 6f68814 Compare December 21, 2022 14:45
@brbzull0 brbzull0 changed the title Draft - records.yaml records.config to records.yaml Dec 21, 2022
@brbzull0 brbzull0 force-pushed the conf_rec_to_yaml branch 3 times, most recently from f761c3e to ed84a45 Compare December 22, 2022 13:20
@brbzull0 brbzull0 added the YAML label Dec 22, 2022
@apache apache deleted a comment from brbzull0 Dec 22, 2022
@brbzull0
Copy link
Contributor Author

[approve ci debian]

@brbzull0 brbzull0 marked this pull request as ready for review January 16, 2023 14:36
@zwoop zwoop added this to the 10.0.0 milestone Jan 17, 2023
maskit
maskit previously approved these changes Jan 26, 2023
Copy link
Member

@maskit maskit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we should merge this sooner rather than later, if we all agree to proceed this way at a minimum. Otherwise we may introduce new setting names that don't work with this, and then it requires updating this PR.

We can make setting name adjustments later. We may have some hiccups but we can find and fix them during 10.0 development. I'm approving this because I don't see any critical issue.

##############################################################################
ram_cache:
size: -1
ram_cache_cutoff: 4194304
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ram_cache:
  size: -1
  cutoff: 4194304

Or maybe I should do separate PRs for these minor changes.

# We want to make the type right when inserting the element into the dict.
if type == 'FLOAT':
return float(value)
elif type == 'INT':
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might it be better to do a conversion and see if that works? Or convert to int and back and see if there's a difference?

// CfgNode(YAML::Node const &n, YAML::Node const &v) : node{n}, value_node{v} {}

/// @brief Construct a configuration node using just the @c YAML::Node and the @c record_name
/// @param n The parsed @c YAML::node.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There aren't @param for all of the parameters. What's @a v?

void
append_field_name() const
{
if (!_legacy.record_name.empty()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (!_legacy.record_name.empty()) {
  _legacy.record_name.append(".");
}
_legacy.record_name.append(node.as<std::tsring>());

@@ -1530,3 +1530,15 @@ RecordsConfigIterate(RecordElementCallback callback, void *data)
callback(&RecordsConfig[i], data);
}
}

const RecordElement *
GetRecordElementByName(std::string_view name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this called? Is there a possible performance issue?

@@ -53,11 +53,11 @@ handle_file_reload(std::string const &fileName, std::string const &configName)
ts::Errata ret;
// TODO: make sure records holds the name after change, if not we should change it.
if (fileName == ts::filename::RECORDS) {
if (RecReadConfigFile() == REC_ERR_OKAY) {
if (auto zret = RecReadYamlConfigFile(); zret) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is zret captured, if it's not used?

std::pair<RecDataT, std::string> try_deduce_type(YAML::Node const &node);

// Helper class to make the code less verbose when lock is needed.
struct scoped_cond_lock {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gah. Better to switch to std::mutex with std::shared_lock and std::unique_lock.

{
[[maybe_unused]] detail::scoped_cond_lock cond_lock(lock);

swoc::Errata errata;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can dump this and use return {}; as the final return statement.

namespace detail
{
std::pair<RecDataT, std::string>
try_deduce_type(YAML::Node const &node)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if a Lexicon would work here.

Copy link
Member

@SolidWallOfCode SolidWallOfCode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to land this and fix the minor issues later.

yaml.

To avoid issues with fields which the main root key holds value and also children fields I had to rename them in order to make it compatible with YAML.

So fields like this:

proxy.config.exec_thread.autoconfig INT 1 << can’t have value and children fields
proxy.config.exec_thread.autoconfig.scale FLOAT 1.0

So this now looks like this:

proxy.config.exec_thread.autoconfig.enabled INT 1

So when migrated to YAML it will looks like this:

exec_thread:
    autoconfig:
      enabled: 1 << new field.
      scale: 1.0
format instead.
- This change keeps the original internals and only adds a new layer for parsing YAML instead of records.
- YAML nodes are converted to record names (dot(.) separated) and use the same internal features.
- traffic_ctl still works with record names.
- All the internal API to query and set records remains unchanged.
@brbzull0 brbzull0 changed the base branch from 10-Dev to master February 1, 2023 19:24
@brbzull0 brbzull0 dismissed stale reviews from SolidWallOfCode and maskit February 1, 2023 19:24

The base branch was changed.

@brbzull0
Copy link
Contributor Author

brbzull0 commented Feb 2, 2023

[approve ci autest]

1 similar comment
@brbzull0
Copy link
Contributor Author

brbzull0 commented Feb 2, 2023

[approve ci autest]

@brbzull0 brbzull0 merged commit aebef45 into apache:master Feb 2, 2023
moleksy pushed a commit to moleksy/trafficserver that referenced this pull request Feb 8, 2023
This commit contains:

* records.yaml - Rename records that will have issues when migrated to
yaml.

To avoid issues with fields which the main root key holds value and also children fields I had to rename them in order to make it compatible with YAML.

So fields like this:

proxy.config.exec_thread.autoconfig INT 1 << can’t have value and children fields
proxy.config.exec_thread.autoconfig.scale FLOAT 1.0

So this now looks like this:

proxy.config.exec_thread.autoconfig.enabled INT 1

So when migrated to YAML it will looks like this:

exec_thread:
    autoconfig:
      enabled: 1 << new field.
      scale: 1.0

* records.yaml: replace records.config configuration file and use YAML
format instead.
- This change keeps the original internals and only adds a new layer for parsing YAML instead of records.
- YAML nodes are converted to record names (dot(.) separated) and use the same internal features.
- traffic_ctl still works with record names.
- All the internal API to query and set records remains unchanged.

* record.yaml - Add basic unit test for some specific cases.
* Add test for fields with multipliers
* Docs update.
cmcfarlen pushed a commit to cmcfarlen/trafficserver that referenced this pull request Jun 3, 2024
* asf/master: (623 commits)
  records.config to records.yaml (apache#9264)
  Updates the release roadmap, adjusting for delays (apache#9360)
  Upgrades master branch to use clang-format v15.0.7 (apache#9355)
  Disable merging on GitHub (apache#9354)
  Clang-format 15.0.7 is finicky, and does not like these old school array inits (apache#9356)
  Enable merging for 10-Dev merge (apache#9353)
  Fix an error on SSL config reload (plus some cleanup). (apache#9334)
  Cleanup of legacy, makes newer clang-format crash right now (apache#9350)
  Update the roadmap / branch management doc page (apache#9340)
  Proxy Protocol out fixes (apache#9341)
  Memory leaks with storing configuration filenames (apache#9324)
  s3_auth autest: convert from gold file to file contains (apache#9337)
  Make 204 cacheable again (apache#9333)
  Add param to forward headers from the auth server to the origin (apache#9271)
  s3_auth: Fix assertion failure of TSActionCancel (apache#9329)
  Added http connect Autest with proxy verifier (apache#9315)
  s3_auth: Schedule reloading config event on TASK thread (apache#9328)
  Register ET_UDP thread type even if no UDP threads are requested (apache#9314)
  Don't send response body on status 204 No Content (apache#9330)
  Limit the serching range of static table by the first letter of header name (apache#9298)
  ...
@bryancall bryancall mentioned this pull request Aug 14, 2024
91 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants