Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor semantic conventions #1977

Closed
wants to merge 11 commits into from

Conversation

tedsuo
Copy link
Contributor

@tedsuo tedsuo commented Sep 29, 2021

Based on discussions within the Semantic Conventions SIG, I am proposing a refactor to how the semantic conventions are laid out within the spec.

Proposed changes:

  • Centralize the location of all semantic conventions, rather than split them across many subsections of the spec.
  • Create one folder per topic (HTTP, faas, messaging, etc). All relevant tracing, metric, and resource conventions are combined into these topics.
  • Move all implementation-specific conventions (HBase, AWS lambda, etc) to an /implementation folder under each topic.

The new layout can viewed here, on the branch in GitHub.

Example layout:

-- semantic_conventions/
   -- database/
        database.md
        sql.md
        --implementations/
            cassandra.md
            dynamodb.md
            hbase.md
            mongodb.md
            mysql.md
            redis.md

Goals:
This new layout should make it easier to browse and comprehend all of the available semantic conventions. This should also make it easier to assign domain experts as codeowners for each convention. It also makes it easier to see how most implementations are underspecified. This will help the Semantic Conventions / Instrumentation SIG move forwards.

This PR only attempts to take separate and recombine the concepts currently described in the spec, while changing the wording as little as possible. It does not attempt to adjust any concepts or clarify existing any descriptions. Once the refactor is complete, approvers should be assigned to each convention so that we can begin processing improvements to the spec.

This PR is currently a draft, but it is ready for initial feedback.

Completed so far:

  • Refactored trace semantics

TODO:

  • Refactor metric semantics
  • Refactor resource semantics

@tedsuo tedsuo added the area:semantic-conventions Related to semantic conventions label Sep 29, 2021
Copy link
Contributor

@pyohannes pyohannes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a great start!

@@ -9,22 +9,9 @@ This document defines how to describe remote procedure calls

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably want to name the directory rpc and not rpc.md.

@@ -0,0 +1,20 @@
# Cassandra

**Status**: [Experimental](../../../document-status.md)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like having implementations in separate documents. This will allow us to have experimental documents for implementations, while the overall conventions could be stable.

@lmolkova
Copy link
Contributor

lmolkova commented Sep 29, 2021

Looks great!

What about yaml files? It looks like many changes should be done through them and the instrumentation SIG expertise area should cover both yaml and md files

@lmolkova
Copy link
Contributor

lmolkova commented Sep 29, 2021

I suggest keeping yaml and md together.
Since we probably want to cover metrics later on, would it make sense to keep current conventions in the trace folder?

-- semantic_conventions/
   -- database/
       -- trace/
           database.yml
           database.md
           sql.md
           --implementations/
                  cassandra.md
                  ...

@tedsuo
Copy link
Contributor Author

tedsuo commented Sep 29, 2021

@lmolkova I'm not familiar with how the yaml files work, I thought they were generated from the markdown.

I agree we need to make them work properly before committing this. But if we want to change their behavior/location, perhaps that can be a follow up PR? I don't know who currently consumes them.

@lmolkova
Copy link
Contributor

@lmolkova I'm not familiar with how the yaml files work, I thought they were generated from the markdown.

I agree we need to make them work properly before committing this. But if we want to change their behavior/location, perhaps that can be a follow up PR? I don't know who currently consumes them.

md files (<semconv> blocks) are generated from yaml files with this tool https://github.com/open-telemetry/build-tools/tree/main/semantic-conventions#usage.

Anyway, I'm fine with a follow-up PR.

@Oberon00
Copy link
Member

Oberon00 commented Sep 29, 2021

I have heard the misconception that the YAML is generated from the markdown for the second time from different persons now. But I wonder how this occurs. Isn't it be far more usual to generate text from structured data than the other way round? Is there something we could change to make it more obvious how this works, maybe in the syntax of the marker comments, or elsewhere?

There is documentation about these also in the spec BTW: https://github.com/open-telemetry/opentelemetry-specification/tree/main/semantic_conventions

@tedsuo
Copy link
Contributor Author

tedsuo commented Sep 29, 2021

@lmolkova for metrics vs trace, I want to see if we can combine them rather than keep them separate. As an instrumentation author, you will need to understand the whole model - configuration, span structure, and how metrics relate to these.

Some semantics, like faas, have also have resource definitions. I want to see if we can present this as a single model.

My hope is that by moving out all of the implementation-specific details, there will now be room to keep all of this info in a single file. If not, I'd like to try using suffixes like -trace.md and -metric.md to keep the all of the details next to each other in the same folder.

@jamesmoessis
Copy link
Contributor

Amazing! I really like the idea of bringing all semantic convention related things together, and of course splitting the implementation specific conventions out. This will help the semantic convention towards stability.

The YAML is probably the first source of truth that needs changing to make this structure a reality, as others have mentioned. I also imagine that a fair amount of re-work would be required in build-tools to generate this structure. I am very happy to help out with these pieces of work.

@kenfinnigan
Copy link
Member

I like the layout and its separation of implementations, and how all signal conventions are in one place.

However, as @lmolkova mentioned, this should also be done for the YAML files otherwise it will be very confusing to find the right file to change


## Requirements

`db.system` MUST be set to `jdmc`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JDMC or JDBC?

Also, so I understand, "database" implies database-clients and database-servers?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it was just clients, and the span kind is CLIENT.

@@ -0,0 +1,70 @@
# gRPC
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was the directory mean to be rpc or rpc.md?

Copy link
Contributor

@jsuereth jsuereth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like this shift!


**Status**: [Experimental](../document-status.md)

Spans and metrics often represent well-known protocols such as HTTP requests, database calls, or message queues.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Telemetry signals? Otherwise you will have to list Logs (and other stuff) in the future.

@carlosalberto
Copy link
Contributor

Great work, thanks for working on this!

@joaopgrassi
Copy link
Member

Is there something we could change to make it more obvious how this works, maybe in the syntax of the marker comments, or elsewhere?

@Oberon00 I think if the suggestion structure in this PR gets accepted/merged, the top-level README there should probably point out that the .md files here are "auto-generated" by the yaml files. I think maybe that will help already? Another thing would be to put also the yaml files together in the same folder, but I don't know if that's even possible.

@github-actions
Copy link

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@jamesmoessis
Copy link
Contributor

Hey folks, I've raised a PR to refactor the YAML so it follows the same structure as this. This is a pre-requisite of refactoring the markdown like this PR shows.

#2019

@github-actions github-actions bot removed the Stale label Oct 15, 2021
@github-actions
Copy link

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Oct 23, 2021
@github-actions
Copy link

Closed as inactive. Feel free to reopen if this PR is still being worked on.

@tigrannajaryan
Copy link
Member

@tedsuo any chance we can revive this? I think this is very important.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:semantic-conventions Related to semantic conventions Stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants