Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Monitoring] Support for unlinked deployments #28278

Merged

Conversation

chrisronline
Copy link
Contributor

@chrisronline chrisronline commented Jan 8, 2019

Resolves #27595

Note

@elastic/stack-monitoring I'm hoping to get a solid yes/no on the direction of this before writing tests and getting reviewers so please chime in on thoughts.

Summary

This PR adds support for viewing monitoring data for stack products that aren't directly related to an elasticsearch cluster, or more specifically, monitoring documents that do not contain a cluster_uuid. See the ticket for more historical information about this.

We've decided to call this situation an unlinked deployment meaning it can contain more than one stack product that are theoretically related to each other in purpose. Therefore, we're using a cluster_uuid of __unlinked_deployment__ which should never be actually used by a cluster in a production environment.

The rest of the monitoring UI should work appropriately. It should only show stack products which contain no cluster_uuid. We only expect this to happen for beats and logstash (and possibly apm?) so if a kibana or ES monitoring document is missing a cluster_uuid, we will not show it. (This can be changed if there is a strong reason to support it)

Testing

  1. Run a beat and/or logstash and configure monitoring appropriately. Beats monitoring and Logstash monitoring
  2. Write some script that will automatically remove the cluster_uuid from the monitoring documents, like this one I use
  3. Notice the multiple clusters now available on the monitoring home page

Screenshots

screen shot 2019-01-08 at 2 23 42 pm

screen shot 2019-01-08 at 2 24 11 pm

screen shot 2019-01-08 at 2 24 22 pm

TODO

  • Add tests
  • Support localization of Unlinked Deployment

@chrisronline chrisronline added the Team:Monitoring Stack Monitoring team label Jan 8, 2019
@chrisronline chrisronline self-assigned this Jan 8, 2019
@ycombinator
Copy link
Contributor

@chrisronline I'm +1 on the direction so my feedback here is more of the nitpicky kind (which I know you love 😄).

  • It's a bit weird to see "deployments" mixed with "clusters". I know we had a discussion on the naming and said it could be it's own PR. Do you intend to get that other PR in before this one? If not, then for consistency sake I'd just call it "unlinked clusters" for now, then change "cluster" -> "deployment" en masse later.

  • In the first screenshot (Cluster Listing page), WDYT about placing an (i) icon next to the "Unlinked Deployment" so users can understand what we mean by that? Alternatively, what about removing that row from the table altogether and having more of a callout like, "hey btw, you also have instances that aren't linked to any cluster/deployment" above or below the table?

@elasticmachine
Copy link
Contributor

💔 Build Failed

Copy link
Member

@pickypg pickypg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Driveby comments, but I am super excited to see this functionality arrive.

{
term: {
cluster_uuid: {
value: ''
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a side note, care should be taken as this is implemented on the sending side given that cluster_uuid can be an array in the future. cluster_uuid: [ "xyz", "" ] would match this filter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you suggesting we change this to

{
  cluster_uuid: ['']
}

now? or to know about this for the future?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I intended it for the future, but maybe we should change this here too.

The term query/filter just needs to match a single term in the field's index to match the document. A document with cluster_uuid: [ "xyz", "" ] would have two terms in the cluster_uuid index pointing back to it: "xyz" and "". The sender should not send "" to avoid this type of scenario.

However, perhaps we should only look for cluster_uuid that do not exist, like your secondary query/filter does quite appropriately. By doing that, we'll never look for "" and therefore we won't encourage that type of data being sent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I'm not sure which cause is happening, or more likely to happen. Is it that the cluster_uuid field is empty, or just missing? @ycombinator

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you updated the query to use must_not exists. I'm good with this and will make sure Beats simply doesn't send the cluster_uuid field so this query will work as intended.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The query handles both scenarios right now, an empty cluster_uuid or one that simply does not exist. I'd love to simplify if we can to @pickypg's proposal, but I just want to make sure that will actually work with monitoring works internally with each of our stack products that can be affected (mainly Logstash and Beats).

It seems beats can/will be updated to not send the cluster_uuid (I can create an issue if you haven't) but we should verify that works with Logstash too

Copy link
Contributor

@ycombinator ycombinator Jan 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems beats can/will be updated to not send the cluster_uuid (I can create an issue if you haven't) but we should verify that works with Logstash too

Today its simply not possible for Beats or LS to omit the cluster_uuid, as that field is being added by the production cluster that proxies the monitoring data to the monitoring cluster.

Going forward (with direct shipping to the monitoring cluster), the work of not sending a cluster_uuid for Beats is already being done in elastic/beats#9260 (which is actually what led to this UI issue/PR if you recall 😄). For Logstash as well the work has already started (elastic/beats#9163) but it hasn't advanced as much as Beats yet. In either case, we have issues/PRs already so we're good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


export function Overview(props) {
const isFromUnlinkedDeployment = props.cluster.cluster_uuid === UNLINKED_DEPLOYMENT_CLUSTER_UUID;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic appears to be appropriately reused in a bunch of places. Is the thought here that used a predefined constant is better than undefined because it ends up in the global scope, and we don't want to add logic to the global scope handling?

(I'm asking to better understand your reasoning rather than "this is not right!")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup basically. I didn't want to use an empty string for cluster_uuid and I felt something like undefined or null misrepresented what's going on.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I felt something like undefined or null misrepresented what's going on.

But looking at the name, cluster_uuid, I feel like that's exactly what it means though?

I do think one problem with using undefined/null in the short term is that we redirect to the cluster listing page if multiple clusters are detected and it's not supplied / found. I wouldn't want you to change that behavior in this PR, but I do think we should stop redirecting automatically from the cluster listing page, which would then open up using it for that purpose.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have any concerns with using this constant as the identifier for this? While I see your side of things, I feel like null or undefined, while technically true for this, make it seem like some sort of error scenario/state where it's not really for the user. I feel pretty good about this name, though we should change it to Unlinked Cluster

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any problem with it. cluster_uuid is a fixed length identifier in ES, so you shouldn't have any unlikely collision opportunity. The constant's name is great, although I agree with the rename until "Deployment" is more widely applied.

x-pack/plugins/monitoring/server/lib/create_query.js Outdated Show resolved Hide resolved
@elasticmachine
Copy link
Contributor

💔 Build Failed

@chrisronline
Copy link
Contributor Author

@ycombinator How does this look?

screen shot 2019-01-09 at 2 44 42 pm

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

@elasticmachine
Copy link
Contributor

💔 Build Failed

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

@ycombinator
Copy link
Contributor

That looks awesome! ❤️

@ycombinator
Copy link
Contributor

Minor question, just noticed it: should that callout be outside the table panel? That way the distinction is even clearer, visually?

@pickypg
Copy link
Member

pickypg commented Jan 9, 2019

should that callout be outside the table panel? That way the distinction is even clearer, visually?

Probably worth getting in touch with @elastic/kibana-design to help with that rather than us debating it. :)

@cchaos
Copy link
Contributor

cchaos commented Jan 9, 2019

Honestly, either way works, depends on context. Though @snide might have a stronger opinion.

The one thing I would say is that I think we try to avoid wordings like "Click here..." I'd just use a link that says "View these instances". But you can also double-check with @gchaps .

@pickypg
Copy link
Member

pickypg commented Jan 9, 2019

I think it looks fine (wording aside), but it seems like something that I wouldn't mind seeing the first time that I visited this page and feel like a secondary feature thereafter.

On subsequent visits, it seems like having a separate columnar listing like the other "clusters" (aka deployments) show the number of instances would be useful. At a certain point though, calculating that accurately (for Beats in particular) may become impractical.

@chrisronline
Copy link
Contributor Author

@cachedout Do you think you'll have cycles tomorrow to run through this and verify the functionality? I just tested it again and it works like we discussed.

@cachedout
Copy link
Contributor

@chrisronline Apologies for the delay. I just finished looking this over. It appears from my perspective that it's doing what we agreed upon.

@@ -78,7 +78,7 @@ export function flagSupportedClusters(req, kbnIndexPattern) {
// if multi cluster
if (clusters.length > 1) {
const basicLicenseCount = clusters.reduce((accumCount, cluster) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: could we update the variable name here, since it's no longer just about the license?

Copy link
Contributor

@ycombinator ycombinator Jan 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, further below in this same function, could we update the comments and corresponding serverLog(...) messages as well? I can't comment on those lines because they haven't been changed by this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the header comment on the flagSupportedClusters function should be updated as well.

Copy link
Contributor

@ycombinator ycombinator Jan 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the logic of the "all basic licenses" block a few lines below:

      // if all basic licenses
      if (clusters.length === basicLicenseCount) {
        const kibanaUuid = config.get('server.uuid');
        return await findSupportedBasicLicenseCluster(req, clusters, kbnIndexPattern, kibanaUuid, serverLog);
      }

If I'm following this right, in a scenario where there's an unlinked cluster and a basic cluster (to which the current kibana is attached), only the basic cluster will be supported. But what we want is for the unlinked cluster to be supported as well, right?

Copy link
Contributor

@ycombinator ycombinator Jan 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the previous comment in this thread, I'm not sure the "some non-basic licenses" block works as we want either, if I'm following the logic correctly.

Consider the scenario of 1 unlinked cluster, 1 basic cluster, and 1 trial cluster. clusters.length === 3 and basicLicenseCount === 2. So we end up in the "some non-basic licenses" code block, whose logic is thus:

      // if some non-basic licenses
      serverLog('Found some basic license clusters in monitoring data. Only non-basic will be supported.');
      clusters.forEach(cluster => {
        if (cluster.license && cluster.license.type !== 'basic') {
          cluster.isSupported = true;
        }

I'd expect cluster.license to be falsy for the unlinked cluster, and as a result this cluster's isSupported wouldn't be set to true.

Am I following this right?

Copy link
Contributor

@ycombinator ycombinator Jan 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a suggestion: might it be cleaner to do something like this instead?

  1. Find if there is an unlinked cluster in clusters. If so, set it's isSupported flag to true.
  2. Create a new variable, linkedClusters which is clusters minus the unlinked one.
  3. Use the same logic as today, starting with the // if multi clusters line, but on linkedClusters instead of clusters.

Copy link
Contributor

@ycombinator ycombinator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chrisronline It would be good to add some tests to this PR that test a few of the license scenarios, e.g.

  • when there's 1 cluster: an unlinked one ---> should be supported
  • when there's 2 clusters: an unlinked one and a trial one ---> both should be supported
  • when there's 2 clusters: an unlinked one and a basic one ---> both should be supported
  • when there's 3 clusters: an unlinked one, a basic one, and a trial one ---> unliked and trial clusters should be supported but basic one should not be supported

I say this because just reading through the logic of the flagSupportedClusters I think I might've found a few bugs (see my comments there).

Looks like there are already unit tests around the existing (pre-PR) logic here: https://github.com/elastic/kibana/blob/master/x-pack/plugins/monitoring/server/lib/cluster/__tests__/flag_supported_clusters.js

So it's hopefully just a matter of adding to these same tests?

@chrisronline
Copy link
Contributor Author

@ycombinator Amazing feedback (as usual)! I fixed up the source code to make it more clear, then added more tests to ensure it's working as expected

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

@chrisronline
Copy link
Contributor Author

I'm +1 for renaming the unlinked cluster/deployment as "stand alone", for the reason @cachedout stated but also because "unlinked" also brings the notion of an ES center back to prominence, something we want to move away from. Naming is hard, however, so I'm happy to have this discussion / change in a follow up PR 😄.

Following up here, do we feel the naming in the UI is the only part of this we should be worried about? or also the route and api urls? Those actually still say deployment

@chrisronline
Copy link
Contributor Author

screen shot 2019-01-24 at 10 39 04 am

Here's how it looks with the name change

@ycombinator
Copy link
Contributor

Following up here, do we feel the naming in the UI is the only part of this we should be worried about? or also the route and api urls? Those actually still say deployment

I think there's two parts to the naming here 😄

  1. We started discussing whether to rename clusters to deployments here: Monitoring UI should support data with no cluster UUID #27595 (comment).

  2. We started discussing whether to rename unlinked to standalone (spelling?) here: [Monitoring] Support for unlinked deployments #28278 (comment)

I'm good with making just the change in (2) in this PR and deferring the change in (1) to a future PR since its a larger change across the Monitoring UI.

* 3. Make a query to the monitored kibana data to find the "supported" cluster
* UUID, which is the cluster associated with *this* Kibana instance.
* 4. Flag the cluster object with an `isSupported` boolean
* 1. Detect if there any unlinked clusters and ignore those for these calculations as they are auto supported
Copy link
Contributor

@ycombinator ycombinator Jan 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, so feel free to ignore:

From what I can tell the code in this flagSupportedClusters function doesn't explicitly set the unlinked cluster's isSupported property to true. I assume this works because you're handling the "supported-ness" logic for the unlinked cluster elsewhere. You might consider setting the flag for the unlinked cluster here, just so all the logic about which clusters should be supported or not is in one place; it might make debugging easier in the future.

@chrisronline
Copy link
Contributor Author

I'm going to change all references of Unlinked Cluster and Unlinked Deployment introduced in this PR to Standalone Cluster. I think that will solve everyone's concerns for now

@ycombinator
Copy link
Contributor

ycombinator commented Jan 24, 2019

Left a nit comment which you can totally ignore. Other than that, the changes from the previous round of feedback look good to me. I'm good with then naming changes you proposed as well for the scope of this PR.

Will do a final review once you've made the naming changes.

@chrisronline
Copy link
Contributor Author

@ycombinator and others. Name changes are in. Final review ready!

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

@cachedout
Copy link
Contributor

+1 from me as well. This looks pretty much ready to roll.

Copy link
Contributor

@ycombinator ycombinator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Awesome to see this feature land! 🎉

@chrisronline chrisronline merged commit 9f37c1f into elastic:master Jan 25, 2019
@chrisronline chrisronline deleted the monitoring/unlinked_deployment branch January 25, 2019 14:34
chrisronline added a commit to chrisronline/kibana that referenced this pull request Jan 25, 2019
* Unlinked deployment working for beats

* Use better constant

* Show N/A for license

* Rename to Unlinked Cluster

* Use callout to mention unlinked cluster

* PR feedback

* Use fragment

* Speed up the query by using terminate_after

* Handle failures more defensively

* Remove unnecessary msearch

* PR feedback

* PR feedback and a bit of light refactor

* Updated text

* Add api integration tests

* Localize call out

* Update loc pattern

* Fix improper i18n.translate usage

* Revert "Fix improper i18n.translate usage"

This reverts commit 0e2e760.

* Revert "Update loc pattern"

This reverts commit cc99fe8.

* Ensure the unlinked deployment cluster counts as a valid cluster

* Sometimes, you miss the smallest things

* Ensure the unlinked cluster is supported, in that users can click the link and load it

* Update tests

* PR feedback. Simplifying the flag supported code and adding more tests

* Update naming

* Rename to Standalone Cluster

* Remove unnecessary file

* Move logic for setting isSupported to exclusively in flag supported clusters code, update tests
chrisronline added a commit that referenced this pull request Jan 29, 2019
* [Monitoring] Support for unlinked deployments (#28278)

* Unlinked deployment working for beats

* Use better constant

* Show N/A for license

* Rename to Unlinked Cluster

* Use callout to mention unlinked cluster

* PR feedback

* Use fragment

* Speed up the query by using terminate_after

* Handle failures more defensively

* Remove unnecessary msearch

* PR feedback

* PR feedback and a bit of light refactor

* Updated text

* Add api integration tests

* Localize call out

* Update loc pattern

* Fix improper i18n.translate usage

* Revert "Fix improper i18n.translate usage"

This reverts commit 0e2e760.

* Revert "Update loc pattern"

This reverts commit cc99fe8.

* Ensure the unlinked deployment cluster counts as a valid cluster

* Sometimes, you miss the smallest things

* Ensure the unlinked cluster is supported, in that users can click the link and load it

* Update tests

* PR feedback. Simplifying the flag supported code and adding more tests

* Update naming

* Rename to Standalone Cluster

* Remove unnecessary file

* Move logic for setting isSupported to exclusively in flag supported clusters code, update tests

* Fix up the data for these tests
@chrisronline
Copy link
Contributor Author

Backport:

6.x: 97e5be6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants