Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[APM] Encode spaces when creating ML job #63683

Merged
merged 1 commit into from
Apr 16, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,8 @@ export class MachineLearningFlyout extends Component<Props, State> {
};

public addErrorToast = () => {
const core = this.context;
const { core } = this.context;

const { urlParams } = this.props;
const { serviceName } = urlParams;

Expand Down
10 changes: 7 additions & 3 deletions x-pack/legacy/plugins/apm/public/services/rest/ml.ts
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,8 @@ import {
} from '../../../../../../plugins/apm/common/elasticsearch_fieldnames';
import {
getMlJobId,
getMlPrefix
getMlPrefix,
encodeForMlApi
} from '../../../../../../plugins/apm/common/ml_job_constants';
import { callApi } from './callApi';
import { ESFilter } from '../../../../../../plugins/apm/typings/elasticsearch';
Expand Down Expand Up @@ -53,13 +54,16 @@ export async function startMLJob({
http: HttpSetup;
}) {
const transactionIndices = await getTransactionIndices(http);
const groups = ['apm', serviceName.toLowerCase()];
const groups = [
'apm',
encodeForMlApi(serviceName),
encodeForMlApi(transactionType)
];
const filter: ESFilter[] = [
{ term: { [SERVICE_NAME]: serviceName } },
{ term: { [PROCESSOR_EVENT]: 'transaction' } },
{ term: { [TRANSACTION_TYPE]: transactionType } }
];
groups.push(transactionType.toLowerCase());
return callApi<StartedMLJobApiResponse>(http, {
method: 'POST',
pathname: `/api/ml/modules/setup/apm_transaction`,
Expand Down
6 changes: 6 additions & 0 deletions x-pack/plugins/apm/common/ml_job_constants.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,12 @@ describe('ml_job_constants', () => {
expect(getMlJobId('myServiceName', 'myTransactionType')).toBe(
'myservicename-mytransactiontype-high_mean_response_time'
);
expect(getMlJobId('my service name')).toBe(
'my_service_name-high_mean_response_time'
);
expect(getMlJobId('my service name', 'my transaction type')).toBe(
'my_service_name-my_transaction_type-high_mean_response_time'
);
});

it('getMlIndex', () => {
Expand Down
6 changes: 5 additions & 1 deletion x-pack/plugins/apm/common/ml_job_constants.ts
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@

export function getMlPrefix(serviceName: string, transactionType?: string) {
const maybeTransactionType = transactionType ? `${transactionType}-` : '';
return `${serviceName}-${maybeTransactionType}`.toLowerCase();
return encodeForMlApi(`${serviceName}-${maybeTransactionType}`);
}

export function getMlJobId(serviceName: string, transactionType?: string) {
Expand All @@ -16,3 +16,7 @@ export function getMlJobId(serviceName: string, transactionType?: string) {
export function getMlIndex(serviceName: string, transactionType?: string) {
return `.ml-anomalies-${getMlJobId(serviceName, transactionType)}`;
}

export function encodeForMlApi(value: string) {
return value.replace(/\s+/g, '_').toLowerCase();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about using encodeURIComponent? With this change service a and service_a are now the same (although it's very much an edge case).

Btw. A comment would probably be useful later (eg. "ML doesn't allow spaces in group properties")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw. are there other characters that we should be aware of that ML doesn't support?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't consider that! But unfortunately % is not supported either. Not sure how we could solve this issue. See https://github.com/elastic/elasticsearch/blob/95a7eed9aa35f47b228e402508709b5bd6703cf4/x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/ml/utils/MlStrings.java#L20-L26 for the regex.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, okay. Not sure how to fix then. Let's leave it for now.

Copy link
Member

@jgowdyelastic jgowdyelastic Apr 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the function we use to validate an ML job/group ID in the UI.

export function isJobIdValid(jobId) {

Copy link
Member

@sorenlouv sorenlouv Apr 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for chiming in @jgowdyelastic.

We use a slightly more liberal regex: ^[a-zA-Z0-9 _-]+$ (alphanumeric characters, spaces, underscores, and dashes).
https://www.elastic.co/guide/en/apm/get-started/current/agents.html#choose-service-name

// it must also start and end with an alphanumeric character'

Sounds like we might run into problems with service names that start or end with non-alphanumeric character eg. service-a- or _internal_service.

Either we should find a fix on our side, or perhaps you can allow this?
What's the background for this limitation?

Copy link
Member

@jgowdyelastic jgowdyelastic Apr 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few reasons why we don't allow non-alphanumeric chars at the start and end of the ID. They are mainly to do with the ID being used as part of our endpoint URIs.
We follow elasticsearch's naming convention and do now allow - or _ a the start. . is use for "hidden" indices etc.
Some url parsing systems don't like . chars at the end, e.g. slack will not auto add it to a link.
It's reasons like this why we've gone with this fairly strict rule.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I don't anticipate that this will be common in service names either but currently we do allow it meaning we'll have to handle it somehow.

The easy solution would be to strip non-alphanumeric chars at the beginning and end, but that would create the same problem as I mentioned before, where _service-A and service-A are now identical (very edge casey as well)

}