Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AMLII-933 - Linux log file tailing e2e test #20202
AMLII-933 - Linux log file tailing e2e test #20202
Changes from 3 commits
c014d79
a540004
2334aad
64c6cb1
9da23b0
51135e0
93d335e
2063e69
9f869f3
1f705ac
e21eeeb
16e007f
66b5c1e
7862cca
91675df
399f3c9
59a8d8e
d14b700
128c1b3
20cc38c
2f6d298
cc0095a
52702e1
399c938
46cb70e
4492f32
87d576e
4faa457
5d81b64
38dd4b7
8f1f3b5
2b1aae8
5621419
6d9964f
16d3070
c02e5da
6f770cf
957e1f4
833f859
2ce3d15
4074606
0846fa8
7416bd8
b25d5b9
caf9b62
9496578
57d3ef4
36b7e50
d3ba4d7
c89fbaf
6427e90
76e7371
b4fd09e
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This filters for any logs from the service 'hello'. Should we simplify this and just assert that we don't find the log we emit below?
My thought is that by being more specific we would reduce any chances of false-positives.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So currently we do have different log content for different scenarios. Mainly:
hello-world
access-denied
access-granted
hello-world-new-content
We could check for all those 4 specific logs but i feel like that is less efficient vs just checking for the intake for all those logs sent via the same service tag. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that is the case, then this assertion should be part of the parent suite as a precondition to running the tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This
EventuallyWithT
is asserting that there are no logs present already (before the logs were accessible to the agent), correct?If that's the case, shouldn't that be a very short (maybe even synchronous) check? Do these last two arguments mean that it will retry every 10s for 5min to assert that the logs are continuously empty?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a pending feature in
Eventually
that is early returning on error onrequire
. Current behaviour is a panic that stops the test. Once that is implemented, we can change this function to retry onfakeintake.FilterLogs("hello")
and not onassert.Empty
.This code retries on failures on
assert.NoError
and onassert.Empty
. This means if will keep on failing if logs are not empty, as we don't flush them - and we don't want toThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, got it, thanks for the explanation, I had misunderstood when it retries.
This makes sense to me then, in the normal case this should run once, find that neither of the asserts fails, and proceed with the rest of the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we not get a fresh filesystem in this test? If this is relying on
/var/log/hello-world.log
to exist from the previous test, we should make that dependency explicit in some way. Maybe we assert on the files existence?Right now if this is just relying on the ordering of the cases in
TestLinuxLogTailing
, that is very subtle and feels easy to break in the future.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we technically check for the existence of the log file via error from the chmod and generate log since both will generate errors if the log file does not exist. I can put a sanity check at the beginning of the function to see if the log file exist
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💬 suggestion
You can also use the
Status
command from thes.Env().Agent
clientThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❓ question
Why do you need to restart the agent ? Does the agent stops checking when a file is unaccessible ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be way way back when the agent was in 7.10x or something but that used to be the behavior where users have to manually restart the agent when the permission changes 😅 . I just did a quick test and it looks like the agent now will rescan just fine when the permission changes so I'll get rid of this line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be a flaky issue, but it looks like even though the agent should rescan, the test fails to get new logs unless the agent is restarted. I can bring it up to OH next week
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, is this an expected feature ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After this you could add a call to wait for the agent to be ready, there should be an helper on the agent client
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to add a blocking call to wait for the agent to be ready as @pducolin suggests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should have a Part 7 that checks if the log actually made it to fake-intake, shouldn't we?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💬 suggestion
You could use the stack definition with VM, fakeintake and agent client, to help running status command - the gain is tiny
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any built-in support to query the status object before its turned into template text?
Ie, instead of using
agent status
to query, query the status HTTP endpoint directly and expose the returned JSON?I'm hesitant to rely too much on
agent status
for tests in general, but it would be less error-prone if we're querying and asserting on a specific json field vsassert.Contains
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could move this implementation by @KevinFairise2 to common helpers, it executes the status command on the agent client (s.Env().Agent) and parses it using json library
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed you can skip status, except if you want to test status, and check fakeintake directly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we repeat the 'content' 10 times? This is slightly surprising to me as a user of
generateLog
, I would expectgenerateLog
to write exactly the log that I pass as an argument.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking of in the future if we have tests that require to send big payload. So for now I'll just pass in a parameter that allow users to repeat contents X amount of time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👏 praise
Nice !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check will give incorrectly pass if you ask for the same content multiple times. The first content addition may succeed, and the second one may fail, but if the second one fails, this check will pass because it was added in the first call to
generateLogs
.This is also not a blocking comment for this review, just pointing this out for future improvement.