-
Notifications
You must be signed in to change notification settings - Fork 717
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[extract_log] Improve extract_log script #559
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -90,9 +90,14 @@ def extract_line(directory, filename, target_string): | |
file = gzip.GzipFile(path) | ||
else: | ||
file = open(path) | ||
result = None | ||
result = [] | ||
with file: | ||
result = [(filename, line) for line in file if target_string in line and 'nsible' not in line] | ||
# This might be a gunzip file or logrotate issue, there has | ||
# been '\x00's in front of the log entry timestamp which | ||
# messes up with the comparator. | ||
# Prehandle lines to remove these sub-strings | ||
result = [(filename, line.replace('\x00', '')) for line in file if target_string in line and 'nsible' not in line] | ||
|
||
return result | ||
|
||
|
||
|
@@ -159,9 +164,12 @@ def extract_latest_line_with_string(directory, filenames, start_string): | |
for filename in filenames: | ||
target_lines.extend(extract_line(directory, filename, start_string)) | ||
|
||
sorted_target_lines = sorted(target_lines, cmp=comparator) | ||
target = target_lines[0] if len(target_lines) > 0 else None | ||
for line in target_lines: | ||
if comparator(line, target) > 0: | ||
target = line | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What is the difference from
? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sort is sorting the whole array. The best of sorting algorithm is O(N * Log(N)). We only need the latest entry, which can be obtained by the loop in O(N). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is not a huge optimization by itself. I had to open the sort call to debug. And since we could get what we need faster, I figured why not :-) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. in this case you can use max() function from python which is O(N) and just one-liner There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As we discussed in person. max doesn't take cmp=... parameter and the (key= lamda) method needs to extend string class. |
||
|
||
return sorted_target_lines[-1] | ||
return target | ||
|
||
|
||
def calculate_files_to_copy(filenames, file_with_latest_line): | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is not required. The empty list will be overwritten.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, Either way is fine. The append function in caller takes both None and [] and the result is the same (in case no entry were found).