-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clean up merge from iev-data gem, update README #39
Conversation
a&.sub(%r{<br/>.*$}, "") | ||
&.sub(%r{, <.*$}, "") | ||
&.gsub(%r{<[^<>]*>}, "")&.strip | ||
&.sub(/, <.*$/, "") | ||
&.gsub(/<[^<>]*>/, "")&.strip |
Check failure
Code scanning / CodeQL
Incomplete multi-character sanitization High
<script
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix AI 19 days ago
To fix the problem, we need to ensure that all instances of the targeted pattern are removed from the string. We can achieve this by applying the regular expression replacement repeatedly until no more replacements can be performed. This ensures that the unsafe text does not re-appear in the sanitized input.
We will modify the code to repeatedly apply the sub
method until the string no longer changes. This will be done for both the %r{<br/>.*$}
and , <.*$
patterns.
-
Copy modified lines R41-R51
@@ -40,5 +40,13 @@ | ||
a = doc&.at(xpath)&.children&.to_xml | ||
a&.sub(%r{<br/>.*$}, "") | ||
&.sub(/, <.*$/, "") | ||
&.gsub(/<[^<>]*>/, "")&.strip | ||
previous = nil | ||
while a != previous | ||
previous = a | ||
a = a&.sub(%r{<br/>.*$}, "") | ||
end | ||
previous = nil | ||
while a != previous | ||
previous = a | ||
a = a&.sub(/, <.*$/, "") | ||
end | ||
a&.gsub(/<[^<>]*>/, "")&.strip | ||
end |
text.gsub( | ||
%r{<a href="?(IEV)\s*(\d\d\d-\d\d-\d\d\d?)"?>(.*?)</?a>}, | ||
'{{\3, \1:\2}}', | ||
).gsub( | ||
%r{<a href="?\s*(\d\d\d-\d\d-\d\d\d?)"?>(.*?)</?a>}, | ||
'{{\3, IEV:\2}}', | ||
).gsub( | ||
# To handle <a> tags without ending tag like | ||
# `Voir <a href=IEV103-05-21>IEV 103-05-21` | ||
# for concept '702-03-11' in `fr` | ||
/<a href="?(IEV)?\s*(\d\d\d-\d\d-\d\d\d?)"?>(.*?)$/, | ||
'{{\3, IEV:\2}}', | ||
).gsub( | ||
%r{<a href="?([^<>]*?)"?>(.*?)</a>}, | ||
'\1[\2]', | ||
).gsub( | ||
Regexp.new([SIMG_PATH_REGEX, '\\s*', FIGURE_TWO_REGEX].join), | ||
"#{IMAGE_PATH_PREFIX}/#{term_domain}/\\1[Figure \\2 - \\3; \\6]", | ||
).gsub( | ||
Regexp.new([SIMG_PATH_REGEX, '\\s*', FIGURE_ONE_REGEX].join), | ||
"#{IMAGE_PATH_PREFIX}/#{term_domain}/\\1[Figure \\2 - \\3]", | ||
).gsub( | ||
/<img\s+([^<>]+?)\s*>/, | ||
"#{IMAGE_PATH_PREFIX}/#{term_domain}/\\1[]", | ||
).gsub( | ||
/<br>/, | ||
"\n", | ||
).gsub( | ||
%r{<b>(.*?)</b>}, | ||
'*\\1*', | ||
) |
Check failure
Code scanning / CodeQL
Polynomial regular expression used on uncontrolled data High
regular expression
library input
This
regular expression
library input
This
regular expression
library input
text.gsub( | ||
%r{<a href="?(IEV)\s*(\d\d\d-\d\d-\d\d\d?)"?>(.*?)</?a>}, | ||
'{{\3, \1:\2}}', | ||
).gsub( | ||
%r{<a href="?\s*(\d\d\d-\d\d-\d\d\d?)"?>(.*?)</?a>}, | ||
'{{\3, IEV:\2}}', | ||
).gsub( | ||
# To handle <a> tags without ending tag like | ||
# `Voir <a href=IEV103-05-21>IEV 103-05-21` | ||
# for concept '702-03-11' in `fr` | ||
/<a href="?(IEV)?\s*(\d\d\d-\d\d-\d\d\d?)"?>(.*?)$/, | ||
'{{\3, IEV:\2}}', | ||
).gsub( | ||
%r{<a href="?([^<>]*?)"?>(.*?)</a>}, | ||
'\1[\2]', | ||
).gsub( | ||
Regexp.new([SIMG_PATH_REGEX, '\\s*', FIGURE_TWO_REGEX].join), | ||
"#{IMAGE_PATH_PREFIX}/#{term_domain}/\\1[Figure \\2 - \\3; \\6]", | ||
).gsub( | ||
Regexp.new([SIMG_PATH_REGEX, '\\s*', FIGURE_ONE_REGEX].join), | ||
"#{IMAGE_PATH_PREFIX}/#{term_domain}/\\1[Figure \\2 - \\3]", | ||
).gsub( | ||
/<img\s+([^<>]+?)\s*>/, | ||
"#{IMAGE_PATH_PREFIX}/#{term_domain}/\\1[]", | ||
).gsub( |
Check failure
Code scanning / CodeQL
Polynomial regular expression used on uncontrolled data High
regular expression
library input
This
regular expression
library input
This
regular expression
library input
This
regular expression
library input
This
regular expression
library input
This
regular expression
library input
text.gsub( | ||
%r{<a href="?(IEV)\s*(\d\d\d-\d\d-\d\d\d?)"?>(.*?)</?a>}, | ||
'{{\3, \1:\2}}', | ||
).gsub( | ||
%r{<a href="?\s*(\d\d\d-\d\d-\d\d\d?)"?>(.*?)</?a>}, | ||
'{{\3, IEV:\2}}', | ||
).gsub( | ||
# To handle <a> tags without ending tag like | ||
# `Voir <a href=IEV103-05-21>IEV 103-05-21` | ||
# for concept '702-03-11' in `fr` | ||
/<a href="?(IEV)?\s*(\d\d\d-\d\d-\d\d\d?)"?>(.*?)$/, | ||
'{{\3, IEV:\2}}', | ||
).gsub( | ||
%r{<a href="?([^<>]*?)"?>(.*?)</a>}, | ||
'\1[\2]', | ||
).gsub( |
Check failure
Code scanning / CodeQL
Polynomial regular expression used on uncontrolled data High
regular expression
library input
text.gsub( | ||
%r{<a href="?(IEV)\s*(\d\d\d-\d\d-\d\d\d?)"?>(.*?)</?a>}, | ||
'{{\3, \1:\2}}', | ||
).gsub( | ||
%r{<a href="?\s*(\d\d\d-\d\d-\d\d\d?)"?>(.*?)</?a>}, | ||
'{{\3, IEV:\2}}', | ||
).gsub( | ||
# To handle <a> tags without ending tag like | ||
# `Voir <a href=IEV103-05-21>IEV 103-05-21` | ||
# for concept '702-03-11' in `fr` | ||
/<a href="?(IEV)?\s*(\d\d\d-\d\d-\d\d\d?)"?>(.*?)$/, | ||
'{{\3, IEV:\2}}', | ||
).gsub( |
Check failure
Code scanning / CodeQL
Polynomial regular expression used on uncontrolled data High
regular expression
library input
text.gsub( | ||
%r{<a href="?(IEV)\s*(\d\d\d-\d\d-\d\d\d?)"?>(.*?)</?a>}, | ||
'{{\3, \1:\2}}', | ||
).gsub( | ||
%r{<a href="?\s*(\d\d\d-\d\d-\d\d\d?)"?>(.*?)</?a>}, | ||
'{{\3, IEV:\2}}', | ||
).gsub( |
Check failure
Code scanning / CodeQL
Polynomial regular expression used on uncontrolled data High
regular expression
library input
text.gsub( | ||
%r{<a href="?(IEV)\s*(\d\d\d-\d\d-\d\d\d?)"?>(.*?)</?a>}, | ||
'{{\3, \1:\2}}', | ||
).gsub( |
Check failure
Code scanning / CodeQL
Polynomial regular expression used on uncontrolled data High
Fixes #37