-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Link checker should be able to prohibit unknown redirects #6525
Comments
If one can tell in advance where they are redirected, might as well use the direct link in the docs and skip the redirect. |
I provided a reason why I want to be able to link to a redirect, unless you think the base URL of sphinx itself should not be linkable? |
I misread the issue originally, I was hoping all redirects could be replaced by the final version of the URL, but that’s not true. What do you think of a mapping in the config:
|
For the sphinx-doc.org case I would not expect to specify the exact final URL because I don't care where it redirects to when I link to If So
Of course, when you start allowing regex in the
There may be multiple conflicting mappings, if any one of them matches then the link is ok. |
This is something I have just come across myself, and such a setting would be helpful to ignore the fact that a redirect happened - in other words, set the state as "working" instead of "redirected" as long as the target page is available. Another example of a case where this would be helpful is wanting to ignore redirects in the case of documentation versions, e.g. I could see a configuration along the following lines (very much what @nomis has specified above): # Check that the link is "working" but don't flag as "redirected" unless the target doesn't match.
linkcheck_redirects_ignore = {
r'^https://([^/?#]+)/$': r'^https://\1/(?:home|index)\.html?$',
r'^https://(nodejs\.org)/$', r'^https://\1/[-a-z]+/$',
r'^https://(pip\.pypa\.io)/$', r'^https://\1/[-a-z]+/stable/$',
r'^https://(www\.sphinx-doc\.org)/$', r'^https://\1/[-a-z]+/master/$',
r'^https://(pytest\.org)/$', r'^https://docs\.\1/[-a-z]+/\d+\.\d+\.x/$',
r'^https://github.com/([^/?#]+)/([^/?#])+/blob/(.*)$': r'https://github.com/\1/\2/tree/\3$',
r'^https://([^/?#\.]+)\.readthedocs\.io/$': r'^https://\1\.readthedocs\.io/[-a-z]+/(?:master|latest|stable)/$',
r'^https://dev\.mysql\.com/doc/refman/': r'^https://dev\.mysql\.com/doc/refman/\d+\.\d+/',
r'^https://docs\.djangoproject\.com/': r'^https://docs\.djangoproject\.com/[-a-z]+/\d+\.\d+/',
r'^https://docs\.djangoproject\.com/([-a-z]+)/stable/': r'^https://docs\.djangoproject\.com/\1/\d+\.\d+/',
} |
Add a new confval; `linkcheck_warn_redirects` to emit a warning when the hyperlink is redirected. It's useful to detect unexpected redirects under the warn-is-error mode.
Add a new confval; linkcheck_ignore_redirects to ignore hyperlinks that are redirected as expected.
Add a new confval; linkcheck_ignore_redirects to ignore hyperlinks that are redirected as expected.
Add a new confval; linkcheck_ignore_redirects to ignore hyperlinks that are redirected as expected.
Add a new confval; linkcheck_ignore_redirects to ignore hyperlinks that are redirected as expected.
Now I posted #9234 to resolve this issue. Please let me know your opinion if you have time. |
Add a new confval; linkcheck_ignore_redirects to ignore hyperlinks that are redirected as expected.
Close #6525: linkcheck: Add linkcheck_ignore_redirects and linkcheck_warn_redirects
Is your feature request related to a problem? Please describe.
A lot of links become stale or move. Good websites will provide redirects to the correct new location or return an HTTP error code. Bad websites will redirect to an unrelated page or the root of the website.
Preventing all redirects does not allow links to URLs like https://www.sphinx-doc.org/ which redirects to https://www.sphinx-doc.org/en/master/. It needs to be possible to allow these redirects but disallow others.
Describe the solution you'd like
It should be possible to prohibit unknown redirects by listing all of the allowed redirects as pairs of URLs.
Describe alternatives you've considered
Post-process
linkcheck/output.txt
by removing filenames and line numbers then sorting it and comparing it with known good output.Additional context
A link to https://blogs.windows.com/buildingapps/2016/12/02/symlinks-windows-10/ (which used to work) now redirects to https://blogs.windows.com/windowsdeveloper/. Linkcheck allows this but the original link is not valid and needs to be updated to the article's new URL of https://blogs.windows.com/windowsdeveloper/2016/12/02/symlinks-windows-10/.
Linkcheck should be able to report an error for this redirect.
The text was updated successfully, but these errors were encountered: