Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add link checking for markdown in CI #2145

Merged
merged 6 commits into from
Jun 8, 2023
Merged

Add link checking for markdown in CI #2145

merged 6 commits into from
Jun 8, 2023

Conversation

wolf99
Copy link
Contributor

@wolf99 wolf99 commented Nov 2, 2022

Fixes #2139

@wolf99 wolf99 added CI x:action/create Work on something from scratch x:knowledge/elementary Little Exercism knowledge required x:type/ci Work on Continuous Integration (e.g. GitHub Actions workflows) x:size/small Small amount of work x:rep/small Small amount of reputation labels Nov 2, 2022
@wolf99 wolf99 requested a review from a team as a code owner November 2, 2022 20:09
Copy link
Member

@kytrinyx kytrinyx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@wolf99
Copy link
Contributor Author

wolf99 commented Nov 2, 2022

I see a couple of things in the output that I need to adjust for before this is merged

@wolf99
Copy link
Contributor Author

wolf99 commented Nov 2, 2022

I actually don't know why these warnings are occurring.
The raw files are valid and are not as the checker seems to be interpreting them.
I will look into this further later on.

@jiegillet
Copy link
Contributor

It's not happy with the email link [abuse@exercism.org](mailto:abuse@exercism.org?subject=%5BCoC%5D)

## Errors per input
### Errors in CODE_OF_CONDUCT.md
* [file:///home/runner/work/problem-specifications/problem-specifications/mailto:abuse@exercism.org](file:///home/runner/work/problem-specifications/problem-specifications/mailto:abuse@exercism.org): Failed: Cannot find file (status code: ERR)

@Stargator
Copy link
Contributor

Stargator commented Nov 4, 2022

Sorry I am late to the party. The dart track does use lychee for our Link workflow.

We had it disabled a month or so ago because it was failing for mailto links which it should support.

@kytrinyx
Copy link
Member

kytrinyx commented Nov 4, 2022

@Stargator pointed us to the open issue on the lychee repository that is related to the failing mailto check: lycheeverse/lychee#653

@wolf99
Copy link
Contributor Author

wolf99 commented Nov 4, 2022

This is now correctly failing for a an incorrectly detected bad link (the mailto mentioned by @Stargator ).

Until Lychee fixes that issue and subsequently makes a new release of lychee and the lychee-action, this PR is blocked.

@mre
Copy link

mre commented Nov 5, 2022

I've worked on a fix here: lycheeverse/lychee#809.
Some more 👀 would be helpful to get this merged.

@Stargator
Copy link
Contributor

@wolf99 @kytrinyx lychee just merged in a fix for this. I anticipate the next version for the lychee-action will include that fix. The dart track has dependabot set to open a PR when that update is available, I believe the same is so for this repo.

@mre
Copy link

mre commented Nov 6, 2022

Yup, I'll release a new version soon. Want to merge in lycheeverse/lychee#810 as well.
Thanks for the support on this one y'all. Much appreciated. ❤️

@mre
Copy link

mre commented Nov 9, 2022

New version of lychee-action with the fix is out.

@wolf99
Copy link
Contributor Author

wolf99 commented Nov 9, 2022

Think you so much for the help @mre !
I have updated the action version in the workflow here and the mailto is no longer failing.

There is still the issue with the timeout for the one link that we have not been able to solve. I will add a commit with an ignore of some sort for this URL.

@mre
Copy link

mre commented Nov 9, 2022

No worries.
Just fyi, I also released a new version to fix the issues you described in lycheeverse/lychee-action#169. https://github.com/lycheeverse/lychee-action/releases/tag/v1.5.4
So feel free to update to 1.5.4 to (hopefully) resolve that problem as well.

@Stargator
Copy link
Contributor

No worries.
Just fyi, I also released a new version to fix the issues you described in lycheeverse/lychee-action#169. https://github.com/lycheeverse/lychee-action/releases/tag/v1.5.4
So feel free to update to 1.5.4 to (hopefully) resolve that problem as well.

Thanks, @mre!

@wolf99 The check initially failed due to a timeout on one website. I re-ran it and everything is passing. it looks good!

@Stargator
Copy link
Contributor

There is a merge conflict that needs to be resolved

@mre
Copy link

mre commented May 23, 2023

From what I can see, the only missing thing is to fix the merge conflicts, right? @wolf99 can you have a look?

@wolf99
Copy link
Contributor Author

wolf99 commented May 27, 2023

Ill look at this this weekend I hope.

@wolf99
Copy link
Contributor Author

wolf99 commented May 28, 2023

Seems like there are issues with 2 URLs.

  1. ✗ [403] https://www.collinsdictionary.com/dictionary/english/lackadaisical | Failed: Network error: Forbidden
  2. ⧖ [TIMEOUT] https://papyr.com/hypertextbooks/grammar/ph_noun.htm | Timeout

Testing the first one locally with curl returns the same resultunless provided with an invalid (contains a :) user-agent.
So it seems like it may be an issue with how that site has configured its server?

Testing the second locally with curl returns 200; so I am unable to reproduce that issue

@wolf99
Copy link
Contributor Author

wolf99 commented May 28, 2023

Re-ran.
Timeout is gone, but 403 remains.

Looks like that site is doing some check for JS and cookie support, not sure if lycheee supports that. More research required.

$ curl -A "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/113.0.0
.0 Safari/537.36" --referer https://foo.com https://www.collinsdictionary.com/dictionary/english/lackadaisical

<!DOCTYPE html>
<html lang="en-US">
<head>
    <title>Just a moment...</title>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
    <meta http-equiv="X-UA-Compatible" content="IE=Edge">
    <meta name="robots" content="noindex,nofollow">
    <meta name="viewport" content="width=device-width,initial-scale=1">
    <link href="/cdn-cgi/styles/challenges.css" rel="stylesheet">


</head>
<body class="no-js">
    <div class="main-wrapper" role="main">
    <div class="main-content">
        <noscript>
            <div id="challenge-error-title">
                <div class="h2">
                    <span class="icon-wrapper">
                        <div class="heading-icon warning-icon"></div>
                    </span>
                    <span id="challenge-error-text">
                        Enable JavaScript and cookies to continue
                    </span>
                </div>
            </div>
        </noscript>
#...

@mre
Copy link

mre commented May 29, 2023

Nice. You made some great progress! Thanks for looking into that.
The remaining issue for https://www.collinsdictionary.com/dictionary/english/lackadaisical points towards Cloudflare bot-detection. That's my guess, at least, because they use the same page layout (compare here). The URL works locally, but on GitHub we seem to trigger the bot detector and there isn't much we can do about it at the moment. lychee does not support JavaScript or cookies. However, you can add a .lycheeignore file and add the URL there, to skip it from being tested. That should resolve the issue.

@wolf99
Copy link
Contributor Author

wolf99 commented May 29, 2023

Adding the .lycheeignore solved it it seems.

@mre thanks for all your help!

@kytrinyx
Copy link
Member

kytrinyx commented Jun 8, 2023

Yay, this is looking fantastic. Thanks so much for your work on this @wolf99, and your help with it @mre!

@kytrinyx kytrinyx merged commit 5bc4c0e into main Jun 8, 2023
@kytrinyx kytrinyx deleted the link-checker branch June 8, 2023 09:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI x:action/create Work on something from scratch x:knowledge/elementary Little Exercism knowledge required x:rep/small Small amount of reputation x:size/small Small amount of work x:type/ci Work on Continuous Integration (e.g. GitHub Actions workflows)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add link checking to CI
7 participants