Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fetch spec info from W3C API, Specref or the spec #22

Merged
merged 3 commits into from
Apr 1, 2020

Conversation

tidoust
Copy link
Member

@tidoust tidoust commented Mar 31, 2020

This update adds the logic needed to complete specification metadata with the spec's title (#10) and the TR/ED URLs (#11).

This additional information gets fetched from the W3C API when possible, from Specref when the W3C API does not know anything about the spec, or from the spec itself when neither the W3C API nor Specref knows anything about the spec.

Fetching logic is in src/fetch-info.js. By definition, fetching this information requires sending network requests and, as such, cannot be done in realtime on hundreds of specs. Instead, the commit adds a GitHub action set to run every 6 hours to fetch the information and update the specs-info.json file.

In turn, this file is imported by the index.js script, which now also returns the information.

New properties returned for each specification are:

  • title: the title of the spec
  • trUrl: the URL of the TR document for specs published in TR space
  • edUrl: the URL of the latest Editor's Draft when known
  • source: one of w3c, specref, or spec to give the source of the info

Other updates:

  • Refactored JSON schemas in a schema folder with a common dfns.json file to avoid duplicating type definitions.
  • Extended the filtering logic in index.js to also look for titles, source, and URLs.

Note that the code and the tests now expect to find a config.json file in the root folder that contains an object with a w3cApiKey property set to a valid W3C API key (this file is not part of the commit).

Also note that tests current fail if one tries to add a new spec to specs.json and does not run npm run fetch-info, because the new spec won't have a title property, which is defined as required in the JSON schema. This means that updates to specs-info.json need to be part of pull requests on the repo. Depending on the envisioned workflow for updates (and packaging), we might want to relax that rule later on.

This update adds the logic needed to complete specification metadata with the
spec's title (w3c#10) and the TR/ED URLs (w3c#11).

This additional information gets fetched from the W3C API when possible, from
Specref when the W3C API does not know anything about the spec, or from the
spec itself when neither the W3C API nor Specref knows anything about the spec.

Fetching logic is in `src/fetch-info.js`. By definition, fetching this
information requires sending network requests and, as such, cannot be done in
realtime on hundreds of specs. Instead, the commit adds a GitHub action set to
run every 6 hours to fetch the information and update the `specs-info.json`
file.

In turn, this file is imported by the `index.js` script, which now also returns
the information.

New properties returned for each specification are:
- `title`: the title of the spec
- `trUrl`: the URL of the TR document for specs published in TR space
- `edUrl`: the URL of the latest Editor's Draft when known
- `source`: one of `w3c`, `specref`, or `spec` to give the source of the info

Other updates:
- Refactored JSON schemas in a schema folder with a common `dfns.json` file to
avoid duplicating type definitions.
- Extended the filtering logic in `index.js` to also look for titles, source,
and URLs.

Note that the code and the tests now expect to find a `config.json` file in the
root folder that contains an object with a `w3cApiKey` property set to a valid
W3C API key (this file is *not* part of the commit).

Also note that tests current fail if one tries to add a new spec to `specs.json`
and does not run `npm run fetch-info`, because the new spec won't have a `title`
property, which is defined as `required` in the JSON schema. This means that
updates to `specs-info.json` need to be part of pull requests on the repo.
Depending on the envisioned workflow for updates (and packaging), we might want
to relax that rule later on.
@tidoust tidoust requested a review from dontcallmedom March 31, 2020 21:27
@tidoust
Copy link
Member Author

tidoust commented Mar 31, 2020

Checks fail because linting workflow does not seem to have access to the CONFIG_JSON secret. This could be a feature if I read the doc correctly: "With the exception of GITHUB_TOKEN, secrets are not passed to the runner when a workflow is triggered from a forked repository":
https://help.github.com/en/actions/configuring-and-managing-workflows/creating-and-storing-encrypted-secrets#using-encrypted-secrets-in-a-workflow

Can we run tests on a pull request that need a secret? If not, we'll have to create a PR test suite that only runs tests that can be run without a W3C API key.

@dontcallmedom
Copy link
Member

A similar set up works on pull requests in validate-repos, so I don't think the issue is with secrets not being available.

@dontcallmedom
Copy link
Member

when looking at the results of a run, it shows "***" where the secret should be in validate-repos, and nothing here.

@tidoust
Copy link
Member Author

tidoust commented Apr 1, 2020

A similar set up works on pull requests in validate-repos, so I don't think the issue is with secrets not being available.

But that set up is for a cron job and actual "push" actions, not for pull requests?

@dontcallmedom
Copy link
Member

no, it also works on pull requests, e.g. https://github.com/w3c/validate-repos/pull/86/checks?check_run_id=507710946

@dontcallmedom
Copy link
Member

hmm... but you're right that it's "not from a fork" (i.e. a local pull request only)

@dontcallmedom
Copy link
Member

maybe that's an acceptable limitation?

schema/dfns.json Outdated Show resolved Hide resolved

const url = `https://api.w3.org/specifications/${spec.name}`;
return new Promise((resolve, reject) => {
const request = https.get(url, options, res => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason you're not using a higher level library (e.g. requests) for HTTP fetches?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm trying to introduce dependencies only when needed. We'll probably need to change that later on but network requests seem manageable with core node.js modules for now.

Copy link
Member

@dontcallmedom dontcallmedom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with 2 comments / questions

Tests that require a W3C API key cannot run on pull requests from forked
repositories. This update splits them to a separate file, and create two
separate workflows:
- one for push actions that runs all tests
- one for pull requests that runs all tests that don't require a W3C API key

dfns.json renamed to definitions.json, and top-level key in that file renamed
to `proptype`.
@tidoust tidoust merged commit e334600 into w3c:master Apr 1, 2020
@tidoust tidoust deleted the title branch May 7, 2020 06:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants