Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add trailing slash to auto generated sitemap.xml for directories only #10044

Closed
4 of 7 tasks
John-fg opened this issue Apr 14, 2024 · 3 comments
Closed
4 of 7 tasks

Add trailing slash to auto generated sitemap.xml for directories only #10044

John-fg opened this issue Apr 14, 2024 · 3 comments
Labels
bug An error in the Docusaurus core causing instability or issues with its execution closed: can't repro This bug is because of some local setup that can't be reproduced.

Comments

@John-fg
Copy link

John-fg commented Apr 14, 2024

Have you read the Contributing Guidelines on issues?

Prerequisites

  • I'm using the latest version of Docusaurus.
  • I have tried the npm run clear or yarn clear command.
  • I have tried rm -rf node_modules yarn.lock package-lock.json and re-installing packages.
  • I have tried creating a repro with https://new.docusaurus.io.
  • I have read the console error message carefully (if applicable).

Description

Bing.com shows a redirection message with HTTP 301 for every page because each link in sitemap.xml is missing a trailing slash. The redirection message shown by the Bing.com search site is WEBMoved Permanently. The document has moved here. It does not display the actual content.

I'd like to:

  • prevent 301 redirects and use direct links in the generated sitemap.xml
  • have docusaurus generate trailing slashes in the generated sitemap.xml for directories.

Also see #4134

Reproducible demo

No response

Steps to reproduce

adding trailingSlash: true, to docusaurus.conf.js.

Expected behavior

Trailing slashes should only be used for actual directories.

Actual behavior

When 'trailingSlash' is added to docusaurus.conf.js within const conf = {..} is being refused with error messages:

const config = {
  title: 'Mysite',
  tagline: 'tagline',
  favicon: 'img/favicon.ico',

  // Set the production url of your site here
  url: 'https://mysite.tld',
  // Set the /<baseUrl>/ pathname under which your site is served
  // For GitHub pages deployment, it is often '/<projectName>/'
  baseUrl: '/',
  trailingSlash: true,

This creates a sitemap.xml with trailing slashes, building then fails due to broken links to anchors:

Error: Unable to build website for locale xx.
    at tryToBuildLocale (/home/user/mytopic/node_modules/@docusaurus/core/lib/commands/build.js:55:19)
    at async mapAsyncSequential (/home/user/mytopic/node_modules/@docusaurus/utils/lib/jsUtils.js:44:24)
    at async Command.build (/home/user/mytopic/node_modules/@docusaurus/core/lib/commands/build.js:82:21) {
  [cause]: Error: Docusaurus found broken links!

  Please check the pages of your site in the list below, and make sure you don't reference any path that does not exist.
  Note: it's possible to ignore broken links with the 'onBrokenLinks' Docusaurus configuration, and let the build pass.

  Exhaustive list of all broken links found:
  - Broken link on source page path = /docs/sub1/:
     -> linking to ./mydoc/#table-1 (resolved as: /docs/sub1/mydoc/#table-1)
     -> linking to mydoc2/#table-2 (resolved as: /docs/sub1/mydoc2/#table-2)
 (removed a list of more broken links)

It looks like links to anchors are not created properly. The directories here are mydoc and mydoc2, the anchors referenced on the index pages are #table-1 and #table-2.

The link in the md file looks like this:

[table 1](./mydoc#table-1) 
[table 2](mydoc2#table-2)

Your environment

  • Public source code: Docusaurus
  • Public site URL: n/a
  • Docusaurus version used: 3.2.1
  • Environment name and version (e.g. Chrome 89, Node.js 16.4): node v18.19.0
  • Operating system and version (e.g. Ubuntu 20.04.2 LTS): Debian 12 (bookworm).

Self-service

  • I'd be willing to fix this bug myself.
@John-fg John-fg added bug An error in the Docusaurus core causing instability or issues with its execution status: needs triage This issue has not been triaged by maintainers labels Apr 14, 2024
@slorber
Copy link
Collaborator

slorber commented Apr 15, 2024

Trailing slashes should only be used for actual directories.

No that's not how this feature is designed sorry. There's not even a concept of "directory" in Docusaurus, only "docs categories".


FYI we recently fixed a bug related to trailing slash not being applied to sitemap:
#9920


301 redirect is a server/host concern, not a Docusaurus concern. If your host serves 301 instead of 200, then you have to configure your host so that it serves 200 instead of 301.


Those links are standard HTML relative links. If you want your pages to end with / then your links must contain that trailing slash too, that's how HTML links work.

[table 1](./mydoc#table-1) 
[table 2](mydoc2#table-2)

We have a whole doc section explaining why we don't recommend those kind of link, in particular due to the trailingSlash portability.

https://docusaurus.io/docs/markdown-features/links

CleanShot 2024-04-15 at 18 33 45


I'm closing because no concrete repro was provided, this issue is quite messy, and to me this works as intended unless proven otherwise.

If you want to discuss things further please create a runnable https://docusaurus.new/stackblitz repro

@slorber slorber closed this as not planned Won't fix, can't repro, duplicate, stale Apr 15, 2024
@slorber slorber added closed: can't repro This bug is because of some local setup that can't be reproduced. and removed status: needs triage This issue has not been triaged by maintainers labels Apr 15, 2024
@John-fg
Copy link
Author

John-fg commented Apr 15, 2024

So how would you configure the most used web server, Apache 2, not to use 301 redirects? Apache adds slashes by default: DirectorySlash On.

The root cause seems to be that sitemap.xml does not contain slashes while Apache requires slashes. A workaround would be to create sitemap.xml with trailing slashes as an option.

Stackblitz obviously does not replicate a real world setup. Do you mean you can't replicate that sitemap.xml does not generate trailing slashes?

@slorber
Copy link
Collaborator

slorber commented Apr 16, 2024

So how would you configure the most used web server, Apache 2, not to use 301 redirects?

This is not an option we recommend using. I'd suggest using Vercel or Netlify, and if you cannot GitHub pages.

If you want to use Apache2, then it's your responsibility to figure out to configure it to serve a static deployment appropriately. I don't use Apache and I can't advise you how to configure it, although I'm pretty sure I already saw people using it successfully.

Docusaurus is only responsible for building a static deployment, not hosting it.

If you think our sitemap has a bug, then provide a repro and show what's the actual sitemap and what's the expected sitemap, given a fixed set of options. The expected behavior is that the sitemap contains URLs with/without / depending on the trailingSlash config, and the sitemap is expected to target the exact canonical URL of pages so if pages have / in their canonical URL, the sitemap should also contain a trailing slash.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug An error in the Docusaurus core causing instability or issues with its execution closed: can't repro This bug is because of some local setup that can't be reproduced.
Projects
None yet
Development

No branches or pull requests

2 participants