Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(build): remove the redirection of legacy images that start with /@api/deki/ #11228

Merged
merged 1 commit into from
Jun 5, 2024

Conversation

yin1999
Copy link
Member

@yin1999 yin1999 commented May 31, 2024

Summary

The usage of legacy images that start with /@api/deki/ has been removed from content and translated-content (this can be checked by search the string /@api/deki in those two repos). We can now safely remove them.

Related issue: mdn/translated-content#10487

The replacement of prefix = of image file name has also been removed, I wrote a script to check all image references that contain an equal sign:

search.ts
import { fromMarkdown } from "mdast-util-from-markdown";
import { visit } from "unist-util-visit";
import fs from "node:fs/promises";
import * as path from "node:path";
import { fdir } from "fdir";
import ora from "ora";
import yargs from "yargs";
import { hideBin } from "yargs/helpers";

const ESCAPE_CHARS_RE = /[.*+?^${}()|[\]\\]/g;

function findMatchesInText(
  haystack: string,
  { attribute = null } = {}
): Set<string> {
  // Need to remove any characters that can affect a regex if we're going
  // use the string in a manually constructed regex.
  const escaped = "[^'\"]+".replace(ESCAPE_CHARS_RE, "\\$&");
  const rex = attribute
    ? new RegExp(`${attribute}=['"](${escaped})['"]`, "g")
    : new RegExp(`(${escaped})`, "g");
  const matches = new Set<string>();
  for (const match of haystack.matchAll(rex)) {
    // check if the match contains an equal sign
    if (match[0].includes("=")) {
      matches.add(match[0]);
    }
  }
  return matches;
}

export function findImagesInMarkdown(rawContent: string): Set<string> {
  const matches = new Set<string>();
  const type = "image";
  const attributeType = "src";
  const tree = fromMarkdown(rawContent);
  // Find all the links and images in the markdown
  // we should also find any HTML elements that contain links or images
  visit(tree, [type, "html"], (node) => {
    if (node.type === "html") {
      const matchesInHtml = findMatchesInText(node.value, {
        attribute: attributeType,
      });
      for (const match of matchesInHtml) {
        matches.add(match);
      }
    } else if (node.type == type && node.url.includes("=")) {
      // else this would be a markdown link or image
      matches.add(node.url);
    }
  });
  return matches;
}

const spinner = ora().start();

async function main() {
  const { argv } = yargs(hideBin(process.argv)).command(
    "$0 [files..]",
    "Check the url locales of the given files",
    (yargs) => {
      yargs.positional("files", {
        describe:
          "The files to check (relative to the current working directory)",
        type: "string",
        array: true,
        default: ["../content/files/", "../translated-content/files/"],
      });
    }
  );

  const files = [];

  spinner.text = "Crawling files...";

  for (const fp of (argv as any).files as string[]) {
    const fstats = await fs.stat(fp);

    if (fstats.isDirectory()) {
      files.push(
        ...new fdir()
          .withBasePath()
          .filter((path) => path.endsWith(".md"))
          .crawl(fp)
          .sync()
      );
    } else if (fstats.isFile()) {
      files.push(fp);
    }
  }

  let exitCode = 0;
  const results = new Map<string, Set<string>>();

  for (const i in files) {
    const file = files[i];

    spinner.text = `${i}/${files.length}: ${file}...`;

    const relativePath = path.relative(process.cwd(), file);

    const originContent = await fs.readFile(relativePath, "utf8");
    const images = findImagesInMarkdown(originContent);
    if (images.size > 0) {
      results.set(relativePath, images);
    }

    spinner.start();
  }

  spinner.stop();

  for (const [file, images] of results) {
    console.log(`\n${file}`);
    for (const image of images) {
      console.log(`  ${image}`);
    }
  }
}

await main();

The result shows:

../translated-content/files/es/web/api/web_components/index.md
  https://pbs.twimg.com/media/EOW1l5dVAAADJuF?format=jpg&name=large

../translated-content/files/es/web/api/window/open/index.md
  menusystemcommands.png?size=webview

../translated-content/files/es/web/css/transform-function/index.md
  transform_functions_generic_transformation_cart.png?size=webview
  transform_functions_transform_composition_cart.png?size=webview

../translated-content/files/es/web/html/content_categories/index.md
  content_categories_venn.png?size=webview

../translated-content/files/es/web/html/element/input/index.md
  mozactionhint.png?size=webview

../translated-content/files/pt-br/conflicting/web/javascript/inheritance_and_the_prototype_chain/index.md
  =figure8.3.png

../translated-content/files/ru/learn/javascript/building_blocks/image_gallery/index.md
  https://github.com/ConstantineZz/javaScript/blob/master/gallery.png?raw=true

../translated-content/files/zh-tw/web/api/battery_status_api/index.md
  http://x.co/qr/batstat?s=165

Only the =figure8.3.png in files/pt-br/conflicting/web/javascript/inheritance_and_the_prototype_chain/index.md has the prefix =, while the file stored in this repo does have the = prefix, so the replacement is not necessary and safe to be removed.


How did you test this change?

No additional test.

@yin1999 yin1999 requested a review from a team as a code owner May 31, 2024 13:58
@github-actions github-actions bot added the python Pull requests that update Python code label May 31, 2024
Comment on lines +95 to 97
const absoluteURL = /^\/files\/\d+/.test(src)
? new URL(`https://mdn.mozillademos.org${src}`)
: new URL(src, baseURL);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to support the other case either, as mdn.mozillademos.org has been decommissioned. Or do we still have links to /files?

Suggested change
const absoluteURL = /^\/files\/\d+/.test(src)
? new URL(`https://mdn.mozillademos.org${src}`)
: new URL(src, baseURL);
const absoluteURL = new URL(src, baseURL);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fiji-flo Will rari support this? (I guess not, which means we should migrate those in translated-content.)

@caugner caugner merged commit 8ec146e into mdn:main Jun 5, 2024
8 checks passed
@yin1999 yin1999 deleted the legacy-images branch June 5, 2024 12:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
python Pull requests that update Python code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants