Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wombat breaks native JavaScript modules #116

Closed
Rich-Harris opened this issue Mar 28, 2023 · 3 comments
Closed

Wombat breaks native JavaScript modules #116

Rich-Harris opened this issue Mar 28, 2023 · 3 comments

Comments

@Rich-Harris
Copy link

I was looking at archived copies of https://kit.svelte.dev on the Wayback Machine (example), and discovered that the JavaScript errors when it loads:

Uncaught (in promise) SyntaxError: import declarations may only appear at top level of a module

This is happening because the file in question is a JavaScript module (i.e. it uses import/export syntax):

var _____WB$wombat$assign$function_____ = function(name) {return (self._wb_wombat && self._wb_wombat.local_init && self._wb_wombat.local_init(name)) || self[name]; };
if (!self.__WB_pmw) { self.__WB_pmw = function(obj) { this.__WB_source = obj; return this; } }
{
  let window = _____WB$wombat$assign$function_____("window");
  let self = _____WB$wombat$assign$function_____("self");
  let document = _____WB$wombat$assign$function_____("document");
  let location = _____WB$wombat$assign$function_____("location");
  let top = _____WB$wombat$assign$function_____("top");
  let parent = _____WB$wombat$assign$function_____("parent");
  let frames = _____WB$wombat$assign$function_____("frames");
  let opener = _____WB$wombat$assign$function_____("opener");

import{o as Ce,t as se}from"../chunks/index.1cc2e0ff.js"; // rest of the code omitted


}
/*
     FILE ARCHIVED ON 16:57:32 Mar 27, 2023 AND RETRIEVED FROM THE
     INTERNET ARCHIVE ON 00:11:11 Mar 28, 2023.
     JAVASCRIPT APPENDED BY WAYBACK MACHINE, COPYRIGHT INTERNET ARCHIVE.

     ALL OTHER CONTENT MAY ALSO BE PROTECTED BY COPYRIGHT (17 U.S.C.
     SECTION 108(a)(3)).
*/
/*
playback timings (ms):
  captures_list: 89.103
  exclusion.robots: 0.064
  exclusion.robots.policy: 0.055
  RedisCDXSource: 5.547
  esindex: 0.007
  LoadShardBlock: 67.337 (3)
  PetaboxLoader3.datanode: 85.078 (4)
  load_resource: 84.5
  PetaboxLoader3.resolve: 47.163
*/

As the error message says, import and export can only appear at the top level of a module, not inside a function body. Unfortunately this means that Wombat's transformation breaks every site that uses native modules. Today, these sites are in the minority, but since modules are supported in all modern browsers and many tools emit them automatically, the number is growing rapidly.

Afraid I don't have a great suggestion for a solution handy! Perhaps something like this, if import/export is detected in the code...

import { window, self, etc... } from 'data:application/javascript;base64,xyz123';

...where xyz123 is a base64-encoded module containing the relevant overrides?

@ikreymer
Copy link
Member

@Rich-Harris Thanks for making a note of this. We (Webrecorder project) have actually implemented module detection + different rewriting path in some of our tools (see: webrecorder/wabac.js#101, webrecorder/pywb#810) which does something similar to what you suggest.

For example, oldweb.today use our system for the full stack https://oldweb.today/?browser=ruffle#20230327165731/https://kit.svelte.dev/ and https://express.archiveweb.page/#https://kit.svelte.dev/
would attempt to determine if the JS in a module and apply the correct rewriting.
In testing these, I did notice another error (from incorrect assumption that dynamic import implies the code is in a module, marking the inline script as a module), but that's something that should be fixable.

The IA wayback machine only uses wombat.js from Webrecorder, the rest of their stack is their own proprietary implementation, so they would need to implement the module detection on their end, but it is possible.

As long as Svelte doesn't use implicit cross-file globals (see #82), things should generally work. In general, don't think we've seen any replaying Svelte based sites :)

ikreymer added a commit that referenced this issue Mar 28, 2023
…b_rewrite_import__(base, url)

where base is either empty or the 'import.meta.url' (if used in a module)
if base is provided, resolve url with base to ensure relative urls resolve to correct module-relative url
bump to 3.5.0
part of fix for issue raised in #116
@Rich-Harris
Copy link
Author

Good to know, thank you! I contacted IA to let them know about this issue, hopefully they'll be able to implement the module detection.

ikreymer added a commit that referenced this issue Mar 28, 2023
#117)

* import relative rewrite fix: change __wb_rewrite_import__(url) -> __wb_rewrite_import__(base, url)
where base is either empty or the 'import.meta.url' (if used in a module)
if base is provided, resolve url with base to ensure relative urls resolve to correct module-relative url
bump to 3.5.0
part of fix for issue raised in #116
ikreymer added a commit to webrecorder/wabac.js that referenced this issue Mar 28, 2023
- support dynamic 'import()' in module or non-module with different base url (presence of import doesn't imply script is in module)
- simplify isModule check logic
- if script not in module, rewrite dynamic import to '__wb_rewrite_import_('', url), no base resolution
- if script is in module, rewrite dynamic import to '__wb_rewrite_import_(import.mata.url, ...)' to pass curr script url as base url
- remove duplicate rewrite call!
bump to 2.15.4
ikreymer added a commit to webrecorder/wabac.js that referenced this issue Mar 28, 2023
…cripts (#112)

* module rewrite fix: (fix for issue found in webrecorder/wombat#116)
- support dynamic 'import()' in module or non-module with different base url (presence of import doesn't imply script is in module)
- simplify isModule check logic
- if script not in module, rewrite dynamic import to '__wb_rewrite_import_('', url), no base resolution
- if script is in module, rewrite dynamic import to '__wb_rewrite_import_(import.mata.url, ...)' to pass curr script url as base url
- remove duplicate rewrite call!
bump to 2.15.4
ikreymer added a commit to webrecorder/wabac.js that referenced this issue Mar 28, 2023
…cripts (#112)

* module rewrite fix: (fix for issue found in webrecorder/wombat#116)
- support dynamic 'import()' in module or non-module with different base url (presence of import doesn't imply script is in module)
- simplify isModule check logic
- if script not in module, rewrite dynamic import to '__wb_rewrite_import_('', url), no base resolution
- if script is in module, rewrite dynamic import to '__wb_rewrite_import_(import.meta.url, ...)' to pass curr script url as base url
- remove duplicate rewrite call!
bump to 2.15.4
@ikreymer
Copy link
Member

ikreymer commented Apr 1, 2023

In testing these, I did notice another error (from incorrect assumption that dynamic import implies the code is in a module, marking the inline script as a module), but that's something that should be fixable.

This is now fixed on our end, see this archive
and oldweb.today live version. The video can replay if it is archived and the module rewriting is applied

@ikreymer ikreymer closed this as completed Apr 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
@ikreymer @Rich-Harris and others