Skip to content

Commit

Permalink
Docs for shot-scraper-har
Browse files Browse the repository at this point in the history
  • Loading branch information
simonw committed Feb 13, 2025
1 parent 26a8b19 commit d5f2f01
Show file tree
Hide file tree
Showing 2 changed files with 77 additions and 0 deletions.
76 changes: 76 additions & 0 deletions docs/har.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# Saving a web page to an HTTP Archive

An HTTP Archive file captures the full details of a series of HTTP requests and responses as JSON.

The `shot-scraper har` command can save a `*.har.zip` file that contains both that JSON data and the content of any assets that were loaded by the page.
```bash
shot-scraper har https://datasette.io/
```
This will save to `datasette-io.har.zip`. You can use `-o` to specify a filename:
```bash
shot-scraper har https://datasette.io/tutorials/learn-sql \
-o learn-sql.har.zip
```
You can view the contents of a HAR file using `unzip -l`:
```bash
unzip -l datasette-io.har.zip
```
```
Archive: datasette-io.har.zip
Length Date Time Name
--------- ---------- ----- ----
39067 02-13-2025 10:33 41824dbd0c51f584faf0e2c4e88de01b8a5dcdcd.html
4052 02-13-2025 10:33 34972651f161f0396c697c65ef9aaeb2c9ac50c4.css
2501 02-13-2025 10:33 9f612e71165058f0046d8bf8fec12af7eb15f39d.css
10916 02-13-2025 10:33 2737174596eafba6e249022203c324605f023cdd.svg
5557 02-13-2025 10:33 427504aa6ef5a8786f90fb2de636133b3fc6d1fe.js
1393 02-13-2025 10:33 25c68a82b654c9d844c604565dab4785161ef697.js
1170 02-13-2025 10:33 31c073551ef5c84324073edfc7b118f81ce9a7d2.svg
1158 02-13-2025 10:33 1e0c64af7e6a4712f5e7d1917d9555bbc3d01f7a.svg
1161 02-13-2025 10:33 ec8282b36a166d63fae4c04166bb81f945660435.svg
3373 02-13-2025 10:33 5f85a11ef89c0e3f237c8e926c1cb66727182102.svg
1134 02-13-2025 10:33 3b9d8109b919dfe9393dab2376fe03267dadc1f1.svg
31670 02-13-2025 10:33 469f0d28af6c026dcae8c81731e2b0484aeac92c.jpeg
1157 02-13-2025 10:33 b7786336bfce38a9677d26dc9ef468bb1ed45e19.svg
50494 02-13-2025 10:33 har.har
--------- -------
154803 14 files
```

## `shot-scraper har --help`

Full `--help` for this command:

<!-- [[[cog
import cog
from shot_scraper import cli
from click.testing import CliRunner
runner = CliRunner()
result = runner.invoke(cli.cli, ["har", "--help"])
help = result.output.replace("Usage: cli", "Usage: shot-scraper")
cog.out(
"```\n{}\n```\n".format(help.strip())
)
]]] -->
```
Usage: shot-scraper har [OPTIONS] URL
Record a HAR file for the specified page
Usage:
shot-scraper har https://datasette.io/
Options:
-a, --auth FILENAME Path to JSON authentication context file
-o, --output FILE HAR filename
--timeout INTEGER Wait this many milliseconds before failing
--log-console Write console.log() to stderr
--fail Fail with an error code if a page returns an HTTP error
--skip Skip pages that return HTTP errors
--bypass-csp Bypass Content-Security-Policy
--auth-password TEXT Password for HTTP Basic authentication
--auth-username TEXT Username for HTTP Basic authentication
--help Show this message and exit.
```
<!-- [[[end]]] -->
1 change: 1 addition & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ multi
javascript
pdf
html
har
accessibility
github-actions
contributing
Expand Down

0 comments on commit d5f2f01

Please sign in to comment.