-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Changelog
Nick Sweeting edited this page Nov 14, 2023
·
28 revisions
ArchiveBox
was previously named Pocket Archive Stream
and then Bookmark Archiver
.
THIS PAGE HAS BEEN MOVED:
See the releases page for versioned source downloads and full changelog.
π° Many thanks to our 100+ contributors and everyone in the web archiving community! π
Expand old release notes...
- v0.4.9 released
-
pip install archivebox
https://pypi.org/project/archivebox/ -
docker run archivebox/archivebox
https://hub.docker.com/r/archivebox/archivebox - https://archivebox.readthedocs.io/en/latest/
- https://github.com/ArchiveBox/ArchiveBox/releases
-
- easy migration from previous versions
cd path/to/your/archive/folder archivebox init archviebox add 'https://example.com' archviebox add 'https://getpocket.com/users/USERNAME/feed/all' --depth=1
- full transition to Django Sqlite DB with migrations (making upgrades between versions much safer now)
- maintains an intuitive and helpful CLI that's backwards-compatible with all previous archivebox data versions
- uses argparse instead of hand-written CLI system: see
archivebox/cli/archivebox.py
- new subcommands-based CLI for
archivebox
(see below) - new Web UI with pagination, better search, filtering, permissions, and more
- 30+ assorted bugfixes, new features, and tickets closed
- for more info, see: https://github.com/ArchiveBox/ArchiveBox/releases/tag/v0.4.9
- v0.2.4 released
- better archive corruption guards (check structure invariants on every parse & save)
- remove title prefetching in favor of new FETCH_TITLE archive method
- slightly improved CLI output for parsing and remote url downloading
- re-save index after archiving completes to update titles and urls
- remove redundant derivable data from link json schema
- markdown link parsing support
- faster link parsing and better symbol handling using a new compiled URL_REGEX
- v0.2.3 released
- fixed issues with parsing titles including trailing tags
- fixed issues with titles defaulting to URLs instead of attempting to fetch
- fixed issue where bookmark timestamps from RSS would be ignored and current ts used instead
- fixed issue where ONLY_NEW would overwrite existing links in archive with only new ones
- fixed lots of issues with URL parsing by using
urllib.parse
instead of hand-written lambdas - ignore robots.txt when using wget (ssshhh don't tell anyone π)
- fix RSS parser bailing out when there's whitespace around XML tags
- fix issue with browser history export trying to run ls on wrong directory
- v0.2.2 released
- Shaarli RSS export support
- Fix issues with plain text link parsing including quotes, whitespace, and closing tags in URLs
- add USER_AGENT to archive.org submissions so they can track archivebox usage
- remove all icons similar to archive.org branding from archive UI
- hide some of the noisier youtubedl and wget errors
- set permissions on youtubedl media folder
- fix chrome data dir incorrect path and quoting
- better chrome binary finding
- show which parser is used when importing links, show progress when fetching titles
- v0.2.1 released with new logo
- ability to import plain lists of links and almost all other raw filetypes
- WARC saving support via wget
- Git repository downloading with git clone
- Media downloading with youtube-dl (video, audio, subtitles, description, playlist, etc)
- v0.2.0 released with new name
- renamed from Bookmark Archiver -> ArchiveBox
- v0.1.0 released
- support for browser history exporting added with
./bin/archivebox-export-browser-history
- support for chrome
--dump-dom
to output full page HTML after JS executes
- v0.0.3 released
- support for chrome
--user-data-dir
to archive sites that need logins - fancy individual html & json indexes for each link
- smartly append new links to existing index instead of overwriting
- v0.0.2 released
- proper HTML templating instead of format strings (thanks to https://github.com/bardisty!)
- refactored into separate files, wip audio & video archiving
- v0.0.1 released
- Index links now work without nginx url rewrites, archive can now be hosted on github pages
- added setup.sh script & docstrings & help commands
- made Chromium the default instead of Google Chrome (yay free software)
- added env-variable configuration (thanks to https://github.com/hannah98!)
- renamed from Pocket Archive Stream -> Bookmark Archiver
- added Netscape-format export support (thanks to https://github.com/ilvar!)
- added Pinboard-format export support (thanks to https://github.com/sconeyard!)
- front-page of HN, oops! apparently I have users to support now π?
- added Pocket-format export support
- v0.0.0 released: created Pocket Archive Stream 2017/05/05
- π’ Quickstart
- π₯οΈ Install
- π³ Docker
- β‘οΈ Supported Sources
- β¬ οΈ Supported Outputs
- οΉ©Command Line
- π Web UI
- 𧩠Browser Extension
- πΎ REST API / Webhooks
- π Python API / REPL / SQL API
- βοΈ Configuration
- π¦ Dependencies
- πΏ Disk Layout
- π Security Overview
- π Developer Documentation
- Upgrading
- Setting up Storage (NFS/SMB/S3/etc)
- Setting up Authentication (SSO/LDAP/etc)
- Setting up Search (rg/sonic/etc)
- Scheduled Archiving
- Publishing Your Archive
- Chromium Install
- Cookies & Sessions Setup
- Merging Collections
- Troubleshooting
- βοΈ Web Archiving Community
- Background & Motivation
- Comparison to Other Tools
- Architecture Diagram
- Changelog & Roadmap