Python scripts to generate static navigation pages from collection list and insert Web Archives records using the Archive-It CDX
-
Updated
Apr 20, 2017 - Python
Python scripts to generate static navigation pages from collection list and insert Web Archives records using the Archive-It CDX
Parse CDXJ(https://github.com/oduwsdl/ORS/wiki/CDXJ) files with node.js
ArchiveSpark DataSpec to analyze the Internet Archive's Web archive through temporal search results returned by Tempas (v2)
Data for testing the Offtopic detection software
This module builds our Waybacks in the various different configurations we require.
宁波凯思奥教育科技有限公司
A collection of the scripts and notebooks I wrote as part of my Data Science Bootcamp capstone project
A service that provides archive-aware oEmbed-compatible embeddable surrogates (social cards, thumbnails, etc.) for archived web pages (mementos).
A Python utility for publishing a social media story built from archived web pages to multiple services.
Create Robust Links from within Zotero
Add-On for Google Sheets to help those working with web archives.
Various examples of notebooks for working with web archives with the Archives Unleashed Toolkit, and derivatives generated by the Archives Unleashed Toolkit.
Parse And Create Web ARChive (WARC) files with node.js
Send records from an EPrints server to the Internet Archive and other web archives
Process web archives (WARC format) with StormCrawler and index content into Elasticsearch or Solr
warc tools allowing joining, finding missing resources, fetching missing resources, accessing metadata, conversion to zim and offline viewing for web archives
Add a description, image, and links to the web-archives topic page so that developers can more easily learn about it.
To associate your repository with the web-archives topic, visit your repo's landing page and select "manage topics."