Metadata Fresh: 24h

Access Points

Public access infrastructure for the End of Term 2024 Web Archive. The collection is distributed across multiple platforms to ensure resilience, permanence, and open availability.

The full End of Term 2024 Web Archive—including all partner contributions—is available through the following platforms as a unified aggregate. Each provides complete, unrestricted access to the entire collection.

Amazon Web Services Open Data Program

The complete End of Term Web Archive—spanning presidential transitions from 2008 through 2024—is hosted as a public dataset on the AWS Open Data Sponsorship Program. The S3 bucket (eotarchive) is freely accessible without an AWS account via the AWS CLI.

S3WARCBulk DownloadCC0
$ aws s3 ls --no-sign-request s3://eotarchive/
Filecoin Foundation — Democracy's Library

The Filecoin Foundation for the Decentralized Web (FFDW) partners with the Internet Archive to preserve End of Term collections on the Filecoin decentralized storage network as part of the Democracy's Library initiative, ensuring tamper-proof, redundant, long-term access to government web archives.

DecentralizedIPFSFilecoinRedundant
Internet Archive — archive.org

The Internet Archive hosts the authoritative parent collection for the End of Term 2024 Web Archive, including all partner crawls, the Wayback Machine full-text search interface, and bulk data download access via Parquet, CDX, and WARC formats.

Wayback MachineWARCCDXParquetFull-Text Search

Individual partner organizations contributed specific crawl segments to the broader EOT 2024 effort. Each segment is independently accessible through the contributor's own infrastructure and mirrored on the Internet Archive.

ContributorDescriptionCollectionHomepage
Harvard LIL
HLILData VaultResearch
Harvard's Library Innovation Lab contributed high-fidelity crawls of select federal government websites to the EOT 2024 collection. Their broader Data Vault project also independently preserves federal datasets from Data.gov.Archive.orgVisit
Common Crawl
Broad CrawlCC.gov/.mil
The Common Crawl Foundation contributed their broad-spectrum web crawl data, filtered for federal government domains, to the EOT 2024 collection. Their infrastructure enables large-scale capture of .gov and .mil content.Archive.orgVisit
Webrecorder
BrowsertrixHigh-FidelityWACZ
Webrecorder joined the EOT 2024 effort as a new partner, using their Browsertrix platform to perform high-fidelity, browser-based captures of complex interactive government websites that traditional crawlers often miss. Their collections are also browsable at GovArchive.us.Archive.orgVisit
UNT
Founding PartnerURL NominationsTargeted Crawls
UNT Libraries is a founding partner of the End of Term Web Archive, active since 2008. For EOT 2024, UNT developed and hosted the URL Nomination Tool and contributed targeted crawls of priority federal content.Archive.orgVisit
Archive Team
VolunteerWarriorVOA
Archive Team, a distributed volunteer collective, contributed massive crawls of federal government content using their Warrior infrastructure. Their EOT 2024 contributions include dedicated Voice of America, DVIDS (Defense Visual Information), and broad US Government crawls.Archive.orgVisit

Data Licensing & Attribution

All End of Term Web Archive data is released under Creative Commons Zero (CC0). There are no restrictions on use, access, or redistribution. The project requests that users cite the End of Term Web Archive and its contributing partners when utilizing this data in research or publications.

EOT Archive Official Site