Access Points
Public access infrastructure for the End of Term 2024 Web Archive. The collection is distributed across multiple platforms to ensure resilience, permanence, and open availability.
The full End of Term 2024 Web Archive—including all partner contributions—is available through the following platforms as a unified aggregate. Each provides complete, unrestricted access to the entire collection.
The complete End of Term Web Archive—spanning presidential transitions from 2008 through 2024—is hosted as a public dataset on the AWS Open Data Sponsorship Program. The S3 bucket (eotarchive) is freely accessible without an AWS account via the AWS CLI.
The Filecoin Foundation for the Decentralized Web (FFDW) partners with the Internet Archive to preserve End of Term collections on the Filecoin decentralized storage network as part of the Democracy's Library initiative, ensuring tamper-proof, redundant, long-term access to government web archives.
The Internet Archive hosts the authoritative parent collection for the End of Term 2024 Web Archive, including all partner crawls, the Wayback Machine full-text search interface, and bulk data download access via Parquet, CDX, and WARC formats.
Individual partner organizations contributed specific crawl segments to the broader EOT 2024 effort. Each segment is independently accessible through the contributor's own infrastructure and mirrored on the Internet Archive.
| Contributor | Description | Collection | Homepage |
|---|---|---|---|
Harvard LIL HLILData VaultResearch | Harvard's Library Innovation Lab contributed high-fidelity crawls of select federal government websites to the EOT 2024 collection. Their broader Data Vault project also independently preserves federal datasets from Data.gov. | Archive.org | Visit |
Common Crawl Broad CrawlCC.gov/.mil | The Common Crawl Foundation contributed their broad-spectrum web crawl data, filtered for federal government domains, to the EOT 2024 collection. Their infrastructure enables large-scale capture of .gov and .mil content. | Archive.org | Visit |
Webrecorder BrowsertrixHigh-FidelityWACZ | Webrecorder joined the EOT 2024 effort as a new partner, using their Browsertrix platform to perform high-fidelity, browser-based captures of complex interactive government websites that traditional crawlers often miss. Their collections are also browsable at GovArchive.us. | Archive.org | Visit |
UNT Founding PartnerURL NominationsTargeted Crawls | UNT Libraries is a founding partner of the End of Term Web Archive, active since 2008. For EOT 2024, UNT developed and hosted the URL Nomination Tool and contributed targeted crawls of priority federal content. | Archive.org | Visit |
Archive Team VolunteerWarriorVOA | Archive Team, a distributed volunteer collective, contributed massive crawls of federal government content using their Warrior infrastructure. Their EOT 2024 contributions include dedicated Voice of America, DVIDS (Defense Visual Information), and broad US Government crawls. | Archive.org | Visit |
Data Licensing & Attribution
All End of Term Web Archive data is released under Creative Commons Zero (CC0). There are no restrictions on use, access, or redistribution. The project requests that users cite the End of Term Web Archive and its contributing partners when utilizing this data in research or publications.
EOT Archive Official Site