Archiving Instagram posts

Stanford Libraries’ Web Archiving Program uses Archive-It as the preferred solution for curation and archiving of topical web archive collections. It is the best of available options for 1) data capture efficacy and 2) support for our curatorial workflow. Websites crawled with Archive-It can be accessioned to the Stanford Digital Repository (SDR), made available through the Stanford Web Archiving Portal (SWAP), and made discoverable in Searchworks using a dedicated web archiving workflow developed by the Libraries over the past few years (see https://consul.stanford.edu/display/WARC/Web+Archiving).
However, there are some sites which cannot be archived properly by Archive-It, including Instagram, Facebook, and sites created using the popular Wix service. While some of these sites can be captured in high fidelity using other services such as Webrecorder and Archive Web.Page, our current web archiving workflow cannot handle files created using those services.
A recent collaboration among Stanford University Press, Webrecorder, and DLSS has enabled websites captured using the Webrecorder toolset to be stored in the SDR and made available via the Archive Web.Page interface. This new capability enables us to create new workflow providing access to Instagram sites. The main differences between our existing web archiving workflow (using Archive-it) and the newly created workflow are as follows:
The main differences between our existing web archiving workflow (using Archive-it) and the newly created workflow are as follows:
Existing workflow |
New workflow |
|
Web archive tool |
Archive-It (subscription) |
Archive Web.Page (free) |
Social media, Wix sites |
Not properly archived |
Archived in high fidelity |
Curation support |
Extensive |
Minimal |
Archiving process |
Mostly automated |
Mostly manual |
Thumbnails in SearchWorks |
Generated by system |
Created manually |
Accession to SDR |
Tailor made for web archive |
Generic process for image and file |
Viewing environment |
SWAP (Stanford Web Archiving Portal) |
Archive Web.Page interface |
Here is an example of archived Instagram posts using the new workflow : https://searchworks.stanford.edu/catalog?f%5Bcollection%5D%5B%5D=jz413tt7854
Ideally, Archive-It would enhance their system to properly archive Instagram, Facebook and sites created by Wix service, and our existing web archiving workflow would also support files created for those sites. Until then, we can use the new workflow to archive Instagram, Facebook and sites created using the Wix service, which are critical to our collections.
I would like to thank Ilya Kreymer (Webrecorder), Jasmine Mulliken (Stanford University Press), Andrew Berger (DLSS), Josh Schneider (University Archives) and Jessica Cebra (Metadata) for making the new workflow possible.