I'm fiddling around with the Altmetric API (since we have access at $work). I sampled 12,109 DOIs (not random) from our database of publications by Stanford authors. 8,797 (73%) have Altmetric data. Altmetric have different categories of citations that they track: facebook, blogs, twitter, bluesky, news, etc.
ed
Preservation is access, in the future!
The web is a preservation medium.
What would we be without wishful thinking?
See also: @ink@merveilles.town
Posts
Preservation is access, in the future! The web is a preservation medium. What would we be without wishful thinking? See also: @ ink
#TIL that Georges Perec worked as an archivist.
Preservation is access, in the future! The web is a preservation medium. What would we be without wishful thinking? See also: @ ink
Preservation is access, in the future! The web is a preservation medium. What would we be without wishful thinking? See also: @ ink
@acdha@code4lib.social nice, happy to jump on a zoom sometime to chat about it -- browsertrix-crawler gets a lot of attention from Webrecorder since their https://webrecorder.net/browsertrix/ service depends on it.
Preservation is access, in the future! The web is a preservation medium. What would we be without wishful thinking? See also: @ ink
@acdha@code4lib.social and yes, the llm bot defenses are a real problem for web archives. I saw in IIPC Slack yesterday that they are considering this as a topic for a technical meeting at the next conference in a few weeks (meeting may be open to remote participation).
Cloudflare has a Verified Bot registry, which I think some web archiving orgs have gotten on?
Preservation is access, in the future! The web is a preservation medium. What would we be without wishful thinking? See also: @ ink
@acdha@code4lib.social for single page I really like Harvard LIL's Scoop, which I believe is used by their flagship perma.cc service via a celery job that talks to scoop-api, a (closed source?) web service that wraps Scoop:
https://github.com/harvard-lil/scoop#readme
For more than one page crawls I think that @webrecorder@digipres.club's browsertrix-crawler is still the best thing out there. I helped write this howto, which I stand by:
https://sciop.net/docs/scraping/webpages/
PS. thanks for asking me about my favorite $work thing :-)
Preservation is access, in the future! The web is a preservation medium. What would we be without wishful thinking? See also: @ ink
@acdha@code4lib.social do you just need one page, or do you need to crawl?
Preservation is access, in the future! The web is a preservation medium. What would we be without wishful thinking? See also: @ ink
Bernie is smart to have AOC by his side, because she actually gets the political & economic significance of AI.
https://www.youtube.com/live/6B2x2FrJa6w?si=LAB2iHygiVi91n4A
Preservation is access, in the future! The web is a preservation medium. What would we be without wishful thinking? See also: @ ink
Have we learned nothing?
https://thedailyrecord.com/2026/03/27/coinbase-brings-token-backed-down-payments-to-housing-market/
Preservation is access, in the future! The web is a preservation medium. What would we be without wishful thinking? See also: @ ink
TIL that ORCID identifiers are available as #LinkedData :
https://gist.github.com/edsu/9be9658f9c6d300c569bae9b1016e108
Preservation is access, in the future! The web is a preservation medium. What would we be without wishful thinking? See also: @ ink
@stuartyeates@cloudisland.nz even more even more confusing! No wonder your stomach is turning :neocat_dizzy:
Preservation is access, in the future! The web is a preservation medium. What would we be without wishful thinking? See also: @ ink
@stuartyeates@cloudisland.nz even more confusing!
Preservation is access, in the future! The web is a preservation medium. What would we be without wishful thinking? See also: @ ink
@stuartyeates@cloudisland.nz is that pointing at the same resource as https://viaf.org/en/viaf/158359701 ?