Canadian Repositories Community of Practice October Call – Repositories in the Age of AI: The Attack of the Bots

Date: October 30, 2025
Time: 1pm-2pm ET

Registration

AI bots are aggressively harvesting content from the internet. Not so well known is the especially devastating effects this scraping has had on libraries, archives, and museums (LAMS). This sudden and overwhelming increase in bot traffic to these repositories has cultural heritage and academic institutions grappling with technical strain on their systems and the ethical questions regarding ownership concerns and how resources are used to train these language models.

These bots, automated agents that collect data to train the large language models that power artificial intelligence, interact with repositories by crawling interfaces, parsing metadata, and extracting digital assets, often at scales and speeds that strain infrastructure and bypass curatorial intent. Aggressive harvesting can degrade system performance, skew usage metrics, violate usage terms, and strip cultural materials of context. All of these behaviors pose risks to both availability of core library online services and the ability to manage public distribution of digital resources.

This session will examine the impact of unregulated AI scraping on the LAM ecosystem and as a result, on library services. The presenters discuss emerging mitigation strategies, including; rate limiting, bot detection, modifying architecture and functionality, machine-readable licensing, and community-driven best practices to regulate AI scraping.

Arran Griffith is the Program Manager for Fedora, an open-source digital repository platform dedicated to long-term digital preservation. In this role, she leads community engagement, aligns global user priorities, and serves as a strategic liaison between Fedora Governance and its stakeholders. She also facilitates cross-community working groups that foster collaboration and maintain alignment across open-source technologies. In addition, Arran is a founding stakeholder of the AI Discussions Working Group, which organizes the monthly Solutions Showcase Series.

Rosalyn Metz is Chief Technology Officer at Emory University Libraries and Museum, where she leads a team of 24 professionals and manages a $4.5 million technology budget. She advances open infrastructure, digital preservation, and user-centered discovery, drawing on experience in both higher education and industry. Nationally and internationally, Rosalyn has held leadership roles in global open-source communities, including Fedora, Samvera, and the Oxford Common File Layout Editorial Group. She is a frequent speaker on AI, technology, open infrastructure and is the author of The Digital Shift, a widely read Substack. Recently she was invited to give a Keynote at iPres 2025, held in New Zealand.

This session will be recorded.