The wisdom of preservation

Share this on social media:

Issue: 

Alicia Wise

Alicia Wise, CLOCKSS executive director, reflects on her career and explains the importance of robustly preserving academic resources

Tell us a little about your background and qualifications…

I started at university when I was 13, through the University of Washington early entrance programme, and meandered through undergraduate courses for the next six years.

I popped out the other end with a passion for archaeology, and an eclectic mix of office skills from myriad odd jobs. My PhD was on the Roman invasion of Scotland, and I was given my first-ever proper job at the Archaeology Data Service. From there I’ve been lucky to be able to follow my nose to interesting challenges: a spell doing UK consortial licensing at Jisc, service as chief executive of the Publishers Licensing Society, a spell at the Publishers Association, and eight years at Elsevier leading on open access.

The reason that this background is a good fit for the CLOCKSS digital archive is that I’ve met many talented librarians and publishers along the way, and have retained focus on expanding access to digital information.

How was CLOCKSS founded and what is its purpose?

In the 1990s concerns began to crystallise about the long-term preservation of digital information. Traditionally libraries have preserved materials in print format, but in the digital age libraries often license access to books and journals that are only available digitally and stored remotely and accessed over the web. Although convenient for immediate access and usability, this can create real challenges for long-term preservation, access, and use. If a publisher ceases to publish, or the library cancels their license – or if the publisher's website is down – then the content that a library has paid to access is no longer available.

Under the leadership of organisations such as the Commission on Preservation and Access, OCLC, and the Research Libraries Group, academic libraries began to systematically explore how digital preservation of academic resources could be accomplished. Various projects launched in the late 1990s, and some of these have led to the development of preservation services.

CLOCKSS launched as a project in 1999, led by the University of Stanford working in partnership with international research libraries and academic publishers. We have since evolved to become an independent charity jointly governed by libraries and publishers united to protect scholarly content.

To date we’ve been entrusted to preserve nearly 50 million journal articles and more than 350,000 books. We are making a concerted effort to welcome more book publishers into our community and are keen to work with publishers of all shapes and sizes. 

Scholarly content is archived in a network of carefully controlled servers distributed around the world at leading academic institutions, and the nodes in this network are in constant contact to check and if necessary, restore the authenticity to the content protected within. When content entrusted to us permanently disappears from the web, we then make it accessible to everyone.

How has the world of preservation changed in recent years, and how do you see its future?

The scholarly record is at risk without long-term preservation, and here I mean much more than a remote back-up copy. Long-term preservation requires active management to ensure that the content remains healthy, and vigilance in the face of changing technology, disk failures, hacking, and worse. Sadly, failure to preserve at all (or until it is too late) is the key challenge. This challenge is greatest for less formally published scholarly communications, but it remains firmly in scope for formal published content and established publishers too.

Last year when I took up the role of executive director, I rather naively imagined that most books and journals – at least academic ones – would now be safely preserved in digital archives. But this is sadly very far from being the case, and more is needed. According to the International ISSN Centre in 2021 there were c. 2.8 million ISSNs issued and nearly 300,000 were assigned to digital resources. Fewer than c. 69,000 of these titles are fully preserved which means that they are archived in at least three independent digital archives. No one even knows the equivalent figures for books. So, there is a very great deal to do to encourage and enable publishers to ensure content reaches archives.

Let me illustrate this point with reference to the JASPER project (see https://doaj.org/preservation/). CLOCKSS is a partner in this project along with the Directory of Open Access Journals (DOAJ), the ISSN International Centre’s KEEPERS registry, the Internet Archive, and the Public Knowledge Project private LOCKSS network (PKP-PN). The project was established in response to studies by Mikael Laakso and others showing that hundreds of open access journals have disappeared entirely from the web in the last 20 years, and that more than 7,000 titles registered with DOAJ have no preservation policy or archive in place. We’ve created a content pipeline from DOAJ to Internet Archive to CLOCKSS and are providing encouragement and support to enable publishers to archive these titles. 

What’s the biggest issue facing scholarly communications at the moment, and how do you hope the industry will change over the next decade?

Digital preservation is also becoming more challenging as artificial intelligence, dynamic databases, interactive resources, knowledge graphs and more become embedded in the scholarly ecosystem. How do you preserve an entire online ecosystem in which scholars collaborate, discover, and share new knowledge? In truth at present, we cannot preserve the entire online ecosystem of research. We can take snapshots and preserve meaningful points in a scholars’ journey through this ecosystem, and it is essential that we do so. Digital preservation as a profession is therefore becoming more complicated. It requires an understanding of the social, organisational, and intellectual environments where metadata and content exist as well as an understanding of the content and technology itself and also an understanding of both the creators and users of research information. Acceleration of change in all these spheres is going to drive changes in the way we organise to do the job of digital preservation. 

Finally, do you have any fascinating facts, hobbies or pastimes that you’d like to admit to?

Post-lockdown there is so much incentive to revive old hobbies and explore new ones too. My teenage son says I’m now doing ‘all the side quests’ – which, in this game called my life, include improv comedy, rally marshalling, singing, and boxercise. I'm not convinced about the boxercise… I think I’ll stick to running.

Interview by Tim Gillett