CAIRSS

CAIRSS assistance in ERA-SEER repository testing 2010

The CAIRSS website has information for CAIRSS institutions on ERA-SEER repository testing 2010. Please see: http://cairss.caul.edu.au/www/era_seer/configuring_repositories_era.htm

The Repository Testing Strategy (open from 12 March to 7 May 2010) aims at facilitating a more comprehensive, systematic, uniform and independent approach to the depositing of and access to research outputs for the Excellence in Research for Australia (ERA) initiative for 2010.

To achieve the desired outcome for this testing institutions need to:

Register all repositories that will be used for ERA 2010

Populate these repositories with at least 10 research outputs that will subsequently be submitted for ERA 2010

Test all aspects of and record results for each research output in SEER Manage Repository Testing.

Login details and URL required for testing was emailed to each institutions ERA Liaison Officer from the ARC on 12 March 2010. The subject of the email was ‘ERA SEER login details and notification’. The ERA Liaison Officer at each institution was advised they need to use this to create a Repository Manage role within SEER and assign that role to the relevant person at the institution (i.e. the repository manager).

If repository managers have not received this login they will need to chase it up at their institution with the main ERA contact officer who would be receiving these emails/details to enable them to commence testing.

Key ARC documents are linked to from CAIRSS website also including:

ERA-SEER Repository Testing Strategy for 2010.pdf

2009 ERA HCA Trial Repository Testing.pdf

Please contact Tim at CAIRSS for further information.

How to export contents from an Institutional Repository to a Spreadsheet

The idea

A short time ago CAIRSS was approached by a Repository Manager from within the CAIRSS community to assist with exporting the contents of their repository to a spreadsheet. It was made apparent that accomplishing this task would greatly assist with Institutional Repository management tasks and most importantly ERA related work.

The tools

There are many ways that data can be extracted, moved and converted. The wisest choice is to use tools that are interoperable. An example of this would be choosing OAI-PMH to extract data rather than attempting to communicate with an individual Repositories data storage device or database etc.

The solution

Our CAIRSS Technical Officer Tim McCallum has completed a solution to address this task in the form of a Java Web Application. FoREveR – Flexible Repository Export Reporter.

Extracting the data

The data extraction is carried out using an OAI-PMH harvester. In this instance The Fascinator was used to accomplish this task. With regards to recent trends in Institutional Repository development and the use of SOLR the next step was an easy choice; simply extract the data from The Fascinator using a SOLR query. As an added bonus SOLR is able to supply the data in JSON (JavaScript Object Notation) format.

Converting the data

Overview

After testing different methods of converting the data including XSLT and Python some research was done revealing some excellent JSON libraries written in Java. The final choice was Java given the fact that the JSON libraries could meet the requirements for this application and that OAI-PMH, The Fascinator and SOLR were all already written in Java.

Technical

The JSON data returned is the result of an HTTP request (can be set to fetch all by default). This data is converted to Java Maps and Java ArrayLists for further processing. The application loops through every record that has been returned and creates a Java Set (unique list/master list). This Set is then displayed in the users browser. This is a last minute chance to select or deselect metadata before the final report is written. It is sometimes the case that a metadata field containing a large amount of content is best left out, as this can make the spreadsheet unmanageable from an end users perspective.

Reporting the data

Once approved the application creates an HTML file with all data saved to a table. The table includes table headings, table rows, table data cells and unordered lists for repeating information. This file can be opened in Microsoft Excel and Open Office spreadsheet applications or viewed in a browser.

Screen Shots

Optional SOLR Query

graphics1Note: It is not necessary to know SOLR query syntax, the application can be set to get everything by default. This may be an area to address with the community and feedback is welcome.

Feedback

graphics2

Small sample of spreadsheet output

graphics3

Using the Flexible Repository Export Reporter (FoREveR)

As this software is in the very early stages of its life cycle reports can be created by CAIRSS and emailed out to you. Please contact CAIRSS Central if you think that your institution can benefit from the use of this tool.

The source code is available at http://cairss.caul.edu.au/trac/browser/code/FoREveR for your interest, however it has not been extensively tested. All feedback is welcome. CAIRSS will endeavor to improve and enhance the software to meet your needs.

Research Repository Managers Symposium @ Educause

Educause Australasia Conference (3-6 May Perth)

Some of the case studies from this repository event are now available online at: http://researchspace.auckland.ac.nz/handle/2292/3368
Please note they are still being added to.

Would be great to hear from any CAUL Repository Managers who attended.

I have picked up the following comments from participants:

  • ERA has taken over ‘traditional’ repository establishment activities in 2009 for many
  • Very honest discussions took place
  • Discussion on the need to ‘act together’ rather than go it alone (esp. for lobbying)
  • Concerns for sustainability of CAIRSS-like service (what happens when the funding runs out in 2 years)
  • Repository Managers feel isolated. Want more opportunities to discuss topics with their Repository Manager peers at other institutions
  • CAIRSS to organise a similar event at the 2011 Educause Conference