LoCloud Hackathon winner !

2015-02-11 18.08.20

Vangelis Banos, Future LIbrary, Greece

MoRe Quality by Vangelis Banos (Future LIbrary - Greece) is the winning prototype application of the LoCloud hackathon which took place on the 11th of February 2015 at the premises of the Google Cultural Institute in Paris, France.

The hackathon was organised by LoCloud and Europeana in the context of EuropeanaTech 2015.

2015-02-11 18.07.54

LoCloud Hackathon, Paris 

Concept

The Metadata & Object Repository (MoRe) is an easy and powerful tool to aggregate information and harvest metadata from multiple sources in multiple schemas. Such aggregation schemas usually create problematic situations regarding the quality of the harvested metadata.

Metadata may pass the standard Europeana XML validity tests but they may include problematic metadata values. For instance:

* a dc:date value could be formatted in the wrong way:

<dc:date>approximately 18th century</dc:date>

This format is not correct according to established date formats.

*  an author name could be incomplete according to bibliographic standards.

Example: <dc:creator>Mike</dc:creator>.

* a URL may be invalid. E.g.: <ese:isShownAt>http://invalidurl.com/error-url</ese:isShownAt>

The aim of the MoRe Quality tool is to implement a validation system which could be able to catch these errors and produce useful reports to the collection administrators.

more-quality-snapshot

MoRe Quality application

The prototype functionality

MoRe Quality communicates with the LoCloud MoRe instance using a specific user API Key, retrieves the ingested metadata and performs

evaluations to identify common errors such as:

* Invalid date formats (ISO 8601 standard)

* Invalid hyperlinks

* Invalid language codes (ISO 639 standard)

* Invalid author names

The results are presented to the user in a simple report.

The application is implemented in such a way that enables developers to add extra evaluation rules in an easy and intuitive way by implementing simple functions - plugins.

Technical information

MoRe Quality is implemented using linux and python 2.7.

Some common python modules are utilised:

* Virtual environments

* Flask

* Python Requests

* BeautifulSoup4

* pycountry

* iso8601

The prototype is not currently running on a production server but the full source code freely available at: https://bitbucket.org/vbanos/more-quality/

 Anyone interested in MoRe Quality should feel free to contact the author for more information.