CERN Accelerating science

HOWTO Migrate Your Old Documents Into Invenio

Overview

Invenio comes as a suite of several independent flexible modules that enable you to easily convert your data from any existing format and to upload them into the Invenio system. This document briefly describes how you proceed.

Quick instructions for the impatient Invenio admin

$ cd /tmp
$ cp /my/own/doc/system/datadump.txt .
$ vi dump.cfg
$ bibconvert -cdatadump.cfg < datadump.txt > datadump.xml
$ bibupload datadump.xml
$ bibsched

Detailed instructions for the patient Invenio admin

$ cd /tmp
Go to a temporary directory.
$ cp /my/own/doc/system/datadump.txt .
Copy your old data into a text file or some other format of your choice. Preferably, the data should be well structured, such as XML. (Anyhow, even a free text format may be attepmted to be matched!)
$ vi dump.cfg
Describe the format of your data in the BibConvert language. (You may also use XSLT if your dumped file is in the XML format.)

This step will enable you to transform your data into MARCXML format that the Invenio system internally uses for bibliographic data handling. (If you have not chosen yet your MARC scheme, please read the MARC HOWTO.)

When writing the transformation configuration file, you may want to add a collection identifier to your records so that they would belong to a particular WebSearch collection; otherwise they might not be visible through the search interface. You do this by enriching the metadata to contain the 980 MARC tags in the output, of the form:

   <datafield tag="980" ind1=" " ind2=" ">
     <subfield code="a">ARTICLE</subfield>
   </datafield>
This will make your record to go to the Articles demo collection, for example. You will be able to define your collections later via the WebSearch Admin, the important point for now is only to create appropriate 980 collection identifier tags for records you are going to upload.

If your dumped file is in the XML format, you can also consult the example oaimarc2marcxml.xsl stylesheet that is used to manipulate metadata harvested via OAI.

You may also want to preserve any OAI identifiers in your old records, if you had any.

$ bibconvert -cdatadump.cfg < datadump.txt > datadump.xml
Convert the data from your own format into XML MARC, using the configuration you just wrote in the previous step.
$ bibupload -i datadump.xml
Upload thusly converted metadata into Invenio bibliographic databases.
$ bibsched
Watch the progress how the metadata are being uploaded, indexed, and formatted.

Congratulations! At this point you should have successfully migrated your old data into the Invenio system.