How to install EPrints 2.3.3 for multi-language environment (or not in the default English)?

  1. If you read this, you probably unpacked the package. Nevertheless, the first step is unpack the package into a separate directory.
  2. There are several patches here. They apply to EPrints version 2.3.3; some of them might be incorporated in later versions. Please check the CHANGELOG file there. Decide which patches do you want to apply. They fall into the following categories:
    1. Corrections to the EPrints library. These might be corrected in a newer version, please consult the pathces file.
    2. Set_lang1, a new cgi script to set the language. This requires the new flag icons in general/images/flags, and some new entries in the system phrases file.
    3. Modifications to support non-latin1 characters. A huge utf-8 conversion map is also included, as the \l and \L commands are not working properly.
    4. Archive specific routines defaulting to multilanguage features, and family name, first name style.
    5. Hungarian language template files.
    For details, please consult the README.MULTILANG file.
  3. Copy the the patched and new files to the unpacked EPrints 2.3.3 directory, or apply the patches manually. Instead copying all files, you can unpack the whole package in the main 2.3.3 directory.
  4. If necessary, edit the cfg/languages.xml file: uncomment languages you want to use, and change the supported="no" tag to supported="yes" (if the language will be supported). Uncomment Hungarian, if you do not intend to support Hungarian language pages.
  5. For all supported languages the following files must exist. Replace "xx" by the two-letter code of the language (for example, "fr" for French, "it" for Italian, "hu" for Hungarian, "de" for German):

    cfg/system-phrases-xx.xml -- system phrases
    defaultcfg/template-xx.xml -- template file for all pages
    defaultcfg/phrases-xx.xml -- archive dependent phrases
    defaultcfg/subjects.xml -- subjects on all languages
    defaultcfg/citations-xx.xml -- citation file
    defaultcfg/static/xx/index.xpage -- main page for the archive
    defaultcfg/static/xx/contact.xpage -- whom to contract
    defaultcfg/static/xx/error401.xpage -- general error page
    defaultcfg/static/xx/information.xpage -- info
    defaultcfg/static/xx/vlit.xpage -- vlit
    defaultcfg/static/xx/help/index.xpage -- help

    These files have special xml format, the first two lines must be as follows:
        <?xml version="1.0" encoding="iso-8859-2" standalone="no" ?>
        <!DOCTYPE phrases SYSTEM "entities-xx.dtd"> 

    The encoding can be any iso-8859-?, or it can be "utf-8". The file MUST USE the indicated encoding system. In the second line `xx' must be replaced by the actual language code. Please do not use direct charcter codes (things like &#ddd;).
  6. Read the documents.
  7. In the main 2.3.3 directory, run ./configure with parameters. `./configure --help' lists all possible parameters. The most important ones are:
  8. If the previous run was successfull, run ./install.pl. You migh get further error messages; correct them, and run ./configure again. If certain supporting programs are missing, supply them.

    *** IMPORTANT! ***

    Before running ./configure again, remove `config.cache'
  9. Before editing the Apache config files (as advised by the `install.pl' script), first you must create some (at least one) eprints archive. Go to the installed eprints directory (that you gave after the --prefix= flag), choose the name for your archive, and issue
            bin/configure_archive ARCHIVENAME

    You will be asked several questions, which you have to answer correctly. The most important ones concern mysql database access -- but see the documents. You can run bin/configure_archive several times, however after the first time accept the default NO to the "create config files" question.
    If you change the data structure (after you added new fields, or erased old ones) it is worth to erase all database content by calling
           bin/erase_archive ARCHIVENAME

    and running bin/configure_archive ARCHIVENAME again, and letting the program to recreate the database. However even in this case you should say "NO" for "create config files".
    The "cfg/apache.conf" file, which you must include in the Apache config file, is created by
           bin/generate_apacheconf

    (note: no argument is here). It should be called whenever you freshly make a new archive (there can be several ones at the same site).

Now you have created an empty archive, here are the steps to make it multilingual.

  1. Locate the archives/ARCHIVENAME.xml file, and edit it. It is an XML file. In the first line you can change the encoding (if you want to enter the archive name in the local language that way). However, you must use encoding="utf-8" if no single encoding suffices for all supported languages. As this file is regenerated automatically, it will use utf-8 encoding after calling bin/configure_archive.
    The used encoding is in the first line:
         <?xml version="1.0" encoding="iso-8859-2"?>
         ------------------------------^^^^^^^^^^^

    For each supported language, insert the line
         <language>en</language>
         <language>hu</language>

    change the default language into one of the supported ones:
        <defaultlanguage>hu</defaultlanguage>

    and give the archive name on all supported languages:
        <archivename language="en">Sample Archive</archivename>
        <archivename language="hu">Any text in the given encoding</archivename>
  2. Run bin/configure_archive ARCHIVENAME again. You can recreate the database files, but say "no" for "create config files"
  3. Run bin/create_tables ARCHIVENAME>
  4. Check the subjects.xml file. It must contain subject names on all supported languages. The file can use any encoding which is enough for all languages. The default encoding is "utf-8". See the subjects.xml file supplied in this package.
  5. Run bin/import_subjects --xml ARCHIVENAME subjects.xml
  6. Adjust ArchiveTextIndexingConfig.pm. As a minimum, you must supply your local stopwords, i.e. those which should not be indexed. You can also redefine the way encoded letters are treated and converted to lowercase. The present method works for accented latin letters, but not for cyrillic, chinese or hebrew.
  7. Edit the Apache configuration file, typically httpd.conf in /etc/apache/. As a last line, insert
            Include /the/full/path/for/eprints2/cfg/apache.conf

    where /the/full/path/for/eprints2/ is the full path your gave after the --prefix= flag in the ./configure invocation.
  8. Stop and restart apache.