This registration script requests all data, not just the e-mail, username and password. It warns if the username and password contains national characters (browsers and Apache handles them differently, thus authentication does not work). Here is a sample registration page.
It would be nice to change the confirmation page, too: it should check credentials (username/password), and then present /users/home with some extra lines (Your registration was successfull, etc.).
I found the following user classification more helpful than the original one. There are five user types:
The confirmation page, after checking that all data is in order, calls confirm_user($user,$session) defined in ArchiveValidateConfig.pm. This subroutine can then decide, depending on various user data, whether the registered user gets into the viewer or the contributor category:
sub confirm_user { my( $user, $session ) = @_; if( $user->get_type() eq "templateuser" || $user->get_type() eq "viewer" ) { $user->set_value("usertype", ( $user->get_value("email") =~ /[\.\@]ceu\.hu$/i ) ? "user":"viewer" ); $user->commit(); } }
The fields for templateuser, viewer and user are the same, except for the password field, which is requred in the first case. The password field must not be required, as otherwise personal record updating is not accepted only if the password field is also filled (which is usually not).
Changing the password during personal record updating is allowed, I hope it will cause no problem later. Archive conttibutors and editors cannot change their e-mail address: this is a little paranoia as anyone can submit material and later vanish in the thin air. Of course, admin can change that address.
Finally all user scripts check whether the user has a valid record. If not (possible only when the admin creates a new user, or changes his/her credentials), the home_fill script is called instead.
Submitting a document is not a trivial fact; thus it should be as simple as possible. Users usually have no deep knowledge of the format of their submission; furthermore apache serves files not according to the specification (what the file was claimed to be) but rather on the extension of the file name (even apache does not consult the content of the file). Thus the "document_type" field can be -- and should be -- determined automatically, and not by the submitter. This is done by the new archive call get_document_type($main_filename).
Uploading may come from two sources: either from a local file, or from an internet address. Whether the uploaded material should be uncompressed or not is independent if its source, and can be given by a checkbox. The method of uncompressing can be decided locally; requesting the submitter to know the exact method is unnecessary. Thus "uncompress" can be a check box only. I have chosen even a simpler method: the very first file uploaded for a format is uncompressed if necessary, all the rest is not. This is what the average user might expect; a knowledgeable user could use it to upload any file she whishes.
Certain web addresses can only be used as links (no problem with the metadata, but they want to keep the file). Thus we have introduced the "link" document type. The link should also be specified in the "url" window, and ticking on the "use as link only" box will prevent downloading the specified URL. Internally links are stored as contents of a file with extension ".link". When rendering, the content is copied into the href field (in ArchiveRenderConfig.pm):
my $fmt=$doc->get_value( "format" ); my $link=""; if( $fmt eq "link" && open(TMP,$doc->local_path()."/".$doc->get_main() ) ) { $link = <TMP>; chomp $link; close TMP; } if( $link eq "" ) { $link = $doc->get_url(); }
The full process of submission has five stages; sample pages are available here
Stage 1 has an altranete form when the document is edited, and not created.
All but stage 4 is relatively straighforward. The upload stage splits into two subcases: if the document has (one or more) formats, or if it has none. In the former case a table is presented for each format showing the type (determined from the main file), the commentary, a link to the main file (to preview in a separate window), the number of files belonging to that format, and two buttons: "Edit" and "Delete". Below the table there are three action buttons: "Back", "Next", and "Add New Format".
If no format is defined for the eprint, then not the summary page is shown, but the upload page. A similar, but slightly different page appears when the "Add New Format" button is pressed. Both pages let the user choose a local file (via the "Browser" field), enter an URL (text field), specifying whether the URL should be downloaded or used as a link, and also fill the field of extra format commentary. Clicking on "Next" starts the grabbing process and a new format is produced. The successor page is 4.
In the EPrints::SubmissionForm script the code for an uncompress button is commented out. This can be used to instruct the downloader to uncompress (can be decided locally the uncompressing method) the grabbed file. Now the very first uploaded file is uncompressed if possible.
On the format list, clicking on "Edit" next to a format leads to the edit format page. Here all files belonging to that format is listed with buttons to allow delete any separate file, or make it the "main" file (which, in turn, determines the format's type). The main file cannot be deleted. It is possible to edit the format's commentary, and upload new file to the format via a similar mechanism as for creating a new format. In this case the "link" button is not available. Clicking on "Next" goes to the page of stage 4.
On the "upload-first" and "edit" pages format description, language and security fields are presented depending on the settings in ArchiveConfig.pm. If the appropriate value is 0, the field is not shown, it the value is 1 then it is always appears. However if the value is 2, then it is shown when the page is edited by an editor. Keeping the submission page as simple as possible, only the format description is presented, and the security field is for editors only. Thus only editors can limit the availability of the document.
In our case it has been requested that documents -- whenever possible -- should be converted into pdf. The upload porocedures in the EPrints::Document library have been modified so that new files are automatically converted (whenever possible), and the resulting pdf file is made main. Using the format's Edit option this can be undone, and the pdf (or the original) file can be erased if necessary. We found this mechanism quite satisfactory, as postscript files regularly shrunk over 50%.
The submission pages have many more help information than before. Also, several new pins were introduced to refer to the document under submission. At stage 2 and 3 the document type is available, at stage 4 the standard one-line document rendering is available.
Availability of documents now can be restricted. In "metadata-types.xml" the "security" dataset has two entries only: "locally" and "staffonly". As security is not required any more (modified in EPrints::Documents), undefined value means unlimited access. In ArchiveConfig.pm the subroutine can_user_view_document was modified a little: the new trivial case
return( 1 ) if( !defined $security );was added, as well as just before the last line the following text:
if( $security eq "locally" ) { return $user->has_priv( "local_view_docs" ) ? 1 : 0 ; }Those users, who can view such documents also have the "local_view_docs" priviledge in the userauth table a little above. As local e-mail ensures archive contributor type automatically, archive contributors, editors, and admin are granted this priv. This ensures local view only.
Setting the security to "staffonly" makes it available to editors and admin only. This feature can be used to hide the document rather than deleting it from the archive.
Restricting document availability is reserved to editors (and admins) only, this is mainly because introducing a new box into the submission page made it confusing. Editors, however, can restrict the availability of the document any time by reediting it.
In the ArchiveRenderConfig.pm file those documents which have "staffonly" security are not listed at all. Those which have restricted availability are marked by a locked icon, thus warning the casual viewer. |
The search pages are rendered in a table format. The first column contains the name and help separated by a break, and the second column contains the input field. The name and help are formed similarly to other entries, and are not put together. It makes possible to give separate help for fields of the same type (for example two text fields might require quite different help); and also the name of multiple fields can be anything. In our case one field contains words from a relatively large list. The help info contains a button which presents all the words in a separate window to copy and paste from. This is archived by the following entries in the phrase-en.xml file:
<ep:phrase ref="eprint_searchname_country">Country</ep:phrase> <ep:phrase ref="eprint_searchhelp_country">Click on &&<input type="button" value="List of countries" onclick="open('&&&base_url;/help/countries.html&&', '_blank','status=no,toolbar=no,resizable=yes,scrollbars=yes')">&& to get a list to copy & paste.</ep:phrase>(See also the next section on escaping).
If a field is too wide, the second column would dominate the whole table. When the field has the (new) property one_column then it is rendered to occupy both columns.
Material in the phrases.xml file gets its way into the final web page via two different mechanism. The first one is used for big chunk of data, and is processed by the html_phrase() procedure. The material is parsed by an XML parser (that's why it should be properly formatted), and then it is copied formatted. Tags between < and > appear with no modification; the text between them is escaped, for example quotation marks are replaced by """. For example,
<p align = "center"> "Text1"</p> <em> Emphasized </em> "Text2"becomes
<p align="center"> "Text1"</p> <em> Emphasized </em>"Text2"Observe that spaces collapsed and the line breaks disappeared.
The second method, performed by the phrase() procedure, evaluates the html tags, and then the resulting text is escaped. This means that in the above example the <p> introduces a new line, and <em> disappears as no emphasized text is available in ascii. In this case the above becomes
"Text1" Emphasized "Text2"in two lines, the first line is aligned into the middle of a line of length 80 characters.
As the name and help given for fields are processed by the second method, it is impossible to make there certain words italics or bold. To overcome this difficulty, the hack in XML/DOM.pm library file can be used. Text between two ampersand (&&) is not interpreted, and goes to the final text "as is". For example, to get "Text1 <em>Emphasized</em> Text2" in a help, the phrase should be
Text1 &&<em>&&Emphasized&&</em>&& Text2If you want a quotation mark " to appear in the final text, it should also be surrounded by double ampersand signs:
&&"&&You get the same result if the quotation mark is replaced by """ as first the text is parsed.
In the citation file this escaping comes handy. Rendering a conference paper, the authors are typeset in bold face, the proceedings title in italics, and the volume number in bold face as well. This is achieved by the following extract:
<ep:citation type="eprint_confpaper"><span class="citation"> &&<b>&&@authors@&&</b>&& (@year@): <ep:linkhere>@title@</ep:linkhere>, in: <ep:ifset name="editors">@editors@, Eds. </ep:ifset> <ep:ifset name="conference">&&<i>&&Proceedings of @conference@&&</i>&&</ep:ifset> <ep:ifset name="volume">Vol &&<b>&&@volume@&&</b>&&</ep:ifset> <ep:ifset name="number">(@number@)</ep:ifset> <ep:ifset name="pages"> pp. @pages@</ep:ifset><ep:ifset name="confloc">, @confloc@</ep:ifset></span></ep:citation>
Each page generated by Eprints has a unique identifier, the pageid. It is used for different page hooks. It can also be used for a context sensitive help system as well. Adding this identifier to the URL address of the help page makes possible that all pages have different help. This is done by introducing a new pin when generating pages: <ep:pin ref="help" />. In the "template-en.xml" file the reference to the help page should be changed to
<a target="help" href="&base_url;/help/index.html#<ep:pin ref="help" />">HELP</a>We expect this to come in the final HTML file as
<a target="help" href="/html/archive/en/help/index.html#submission">HELP</a>Clicking on "HELP", on the separate designated window the given page is positioned at the "submission" label. If that window is not open yet, then a new window is opened. Unfortunately there are problems with the above line. First, the window will be a full-fledged viewer; it would be nicer not have the upper lines; also we could limit the size of the window. The second and bigger problem is that the XML parser does not allow embedded XML tags. The first problem can be solved by using a small javascrit function, the second by using the escape mechanism. The following line is inserted into the <head> part of the page:
<script language="JavaScript">&&<--&& function ow(win,prm){window.open(&&"&&&base_url;/help/index#<ep:pin ref="help" />&&"&&,win,prm,null);} &&//-->&& </script>We quoted the start and end of the comment; also the quotation marks, as otherwise they are replaced by the string ". The "HELP" reference then is the following:
<a href="javascript:ow('HELP','menubar=no,location=no,scrollbars=yes,resizable=yes,width=620,height=470')">HELP</a>which works.
To differentiate between statically generated pages, the utility generate_static now calls build_page() with pageid as the last part of the file to be generated, and not as "static".
The import utility tries to import a full eprint database. The metadata is taken from the exported archive and document databases, files are from the copy of the disk00/ filesystem. This makes possible to have a backup independently of mysql.