EPrints Romeo AJAX lookup widget

The EPrints AJAX lookup uses the publisher copyright policies from SHERPA/RoMEO and is based on work by Ian Stuart at Edina. http://lucas.ucs.ed.ac.uk/test/ajax-romeo.html

The lookup works using a combination of the AJAX autocomplete built into EPrints and Ian Stuart's Perl script, which uses the Sherpa Romeo API to help a depositor supply valid publisher details in their repository submission process.

How to install

Install EPrints as per the instructions http://wiki.eprints.org/w/EPrints_Manual

Once EPrints is set up and installed - log in to the server as the eprints user and cd to the following directory:

<eprints install location>/archives/ARCHIVEID/cfg/workflows/eprint/

Backup then edit the default.xml. Find the section

<epc:if test="type = 'article'">
    <field ref="publication" required="yes" input_lookup_url="{$config{rel_cgipath}}/users/lookup/journal_by_name" />
    <field ref="issn" input_lookup_url="{$config{rel_cgipath}}/users/lookup/journal_by_issn" />
    <field ref="publisher" />
    <field ref="official_url"/>
    <field ref="volume"/>
    <field ref="number"/>
    <field ref="pagerange"/>
    <field ref="date"/>
    <field ref="date_type"/>
    <field ref="id_number"/>
</epc:if>

This needs changing to the following.

<epc:if test="type = 'article'">
    <field ref="publication" required="yes" input_lookup_url="{$config{rel_cgipath}}/users/lookup/get_journals" />
    <field ref="issn" />
    <field ref="publisher" />
    <field ref="official_url"/>
    <field ref="volume"/>
    <field ref="number"/>
    <field ref="pagerange"/>
    <field ref="date"/>
    <field ref="date_type"/>
    <field ref="id_number"/>
</epc:if>

Next cd to the following directory.

<eprints install location>/cgi/users/lookup

Using the touch command create a file called get_journals

$ touch get_journals

Now change the file permissions to the following

-rwxrwxr-x 1 eprints eprints 2967 2008-11-26 14:00 get_journals

Using the chmod command

chmod 775 get_journals

Open up a text editor then copy and past the following code into the get_journals file

use strict;
use HTTP::Request;
use LWP::UserAgent;
use XML::Twig;

use Data::Dumper;
use EPrints;

my $journal_data = {};

sub urldecode{
  my ($url) = @_;
  $url =~ s/%([0-9a-f][0-9a-f])/pack("C",hex($1))/egi;
  $url =~ s/\x2B/ /; # swap '+' for ' '
  return $url;
}

# XML::Twig's routine for dealing with a journal entry
sub process_journal {
  my ( $twig, $journal ) = @_;

  # get the components
  my $title = urldecode( $journal->first_child('jtitle')->text );

  my $zetoc = urldecode( $journal->first_child('zetocpub')->text ) 
                  if $journal->first_child('zetocpub');
  my $romeo = urldecode( $journal->first_child('romeopub')->text )
                  if $journal->first_child('romeopub');
  my $issn  = urldecode( $journal->first_child('issn')->text )
                  if $journal->first_child('issn');

  my $publisher = $romeo;
  $publisher = $zetoc if (not $publisher && $zetoc);
  
  # build a lub of html based on the components
  my $html .= "<li>$title";
  $html .= "<br />published by $publisher" if $publisher;
  
  $html .= "<ul>";
  if ($title) {
      $html .= "<li id='for:value:component:_publication'>$title</li>";
  }
  if ($publisher) {
      $html .= "<li id='for:value:component:_publisher'>$publisher</li>";
  }
  if ($issn) {
    $html .= "<li id='for:value:component:_issn'>$issn</li>";
  }
  $html .= "</ul></li>\n";
  warn "\n\n$html\n\n";
  # save the html
  $journal_data->{$title} = $html;

  1; 
} ## end process_journal

# get a list of journals that match the query
sub get_journals {
  my $journal = shift;
  my @html = ();

  if ($journal) {
    return ("<ul><li>keep typing....</li></ul>") if (length($journal) < 3);

    $journal =~ s/\s+/\+/;
    my $query = "http://www.sherpa.ac.uk/romeoapi11.php?qtype=starts&jtitle=$journal";

    my $request = HTTP::Request->new( GET => "$query" );

    my $ua = LWP::UserAgent->new();
    my $response = $ua->request($request);
    my $content = $response->content();

    my $twig = XML::Twig->new(
                       'keep_encoding' => 1,
                       'TwigRoots' => { 'journals' => 1 },
                       'TwigHandlers' => { 'journal' => \&process_journal, }
                      );
    $twig->parse($content);
    if (scalar keys %{$journal_data}) {
      push @html, "<ul class='journals'>\n";
      foreach my $title (sort keys %{$journal_data}) {
        push @html, "$journal_data->{$title}\n";
      } ## end of  foreach my $title (sort keys %{$journal_data})
      push @html, "</ul>\n";
    } ## end of if (scalar keys %{$journal_data}) ...
  } else {
    push @html, "<!-- No journal name supplied -->\n";
  }

  return (join "\n", @html)

} ## end get_journals

my $session = EPrints::Session->new();

# we need the send an initial content-type
print <<END;
<?xml version="1.0" encoding="UTF-8" ?>

END

# then we send the fragment of html for the autocompleter
print get_journals( lc $session->param( "q" ) );

$session->terminate;

For the changes to take affect you will need to restart Apache. This would normally be done as the root user.

Log in to your EPrints repository and create a new deposit. In the Edit Item > Details section under Publication Details, you should now see that after typing a few letters in the "Journal or Publication Title" text box a drop-down list will appear for you to confirm your journal or publication title. Once you have selected the correct one. The title box will be filed in with the correct details and if the Romeo database has the details the ISSN and publisher fields will also be completed.

There is an example of this script running here

Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.
  1. Jan 29, 2009

    Anonymous says:

    The sherpa romeo api has been updated the URL you need to use has changed from: ...

    The sherpa romeo api has been updated - the URL you need to use has changed from:

    my $query = http://www.sherpa.ac.uk/romeoapi.php?qtype=starts&jtitle=$journal; to

    my $query = http://www.sherpa.ac.uk/romeoapi11.php?qtype=starts&jtitle=$journal;

    Cheers,

    John Salter

    White Rose Research Online

  2. Oct 30, 2009

    Anonymous says:

    The latest version of the SHERPA/RoMEO API is V.2.4.2, as of 29Oct2009 the URL t...

    The latest version of the SHERPA/RoMEO API is V.2.4.2, as of 29-Oct-2009 - the URL therefore needs to changed from:

    my $query = http://www.sherpa.ac.uk/romeoapi.php?qtype=starts&jtitle=$journal;

    or

    my $query = http://www.sherpa.ac.uk/romeoapi11.php?qtype=starts&jtitle=$journal;

    to

    my $query = http://www.sherpa.ac.uk/romeo/api24.php?qtype=starts&jtitle=$journal;

    From a quick scan of the above code, I reckon the new version should work without any further changes, although I have not tested it as I am not auser of the widget. Would sometime like to try this and report back here?

    Cheers

    Peter Millington

    SHERPA Technical Development Officer

    University of Nottingham

    peter.millington@nottingham.ac.uk

Add Comment