Wordcloud Software now available on Sourceforge

December 10th, 2010

The software written to create the wordclouds based on the Voyager Library Catalogue search data is now available on SourceForge. We’d welcome the software being used by other libraries and would be happy to discuss possible changes and enhancements to the software.

EnCLaVE Final Blog Post

October 22nd, 2010

Background

EnCLaVE took place in a large research intensive institution (University of Edinburgh (UoE)), but very much within the context of a consortium of libraries (the SDLC) using the same LMS (Ex Libris Voyager). So the focus was not on any special nature of UoE users but on widely applicable benefits, or potential benefits, across SDLC sites and, therefore, of relevance to a wide community of Voyager users elsewhere.

The project identified Voyager as the focal point, but looked at a range of ideas which could add value, either in terms of the richness of the experience, or by improving catalogue access and visibility.  The project was an excellent opportunity to improve discovery functionality which had been neglected to some extent because of the research-intensive nature of the lead institution; priority, and project funding, had tended to focus on the research management and publication agenda through institutional repository development.

Our project partner, the National Library of Scotland (NLS), was an important contributor as joint lead institution for the SDLC, and also provided a useful perspective as a different kind of institution with a different audience and set of strategic drivers.

The timing of the call was fortunate, as the latest version (7) of the Voyager system had recently been released, with an OPAC module running under Apache Tomcat (‘Tomcat WebVoyage’),  built on XML, XSL and CSS, meaning that it was much more customisable than previously. EnCLaVe allowed us to prioritise development work on the OPAC in order to test enrichment options that were previously difficult or impossible to implement.

The rather dull 'unenriched' OPAC

Intended outcomes

EnCLaVE was divided into four workpackages. Workpackages 1 and 2 were about evaluating Voyager against AquaBrowser. UoE had implemented Aquabrowser mainly because it allowed us to deliver a more functional catalogue interface to our users. We felt we needed to test the ability of Voyager to replicate AquaBrowser functionality, as this could result in system savings, and reduced user confusion, by ceasing to take two systems to do, broadly speaking, the same thing. For NLS, who had concentrated more effort into developing AquaBrowser as a frontend for a range of services/content types, the issues were different, and more about focusing thought on understanding ways forward for the wider digital architecture, discovery experience and strategy.

Workpackage 3 informed the project name, as it was about embedding the OPAC module within a different system, to raise its visibility and ease of use for an important community: WebCT Virtual Learning Environment (VLE) users. For technical reasons, work on this workpackage was delayed and we have been granted JISC approval for an extension to see this work through.  WP3 will be reported on separately in due course (estimated as being late November 2010).

Workpackage 4 was very much about innovation. It aimed to build on recent work to use search data to generate wordclouds. To date, these wordclouds have been used for display purposes only, but to good effect, bringing the digital library into the new physical space that was created as part of the UoE Main Library Refurbishment Project.

What we have done through EnCLaVE is develop our thinking on wordclouds to enable them to become search tools by associating links with the terms, and to think about other types of clouds, in addition to those created by user generated searches.

The challenge

The project sought to understand the differences between Voyager and AquaBrowser, both in terms of technical functionality and user behaviour:

  • Do our users use AquaBrowser in preference to Voyager? Why/why not?
  • What are the key benefits of AquaBrowser?
  • What functionality is it now possible to replicate in Voyager?

We conducted our own testing, and have also followed the work of the AquabrowserUX project, coincidentally also at UoE, which tested on our own users.

We recognised that enrichment was only part of the challenge. A richer search tool is much more valuable if highly visible to the community, and available within an appropriate context, which is why we wanted to do more than we have been able to previously to embed the LMS into the VLE.

Finally, the wordcloud element of the project was trying to take advantage of what was at the outset almost a ‘throwaway’ idea to find something to make use of the new plasma screens in the refurbished main library concourse. This idea has gone down extremely well with both the user and professional community; could we tap into this and develop useful new tools on top of it?

Established practice

UoE wanted to move away from a situation where we effectively forced the user to make a decision (see image) or had to make that decision for them (e.g. by defaulting to a particular choice of catalogue interface on a kiosk machine in a library building).

'Choose your weapon': users are required to pick an OPAC

NLS were interested to know how the Voyager OPAC fitted into an environment where AquaBrowser was increasingly high profile as an entry point to the digital library.

In terms of access, we were keen to develop greater integration than is presently possible within WebCT.

The rather uninspiring connectivity between Library and VLE at present

Mockup of Voyager embedded into WebCT

Regarding our plans for word clouds, AquaBrowser already makes use of this functionality, so we wanted to look at user practice with this tool, and compare the approach it uses to ways in which Voyager generated clouds might be deployed.

The LMS advantage

Workpackage 1 found significant overlap between AquaBrowser and Voyager 7 OPAC functionality.  It was recommended that we implement:

  • a simple search box and search refine options
  • clear layout of search results and full record display
  • spell checking functionality
  • a basic mobile version
  • add word cloud functionality
  • content enrichment and social networking tools
  • additional help

Full detail is available in the Workpackage 1 Report (PDF file).

As a result, we have worked, in consultation with the SDLC LMS team, to upgrade Voyager sites to version 7, and have done initial testing of the additional functionality identified in WP1.

Book jackets and filters: new Voyager OPAC interface (on test server)

Content enrichment is available from various providers, and we looked at both Syndetic Solutions and Google Books. Syndetics offers us book covers and tables of contents; Google Books offers more content, sometimes including additional content such as book chapters, reviews, comments from the author. The implementation was fairly straightforward and the services were then evaluated by our Subject Librarians.

We implemented in our test environment the Shelf Browse feature from the University of Wisconsin, Oshkosh using Voyager web services and compared it to the Google Books Shelf Browse. The users who reviewed the functionality really liked the way it presented in the catalogue as it recreated some experience of serendipitious browsing for users. The Google Books shelf browse is less valuable as it shows associated books, but not how they are co-located on our library shelves.

We need to do some further work on this before releasing to users, to customise the metadata elements displayed and also to seek to improve its reliability in returning results. It does provide some added value to having book covers associated with bibliographic records in the catalogue.

Wordcloud and VLE integration remains at prototyping stage, but we will undertake some user testing post-project.

Key points for effective practice

From our experience, we would highlight the following points:

  • Undertake user testing to test assumptions, reveal unknown issues and drive development according to user need
  • Content enrichment: Comments on both services were broadly positive but there were some concerns raised that having two sources for this information was potentially confusing for users, and that the Google Books service often presented different book covers and editions to those in the Library Catalogue. We will investigate Google Books provision and quality a little more and have decided just to go live with Syndetics content for the moment.
  • Relating to embedding the OPAC into WebCT, we encountered some issues with configuring the voyager webservices for use by the project. This was further complicated by a security patch being applied for voyager just as we started working on the problem, which seems to have broken our access to the web services. We still intend to complete this work once the problem has been overcome, and this will be reported on as soon as we can, alongside any generally applicable advice for Voyager sites interested in doing similar work.
  • It is clear that word clouds generated simplistically from search terms lack value for searching.  Further work is being done on this as blogged recently.

Voyager OPAC with embedded wordcloud

Conclusions and recommendations

The key conclusions of the project were:

  • Voyager 7 is a much more functional OPAC module, and could, if necessary, provide a satisfactory single route into the catalogue with the enrichment options until recently only available through the application of a separate third-party interface system like AquaBrowser.
  • There are differences in how AquaBrowser and Voyager handle some functionality,  and some AquaBrowser functionality that is not native to Voyager, but may be developed on top (such development work may already have been done by Voyager sites and made available to others to use – see the WP1 report for details).
  • Wordclouds generated from usage data have attracted much professional interest and, if nothing else, demonstrably serve to raise the profile of the library to internal audiences and external peer groups. However, more work is required to understand the value of word clouds as search tools. It is noted that some usability testing has demonstrated that the AquaBrowser word cloud may be helpful, but the AquabrowserUX study recorded no use of this function.

Additional information

Workpackage 1 Report

UoE commissioned Usability Testing Report (AquaBrowser slides) (PowerPoint file)

AquabrowserUX commissioned Usability Test Report

Voyager 7 Tomcat OPAC documentation <to follow>

Wordcloud documentation <to follow>

EnCLaVE presentation to IGeLU (PowerPoint file)

EnCLaVE pecha kucha presentation to RLUK 2010 Conference (forthcoming, November 2010)

Simon Bains
Project Manager

Is the future of search in the cloud?

October 21st, 2010

EnCLaVE has directed one workpackage at the notion that word clouds are a potentially useful search tool for the LMS. AquaBrowser already offers a word cloud, although its graphical representation, is quite different from the increasingly popular ‘Wordle’ approach, which we have adopted at Edinburgh as a way of depicting search activity in Voyager.

Voyager word cloud

The Aquabrowser approach is not based on previous searches but rather word associations, translations, spelling variations and thesaurus terms:

AquaBrowser word cloud

After the recent JISC LMS meeting  in Glasgow, and some internal discussion around the first phase of wordcloud development, some issues were identified for examination:

1.    Single word vs phrase searching

Phase 1 of development had stripped all association from the search terms, including breaking up phrases into single words. This was initially done to allow the frequency of search words to be counted so that we could then represent the most searched for words more prominently in the resultant word cloud. This was okay for the original purpose, which was a static display in the Main Library entrance hall.

Now we have integrated the word cloud tool with the LMS, and made the search terms clickable by users to allow searches to be repeated, it is clear that the entries in the word cloud are not hugely useful for this purpose. Most frequently counted words tend to be words like ‘English’, ‘Scotland’, ‘History’ etc. and these return many 1000’s of hits in our catalogue.

The word cloud on the LMS  is, therefore, more of a decorative item presently, bringing variety to the search pages, but not adding any additional value to the user’s search. This is very different from the same process in Aquabrowser, as while a search for ‘history’ returns many hits, the word cloud provides information on ways to refine that search.

We ‘are working to change the  word cloud from using single words for display to using phrases, as entered by the user. As the incidence of repeated phrases is quite low, we intend to stop counting duplicate searches and just display all the phrases as of the same value and level. We currently parse the last 2000 searches on the Library catalogue to create the wordcloud, for a phrase based word cloud we aim to  just present the last 20 searches instead.

2.    Search context

The other area we are investigating is looking at the context of a search. The current word cloud is generated after we have stripped everything but search terms from the search logs. When the word cloud term is clicked it does a keyword search for that term, whether it was originally an author search or a title search. This affects the quality of the search returns, and the value of using the word cloud.

As we move to phrase searching we will be looking at using the last 20 searches as they are, rather than breaking these up to facilitate word counting and emphasis, we will also look to retain the type of search index used originally and embed that in the presentation via the word cloud. This should make the word cloud more useful to users as it will now recreate much more of what the original users intended when making their search.

We aim to have this work completed and can report further by end October. We hope to evaluate what users think of the finished word cloud as a tool for searching the Library Catalogue, ideally by connecting with the UX2.0 team who ran the AquabrowserUX JISC LMS project.

Project meeting notes: 24 Sept 2010

September 29th, 2010

Notes of EnCLaVE project team meeting of 24 September  2010

1. Apologies

  • None
  • In attendance: Simon Bains (SB), Gill Hamilton (GH), Stephen Vickers (SV), Morag Watson (MW)
  • Noted DF has handed over programme mgt to Ben Flanders

2. Elevator pitch

  • Noted GH/SV comments on SB’s draft.
  • Action: GH to upload the document to the blog. [available]

3. Work Packages progress

WP1

  • Noted GH has met with Boon Low (#aquabrowserux #jisclms project)
  • Draft report discussion
  • Noted some of the recommendations have already been implemented:
    • Simple search box
    • Refine options (not the full functionality of AquaBrowser; there are some limitations)
    • GH to identify which fields should be in results list and full display
    • GH to look at matching terminology with that in Aquabrowser.
    • Spell-checking is in test
    • The wordcloud is being modified to avoid issues with uppercase being given more significance, and to include phrases and other indexes.
  • Noted interest in the wordcloud code at the #jisclms programme meeting in Glasgow
  • Action: GH will explain how the priority list was drawn up.
  • Noted that WP1 findings match the ongoing development work.
  • Noted report needs to distinguish between work being done, and work we’d like to do post-project, e.g. Aberystwyth-style recommender service.
  • Wordcloud: Noted #aquabrowserux study found that users found it useful once it was explained to them. Action: WP1 report needs to clarify this.
  • WP1 needs to identify the difference between the wordclouds, and possibility of further research on this.
  • Mobile: noted need to be clear about what the survey tells us about use of services on mobile devices, and need to for further studies.
  • Action: GH will redraft to a close to final draft by 1 October. Noted that locations of config in Voyager to be reused by other Voyager users won’t be ready in that timescale.

WP2

Voyager OPAC 7 development.

  • Noted this has been running in tandem with WP1
  • Noted some functionality has gone live
  • Some aspects aren’t as functional as Aquabrowswer, e.g table of contents isn’t searchable
  • Google Books/Shelf browse have been investigated.
  • The project team has found some useful results from this which will force some decisions on how to implement, and a need for more investigation.
  • Noted that report on what it was like to configure might be helpful, as some of this work has been done by staff not involved in #enclavelms
  • Noted that config info could be placed in WP2 output
  • Noted that we should be producing a toolkit, so we need to record configuration settings etc so other institutions could replicate.
  • Action: MW will follow up with Colin Watt, and will document putting some functionality back into test (e.g. Google Books).

WP3

WebCT/OPAC

  • Noted it’s proved difficult to prioritise this as the work has had to focus on WP2 before start of semester
  • MW will turn on web services in the next couple of days. ACTION: MW will alert SV so he can test that it’s working correctly.
  • Noted that there will now not be time to do testing with students in the time left to the project. However, this will be done, post project, as we have buy in from an academic the relevant School
  • It was felt that it would be more realistic instead to look at wordcloud generation within WebCT, which was not originally in scope.
  • Noted that we should look at LSE Voyager widgets. ACTION: SB to send SV details.

WP4

Wordcloud

  • ACTIONS:
    • MW will write up the work to date and post to the Blog.
    • MW/GH will post a report on IGeLU and link to the presentation

4. Date of next meeting

12 October

EnCLaVE elevator pitch

September 27th, 2010

The ‘Only to the First Floor’ pitch

With appropriate acknowledgements to The Guardian’s John Crace and his ‘Digested read, digested’, the one floor elevator pitch is:

EnCLaVE aims to understand Library Management System (LMS) users better, reach them more effectively, provide them with a richer experience, and provide the wider community with the ability to do the same.  Embedding the LMS into the Virtual Learning Environment adds user and course context to any search”.

The ‘All the way to the top of Edinburgh’s Appleton Tower’ pitch

The University of Edinburgh and the National Library of Scotland are the joint lead institutions for the Scottish Digital Library Consortium (SDLC), with a long record of working together for mutual benefit. Since the late 90s, the SDLC has operated a shared LMS infrastructure, running the Ex Libris Voyager application on hardware based at, and supported by, the University of Edinburgh. Both organisations also use the Aquabrowser discovery system.

Visualisation of the EnCLaVE project

SDLC partners have been considering how to provide the best possible user experience, given the assumption that a ‘next generation’ LMS is still several years away, and the JISC LMS programme offered an opportunity to investigate a number of opportunities:

  • Whether the recently upgraded Voyager OPAC module could be enhanced to replicate Aquabrowser functionality
  • Whether more could be done to expose the catalogue more effectively in key user spaces (the WebCT Virtual Learning Environment was identified for the purposes of the project)
  • Whether new routes to discovery would add value to the user experience, building on previous work to generate wordclouds dynamically from catalogue searches

The intended outcome of our work is:

  • A richer experience for the user
  • A more relevant experience for the user
  • A better understanding of what the user expects and requires
  • Transferable functionality which Edinburgh University Library can offer to SDLC partners
  • Transferable functionality with appropriate documentation which other Voyager users can deploy in their own environments
  • An improved understanding of user behaviours
  • An assessment of digital library architecture, supporting decision-making on how to deploy different discovery layers, and whether extra layers, at additional cost, add significant value

Some evaluation work has been done to understand how users use Aquabrowser, and what they particularly like about it. We have also benefited from having a relevant partner JISC LMS project on our doorstep (AquaBrowserUX) conducting user testing of Edinburgh University’s students on our Aquabrowser installation.

Project team notes: 20 August 2010

May 16th, 2012

Notes of EnCLaVE project team meeting of 20 August 2010

1. In attendance

  • Simon Bains (SB), Gill Hamilton (GH), Stephen Vickers (SV), Morag Watson (MW)
  • No apologies

2. Workpackage reports

WP1

  • Draft report on WP1 has been prepared for review.
  • Report will be ready by the meeting with DF next week.
  • ACTION: GH to circulate draft
  • ACTION: All to comment over the next week

WP2

  • Voyager 7 OPAC has been created for library staff testing
  • ACTION: copy the config across to EnCLaVE test OPAC by Friday next week – MW to arrange
  • We have now done integration with third party content.
  • WP1 draft report has tagging examples we will look at for integration. ACTION: MW

WP3

  • SV has reviewed Voyager API information and is now considering the most effective approach
  • Team members met to discuss this in August
  • ACTION: SV will aim to have some of the APIs working by next week’s meeting
  • ACTION: SV, MW and Java developer to meet Monday next week to discuss.
  • Testing may not be possible until start of semester. We would hope to use UX2/JISCLMS Aquabrowser project team for this.

WP4

  • Code has been rewritten in Java and embedded on the front page of the EnCLaVE OPAC.
  • Links are now clickable.
  • Thinking is ongoing about other sorts of wordcloud and how to represent them, e.g. subject coverage based on top searches or number of uses of subject terms.
  • Looking at the possibility of course-specific wordclouds for WebCT.
  • SV and MW will look at how to get the data into WebCT
  • MW also thinking about the value of wordclouds across different services vs application-centric approaches.
  • Noted that MW and GH are doing a presentation on the EnCLaVE project at IGeLU  at end of August (Ex Libris European User Group meeting).

3. Programme manager visit

  • ACTION: SV to look at possible tours of new university spaces (Business School, Medical building)
  • ACTION: SB to see if an Informatics Forum tour is possible
  • ACTION: All to ensure they are up to date with the activities of other JISC LMS projects next week.
  • ACTION: MW to catch up with Boon Low about engagement between our two LMS projects.

4.ECDL LMS meeting

  • Need to ask DF what will be required of us for this when he visits.

5. AOB

  • Blog php bug needs to be fixed, as dates aren’t working correctly on blogs.

ACTION: MW

Wordcloud functionality for Voyager Library Catalogue

May 16th, 2012

Morag Watson and Robin Taylor met to reveiw the existing word cloud tool built on php.  This wordcloud tool was intitially built for the reopening of the ground floor of the University of Edinburgh Main Library and was designed to be displayed on Holopro Screens in the Main Entrance Hall.  A presentation on the development was given at the International Users Group Meeting for Exlibris, this presentation is available at  http://tiny.cc/22ha1

There were some obvious initial issues that needed to be addressed. The wordcloud tool for the Library Catalogue had originally been built in PHP but the interface for the Catalogue had now been updated to run on Tomcat. Therefore it seemed sensible as a first step to consider its migration to JavaScript. We had used JavaScript as part of our intitial development but that was for a word cloud based on the institutional repository.  As the wordclouds had initially been designed to be projected there was no need to make it interactive, this was now an opportunity to investigate t his.

Therefore we have focussed our first stage of development on changing the code to Javascript to work with the Tomcat interface, and secondly making the search terms clickable. You can see the output of this development at http://eddb-enclave.lib.ed.ac.uk which is a test version of our Library Catalogue.

The next steps are to liaise with colleagues in E-Learning to consider how to embed this in the VLE, and also to investiage if the word clouds can add value for other search types in the Library Catalogue, such as author or subject.

Identifying search functionality for the VLE

May 16th, 2012

Morag, Gill and Stephen met to discuss the types of integration with the VLE which we should try to explore as part of this project.  Whilst the main focus will be on adding functionality within WebCT (now owned by Blackboard) as this is The University of Edinburgh’s centrally supported VLE, the intention is to be as VLE independent as possible with the design, and the development of a Building Block for Learn 9 will provide evidence of the reusability of this design and the code libraries.

The value being added by embedding library search functionality with a VLE is:

  • convenience – the VLE is a common place for students to visit on a regular basis
  • context – knowledge about where enquiries are being made from
  • community – ability to aggregate data collected from users with a common interest

Project ideas arising from the discussion were:

  • basic catalogue search functionality
  • access to course-specific search options set by the instructor
  • clickable word clouds aggregating recent searches performed by different groups of users; for example, course, school/department, institution

This is still a work in progress to please add you own views and reflections to this site.

Project team notes: 2 June 2010

May 16th, 2012

Notes of EnCLaVE project team meeting of 2 June 2010

1. In attendance

  • Simon Bains (SB), Gill Hamilton (GH), Stephen Vickers (SV), Morag Watson (MW)
  • No apologies

2. Matters arising

  • Core resources form: completed
  • Collaboration agreement: completed and signed
  • Project plan: picked up under individual WPs below.
  • Blog: Logo fine
    • ACTION SV to try and upload but may need support if no admin access.
    • ACTION: SV to arrange for a version for use elsewhere e.g. PPT
  • Time recording: carried forward: ACTION SB.

3. Work packages

WP1:

  • GH and MW have met to discuss methodology
  • GH has reviewed UoE usability reports
  • GH starting to discuss with UoE colleagues
  • Most of WP1 work is expected to have been completed by the next meeting
  • MW has met with Boon Low, PM of the UX2.0 and Aquabrowser #jisclms projects
  • ACTION: GH to arrange to meet with Boon and visiting Aquabrowser engineer w/c 7 June completed
  • Noted: Aquabrowser study can link with ours so that user evaluation informs our study.
  • ACTION: SB to meet with Boon Low shortly to discuss this.
  • ACTION: MW to supply a list of relevant questions for asking students through Aquabrowser study.

WP2:

  • An instance of Voyager for project development has been requested but won’t be available until mid-June.
  • MW looking at the mobile skin for Voyager and other new functionality which will be investigated as part of EnCLaVE.

WP3:

  • SV and MW are meeting very shortly to discuss WP3. GH will attend for additional expertise in the new OPAC.

WP4:

  • We will be using the Aquabrowser jisclms project to inform us of the value of the Aquabrowser wordcloud
  • We are investigating options for wordcloud searches
  • UoE developers are presently fully committed, so resource will not be available until 1 July at the earliest.
  • EnCLaVE will be presenting a paper on this at the Ex Libris User Group (IGeLU)  meeting in Ghent in September

WP5:

  • All JISC paperwork has been completed, and information posted to the blog and tagged as required.

4. Issues and risks:

  • ACTION: GH to set up blog so tags are showing, and we will use ‘risk’ and ‘issue’ tags.
  • Noted that Summer absences need to be recorded as a risk.
  • Developer availability has not yet been confirmed, although it is scheduled. A risk associated with the levels of commitments for developers in the Digital Library Section needs to be recorded.
  • Voyager development OPAC has not yet been provided – this should be recorded as a (low) risk.
  • Noted that there is a risks table in the proposal, which we should review: ACTION: SB

5. AOB

None.

Project Plan post 7: Budget

May 16th, 2012

EnCLaVE Project Budget