Issues on search ranking from within a repository and evaluation methods
Survey on current practices in relevancy ranking
----------- Notes from the 4th meeting on 9 March 2017 -----------
Siri J. Khalsa (U. Colo.)
Mingfang Wu (ANDS)
- MW to email the IG memebers about the survey questionnaire and get their feedback
- MW to draft the task progress report for RDA9
----------- Notes from the meeting on 23 Feb. 2017 (3nd) ---------------
Dawei Lin (DAIT, NIAID/NIH)
Peter Cotroneo (Elsevier)
Jens Klump (CSIRO)
Siri J. Khalsa (U. Colo.)
Mingfang Wu (ANDS)
In last two meetings, participants identified a lot variations that may affect relevancy rank. However, there is a concern that it might be beyond this group’s capacity to conduct any experiment.
In this meeting, the participants proposed two activities:
Carry out an environmental scanning of search challenges as organised by IR (information retrieval) research community (e.g. TREC and Inex), data search community (e.g. bioCADDIE 2016 Dataset Retrieval Challenge) and so on. We should especially look for those challenges that have a similar test collection to that from data repositories (structured metadata + linked information) and their outcomes from addressing these challenges.
Conduct a survey on data repositories and their search benchmarks, their evaluation methods, search algorithms/strategies being tested etc.; if the owners of these repositories are willing to provide their collections for testing or to they are happy to participate in any search challenges experiments. The collected information could be useful for data repositories to compare with each other for benchmarking, and learn from each other’s lessons, and/or address common challenges.
Action: MW and PC to put an initial set of questions next week, the group will discuss the questions offline, and try to finalise the survey questionnaire in the next meeting on Thursday, 9 March.
Siri supplimented the notes with his further recollections of the discussion:
A pair of driving questions are: "what can the group realistically accomplish?" and "what goal are we trying to achieve with our outputs?". Towards the latter, some thoughts were:
- help people choose appropriate technologies when implementing or improving search functionality at their repositories
- categorize solutions based on type of holdings/metadata/users(?)
- provide a means or forum for sharing experiences with relevancy ranking
- to support the above, capture the aspirations, successes and challenges encountered from repository managers
There is a difference between searching on structured data/metadata with controlled vocabularies vs searching unstructured data, where analytics/information retrieval methods are needed to extract data that can be enumerated/indexed. In fact, searching on unstructured data, which can/should involve following links to other content, brings in technologies like Apache Nutch.
----------- Notes from the 2nd meeting on 2 Feb. 2017 ----------------
Date: 02 Feb., 2017
- Beth Huffer (LLC)
- M. Wu (ANDS)
- S.J. Khalsa (U. Colo.)
- Do environmental scanning of similar activities from other forums. (ALL)
- Collect relevancy criteria and fill in the spreadsheet. (ALL)
Both Peter (Cotroneo) and Mingfang (Wu) have confirmed with their organisations Elsevier and ANDS respectively for providing a test bed, both got green light to go ahead. In particular,
Elsevier can provide AWS EC2 instances for a relevancy test bed. The Elsevier team could probably clone the machines that they used during the recent bioCADDIE Challenge.
ANDS could also provide search log for developing search topics.
We have discussed what outputs are expected from this task force or what will this task force focus: recommendations of index and searching algorithms in response to search use cases or actually do stuff as discussed in this brainstorming document. In either cases, we need to have an environmental scanning of similar activities from other forums such as ESIP, NASA and bioCADDIE etc.
If we do experiments around ranking algorithms,
We could first set up baselines as follows: develop search topics from search logs; run search topics through three ranking models: vector space, BM25, language model; pool first (say) 100 matches from each model for relevance judgement; and measure performance of each model.
We have people in this interest group, they are representatives from data search providers such as Elsevier, DataCite, DataOne, ANDS, …. We may ask them what relevancy rank issues they are facing, and identify common problems to work on.
We may need to understand what are relevancy criteria, what do people expect from their search. We may have relevance criteria and searched objects against each search domain, a matrix as shown in this Google spreadsheet. This understanding may contribute both activities as outlined in the second dot point. We may collect relevant criteria from the use case collection document and use cases collected from other forums.
Next meeting: Thursday 23 Feb. 2017
----------- Notes from the 1st meeting on 19 Jan. 2017 --------------
Date: Jan. 19/20, 2017
Peter Cotroneo, Elsevier (PC)
Siri Jodha Khalsa, NSIDC (SJK)
Mingfang Wu, ANDS (MFW)
Polish document, especially scope; create PDF with link to online Google Doc; Siri will propagate doc (via RDA list, which is on Wiki), but will check with Anita first
Can one or more participants provide a test bed?
- Disscuss the scope of this topic.
- Brainstorm ideas and possible activities, which are documented in this online document. The attached pdf file is its current state as dated on Jan. 19, 2017.