One of the goals of MUMIA is to establish solid, verifiable and comparable evaluations of research in multilingual/multifaceted search. These rely on the availability of benchmark data, comparable and repeatable processes. Yet, there are a range of challenges that inhibit progress in this domain, amongst which are, for example:
- legal aspects of sharing benchmark data and evaluations
- size aspects: how to host, distribute, provide access to TBs of data
- metadata issues: how to annotate and describe benchmark data, its collection principles, underlying bias, etc.
o how to identify/cite in a machine-processable way the exact subset (out of a huge, growing database of benchmark data) that was used in a specific study
o how to share not only data and results, but also the tools used in the (pre-)processing of data and results for comparability and repeatability of results
- costing issues: what are the financial implications, cost and business models for hosting/provisioning of such benchmark data collections
The meeting is inspired by articles such as:
- Armstrong, Timothy G. and Moffat, Alistair and Webber, William and Zobel, Justin: Improvements That Don't Add Up: Ad-hoc Retrieval Results Since 1998, ACM CIKM 2009, http://dl.acm.org/citation.cfm?id=1646031
- The Economist: Unreliable Research: Trouble at the Lab. Oct 19 2013, http://www.economist.com/news/briefing/21588057-scientists-think-science...
- Jason B: Reproducible Machine Learning Results by Default: http://machinelearningmastery.com/reproducible-machine-learning-results-by-default/
- and many others along the line, aiming at well-document, repeatable, workflows, comparable results, etc.
The meeting will comprise of a small number of short focused presentations highlighting challenges in different benchmarking/evaluation settings, followed by a world-cafe style sessions to brainstorm and elaborate means to address selected sub-problems, followed by break-out and reporting-back sessions.
This pre-registration is required as places are limited and we would like to distribute some material up-front as well as consider topic suggestions for specific world cafe sessions by participants.