Bitte benutzen Sie diese Kennung, um auf die Ressource zu verweisen: https://doi.org/10.21256/zhaw-3485
Titel: A study of untrained models for multimodal information retrieval
Autoren: Imhof, Melanie
Braschler, Martin
Erschienen in: Information Retrieval Journal
Verlag / Hrsg. Institution: Springer
Verlag / Hrsg. Institution: Netherlands
Erscheinungsdatum: 3-Nov-2017
Lizenz (gemäss Verlagsvertrag): Lizenz gemäss Verlagsvertrag
Art der Begutachtung: Editorial review
Sprache: Englisch
Fachgebiet (DDC): 020: Bibliotheks- und Informationswissenschaft
Zusammenfassung: Operational multimodal information retrieval systems have to deal with increasingly complex document collections and queries that are composed of a large set of textual and non-textual modalities such as ratings, prices, timestamps, geographical coordinates, etc. The resulting combinatorial explosion of modality combinations makes it intractable to treat each modality individually and to obtain suitable training data. As a consequence, instead of finding and training new models for each individual modality or combination of modalities, it is crucial to establish unified models, and fuse their outputs in a robust way. Since the most popular weighting schemes for textual retrieval have in the past generalized well to many retrieval tasks, we demonstrate how they can be adapted to be used with non-textual modalities, which is a first step towards finding such a unified model. We demonstrate that the popular weighting scheme BM25 is suitable to be used for multimodal IR systems and analyze the underlying assumptions of the BM25 formula with respect to merging modalities under the so-called raw-score merging hypothesis, which requires no training. We establish a multimodal baseline for two multimodal test collections, show how modalities differ with respect to their contribution to relevance and the difficulty of treating modalities with overlapping information. Our experiments demonstrate that our multimodal baseline with no training achieves a significantly higher retrieval effectiveness than using just the textual modality for the social book search 2016 collection and lies in the range of a trained multimodal approach using the optimal linear combination of the modality scores.
Departement: School of Engineering
Organisationseinheit: Institut für Angewandte Informationstechnologie (InIT)
Publikationstyp: Beitrag in wissenschaftlicher Zeitschrift
DOI: 10.21256/zhaw-3485
10.1007/s10791-017-9322-x
ISSN: 1386-4564
URI: https://digitalcollection.zhaw.ch/handle/11475/2169
Gesperrt bis: 2023-01-01
Enthalten in den Sammlungen:Publikationen School of Engineering

Dateien zu dieser Ressource:
Datei Beschreibung GrößeFormat 
10.1007_s10791-017-9322-x.pdf
  Bis 2023-01-01
446.22 kBAdobe PDFÖffnen/Anzeigen


Alle Ressourcen in diesem Repository sind urheberrechtlich geschützt, soweit nicht anderweitig angezeigt.