Please use this identifier to cite or link to this item: https://doi.org/10.21256/zhaw-3485
Full metadata record
DC FieldValueLanguage
dc.contributor.authorImhof, Melanie-
dc.contributor.authorBraschler, Martin-
dc.date.accessioned2018-01-24T12:38:04Z-
dc.date.available2018-01-24T12:38:04Z-
dc.date.issued2017-11-03-
dc.identifier.issn1386-4564de_CH
dc.identifier.urihttps://digitalcollection.zhaw.ch/handle/11475/2169-
dc.description.abstractOperational multimodal information retrieval systems have to deal with increasingly complex document collections and queries that are composed of a large set of textual and non-textual modalities such as ratings, prices, timestamps, geographical coordinates, etc. The resulting combinatorial explosion of modality combinations makes it intractable to treat each modality individually and to obtain suitable training data. As a consequence, instead of finding and training new models for each individual modality or combination of modalities, it is crucial to establish unified models, and fuse their outputs in a robust way. Since the most popular weighting schemes for textual retrieval have in the past generalized well to many retrieval tasks, we demonstrate how they can be adapted to be used with non-textual modalities, which is a first step towards finding such a unified model. We demonstrate that the popular weighting scheme BM25 is suitable to be used for multimodal IR systems and analyze the underlying assumptions of the BM25 formula with respect to merging modalities under the so-called raw-score merging hypothesis, which requires no training. We establish a multimodal baseline for two multimodal test collections, show how modalities differ with respect to their contribution to relevance and the difficulty of treating modalities with overlapping information. Our experiments demonstrate that our multimodal baseline with no training achieves a significantly higher retrieval effectiveness than using just the textual modality for the social book search 2016 collection and lies in the range of a trained multimodal approach using the optimal linear combination of the modality scores.de_CH
dc.language.isoende_CH
dc.publisherSpringerde_CH
dc.relation.ispartofInformation Retrieval Journalde_CH
dc.rightsLicence according to publishing contractde_CH
dc.subject.ddc020: Bibliotheks- und Informationswissenschaftde_CH
dc.titleA study of untrained models for multimodal information retrievalde_CH
dc.typeBeitrag in wissenschaftlicher Zeitschriftde_CH
dcterms.typeTextde_CH
zhaw.departementSchool of Engineeringde_CH
zhaw.organisationalunitInstitut für Angewandte Informationstechnologie (InIT)de_CH
zhaw.publisher.placeNetherlandsde_CH
dc.identifier.doi10.21256/zhaw-3485de_CH
dc.identifier.doi10.1007/s10791-017-9322-x-
zhaw.funding.euNode_CH
zhaw.originated.zhawYesde_CH
zhaw.publication.statuspublishedVersionde_CH
zhaw.embargo.end2023-01-01-
zhaw.publication.reviewEditorial reviewde_CH
zhaw.webfeedDatalabde_CH
Appears in Collections:Publikationen School of Engineering

Files in This Item:
File Description SizeFormat 
10.1007_s10791-017-9322-x.pdf
  Until 2023-01-01
446.22 kBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.