Please use this identifier to cite or link to this item:
Title: Fully convolutional neural networks for newspaper article segmentation
Authors : Meier, Benjamin
Stadelmann, Thilo
Stampfli, Jan
Arnold, Marek
Cieliebak, Mark
Proceedings: Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)
Conference details: 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto Japan, November 13-15, 2017
Publisher / Ed. Institution : CPS
Publisher / Ed. Institution: Kyoto, Japan
Issue Date: 2017
License (according to publishing contract) : Licence according to publishing contract
Type of review: Peer review (Publication)
Language : English
Subjects : Semantic segmentation; CNN; Deep learning; Datalab
Subject (DDC) : 004: Computer science
005: Computer programming, programs and data
Abstract: Segmenting newspaper pages into articles that semantically belong together is a necessary prerequisite for article-based information retrieval on print media collections like e.g. archives and libraries. It is challenging due to vastly differing layouts of papers, various content types and different languages, but commercially very relevant for e.g. media monitoring.  We present a semantic segmentation approach based on the visual appearance of each page. We apply a fully convolutional neural network (FCN) that we train in an end-to-end fashion to transform the input image into a segmentation mask in one pass. We show experimentally that the FCN performs very well: it outperforms a deep learning-based commercial solution by a large margin in terms of segmentation quality while in addition being computationally two orders of magnitude more efficient.
Departement: School of Engineering
Organisational Unit: Institute of Applied Information Technology (InIT)
Publication type: Conference Paper
DOI : 10.21256/zhaw-1533
Appears in Collections:Publikationen School of Engineering

Files in This Item:
File Description SizeFormat 
212962.pdf523.17 kBAdobe PDFThumbnail

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.