Please use this identifier to cite or link to this item:
https://doi.org/10.21256/zhaw-25554
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Schmitt-Koopmann, Felix M. | - |
dc.contributor.author | Huang, Elaine M. | - |
dc.contributor.author | Hutter, Hans-Peter | - |
dc.contributor.author | Stadelmann, Thilo | - |
dc.contributor.author | Darvishy, Alireza | - |
dc.date.accessioned | 2022-09-01T12:53:11Z | - |
dc.date.available | 2022-09-01T12:53:11Z | - |
dc.date.issued | 2022 | - |
dc.identifier.issn | 2169-3536 | de_CH |
dc.identifier.uri | https://digitalcollection.zhaw.ch/handle/11475/25554 | - |
dc.description.abstract | One unsolved sub-task of document analysis is mathematical formula detection (MFD). Research by ourselves and others has shown that existing MFD datasets with inline and display formula labels are small and have insufficient labeling quality. There is therefore an urgent need for datasets with better quality labeling for future research in the MFD field, as they have a high impact on the performance of the models trained on them. We present an advanced labeling pipeline and a new dataset called FormulaNet in this paper. At over 45k pages, we believe that FormulaNet is the largest MFD dataset with inline formula labels. Our experiments demonstrate substantially improved labeling quality for inline and display formulae detection over existing datasets. Additionally, we provide a math formula detection baseline for FormulaNet with an mAP of 0.754. Our dataset is intended to help address the MFD task and may enable the development of new applications, such as making mathematical formulae accessible in PDFs for visually impaired screen reader users. | de_CH |
dc.language.iso | en | de_CH |
dc.publisher | IEEE | de_CH |
dc.relation.ispartof | IEEE Access | de_CH |
dc.rights | http://creativecommons.org/licenses/by/4.0/ | de_CH |
dc.subject | Automatic annotation | de_CH |
dc.subject | Dataset | de_CH |
dc.subject | Document analysis | de_CH |
dc.subject | Deep learning | de_CH |
dc.subject | Mathematical formula detection | de_CH |
dc.subject | Page object detection | de_CH |
dc.subject.ddc | 005: Computerprogrammierung, Programme und Daten | de_CH |
dc.title | FormulaNet : a benchmark dataset for mathematical formula detection | de_CH |
dc.type | Beitrag in wissenschaftlicher Zeitschrift | de_CH |
dcterms.type | Text | de_CH |
zhaw.departement | School of Engineering | de_CH |
zhaw.organisationalunit | Centre for Artificial Intelligence (CAI) | de_CH |
zhaw.organisationalunit | Institut für Informatik (InIT) | de_CH |
dc.identifier.doi | 10.1109/ACCESS.2022.3202639 | de_CH |
dc.identifier.doi | 10.21256/zhaw-25554 | - |
zhaw.funding.eu | No | de_CH |
zhaw.originated.zhaw | Yes | de_CH |
zhaw.pages.end | 91596 | de_CH |
zhaw.pages.start | 91588 | de_CH |
zhaw.publication.status | publishedVersion | de_CH |
zhaw.volume | 10 | de_CH |
zhaw.publication.review | Peer review (Publikation) | de_CH |
zhaw.funding.snf | 194677 | de_CH |
zhaw.webfeed | Machine Perception and Cognition | de_CH |
zhaw.webfeed | Human-Centered Computing | de_CH |
zhaw.author.additional | No | de_CH |
zhaw.display.portrait | Yes | de_CH |
Appears in collections: | Publikationen School of Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
2022_SchmittKoopmann-etal_FormulaNet-Benchmark-Dataset-Mathematical-Formula-Detection.pdf | 1.35 MB | Adobe PDF | View/Open |
Show simple item record
Schmitt-Koopmann, F. M., Huang, E. M., Hutter, H.-P., Stadelmann, T., & Darvishy, A. (2022). FormulaNet : a benchmark dataset for mathematical formula detection. IEEE Access, 10, 91588–91596. https://doi.org/10.1109/ACCESS.2022.3202639
Schmitt-Koopmann, F.M. et al. (2022) ‘FormulaNet : a benchmark dataset for mathematical formula detection’, IEEE Access, 10, pp. 91588–91596. Available at: https://doi.org/10.1109/ACCESS.2022.3202639.
F. M. Schmitt-Koopmann, E. M. Huang, H.-P. Hutter, T. Stadelmann, and A. Darvishy, “FormulaNet : a benchmark dataset for mathematical formula detection,” IEEE Access, vol. 10, pp. 91588–91596, 2022, doi: 10.1109/ACCESS.2022.3202639.
SCHMITT-KOOPMANN, Felix M., Elaine M. HUANG, Hans-Peter HUTTER, Thilo STADELMANN und Alireza DARVISHY, 2022. FormulaNet : a benchmark dataset for mathematical formula detection. IEEE Access. 2022. Bd. 10, S. 91588–91596. DOI 10.1109/ACCESS.2022.3202639
Schmitt-Koopmann, Felix M., Elaine M. Huang, Hans-Peter Hutter, Thilo Stadelmann, and Alireza Darvishy. 2022. “FormulaNet : A Benchmark Dataset for Mathematical Formula Detection.” IEEE Access 10: 91588–96. https://doi.org/10.1109/ACCESS.2022.3202639.
Schmitt-Koopmann, Felix M., et al. “FormulaNet : A Benchmark Dataset for Mathematical Formula Detection.” IEEE Access, vol. 10, 2022, pp. 91588–96, https://doi.org/10.1109/ACCESS.2022.3202639.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.