Please use this identifier to cite or link to this item: https://doi.org/10.21256/zhaw-30993
Full metadata record
DC FieldValueLanguage
dc.contributor.authorNooralahzadeh, Farhad-
dc.contributor.authorZhang, Yi-
dc.contributor.authorSmith, Ellery-
dc.contributor.authorMaennel, Sabine-
dc.contributor.authorMatthey-Doret, Cyril-
dc.contributor.authorRaphaël, de Fondville-
dc.contributor.authorStockinger, Kurt-
dc.date.accessioned2024-07-04T13:47:43Z-
dc.date.available2024-07-04T13:47:43Z-
dc.date.issued2024-08-
dc.identifier.urihttps://digitalcollection.zhaw.ch/handle/11475/30993-
dc.description.abstractThe potential for improvements brought by Large Language Models (LLMs) in Text-to-SQL systems is mostly assessed on monolingual English datasets. However, LLMs' performance for other languages remains vastly unexplored. In this work, we release the StatBot.Swiss dataset, the \emph{first bilingual benchmark for evaluating Text-to-SQL systems} based on real-world applications. The StatBot.Swiss dataset contains 455 natural language/SQL-pairs over 35 big databases with varying level of complexity for both English and German. We evaluate the performance of state-of-the-art LLMs such as GPT-3.5-Turbo and mixtral-8x7b-instruct for the Text-to-SQL translation task using an in-context learning approach. Our experimental analysis illustrates that current LLMs struggle to generalize well in generating SQL queries on our novel bilingual dataset.de_CH
dc.language.isoende_CH
dc.publisherAssociation for Computational Linguisticsde_CH
dc.rightsLicence according to publishing contractde_CH
dc.subjectNatural language processingde_CH
dc.subjectMachine learningde_CH
dc.subjectDatabasede_CH
dc.subjectGenerative AIde_CH
dc.subject.ddc005: Computerprogrammierung, Programme und Datende_CH
dc.subject.ddc006: Spezielle Computerverfahrende_CH
dc.titleStatBot.Swiss : bilingual open data exploration in natural languagede_CH
dc.typeKonferenz: Paperde_CH
dcterms.typeTextde_CH
zhaw.departementSchool of Engineeringde_CH
zhaw.organisationalunitInstitut für Informatik (InIT)de_CH
dc.identifier.doi10.21256/zhaw-30993-
zhaw.conference.details62nd Annual Meeting of the Association for Computational Linguistics (ACL), Bangkok, Thailand, 11-16 August 2024de_CH
zhaw.funding.euNode_CH
zhaw.originated.zhawYesde_CH
zhaw.publication.statusacceptedVersionde_CH
zhaw.publication.reviewPeer review (Publikation)de_CH
zhaw.title.proceedingsFindings of the Association for Computational Linguistics: ACL 2024de_CH
zhaw.webfeedDatalabde_CH
zhaw.webfeedIntelligent Information Systemsde_CH
zhaw.funding.zhawINODE4StatBot.swiss – Anwendung neuer Algorithmen zur automatischen Übersetzung natürlicher Sprache in die Datenbankabfragesprache SQL (NL-to-SQL)de_CH
zhaw.author.additionalNode_CH
zhaw.display.portraitYesde_CH
zhaw.relation.referenceshttps://github.com/dscc-admin-ch/statbot.swissde_CH
Appears in collections:Publikationen School of Engineering

Files in This Item:
File Description SizeFormat 
2024_Nooralahzadeh-etal_StatBot-Swiss_ACL2024.pdfAccepted Version882.9 kBAdobe PDFThumbnail
View/Open
Show simple item record
Nooralahzadeh, F., Zhang, Y., Smith, E., Maennel, S., Matthey-Doret, C., Raphaël, d. F., & Stockinger, K. (2024, August). StatBot.Swiss : bilingual open data exploration in natural language. Findings of the Association for Computational Linguistics: ACL 2024. https://doi.org/10.21256/zhaw-30993
Nooralahzadeh, F. et al. (2024) ‘StatBot.Swiss : bilingual open data exploration in natural language’, in Findings of the Association for Computational Linguistics: ACL 2024. Association for Computational Linguistics. Available at: https://doi.org/10.21256/zhaw-30993.
F. Nooralahzadeh et al., “StatBot.Swiss : bilingual open data exploration in natural language,” in Findings of the Association for Computational Linguistics: ACL 2024, Aug. 2024. doi: 10.21256/zhaw-30993.
NOORALAHZADEH, Farhad, Yi ZHANG, Ellery SMITH, Sabine MAENNEL, Cyril MATTHEY-DORET, de Fondville RAPHAËL und Kurt STOCKINGER, 2024. StatBot.Swiss : bilingual open data exploration in natural language. In: Findings of the Association for Computational Linguistics: ACL 2024. Conference paper. Association for Computational Linguistics. August 2024
Nooralahzadeh, Farhad, Yi Zhang, Ellery Smith, Sabine Maennel, Cyril Matthey-Doret, de Fondville Raphaël, and Kurt Stockinger. 2024. “StatBot.Swiss : Bilingual Open Data Exploration in Natural Language.” Conference paper. In Findings of the Association for Computational Linguistics: ACL 2024. Association for Computational Linguistics. https://doi.org/10.21256/zhaw-30993.
Nooralahzadeh, Farhad, et al. “StatBot.Swiss : Bilingual Open Data Exploration in Natural Language.” Findings of the Association for Computational Linguistics: ACL 2024, Association for Computational Linguistics, 2024, https://doi.org/10.21256/zhaw-30993.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.