Please use this identifier to cite or link to this item:
https://doi.org/10.21256/zhaw-30993
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Nooralahzadeh, Farhad | - |
dc.contributor.author | Zhang, Yi | - |
dc.contributor.author | Smith, Ellery | - |
dc.contributor.author | Maennel, Sabine | - |
dc.contributor.author | Matthey-Doret, Cyril | - |
dc.contributor.author | Raphaël, de Fondville | - |
dc.contributor.author | Stockinger, Kurt | - |
dc.date.accessioned | 2024-07-04T13:47:43Z | - |
dc.date.available | 2024-07-04T13:47:43Z | - |
dc.date.issued | 2024-08 | - |
dc.identifier.uri | https://digitalcollection.zhaw.ch/handle/11475/30993 | - |
dc.description.abstract | The potential for improvements brought by Large Language Models (LLMs) in Text-to-SQL systems is mostly assessed on monolingual English datasets. However, LLMs' performance for other languages remains vastly unexplored. In this work, we release the StatBot.Swiss dataset, the \emph{first bilingual benchmark for evaluating Text-to-SQL systems} based on real-world applications. The StatBot.Swiss dataset contains 455 natural language/SQL-pairs over 35 big databases with varying level of complexity for both English and German. We evaluate the performance of state-of-the-art LLMs such as GPT-3.5-Turbo and mixtral-8x7b-instruct for the Text-to-SQL translation task using an in-context learning approach. Our experimental analysis illustrates that current LLMs struggle to generalize well in generating SQL queries on our novel bilingual dataset. | de_CH |
dc.language.iso | en | de_CH |
dc.publisher | Association for Computational Linguistics | de_CH |
dc.rights | Licence according to publishing contract | de_CH |
dc.subject | Natural language processing | de_CH |
dc.subject | Machine learning | de_CH |
dc.subject | Database | de_CH |
dc.subject | Generative AI | de_CH |
dc.subject.ddc | 005: Computerprogrammierung, Programme und Daten | de_CH |
dc.subject.ddc | 006: Spezielle Computerverfahren | de_CH |
dc.title | StatBot.Swiss : bilingual open data exploration in natural language | de_CH |
dc.type | Konferenz: Paper | de_CH |
dcterms.type | Text | de_CH |
zhaw.departement | School of Engineering | de_CH |
zhaw.organisationalunit | Institut für Informatik (InIT) | de_CH |
dc.identifier.doi | 10.21256/zhaw-30993 | - |
zhaw.conference.details | 62nd Annual Meeting of the Association for Computational Linguistics (ACL), Bangkok, Thailand, 11-16 August 2024 | de_CH |
zhaw.funding.eu | No | de_CH |
zhaw.originated.zhaw | Yes | de_CH |
zhaw.publication.status | acceptedVersion | de_CH |
zhaw.publication.review | Peer review (Publikation) | de_CH |
zhaw.title.proceedings | Findings of the Association for Computational Linguistics: ACL 2024 | de_CH |
zhaw.webfeed | Datalab | de_CH |
zhaw.webfeed | Intelligent Information Systems | de_CH |
zhaw.funding.zhaw | INODE4StatBot.swiss – Anwendung neuer Algorithmen zur automatischen Übersetzung natürlicher Sprache in die Datenbankabfragesprache SQL (NL-to-SQL) | de_CH |
zhaw.author.additional | No | de_CH |
zhaw.display.portrait | Yes | de_CH |
zhaw.relation.references | https://github.com/dscc-admin-ch/statbot.swiss | de_CH |
Appears in collections: | Publikationen School of Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
2024_Nooralahzadeh-etal_StatBot-Swiss_ACL2024.pdf | Accepted Version | 882.9 kB | Adobe PDF | ![]() View/Open |
Show simple item record
Nooralahzadeh, F., Zhang, Y., Smith, E., Maennel, S., Matthey-Doret, C., Raphaël, d. F., & Stockinger, K. (2024, August). StatBot.Swiss : bilingual open data exploration in natural language. Findings of the Association for Computational Linguistics: ACL 2024. https://doi.org/10.21256/zhaw-30993
Nooralahzadeh, F. et al. (2024) ‘StatBot.Swiss : bilingual open data exploration in natural language’, in Findings of the Association for Computational Linguistics: ACL 2024. Association for Computational Linguistics. Available at: https://doi.org/10.21256/zhaw-30993.
F. Nooralahzadeh et al., “StatBot.Swiss : bilingual open data exploration in natural language,” in Findings of the Association for Computational Linguistics: ACL 2024, Aug. 2024. doi: 10.21256/zhaw-30993.
NOORALAHZADEH, Farhad, Yi ZHANG, Ellery SMITH, Sabine MAENNEL, Cyril MATTHEY-DORET, de Fondville RAPHAËL und Kurt STOCKINGER, 2024. StatBot.Swiss : bilingual open data exploration in natural language. In: Findings of the Association for Computational Linguistics: ACL 2024. Conference paper. Association for Computational Linguistics. August 2024
Nooralahzadeh, Farhad, Yi Zhang, Ellery Smith, Sabine Maennel, Cyril Matthey-Doret, de Fondville Raphaël, and Kurt Stockinger. 2024. “StatBot.Swiss : Bilingual Open Data Exploration in Natural Language.” Conference paper. In Findings of the Association for Computational Linguistics: ACL 2024. Association for Computational Linguistics. https://doi.org/10.21256/zhaw-30993.
Nooralahzadeh, Farhad, et al. “StatBot.Swiss : Bilingual Open Data Exploration in Natural Language.” Findings of the Association for Computational Linguistics: ACL 2024, Association for Computational Linguistics, 2024, https://doi.org/10.21256/zhaw-30993.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.