Projects funded by the NCN


Information on the principal investigator and host institution

Information of the project and the call

Keywords

Equipment

Delete all

Large-Scale Text Analysis and Methodological Foundations of Computational Stylistics

2017/26/E/HS2/01019

Keywords:

computational stylistics stylometry quantitative linguistics large corpora Big Data

Descriptors:

  • HS2_6: General linguistics, theory and methodology of linguistic research
  • HS2_1: History of literature (incl. ancient, modern, contemporary; national and world literature), literary criticism and interpretation
  • HS2_2: Theory of literature, history of literary studies, methodology and trends in literary and cultural studies, anthropology of literature, comparative literature, literary and cultural translatology

Panel:

HS2 - Culture and cultural production: literary theory and comparative literature, history of literature, linguistics, library science, cultural studies, arts, architecture

Host institution :

Instytut Języka Polskiego Polskiej Akademii Nauk

woj. małopolskie

Other projects carried out by the institution 

Principal investigator (from the host institution):

dr hab. Maciej Eder 

Number of co-investigators in the project: 6

Call: SONATA BIS 7 - announced on 2017-06-14

Amount awarded: 1 258 200 PLN

Project start date (Y-m-d): 2018-05-08

Project end date (Y-m-d): 2024-11-07

Project duration:: 60 months (the same as in the proposal)

Project status: Pending project

Project description

Download the project description in a pdf file

Note - project descriptions were prepared by the authors of the applications themselves and placed in the system in an unchanged form.

Information in the final report

  • Publication in academic press/journals (11)
  • Articles in post-conference publications (10)
  • Book publications / chapters in book publications (3)
  1. Detecting Ottokar II's 1248–1249 uprising and its instigators in co-witnessing networks
    Authors:
    Jeremi Ochab, Jan Škvrňák, Michael Škvrňák
    Academic press:
    Historical Methods: A Journal of Quantitative and Interdisciplinary History (rok: 2022, tom: 55, strony: 189-208), Wydawca: Taylor & Francis
    Status:
    Published
    DOI:
    10.1080/01615440.2022.2065397 - link to the publication
  2. La prosa de Gustavo Adolfo Bécquer en los límites de la poesía: Análisis estilométrico
    Authors:
    Laura Hernandez Lorenzo
    Academic press:
    apropos [Perspektiven auf die Romania] (rok: 2022, tom: 9, strony: 37-56), Wydawca: Hamburg Universitat
    Status:
    Published
    DOI:
    0.15460/apropos.9.1875 - link to the publication
  3. Słowozbiory "Tekstów Drugich"
    Authors:
    Maciej Maryl, Maciej Eder
    Academic press:
    Teksty Drugie (rok: 2023, tom: 33, strony: 346-364), Wydawca: IBL PAN
    Status:
    Published
    DOI:
    10.18318/td.2023.1.21 - link to the publication
  4. The fall of genres that did not happen: formalising history of the universal" semantics of Russian iambic tetrameter
    Authors:
    Antonina Martynenko, Artjoms Šeļa
    Academic press:
    Studia Metrica et Poetica (rok: 2023, tom: 10, strony: 89-111), Wydawca: University of Tartu Press
    Status:
    Published
  5. Topic Modeling, Long Texts and the Best Number of Topics: Some Problems and Solutions
    Authors:
    Stefano Sbalchiero, Maciej Eder
    Academic press:
    Quantity & Quality (rok: 2020, tom: 54, strony: 1095–1108), Wydawca: Springer
    Status:
    Published
    DOI:
    10.1007/s11135-020-00976-w - link to the publication
  6. Challenging Stylometry: the authorship of the baroque play La Segunda Celestina
    Authors:
    Laura Hernandez Lorenzo, Joanna Byszuk
    Academic press:
    Digital Scholarship in the Humanities , Wydawca: Oxford University Press
    Status:
    Accepted for publication
  7. Stylistic fingerprints, POS tags and inflected languages: A case study in Polish
    Authors:
    Maciej Eder, Rafał Górski
    Academic press:
    Journal of Quantitative Linguistics (rok: 2023, tom: 30, strony: 86-103), Wydawca: Taylor & Francis
    Status:
    Published
    DOI:
    10.1080/09296174.2022.2122751 - link to the publication
  8. Erinevused, kaugused ja sõrmejäljed: Stilomeetria ja mitmemõõtmelise tekstianalüüsi alused [Differences, distances and fingerprints: the fundamentals of stylometry and multivariate text analysis]
    Authors:
    Šeļa, Artjoms
    Academic press:
    Keel ja Kirjandus (rok: 2021, tom: 45513, strony: 696-718), Wydawca: Estonian Academy of Sciences
    Status:
    Published
    DOI:
    10.54013/kk764a3 - link to the publication
  9. Stylistic change in early modern Spanish poetry through network analysis (with an especial focus on Fernando de Herrera's role)
    Authors:
    Laura Hernandez Lorenzo
    Academic press:
    Neophilologus (rok: 2022, tom: 106, strony: 397–417), Wydawca: Springer
    Status:
    Published
    DOI:
    10.1007/s11061-021-09717-2 - link to the publication
  10. The Voices of Doctor Who – How Stylometry Can be Useful in Revealing New Information About TV Series
    Authors:
    Joanna Byszuk
    Academic press:
    Digital Humanities Quarterly (rok: 2020, tom: 14, strony: 25934), Wydawca: Association for Computers and the Humanities
    Status:
    Published
  11. Computational thematics: Comparing algorithms for clustering the genres of literary fiction
    Authors:
    Oleg Sobchuk, Artjoms Šeļa
    Academic press:
    Social Sciences Communications (rok: 2024, ), Wydawca: Nature
    Status:
    Accepted for publication
  1. Using Word Embeddings for Validation and Enhancement of Spatial Entity Lists
    Authors:
    Berenike Herrmann, Joanna Byszuk, Giulia Grisot
    Conference:
    Digital Humanities 2022 (rok: 2022, ), Wydawca: University of Tokyo
    Data:
    konferencja 25-29.07.2022
    Status:
    Published
  2. Identifying Similarities in Text Analysis: Hierarchical Clustering (Linkage) versus Network Clustering (Community Detection)
    Authors:
    Jeremi K. Ochab, Joanna Byszuk, Steffen Pielström, Maciej Eder
    Conference:
    Digital Humanities 2019: Book of Abstracts (rok: 2019, ), Wydawca: University of Utrecht
    Data:
    konferencja 43657
    Status:
    Published
  3. Stylometric investigations into translationese: The Baby-Sitters Club across languages
    Authors:
    Joanna Byszuk, Quinn Dombrowski
    Conference:
    Proceedings of the 16th International Conference on Statistical Analysis of Textual Data (rok: 2022, ), Wydawca: VadiStat
    Data:
    konferencja 6-8.07.2022
    Status:
    Published
  4. Weak Genres: Modeling Association Between Poetic Meter and Meaning in Russian Poetry
    Authors:
    Artjoms Šeļa, Boris Orekhov, Roman Leibov
    Conference:
    CHR 2020: Workshop on Computational Humanities Research (rok: 2020, ), Wydawca: CEUR-WS.org
    Data:
    konferencja 18–20.11.2020
    Status:
    Published
  5. Detecting direct speech in multilingual collection of 19th-century novels
    Authors:
    Joanna Byszuk, Michał Woźniak, Mike Kestemont, Albert Leśniak, Wojciech Łukasik, Artjoms Šeļa, Maciej Eder
    Conference:
    Language Resources and Evaluation (LREC) (rok: 2020, ), Wydawca: European Language Resources Association (ELRA)
    Data:
    konferencja 11-16.05.2020
    Status:
    Published
  6. Measuring Rhythm Regularity in Verse: Entropy of Inter-Stress Intervals
    Authors:
    Artjoms Šeļa, Mikhail Gronas
    Conference:
    CHR 2022: Computational Humanities Research Conference (rok: 2022, ), Wydawca: CEUR-WS.org
    Data:
    konferencja 12-14.12.2022
    Status:
    Published
  7. One word to rule them all: Understanding word embeddings for authorship attribution
    Authors:
    Maciej Eder, Artjoms Šeļa
    Conference:
    Digital Humanities 2022 (rok: 2022, ), Wydawca: University of Tokyo Press
    Data:
    konferencja 25–29.07.2022
    Status:
    Published
  8. Boosting word frequencies in authorship attribution
    Authors:
    Maciej Eder
    Conference:
    CHR 2022: Computational Humanities Research Conference (rok: 2022, ), Wydawca: CEUR-WS.org
    Data:
    konferencja 12-14.12.2022
    Status:
    Published
  9. Feature Selection in Authorship Attribution: Ordering the Wordlist
    Authors:
    Joanna Byszuk, Maciej Eder
    Conference:
    Digital Humanities 2019: Book of Abstratcs (rok: 2019, ), Wydawca: University of Utrecht
    Data:
    konferencja 43658
    Status:
    Published
  10. Improving the performance of word frequencies in authorship attribution
    Authors:
    Maciej Eder
    Conference:
    Proceedings of the 16th International Conference on Statistical Analysis of Textual Data (rok: 2022, ), Wydawca: VadiStat
    Data:
    konferencja 6-8.07.2022
    Status:
    Published
  1. On computers in text analysis
    Authors:
    Joanna Byszuk
    Book:
    The Bloomsbury Handbook to the Digital Humanities (rok: 2023, tom: b.d., strony: 159–168), Wydawca: James O'Sullivan
    Status:
    Published
  2. Tekst w humanistyce cyfrowej. Modelowanie tematyczne
    Authors:
    Maciej Eder
    Book:
    Od Gutenberga do Zuckerberga. Wstęp do humanistyki cyfrowej (rok: 2023, tom: b.d., strony: 129-141), Wydawca: Universitas
    Status:
    Published
  3. From stage to page: language independent bootstrap measures of distinctiveness in fictional speech
    Authors:
    Artjoms Šeļa, Ben Nagy, Joanna Byszuk, Laura Hernández-Lorenzo, Botond Szemes, Maciej Eder
    Book:
    Workshop on Computational Drama Analysis: Achievements and Opportunities (rok: 2022, tom: 1, strony: 45310), Wydawca: de Gruyter
    Status:
    Accepted for publication