Projects funded by the NCN


Information on the principal investigator and host institution

Information of the project and the call

Keywords

Equipment

Delete all

Deep extraction for robust speech recognition

2021/42/E/ST7/00452

Keywords:

speech and audio signal processing machine learning deep neural networks artificial intelligence statistical signal processing stochastic processes optimization methods speech analysis speech intelligibility speech understanding

Descriptors:

  • ST7_007:
  • ST6_011:

Panel:

ST7 - Systems and communication engineering: electronics, communication, optoelectronics

Host institution :

Akademia Górniczo-Hutnicza im. Stanisława Staszica w Krakowie, Wydział Informatyki, Elektroniki i Telekomunikacji

woj. małopolskie

Other projects carried out by the institution 

Principal investigator (from the host institution):

dr hab. Konrad Karol Kowalczyk 

Number of co-investigators in the project: 10

Call: SONATA BIS 11 - announced on 2021-06-15

Amount awarded: 1 878 000 PLN

Project start date (Y-m-d): 2022-11-02

Project end date (Y-m-d): 2027-11-01

Project duration:: 60 months (the same as in the proposal)

Project status: Pending project

Project description

Download the project description in a pdf file

Note - project descriptions were prepared by the authors of the applications themselves and placed in the system in an unchanged form.

Information in the final report

  • Publication in academic press/journals (2)
  • Articles in post-conference publications (4)
  1. On Ambisonic Source Separation With Spatially Informed Non-Negative Tensor Factorization
    Authors:
    M. Guzik and K. Kowalczyk
    Academic press:
    IEEE/ACM Transactions on Audio, Speech, and Language Processing (rok: 2024, tom: 32, strony: 3238-3255), Wydawca: IEEE
    Status:
    Published
    DOI:
    10.1109/TASLP.2024.3399618 - link to the publication
  2. End-to-End Neural Speaker Diarization With Non-Autoregressive Attractors
    Authors:
    M. Rybicka, J. Villalba, T. Thebaud, N. Dehak and K. Kowalczyk
    Academic press:
    IEEE/ACM Transactions on Audio, Speech, and Language Processing (rok: 2024, tom: 32, strony: 3960-3973), Wydawca: IEEE
    Status:
    Published
    DOI:
    10.1109/TASLP.2024.3439993 - link to the publication
  1. Causal Signal-Based DCCRN with Overlapped-Frame Prediction for Online Speech Enhancement
    Authors:
    Julitta Bartolewska, Stanisław Kacprzak, Konrad Kowalczyk
    Conference:
    Proc. INTERSPEECH 2023 (rok: 2023, tom: Annual Conf. Int. Speech Communication Association (INTERSPEECH), strony: 4039-4043), Wydawca: International Speech Communication Association (ISCA)
    Data:
    konferencja 20-24 August 2023
    Status:
    Published
    DOI:
    10.21437/Interspeech.2023-2177 - link to the publication
  2. Investigation of Whisper ASR Hallucinations Induced by Non-Speech Audio
    Authors:
    M. Barański, J. Jasiński, J. Bartolewska, S. Kacprzak, M. Witkowski, and K. Kowalczyk
    Conference:
    Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (rok: 2025, tom: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), strony: 45662), Wydawca: IEEE
    Data:
    konferencja 6-11 April 2025
    Status:
    Published
    DOI:
    10.1109/ICASSP49660.2025.10890105 - link to the publication
  3. Heightceleb – An Enrichment of Voxceleb Dataset With Speaker Height Information
    Authors:
    S. Kacprzak and K. Kowalczyk
    Conference:
    Proc. IEEE Spoken Language Technology Workshop (SLT) (rok: 2024, tom: IEEE Spoken Language Technology Workshop (SLT), strony: 857-862), Wydawca: IEEE
    Data:
    konferencja 2-5 December 2024
    Status:
    Published
    DOI:
    10.1109/SLT61566.2024.10832224 - link to the publication
  4. Joint Blind Source Separation and Dereverberation for Automatic Speech Recognition using Delayed-Subsource MNMF with Localization Prior
    Authors:
    Mieszko Fra´s, Marcin Witkowski, Konrad Kowalczyk
    Conference:
    Proc. INTERSPEECH 2023 (rok: 2023, tom: Annual Conf. Int. Speech Communication Association (INTERSPEECH), strony: 3734-3738), Wydawca: International Speech Communication Association (ISCA)
    Data:
    konferencja 20-24 August 2023
    Status:
    Published
    DOI:
    10.21437/Interspeech.2023-2520 - link to the publication