Projects funded by the NCN


Information on the principal investigator and host institution

Information of the project and the call

Keywords

Equipment

Delete all

Analysis of the usage of machine learning in spatial audio processing

2017/25/B/ST7/01792

Keywords:

spatial audio array processing speaker recognition acoustic source classification

Descriptors:

  • ST7_7: Signal processing
  • ST6_11: Machine learning, statistical data processing and applications using signal processing (e.g. speech, image, video)
  • ST2_10: Quantum optics and quantum information

Panel:

ST7 - Systems and communication engineering: electronics, communication, optoelectronics

Host institution :

Akademia Górniczo-Hutnicza im. Stanisława Staszica w Krakowie, Wydział Informatyki, Elektroniki i Telekomunikacji

woj. małopolskie

Other projects carried out by the institution 

Principal investigator (from the host institution):

dr hab. Konrad Kowalczyk 

Number of co-investigators in the project: 9

Call: OPUS 13 - announced on 2017-03-15

Amount awarded: 998 400 PLN

Project start date (Y-m-d): 2018-10-31

Project end date (Y-m-d): 2023-05-30

Project duration:: 55 months (the same as in the proposal)

Project status: Project settled

Project description

Download the project description in a pdf file

Note - project descriptions were prepared by the authors of the applications themselves and placed in the system in an unchanged form.

Equipment purchased [PL]

  1. System odsłuchowy (18 000 PLN)
  2. 2 macierze mikrofonowe wraz z przetwornikami analogowo-cyfrowymi i wielokanałową kartą dźwiękową (40 752 PLN)
  3. Bazy danych z nagraniami (17 000 PLN)
  4. 3 komputery / laptopy (15 000 PLN)

Information in the final report

  • Publication in academic press/journals (3)
  • Articles in post-conference publications (13)
  1. On Ambisonic Source Separation with Spatially Informed Non-negative Tensor Factorization
    Authors:
    Mateusz Guzik and Konrad Kowalczyk
    Academic press:
    IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING (rok: 2023, tom: 1, strony: 45305), Wydawca: IEEE
    Status:
    Submitted
  2. Data‑based spatial audio processing
    Authors:
    Maximo Cobos, Jens Ahrens, Konrad Kowalczyk, and Archontis Politis
    Academic press:
    EURASIP Journal on Audio, Speech, and Music Processing (rok: 2022, tom: 13, strony: 45294), Wydawca: Springer Open
    Status:
    Published
    DOI:
    10.1186/s13636-022-00248-5 - link to the publication
  3. An Overview of Machine Learning and Other Data-Based Methods for Spatial Audio Capture, Processing, and Reproduction
    Authors:
    Maximo Cobos, Jens Ahrens, Konrad Kowalczyk, and Archontis Politis
    Academic press:
    EURASIP Journal on Audio, Speech, and Music Processing (rok: 2022, tom: 10, strony: 45312), Wydawca: Springer Open
    Status:
    Published
    DOI:
    10.1186/s13636-022-00242-x - link to the publication
  1. Comparison of Convolution Types in CNN-basedFeature Extraction for Sound Source Localization
    Authors:
    Daniel Krause, Archontis Politis, and Konrad Kowalczyk
    Conference:
    European Signal Processing Conference (EUSIPCO) (rok: 2020, ), Wydawca: EURASIP
    Data:
    konferencja 44214
    Status:
    Published
  2. End-to-End Neural Speaker Diarization with an Iterative Refinement of Non-Autoregressive Attention-based Attractors
    Authors:
    Magdalena Rybicka, Jesus Villalba, Najim Dehak, and Konrad Kowalczyk
    Conference:
    Annual Conf. Int. Speech Communication Association (INTERSPEECH) (rok: 2022, ), Wydawca: Int. Speech Communication Association
    Data:
    konferencja 44822
    Status:
    Published
  3. Data Diversity for Improving DNN-based Localization of Concurrent Sound Events
    Authors:
    Daniel Krause, Archontis Politis, and Konrad Kowalczyk
    Conference:
    European Signal Processing Conference (EUSIPCO) (rok: 2021, ), Wydawca: EURASIP
    Data:
    konferencja 44431
    Status:
    Published
  4. NTF of Spectral and Spatial Features for Tracking and Separation of Moving Sound Sources in Spherical Harmonic Domain
    Authors:
    Mateusz Guzik and Konrad Kowalczyk
    Conference:
    Annual Conf. Int. Speech Communication Association (INTERSPEECH) (rok: 2022, ), Wydawca: Int. Speech Communication Association
    Data:
    konferencja 44822
    Status:
    Published
  5. Wishart Localization Prior on Spatial Covariance Matrix in Ambisonic Source Separation using Non-negative Tensor Factorization
    Authors:
    Mateusz Guzik and Konrad Kowalczyk
    Conference:
    IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP) (rok: 2022, ), Wydawca: IEEE
    Data:
    konferencja 44703
    Status:
    Published
  6. Adversarial Domain Adaptation with Paired Examples for Acoustic Scene Classification on Different Recording Devices
    Authors:
    Stanisław Kacprzak and Konrad Kowalczyk
    Conference:
    European Signal Processing Conference (EUSIPCO) (rok: 2021, ), Wydawca: EURASIP
    Data:
    konferencja 44431
    Status:
    Published
  7. Convolutive NTF for Ambisonic Source Separation under Reverberant Conditions
    Authors:
    Mateusz Guzik and Konrad Kowalczyk
    Conference:
    IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (rok: 2023, ), Wydawca: IEEE
    Data:
    konferencja 45081
    Status:
    Published
  8. Convolutive Weighted Multichannel Wiener Filter Front-end for Distant Automatic Speech Recognition in Reverberant Multispeaker Scenarios
    Authors:
    Mieszko Fras, Marcin Witkowski, and Konrad Kowalczyk
    Conference:
    Annual Conf. Int. Speech Communication Association (INTERSPEECH) (rok: 2022, ), Wydawca: Int. Speech Communication Association
    Data:
    konferencja 44822
    Status:
    Published
  9. Feature Overview for Joint Modeling of Sound Event Detection and Localization Using a Microphone Array
    Authors:
    Daniel Krause, Archontis Politis, and Konrad Kowalczyk
    Conference:
    European Signal Processing Conference (EUSIPCO) (rok: 2020, ), Wydawca: EURASIP
    Data:
    konferencja 44214
    Status:
    Published
  10. Convolutional Weighted Minimum Mean Square Error Filter for Joint Source Separation and Dereverberation
    Authors:
    Mieszko Fraś, Marcin Witkowski and Konrad Kowalczyk
    Conference:
    IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (rok: 2022, ), Wydawca: IEEE
    Data:
    konferencja 44703
    Status:
    Published
  11. Incorporation of Localization Information for Sound Source Separation in Spherical Harmonic Domain
    Authors:
    Mateusz Guzik, Mieszko Fraś, and Konrad Kowalczyk
    Conference:
    IEEE International Workshop on Multimedia Signal Processing (rok: 2021, ), Wydawca: IEEE
    Data:
    konferencja 44475
    Status:
    Published
  12. Refining DNN-based Mask Estimation using CGMM-based EM Algorithm for Multi-channel Noise Reduction
    Authors:
    Julitta Bartolewska, Stanisław Kacprzak, and Konrad Kowalczyk
    Conference:
    Annual Conf. Int. Speech Communication Association (INTERSPEECH) (rok: 2022, ), Wydawca: Int. Speech Communication Association
    Data:
    konferencja 44822
    Status:
    Published
  13. Sparse Linear Prediction-based Dereverberation for Signal Enhancement in Distant Speaker Verification
    Authors:
    Marcin Witkowski, Magdalena Rybicka, and Konrad Kowalczyk
    Conference:
    European Signal Processing Conference (EUSIPCO) (rok: 2021, ), Wydawca: EURASIP
    Data:
    konferencja 44431
    Status:
    Published