ESPnet: End-to-end speech processing toolkit S Watanabe, T Hori, S Karita, T Hayashi, J Nishitoba, Y Unno, NEY Soplin, ... arXiv preprint arXiv:1804.00015, 2018 | 1737 | 2018 |
A comparative study on transformer vs rnn in speech applications S Karita, N Chen, T Hayashi, T Hori, H Inaguma, Z Jiang, M Someki, ... 2019 IEEE automatic speech recognition and understanding workshop (ASRU …, 2019 | 895 | 2019 |
Improving transformer-based end-to-end speech recognition with connectionist temporal classification and language model integration T Nakatani proc. INTERSPEECH 2019, 1408-1412, 2019 | 280 | 2019 |
ESPnet-ST: All-in-one speech translation toolkit H Inaguma, S Kiyono, K Duh, S Karita, NEY Soplin, T Hayashi, ... arXiv preprint arXiv:2004.10234, 2020 | 179 | 2020 |
Sound source localization using deep learning models N Yalta, K Nakadai, T Ogata Journal of Robotics and Mechatronics 29 (1), 37-48, 2017 | 158 | 2017 |
Multilingual sequence-to-sequence speech recognition: architecture, transfer learning, and language modeling J Cho, MK Baskar, R Li, M Wiesner, SH Mallidi, N Yalta, M Karafiat, ... 2018 IEEE Spoken Language Technology Workshop (SLT), 521-527, 2018 | 155 | 2018 |
Weakly-supervised deep recurrent neural networks for basic dance step generation N Yalta, S Watanabe, K Nakadai, T Ogata 2019 International Joint Conference on Neural Networks (IJCNN), 1-8, 2019 | 67 | 2019 |
The Hitachi/JHU CHiME-5 system: Advances in speech recognition for everyday home environments using multiple microphone arrays N Kanda, R Ikeshita, S Horiguchi, Y Fujita, K Nagamatsu, X Wang, ... Proc. CHiME-5, 6-10, 2018 | 54 | 2018 |
The Hitachi-JHU DIHARD III system: Competitive end-to-end neural diarization and x-vector clustering systems combined by DOVER-Lap S Horiguchi, N Yalta, P Garcia, Y Takashima, Y Xue, D Raj, Z Huang, ... arXiv preprint arXiv:2102.01363, 2021 | 43 | 2021 |
CNN-based Multichannel End-to-End Speech Recognition for Everyday Home Environments* N Yalta, S Watanabe, T Hori, K Nakadai, T Ogata 2019 27th European Signal Processing Conference (EUSIPCO), 1-5, 2019 | 17 | 2019 |
HATSUKI: An anime character like robot figure platform with anime-style expressions and imitation learning based action generation PC Yang, M Al-Sada, CC Chiu, K Kuo, TP Tomo, K Suzuki, N Yalta, ... 2020 29th IEEE International Conference on Robot and Human Interactive …, 2020 | 9 | 2020 |
Sequential deep learning for dancing motion generation N Yalta, T Ogata, K Nakadai Proc. the 46th AI Challenge Study Group, 43-49, 2016 | 6 | 2016 |
The Hitachi DCASE 2021 Task 3 system: Handling directive interference with self attention layers N Yalta, Y Sumiyoshi, Y Kawaguchi Technical Report, DCASE 2021 Challenge, 2021 | 5 | 2021 |
Delayed skip connections for music content driven motion generation N Yalta, K Nakadai, T Ogata | 1 | 2018 |