Specaugment: A simple data augmentation method for automatic speech recognition DS Park, W Chan, Y Zhang, CC Chiu, B Zoph, ED Cubuk, QV Le arXiv preprint arXiv:1904.08779, 2019 | 4262 | 2019 |
Conformer: Convolution-augmented transformer for speech recognition A Gulati, J Qin, CC Chiu, N Parmar, Y Zhang, J Yu, W Han, S Wang, ... arXiv preprint arXiv:2005.08100, 2020 | 3329 | 2020 |
Gemini: a family of highly capable multimodal models G Team, R Anil, S Borgeaud, JB Alayrac, J Yu, R Soricut, J Schalkwyk, ... arXiv preprint arXiv:2312.11805, 2023 | 2155 | 2023 |
State-of-the-art speech recognition with sequence-to-sequence models CC Chiu, TN Sainath, Y Wu, R Prabhavalkar, P Nguyen, Z Chen, ... 2018 IEEE international conference on acoustics, speech and signal …, 2018 | 1472 | 2018 |
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context G Team, P Georgiev, VI Lei, R Burnell, L Bai, A Gulati, G Tanzer, ... arXiv preprint arXiv:2403.05530, 2024 | 676 | 2024 |
W2v-bert: Combining contrastive learning and masked language modeling for self-supervised speech pre-training YA Chung, Y Zhang, W Han, CC Chiu, J Qin, R Pang, Y Wu 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU …, 2021 | 425 | 2021 |
Pushing the limits of semi-supervised learning for automatic speech recognition Y Zhang, J Qin, DS Park, W Han, CC Chiu, R Pang, QV Le, Y Wu arXiv preprint arXiv:2010.10504, 2020 | 373 | 2020 |
Contextnet: Improving convolutional neural networks for automatic speech recognition with global context W Han, Z Zhang, Y Zhang, J Yu, CC Chiu, J Qin, A Gulati, R Pang, Y Wu arXiv preprint arXiv:2005.03191, 2020 | 321 | 2020 |
Monotonic chunkwise attention CC Chiu, C Raffel arXiv preprint arXiv:1712.05382, 2017 | 308 | 2017 |
Improved noisy student training for automatic speech recognition DS Park, Y Zhang, Y Jia, W Han, CC Chiu, B Li, Y Wu, QV Le arXiv preprint arXiv:2005.09629, 2020 | 271 | 2020 |
Google usm: Scaling automatic speech recognition beyond 100 languages Y Zhang, W Han, J Qin, Y Wang, A Bapna, Z Chen, N Chen, B Li, ... arXiv preprint arXiv:2303.01037, 2023 | 260 | 2023 |
A streaming on-device end-to-end model surpassing server-side conventional model quality and latency TN Sainath, Y He, B Li, A Narayanan, R Pang, A Bruguier, S Chang, W Li, ... ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 229 | 2020 |
Lingvo: a modular and scalable framework for sequence-to-sequence modeling J Shen, P Nguyen, Y Wu, Z Chen, MX Chen, Y Jia, A Kannan, T Sainath, ... arXiv preprint arXiv:1902.08295, 2019 | 212 | 2019 |
Monotonic infinite lookback attention for simultaneous machine translation N Arivazhagan, C Cherry, W Macherey, CC Chiu, S Yavuz, R Pang, W Li, ... arXiv preprint arXiv:1906.05218, 2019 | 199 | 2019 |
A comparison of techniques for language model integration in encoder-decoder speech recognition S Toshniwal, A Kannan, CC Chiu, Y Wu, TN Sainath, K Livescu 2018 IEEE spoken language technology workshop (SLT), 369-375, 2018 | 193 | 2018 |
Minimum word error rate training for attention-based sequence-to-sequence models R Prabhavalkar, TN Sainath, Y Wu, P Nguyen, Z Chen, CC Chiu, ... 2018 IEEE International Conference on Acoustics, Speech and Signal …, 2018 | 191 | 2018 |
Bigssl: Exploring the frontier of large-scale semi-supervised learning for automatic speech recognition Y Zhang, DS Park, W Han, J Qin, A Gulati, J Shor, A Jansen, Y Xu, ... IEEE Journal of Selected Topics in Signal Processing 16 (6), 1519-1532, 2022 | 189 | 2022 |
Leveraging weakly supervised data to improve end-to-end speech-to-text translation Y Jia, M Johnson, W Macherey, RJ Weiss, Y Cao, CC Chiu, N Ari, ... ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019 | 179 | 2019 |
Self-supervised learning with random-projection quantizer for speech recognition CC Chiu, J Qin, Y Zhang, J Yu, Y Wu International Conference on Machine Learning, 3915-3924, 2022 | 166 | 2022 |
Two-pass end-to-end speech recognition TN Sainath, R Pang, D Rybach, Y He, R Prabhavalkar, W Li, M Visontai, ... arXiv preprint arXiv:1908.10992, 2019 | 166 | 2019 |