Ke Tan (谭可)

Research Scientist
Meta Research
Redmond, WA
tan [dot] 650 [at] osu [dot] edu
Google Scholar
ResearchGate
CV LinkedIn

Bio

I am currently a research scientist at Meta Reality Labs Research (f.k.a. Facebook Reality Labs Research, Oculus Research). I received the B.E. degree in electronic information engineering from University of Science and Technology of China, Hefei, Anhui, China, in 2015, and the M.S. degree in 2019 and the Ph.D. degree in 2021 from The Ohio State University, Columbus, OH, both in computer science and engineering. During my Ph.D. study at Ohio State, my research focused on speech enhancement and separation. I was advised by Prof. DeLiang Wang, who leads the Perception and Neurodynamics Laboratory.

My research interests include speech enhancement, speech separation, speech dereverberation, microphone array processing, audio-visual speech processing, acoustic echo cancellation, and deep learning. I serve as a reviewer for IEEE/ACM Transactions on Audio, Speech, and Language Processing, IEEE Journal of Selected Topics in Signal Processing, IEEE Signal Processing Letters, IEEE Transactions on Signal Processing, IEEE Communications Letters, The Journal of the Acoustical Society of America, Speech Communication, Neural Networks, Pattern Recognition, and Neurocomputing.

Publications

Journal Articles

[10] H. Taherian, K. Tan, and D. L. Wang, "Multi-Channel Talker-Independent Speaker Separation Through Location-Based Training", in submission to IEEE/ACM Transactions on Audio, Speech, and Language Processing (IEEE/ACM TASLP), vol. 30, pp. 2791-2800, 2022.
Paper BibTeX

[9] K. Tan, Z.-Q. Wang, and D. L. Wang, "Neural Spectrospatial Filtering", in IEEE/ACM Transactions on Audio, Speech, and Language Processing (IEEE/ACM TASLP), vol. 30, pp. 605-621, 2022.
Paper BibTeX

[8] E. W. Healy, K. Tan, E. M. Johnson, and D. L. Wang, "An Effectively Causal Deep Learning Algorithm to Increase Intelligibility in Untrained Noises for Hearing-Impaired Listeners", in Journal of the Acoustical Society of America (JASA), vol. 149, pp. 3943-3953, 2021.
Paper BibTeX

[7] K. Tan, X. Zhang, and D. L. Wang, "Deep Learning Based Real-Time Speech Enhancement for Dual-Microphone Mobile Phones", in IEEE/ACM Transactions on Audio, Speech, and Language Processing (IEEE/ACM TASLP), vol. 29, pp. 1853-1863, 2021.
Paper BibTeX

[6] K. Tan and D. L. Wang, "Towards Model Compression for Deep Learning Based Speech Enhancement", in IEEE/ACM Transactions on Audio, Speech, and Language Processing (IEEE/ACM TASLP), vol. 29, pp. 1785-1794, 2021.
Paper BibTeX

[5] K. Tan, B. Xu, A. Kumar, E. Nachmani, and Y. Adi, "SAGRNN: Self-Attentive Gated RNN for Binaural Speaker Separation with Interaural Cue Preservation", in IEEE Signal Processing Letters (IEEE SPL), vol. 28, pp. 26-30, 2021.
Paper BibTeX Demos

[4] K. Tan, Y. Xu, S.-X. Zhang, M. Yu, and D. Yu, "Audio-Visual Speech Separation and Dereverberation with a Two-Stage Multimodal Network", in IEEE Journal of Selected Topics in Signal Processing (IEEE JSTSP), vol. 14, pp. 542-553, 2020.
Paper BibTeX Demos

[3] K. Tan and D. L. Wang, "Learning Complex Spectral Mapping with Gated Convolutional Recurrent Networks for Monaural Speech Enhancement", in IEEE/ACM Transactions on Audio, Speech, and Language Processing (IEEE/ACM TASLP), vol. 28, pp. 380-390, 2020.
Paper BibTeX Code

[2] P. Wang, K. Tan and D. L. Wang, "Bridging the Gap Between Monaural Speech Enhancement and Recognition with Distortion-Independent Acoustic Modeling", in IEEE/ACM Transactions on Audio, Speech, and Language Processing (IEEE/ACM TASLP), vol. 28, pp. 39-48, 2020.
Paper BibTeX

[1] K. Tan, J. Chen, and D. L. Wang, "Gated Residual Networks with Dilated Convolutions for Monaural Speech Enhancement", in IEEE/ACM Transactions on Audio, Speech, and Language Processing (IEEE/ACM TASLP), vol. 27, pp. 189-198, 2019.
Paper BibTeX

Conference Papers

[15] A. Kumar, K. Tan, Z. Ni, P. Manocha, X. Zhang, E. Henderson, and B. Xu, "TorchAudio-Squim: Reference-Less Speech Quality and Intelligibility Measures in TorchAudio", in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023.
Preprint

[14] K.-L. Chen, D. D.E. Wong, K. Tan, B. Xu, A. Kumar, and V. K. Ithapu, "Leveraging Heteroscedastic Uncertainty in Learning Complex Spectral Mapping for Single-Channel Speech Enhancement", in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023.
Preprint

[13] H. Taherian, K. Tan, and D. L. Wang, "Location-Based Training for Multi-Channel Talker-Independent Speaker Separation", in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 696-700, 2022.
Paper BibTeX

[12] K. Tan, X. Zhang, and D. L. Wang, "Real-Time Speech Enhancement for Mobile Communication Based on Dual-Channel Complex Spectral Mapping", in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6134-6138, 2021.
Paper BibTeX

[11] K. Tan and D. L. Wang, "Compressing Deep Neural Networks for Efficient Speech Enhancement", in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 8358-8362, 2021.
Paper BibTeX

[10] K. Tan and D. L. Wang, "Improving Robustness of Deep Learning Based Monaural Speech Enhancement Against Processing Artifacts", in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6914-6918, 2020.
Paper BibTeX

[9] H. Zhang, K. Tan and D. L. Wang, "Deep Learning for Joint Acoustic Echo and Noise Cancellation with Nonlinear Distortions", in the 20th Annual Conference of the International Speech Communication Association (INTERSPEECH), pp. 4255-4259, 2019.
Paper BibTeX

[8] P. Wang, K. Tan and D. L. Wang, "Bridging the Gap Between Monaural Speech Enhancement and Recognition with Distortion-Independent Acoustic Modeling", in the 20th Annual Conference of the International Speech Communication Association (INTERSPEECH), pp. 471-475, 2019.
Paper BibTeX

[7] K. Tan and D. L. Wang, "Complex Spectral Mapping with a Convolutional Recurrent Network for Monaural Speech Enhancement", in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6865-6869, 2019.
Paper BibTeX

[6] K. Tan, X. Zhang, and D. L. Wang, "Real-Time Speech Enhancement Using an Efficient Convolutional Recurrent Network for Dual-Microphone Mobile Phones in Close-Talk Scenarios", in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5751-5755, 2019.
Paper BibTeX

[5] Z.-Q. Wang, K. Tan, and D. L. Wang, "Deep Learning Based Phase Reconstruction for Speaker Separation: A Trigonometric Perspective", in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 71-75, 2019.
Paper BibTeX

[4] K. Tan and D. L. Wang, "A Convolutional Recurrent Neural Network for Real-Time Speech Enhancement", in the 19th Annual Conference of the International Speech Communication Association (INTERSPEECH), pp. 3229-3233, 2018.
Paper BibTeX Code

[3] K. Tan and D. L. Wang, "A Two-Stage Approach to Noisy Cochannel Speech Separation with Gated Residual Networks", in the 19th Annual Conference of the International Speech Communication Association (INTERSPEECH), pp. 3484-3488, 2018.
Paper BibTeX

[2] K. Tan, J. Chen, and D. L. Wang, "Gated Residual Networks with Dilated Convolutions for Supervised Speech Separation", in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 21-25, 2018.
Paper BibTeX

[1] S. Zhu, K. Tan, X. Zhang, Z. Liu and B. Liu, "MICROST: A Mixed Approach for Heart Rate Monitoring During Intensive Physical Exercise Using Wrist-Type PPG Signals", in the 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 2347-2350, 2015.
Paper BibTeX

Dissertation

K. Tan, "Convolutional and Recurrent Neural Networks for Real-Time Speech Separation in the Complex Domain", Ph.D. Dissertation, The Ohio State University, Aug. 2021.
Paper

Software

GCRN for monaural speech enhancement: 2020 Tan-Wang paper, and PyTorch code on GitHub.
CRN for monaural speech enhancement: 2018 Tan-Wang paper, and PyTorch code on GitHub.

Presentations

[7] Slides Compressing Deep Neural Networks for Efficient Speech Enhancement, IEEE ICASSP (virtual due to COVID-19 pandemic), Toronto, Ontario, Canada, Jun. 2021.

[6] Slides Real-Time Speech Enhancement for Mobile Communication Based on Dual-Channel Complex Spectral Mapping, IEEE ICASSP (virtual due to COVID-19 pandemic), Toronto, Ontario, Canada, Jun. 2021.

[5] Slides Improving Robustness of Deep Learning Based Monaural Speech Enhancement Against Processing Artifacts, IEEE ICASSP (virtual due to COVID-19 pandemic), Barcelona, Spain, May 2020.

[4] Poster Complex Spectral Mapping with a Convolutional Recurrent Network for Monaural Speech Enhancement, IEEE ICASSP, Brighton, United Kingdom, May 2019.

[3] Slides Real-Time Speech Enhancement Using an Efficient Convolutional Recurrent Network for Dual-Microphone Mobile Phones in Close-Talk Scenarios, IEEE ICASSP, Brighton, United Kingdom, May 2019.

[2] Slides Deep Learning Based Phase Reconstruction for Speaker Separation: A Trigonometric Perspective, IEEE ICASSP, Brighton, United Kingdom, May 2019.

[1] Slides Gated Residual Networks with Dilated Convolutions for Supervised Speech Separation, IEEE ICASSP, Calgary, Alberta, Canada, Apr. 2018.

Working

Aug. 2021 - present, Research Scientist at Facebook Reality Labs Research, Redmond, WA, United States

May 2020 - Aug. 2020, Research Intern at Facebook Reality Labs, Redmond, WA, United States

May 2019 - Aug. 2019, Research Intern at Tencent AI Lab, Bellevue, WA, United States

May. 2018 - Aug. 2018, Research Intern at KITT.AI group - Baidu DuerOS, Bellevue, WA, United States

Jan. 2017 - Aug. 2021, Graduate Research Associate in Perception and Neurodynamics Laboratory at The Ohio State University, Columbus, OH, United States

Aug. 2015 - Dec. 2016, Graduate Teaching Associate at The Ohio State University, Columbus, OH, United States