Invited Talks

Exploiting top-down source models to improve binaural sound localisation. Institute of Acoustics, Chinese Academy of Sciences, Beijing, 30 September, 2015
Machine hearing exploiting head movements for binaural sound localisation in reverberant conditions. MRC Institute of Hearing Research, Nottingham, 31 March, 2015
Which sounds to listen to? The Hamlyn Centre, Imperial College London, 30 November, 2011
Auditory scene analysis and robust speech recognition. Invited lectures for the Postgraduate Programme, Department of Signal Theory, Networking and Communications, University of Granada, Spain, 16--19 November, 2010
Distant microphone speech recognition in a noisy indoor environment: combining soft missing data and speech fragment decoding. ISCA Workshop on Statistical And Perceptual Audition, Makuhari, Japan, 25 September, 2010
Missing Links between Hearing and Robust Speech Recognition. Department of Signal Theory, Networking and Communications, University of Granada, Spain, 10 May, 2010
Active listening in auditory scenes. Department of Computer Science, University of Sheffield, 29 October, 2008
Using monaural and binaural cues for speaker localization in single-, and multi-source environments. International Workshop on Computational and Cognitive Models for Audio-Visual Interactions, Losehill Hall, Peak District National Park, UK, 11 March, 2008

Publications

Refereed journal papers

MA, N., Gonzalez, J., Brown, G. (2018) Robust Binaural Localization of a Target Sound Source by Combining Spectral Source Models and Deep Neural Networks, IEEE/ACM Transactions on Audio, Speech and Language Processing, 26(11): 2122--2131
Ma, N., May, T., Brown, G. (2017) Exploiting deep neural networks and head movements for robust binaural localisation of multiple sources in reverberant environments, IEEE/ACM Transactions on Audio, Speech and Language Processing, 25(12): 2444--2453
Gonzalez, J. A., Gomez, A. M., Peinado, A. M., Ma, N., and Barker, J. (2017) Spectral reconstruction and noise model estimation based on a masking model for noise robust speech recognition, Circuits, Systems, and Signal Processing, 36(9): 3731--3760
Ma, N., Morris, S., and Kitterick P. T. (2016) Benefits to speech perception in noise from the binaural integration of electric and acoustic signals in simulated unilateral deafness, Ear and Hearing, 37(3): 248--259
Ma, N., Barker, J., Christensen, H., Green P. (2013) A hearing-inspired approach for distant-microphone speech recognition in the presence of multiple sound sources, Computer Speech and Language, 27(3): 820-836
Carmona, J.L., Barker, J., Gomez, A.,M. and Ma, N. (2013) Speech Spectral Envelope Enhancement by HMM-based Analysis/Resynthesis, IEEE Signal Processing Letters, 20(3): 563-566
Barker, J., Vincent, E., Ma, N., Christensen, H., Green, P. (2013) The PASCAL CHiME Speech Separation and Recognition Challenge, Computer Speech and Language, 27(3): 621-633
Gonzalez, J., Peinado, A., Ma, N., Gomez, A. and Barker, J. (2013) MMSE-based missing-feature reconstruction with temporal modeling for robust speech recognition, IEEE/ACM Transactions on Audio, Speech and Language Processing, 21(3): 624-635
Ma, N., Barker, J., Christensen, H., Green P. (2012), Combining speech fragment decoding and adaptive noise floor modelling, IEEE/ACM Transactions on Audio, Speech and Language Processing, 20(3): 818-827
Barker, J., Ma, N., Coy, A. and Cooke, M. (2010) Speech fragment decoding techniques for simultaneous speaker identification and speech recognition, Computer Speech and Language, 24(1): 94-111
Ma, N., Green, P., Barker, J. and Coy, A. (2007) Exploiting correlogram structure for robust speech recognition with multiple speech sources, Speech Communication, 49(12): 874-891

Refereed conference papers and abstracts

P. Vecchiotti, N. Ma, S. Squartini and G. J. Brown (2019) End-to-end binaural sound localisation from the raw waveform. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP-2019), 12-17 May 2019, pp. 451-455.
H. E. Romero, N. Ma, G. J. Brown, A. V. Beeston and M. Hasan (2019) Deep learning features for robust detection of acoustic events in sleep-disordered breathing. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP-2019), 12-17 May 2019, pp. 810-814.
Meutzner, H., Ma, N., Nickel, R., Schymura, C., Kolossa, D. (2017). "Improving audio-visual speech recognition using deep neural networks with dynamic stream reliability estimates", Proc. ICASSP, New Orleans, March 2017
Ma, N. and Brown, G. J. (2016) Speech localisation in a multitalker mixture by humans and machines. In Proceedings of Interspeech 2016, San Francisco, CA, 8th--12th September 2016
Guo, Y., Wang, X., Wu, C., Fu, Q., Ma, N. and Brown, G. J. (2016) A robust dual-microphone speech source localization algorithm for reverberant environments. In Proceedings of Interspeech 2016, San Francisco, CA, 8th--12th September 2016
Zeiler, S., Nicheli, R., Ma, N., Brown, G. J., and Kolossa, D. (2016) Robust audiovisual speech recognition using noise-adaptive linear discriminant analysis. In Proceedings of ICASSP 2016, Shanghai, China, 22nd-25th March, pp. 2797-2801
Ma, N., Marxer, R., Barker, J. and Brown, J. (2015) Exploiting synchrony spectra and deep neural networks for noise-robust automatic speech recognition. In Proceedings of IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), Scottsdale, AZ, pp. 490-495
Ma, N., Brown, G. J. and Gonzalez, J. A. (2015) Exploiting top-down source models to improve binaural localisation of multiple sources in reverberant environments. In Proceedings of Interspeech 2015, Dresden, Germany, 6th-11th September, pp. 160-164
Ma, N., Brown, G. J. and T. May (2015) Exploiting deep neural networks and head movements for binaural localisation of multiple speakers in reverberant conditions. In Proceedings of Interspeech 2015, Dresden, Germany, 6th-11th September, pp. 3302-3306
Ma, N., May, T., Wierstorf, H. and Brown, G. J. (2015) A machine-hearing system exploiting head movements for binaural sound localisation in reverberant conditions. In Proceedings of ICASSP 2015, Brisbane, Australia, 19th-24th April, pp. 2699-2703
May, J., Ma, N. and Brown, G. J. (2015) Robust localisation of multiple speakers exploiting head movements and multi-conditional training of binaural cues. In Proceedings of ICASSP 2015, Brisbane, Australia, 19th-24th April, pp. 2679-2683
Schymura, C., Ma, N., Brown, G. J., Walther, T., and Kolossa, D. (2014) Binaural sound source localisation using a Bayesian-network-based blackboard system and hypothesis-driven feedback. In Proceedings of 7th FORUM ACUSTICUM 2014, Krakow, Poland, 7th-12th September
Ma, N. and Barker, J. (2013) A fragment-decoding plus missing-data imputation system evaluated on the 2nd CHiME challenge. In Proceedings of CHiME-2013 the 2nd International Workshop on Machine Listening in Multisource Environments, Vancouver, Canada, 1st June, pp. 53-58
Ma, N. and Barker, J. (2012) Coupling identification and reconstruction of missing features for noise-robust automatic speech recognition. In Proceedings of Interspeech 2012, Portland, Oregon, 9th-13th September, pp. 2637-2640
Gonzalez, J., Peinado, A., Gomez, A. and Ma, N. (2012) Log-spectral feature reconstruction based on an occlusion model for noise robust speech recognition. In Proceedings of Interspeech 2012, Portland, Oregon, 9th-13th September, pp. 2629-2632
Gonzalez, J., Peinado, A., Gomez, A., Ma, N. and Barker, J. (2012) Combining missing-data reconstruction and uncertainty decoding for robust speech recognition. In Proceedings of ICASSP 2012, Kyoto, Japan, 25th-30th March, pp. 4693-4696
Ma, N., Barker, J., Christensen, H. and Green, P. (2011) Binaural cues for fragment-based speech recognition in reverberant multisource environments. In Proceedings of Interspeech 2011, Florence, Italy, 28th-30th August, pp. 1657-1660
Ma, N., Barker, J., Christensen, H. and Green, P. (2011) Recent advances in fragment-based speech recognition in reverberant multisource environments. In Proceedings of ISCA Workshop on Machine Listening in Multisource Environments, Florence, Italy, pp. 68-73
Morales, J., Ma, N., Sanchez, V., Carmona, J., Peinado, A., and Barker, J. (2011) A pitch based noise estimation technique for robust speech recognition with missing data. In Proceedings of ICASSP 2011, Prague, Czech Republic, 22nd-27th May, pp. 4808-4811
Ma, N., Barker, J., Christensen, H. and Green, P. (2011) Incorporating localisation cues in a fragment decoding framework for distant binaural speech recognition. In Proceedings of IEEE HSCMA-2011, Edinburgh, Scotland, pp. 207-212
Ma, N., Barker, J., Christensen, H. and Green, P. (2010) Distant microphone speech recognition in a noisy indoor environment: combining soft missing data and speech fragment decoding. In Proceedings of ISCA SAPA-2010, Makuhari, Japan, pp. 19-24
Christensen, H., Barker, J., Ma, N. and Green, P. (2010) The CHiME corpus: a resource and a challenge for Computational Hearing in Multisource Environments. In Proceedings of Interspeech 2010, Makuhari, Japan, 26th-30th September, pp. 1918-1921
Ma, N., Bartels, C., Bilmes, J. and Green, P. (2009) Modelling the prepausal lengthening effect for speech recognition: A dynamic Bayesian network approach, In Proceedings of ICASSP 2009, Taipei, Taiwan, 19th-24th April, pp. 4617-4620
Christensen, H., Ma, N., Wrigley, S. and Barker, J. (2009) A speech fragment approach to localising multiple speakers in reverberant environments. In Proceedings of ICASSP 2009, Taipei, Taiwan, 19th-24th April, pp. 4593-4596
Ma, N. and Green, P. (2008) A `speechiness' measure to improve speech decoding in the presence of other sound sources. In Proceedings of Interspeech 2008, Brisbane, Australia, pp. 1285-1288
Christensen, H., Ma, N., Wrigley, S.N. and Barker, J. (2008) Improving source localisation in multi-source, reverberant conditions: exploiting local spectro-temporal location cues. Journal of the Acoustical Society of America, 123(5): 3294
Ma, N., Barker, J. and Green, P. (2007) Applying word duration constraints by using unrolled HMMs. In Proceedings of Interspeech 2007, Antwerp, Belgium, pp. 1066-1069
Christensen, H., Ma, N., Wrigley, S. and Barker, J. (2007) Integrating pitch and localisation cues at a speech fragment level. In Proceedings of Interspeech 2007, Antwerp, Belgium, pp. 2769-2772
Ma, N., Green, P. and Coy, A. (2006) Exploiting dendritic autocorrelogram structure to identify spectro-temporal regions dominated by a single sound source. In Proceedings of Interspeech-2006, Pittsburgh, PA, pp. 669-672
Barker, J., Coy, A., Ma, N. and Cooke, M. (2006) Recent advances in speech fragment decoding techniques. In Proceedings of Interspeech 2006, Pittsburgh, PA, pp. 85-88
Ma, N. and Green, P. (2005) Context-dependent word duration modelling for robust speech recognition. In Proceedings of Interspeech 2005, Lisbon, Portugal, pp. 2609-2612
Ma, N. and Green, P. (2005) Informing multisource decoding for robust speech recognition. One Day Meeting for Young Speech Researchers, University College London, 14 April, 2005
Ma, N. and Green, P. (2004) Improving whole-word HMMs in the Aurora 2 connected digits recognition task. One Day Meeting for Young Speech Researchers, University College London, 22 April, 2004
Ma, N., Sargaison, M. and Lawrence, N. (2003) Linear A#: A linear algebra based programming language for the .NET framework. 2nd Microsoft .NET ROTOR workshop, Pisa, Italy, 23-25 April 2003

Thesis

Ma, N. (2008) Informing Multisource Decoding in Robust Automatic Speech Recognition. PhD Thesis, Department of Computer Science, University of Sheffield [pdf]
Ma, N. (2003) Identification and Elimination of Crosstalk in Audio Recordings. Masters Dissertation, Department of Computer Science, University of Sheffield [pdf]