Invited Talks

  • Exploiting top-down source models to improve binaural sound localisation. Institute of Acoustics, Chinese Academy of Sciences, Beijing, 30 September, 2015
  • Machine hearing exploiting head movements for binaural sound localisation in reverberant conditions. MRC Institute of Hearing Research, Nottingham, 31 March, 2015
  • Which sounds to listen to? The Hamlyn Centre, Imperial College London, 30 November, 2011
  • Auditory scene analysis and robust speech recognition. Invited lectures for the Postgraduate Programme, Department of Signal Theory, Networking and Communications, University of Granada, Spain, 16--19 November, 2010
  • Distant microphone speech recognition in a noisy indoor environment: combining soft missing data and speech fragment decoding. ISCA Workshop on Statistical And Perceptual Audition, Makuhari, Japan, 25 September, 2010
  • Missing Links between Hearing and Robust Speech Recognition. Department of Signal Theory, Networking and Communications, University of Granada, Spain, 10 May, 2010
  • Active listening in auditory scenes. Department of Computer Science, University of Sheffield, 29 October, 2008
  • Using monaural and binaural cues for speaker localization in single-, and multi-source environments. International Workshop on Computational and Cognitive Models for Audio-Visual Interactions, Losehill Hall, Peak District National Park, UK, 11 March, 2008

Publications

Refereed journal papers

  • MA, N., Gonzalez, J., Brown, G. (2018) Robust Binaural Localization of a Target Sound Source by Combining Spectral Source Models and Deep Neural Networks, IEEE/ACM Transactions on Audio, Speech and Language Processing, 26(11): 2122--2131
  • Ma, N., May, T., Brown, G. (2017) Exploiting deep neural networks and head movements for robust binaural localisation of multiple sources in reverberant environments, IEEE/ACM Transactions on Audio, Speech and Language Processing, 25(12): 2444--2453
  • Gonzalez, J. A., Gomez, A. M., Peinado, A. M., Ma, N., and Barker, J. (2017) Spectral reconstruction and noise model estimation based on a masking model for noise robust speech recognition, Circuits, Systems, and Signal Processing, 36(9): 3731--3760
  • Ma, N., Morris, S., and Kitterick P. T. (2016) Benefits to speech perception in noise from the binaural integration of electric and acoustic signals in simulated unilateral deafness, Ear and Hearing, 37(3): 248--259
  • Ma, N., Barker, J., Christensen, H., Green P. (2013) A hearing-inspired approach for distant-microphone speech recognition in the presence of multiple sound sources, Computer Speech and Language, 27(3): 820-836
  • Carmona, J.L., Barker, J., Gomez, A.,M. and Ma, N. (2013) Speech Spectral Envelope Enhancement by HMM-based Analysis/Resynthesis, IEEE Signal Processing Letters, 20(3): 563-566
  • Barker, J., Vincent, E., Ma, N., Christensen, H., Green, P. (2013) The PASCAL CHiME Speech Separation and Recognition Challenge, Computer Speech and Language, 27(3): 621-633
  • Gonzalez, J., Peinado, A., Ma, N., Gomez, A. and Barker, J. (2013) MMSE-based missing-feature reconstruction with temporal modeling for robust speech recognition, IEEE/ACM Transactions on Audio, Speech and Language Processing, 21(3): 624-635
  • Ma, N., Barker, J., Christensen, H., Green P. (2012), Combining speech fragment decoding and adaptive noise floor modelling, IEEE/ACM Transactions on Audio, Speech and Language Processing, 20(3): 818-827
  • Barker, J., Ma, N., Coy, A. and Cooke, M. (2010) Speech fragment decoding techniques for simultaneous speaker identification and speech recognition, Computer Speech and Language, 24(1): 94-111
  • Ma, N., Green, P., Barker, J. and Coy, A. (2007) Exploiting correlogram structure for robust speech recognition with multiple speech sources, Speech Communication, 49(12): 874-891

Refereed conference papers and abstracts

  • Meutzner, H., Ma, N., Nickel, R., Schymura, C., Kolossa, D. (2017). "Improving audio-visual speech recognition using deep neural networks with dynamic stream reliability estimates", Proc. ICASSP, New Orleans, March 2017
  • Ma, N. and Brown, G. J. (2016) Speech localisation in a multitalker mixture by humans and machines. In Proceedings of Interspeech 2016, San Francisco, CA, 8th--12th September 2016
  • Guo, Y., Wang, X., Wu, C., Fu, Q., Ma, N. and Brown, G. J. (2016) A robust dual-microphone speech source localization algorithm for reverberant environments. In Proceedings of Interspeech 2016, San Francisco, CA, 8th--12th September 2016
  • Zeiler, S., Nicheli, R., Ma, N., Brown, G. J., and Kolossa, D. (2016) Robust audiovisual speech recognition using noise-adaptive linear discriminant analysis. In Proceedings of ICASSP 2016, Shanghai, China, 22nd-25th March, pp. 2797-2801
  • Ma, N., Marxer, R., Barker, J. and Brown, J. (2015) Exploiting synchrony spectra and deep neural networks for noise-robust automatic speech recognition. In Proceedings of IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), Scottsdale, AZ, pp. 490-495
  • Ma, N., Brown, G. J. and Gonzalez, J. A. (2015) Exploiting top-down source models to improve binaural localisation of multiple sources in reverberant environments. In Proceedings of Interspeech 2015, Dresden, Germany, 6th-11th September, pp. 160-164
  • Ma, N., Brown, G. J. and T. May (2015) Exploiting deep neural networks and head movements for binaural localisation of multiple speakers in reverberant conditions. In Proceedings of Interspeech 2015, Dresden, Germany, 6th-11th September, pp. 3302-3306
  • Ma, N., May, T., Wierstorf, H. and Brown, G. J. (2015) A machine-hearing system exploiting head movements for binaural sound localisation in reverberant conditions. In Proceedings of ICASSP 2015, Brisbane, Australia, 19th-24th April, pp. 2699-2703
  • May, J., Ma, N. and Brown, G. J. (2015) Robust localisation of multiple speakers exploiting head movements and multi-conditional training of binaural cues. In Proceedings of ICASSP 2015, Brisbane, Australia, 19th-24th April, pp. 2679-2683
  • Schymura, C., Ma, N., Brown, G. J., Walther, T., and Kolossa, D. (2014) Binaural sound source localisation using a Bayesian-network-based blackboard system and hypothesis-driven feedback. In Proceedings of 7th FORUM ACUSTICUM 2014, Krakow, Poland, 7th-12th September
  • Ma, N. and Barker, J. (2013) A fragment-decoding plus missing-data imputation system evaluated on the 2nd CHiME challenge. In Proceedings of CHiME-2013 the 2nd International Workshop on Machine Listening in Multisource Environments, Vancouver, Canada, 1st June, pp. 53-58
  • Ma, N. and Barker, J. (2012) Coupling identification and reconstruction of missing features for noise-robust automatic speech recognition. In Proceedings of Interspeech 2012, Portland, Oregon, 9th-13th September, pp. 2637-2640
  • Gonzalez, J., Peinado, A., Gomez, A. and Ma, N. (2012) Log-spectral feature reconstruction based on an occlusion model for noise robust speech recognition. In Proceedings of Interspeech 2012, Portland, Oregon, 9th-13th September, pp. 2629-2632
  • Gonzalez, J., Peinado, A., Gomez, A., Ma, N. and Barker, J. (2012) Combining missing-data reconstruction and uncertainty decoding for robust speech recognition. In Proceedings of ICASSP 2012, Kyoto, Japan, 25th-30th March, pp. 4693-4696
  • Ma, N., Barker, J., Christensen, H. and Green, P. (2011) Binaural cues for fragment-based speech recognition in reverberant multisource environments. In Proceedings of Interspeech 2011, Florence, Italy, 28th-30th August, pp. 1657-1660
  • Ma, N., Barker, J., Christensen, H. and Green, P. (2011) Recent advances in fragment-based speech recognition in reverberant multisource environments. In Proceedings of ISCA Workshop on Machine Listening in Multisource Environments, Florence, Italy, pp. 68-73
  • Morales, J., Ma, N., Sanchez, V., Carmona, J., Peinado, A., and Barker, J. (2011) A pitch based noise estimation technique for robust speech recognition with missing data. In Proceedings of ICASSP 2011, Prague, Czech Republic, 22nd-27th May, pp. 4808-4811
  • Ma, N., Barker, J., Christensen, H. and Green, P. (2011) Incorporating localisation cues in a fragment decoding framework for distant binaural speech recognition. In Proceedings of IEEE HSCMA-2011, Edinburgh, Scotland, pp. 207-212
  • Ma, N., Barker, J., Christensen, H. and Green, P. (2010) Distant microphone speech recognition in a noisy indoor environment: combining soft missing data and speech fragment decoding. In Proceedings of ISCA SAPA-2010, Makuhari, Japan, pp. 19-24
  • Christensen, H., Barker, J., Ma, N. and Green, P. (2010) The CHiME corpus: a resource and a challenge for Computational Hearing in Multisource Environments. In Proceedings of Interspeech 2010, Makuhari, Japan, 26th-30th September, pp. 1918-1921
  • Ma, N., Bartels, C., Bilmes, J. and Green, P. (2009) Modelling the prepausal lengthening effect for speech recognition: A dynamic Bayesian network approach, In Proceedings of ICASSP 2009, Taipei, Taiwan, 19th-24th April, pp. 4617-4620
  • Christensen, H., Ma, N., Wrigley, S. and Barker, J. (2009) A speech fragment approach to localising multiple speakers in reverberant environments. In Proceedings of ICASSP 2009, Taipei, Taiwan, 19th-24th April, pp. 4593-4596
  • Ma, N. and Green, P. (2008) A `speechiness' measure to improve speech decoding in the presence of other sound sources. In Proceedings of Interspeech 2008, Brisbane, Australia, pp. 1285-1288
  • Christensen, H., Ma, N., Wrigley, S.N. and Barker, J. (2008) Improving source localisation in multi-source, reverberant conditions: exploiting local spectro-temporal location cues. Journal of the Acoustical Society of America, 123(5): 3294
  • Ma, N., Barker, J. and Green, P. (2007) Applying word duration constraints by using unrolled HMMs. In Proceedings of Interspeech 2007, Antwerp, Belgium, pp. 1066-1069
  • Christensen, H., Ma, N., Wrigley, S. and Barker, J. (2007) Integrating pitch and localisation cues at a speech fragment level. In Proceedings of Interspeech 2007, Antwerp, Belgium, pp. 2769-2772
  • Ma, N., Green, P. and Coy, A. (2006) Exploiting dendritic autocorrelogram structure to identify spectro-temporal regions dominated by a single sound source. In Proceedings of Interspeech-2006, Pittsburgh, PA, pp. 669-672
  • Barker, J., Coy, A., Ma, N. and Cooke, M. (2006) Recent advances in speech fragment decoding techniques. In Proceedings of Interspeech 2006, Pittsburgh, PA, pp. 85-88
  • Ma, N. and Green, P. (2005) Context-dependent word duration modelling for robust speech recognition. In Proceedings of Interspeech 2005, Lisbon, Portugal, pp. 2609-2612
  • Ma, N. and Green, P. (2005) Informing multisource decoding for robust speech recognition. One Day Meeting for Young Speech Researchers, University College London, 14 April, 2005
  • Ma, N. and Green, P. (2004) Improving whole-word HMMs in the Aurora 2 connected digits recognition task. One Day Meeting for Young Speech Researchers, University College London, 22 April, 2004
  • Ma, N., Sargaison, M. and Lawrence, N. (2003) Linear A#: A linear algebra based programming language for the .NET framework. 2nd Microsoft .NET ROTOR workshop, Pisa, Italy, 23-25 April 2003

Thesis

  • Ma, N. (2008) Informing Multisource Decoding in Robust Automatic Speech Recognition. PhD Thesis, Department of Computer Science, University of Sheffield [pdf]
  • Ma, N. (2003) Identification and Elimination of Crosstalk in Audio Recordings. Masters Dissertation, Department of Computer Science, University of Sheffield [pdf]