my photo Resumé

Jie Gao Bsc(Hons) Msc(Hons) MBCS CSM

I am a research assistant and PhD candidate in the OAK Research Group at The University of Sheffield, Department of Computer Science. My supervisor is Professor Fabio Ciravegna.

Learn about what I do

Large Scale Data and Information Normalisation, Extraction and Integration

My current research work mainly focus on designing and implementing methods for information and data integration over large scale and particularly explore the methodologies in leveraging semantic technologies in model integration for human and physical sensors networks, with particular consideration of real-time big data streaming integration & optimsation mechanism for the heterogeneous monitoring network. As my current research directly connects to the European FP7 WeSenseIt project and EPPICS project, particular attention is paid to research and develop the theoretical and conceptual semantic models/integration approaches in the field of Multisensor Data Fusion, for the purpose of enhancing participatory decision support in environment monitoring and city-wide events management by fusing complex, large-scale heterogeneous sensor data.

WeSenseIt – Citizen Water Observatories

WeSenseIt ( is a multi-site, multi-disciplinary project involving researchers and industrial partners in web technologies, environment, sensing, as well as social media monitoring. Working together with the EU and our sister projects this project aims to develop a new concept of citizen observatories of water creating a two communication channel between authorities and citizens in cooperating to monitor rivers, covering water quality to flooding.

SPEEAK-PC – Sustained Process Excellence through Embedding of Analytics and Knowledge Management into Process Chain

This project directly addresses the challenge for organisations to realise actionable knowledge from an ever increasing flood of potentially valuable data. Currently, the need for skilled ‘data scientists’ is a major bottleneck in this regard. The project will apply existing and develop new technologies to create an ICT tool set which alleviates this bottleneck through provision of a collaborative platform with tools for data integration and analytics deployment which are accessible to non-ICT specialists. The project is funded by Innovate UK, start date 1.10.2014 and has a length of 18 months.

SETA – ubiquitous data and service ecosystem for better metropolitan mobility

The objective of the SETA project is to provide effective solutions for intelligent and sustainable mobility i.e. the smarter, greener and more efficient movement of people and goods. SETA will provide a radical change from transport as a series of separate modal journeys to an integrated, reactive, intelligent, mobility system. It will provide always-on, pervasive services to citizens and business, as well as decision makers to support safe, sustainable, effective, efficient and resilient mobility. The project lasts 3 years, with €5.5M of funding from EU Horizon 2020 of which €1.2M is for Sheffield. Professor Ciravegna is the project director (2016-2019).


In my last position as KTP Associate within University of Westminster and ActiveStandards, I was involved in 2 years research project (KTP Project 007326), working alongside academics from Westminster Semantic Computing Group within the ActiveStandards Application Development team to research, plan, design, develop and benchmark an brand new in-house semantic annotation system. Through innovative thinking, the Natural Language Processing and Semantic Web technologies are leveraged to implement automatic semantic annotation and semantic based content retrieval on ActiveStandards™ platform for their clients. My focus of research and development in the KTP project was to enhance relevance and findability of content so as to facilitate the multisite content management for large corporate. During the semantic product R&D stage, I got the opportunity to collaborate closely with academic community and leading semantic technology providers and further apply academic best practice into real business application. The partnership project has been graded as second higher level which will be shortlisted as one of KTP case-studies among 250 national funded projects.

In addition to my experience gained with the KTP project, I am graduating with an Honors Master degree from University of Southampton under the supervision of Prof. Leslie Carr. My research interests in Southampton lies in the area of Semantic Web technologies and Enterprise Content Management, ranging from theory to design to implementation, with a focus on the potential of Linked Data technology in knowledge sharing, reusing and merging across heterogeneous platforms (a.k.a "data silo" issues).

My solid knowledge of computer sicence technologies and project management has also come from my years experience in software engineering industry. I was sensor software engineer in a top-5 US-Chinese software service company (Dextrys) and had engaged in many large-scaled enterprise application development projects with complicated business requirements, such as Chartis Brazil Insurance System, Hanover Insurance system for Hanover, Mayban Fortis Insurance System, EA Back Office System for Electric Arts Inc., Non-Domestic Management System for Hong Kong Housing Authority and etc. My role in the four years industry experience was to design, enhance application solutions and work closely with business analysis team; transform clients' ideas and business concepts into carefully designed and well-authorised software engineering solutions; work closely with UI team to ensure the high usability of software; coordinate with QA team to improve the software quality and so forth during the whole software development lifecycle. The valuable industry experience improved me not only professional skills in software engineering, but also the cooperative ability, teamwork spirit, analyzing ability as well as how to response to the emergency.

In my spare time, i enjoys running, swimming, bicycling, movies, travelling, and enjoys good food and cooking at home. I am also a big fan of beer and love going out and drink with friends.

My Research Interests

My research interests can broadly include:

  • Natural Language Processing
  • Terminology/Entity Recognition
  • Linked Data
  • * particular in its application in knowledge disambiguation, information integration
  • * and the use as background knowledge for supervised machine learning
  • Information Extraction
  • Ontology Engineering
  • Enterprise Knowledge Management


  • Gao, Jie, and Suvodeep Mazumdar. "Exploiting linked open data to uncover entity types." In Semantic Web Evaluation Challenge, pp. 51-62. Springer International Publishing, 2015.
  • Zhang, Z., Gao, J. and Ciravegna, F. (2016) JATE2.0: Java Automatic Term Extraction with Apache Solr. In: The LREC 2016 Proceedings. The International Conference on Language Resources and Evaluation, 23-28 May 2016, Slovenia. LREC 2016.
  • Zhang, Z., Gao, J., and Gentile, A.L. The LODIE team (University of Sheffield) Participation at the TAC2015 Entity Discovery Task of the Cold Start KBP Track. In: Proceedings of the 2015 Text Analysis Conference. TAC Knowledge Base Population (KBP) 2015, 16-17 Nov 2015, Gaithersburg, Maryland USA.

Contact Details

Jie Gao
Room G017
Regent Court
211 Portobello
Department of Computer Science
University of Sheffield
Sheffield S1 4DP

Phone: +44 (0)114 222 1875
Fax: +44 (0)114 278 1810

Find me on ...