The University of Sheffield
Natural Language Processing Group

Hands-on QuEst++

QuEst++ is an open source software is aimed at Quality Estimation (QE) for machine translation. It was developed by Professor Lucia Specia's team at the University of Sheffield and includes contributions from a number of researchers.

It has two main modules: a Java module to extract a number of word, sentence and document-level features and a Python module that interacts with the scikit-learn toolkit for machine learning. Is has also a few python and shell scripts for small things here and there.

See Hands-on material for details.

** Hands-on resource for including a new feature: List of simple words.

Pre-tutorial Instructions

Versions of QuEst++

  • Get QuEst++ from our GitHub repository (source code and basic tools - recommended for developers).
  • Get vanilla version of QuEst++ (complete JAR file with all libraries included of the stable version of the code - recommended for users).

  • License: for our Java code is BSD and for our Python code is Apache License 2.0. For pre-existing code and resources, e.g., scikit-learn, SRILM, GIZA++, Stanford and Berkeley parsers, please check their website.

    Check the current baseline, black-box, and glass-box lists of features QuEst++ can extract at sentence level.

    Citing QuEst++

    Lucia Specia, Gustavo Henrique Paetzold and Carolina Scarton (2015): Multi-level Translation Quality Prediction with QuEst++. In the Proceedings of ACL-IJCNLP 2015 System Demonstrations, Beijing, China, pp. 115-120. [PDF][BIBTEX]