ÁñÁ«ÊÓƵ

We¡¯re investing in data quality to strengthen our rankings

<ÁñÁ«ÊÓƵ class="standfirst">The use of generative AI and a new engine drawing on data from government agencies are among the ways in which we¡¯re doubling down on quality, says David Watkins
February 24, 2025
data validation quality control
Source: iStock / amgun

At Times Higher Education, we have a duty to higher education worldwide to uphold rigorous standards in the creation and production of university rankings. We know that students, parents, governments, industry, staff and university leaders look to these rankings to understand the performance of institutions, areas of excellence and opportunities for improvement.?

Last year Phil Baty, °Õ±á·¡¡¯²õ chief global affairs officer, wrote an , with one of the points being: ¡°Invest properly in data collection, validation and quality assurance.¡±?We at THE take data quality very seriously. We have invested significantly in people, process and technology to continuously improve the quality of all our various data sources.?

Over the next few weeks, I will write a regular blog to examine each of our key data sources and discuss how we are ensuring the quality of the data. First, a brief overview.?

University data?

Universities across the world kindly provide us with their institutional data ?on an annual basis, and every year about 15 per cent more universities submit data to us. We really appreciate this time and effort from universities and we have a team dedicated to answering questions from institutions. In 2024, we invested in building a new data quality engine that allows us to more accurately detect potential issues in those submissions and fine-tune our queries back to universities. This engine uses data from more than 70 major government and education agencies around the world for verification, as well as deep statistical analysis to detect anomalies.?

ÁñÁ«ÊÓƵ

ADVERTISEMENT

Evidence data?

To compile our Impact Rankings, which measure universities¡¯ contributions to the United Nations¡¯ Sustainable Development Goals, we collect and analyse more than 250,000 evidence documents annually. Since the creation of those rankings in 2019, we have analysed those documents manually but last year, to increase efficiency, scalability, accuracy and consistency, we started using generative AI to help assess those documents. Currently one-third of documents are analysed this way, and we are working towards sending far more documents through this automated process.?

Reputation data?

Last year, more than 55,000 cited academics from around the world participated in our global Academic Reputation Survey, a fivefold increase in participation from our 2021 survey, which has improved the signal-to-noise ratio. During that period, we have restricted self-votes, ensured?voting patterns are sufficiently diverse and . We are strengthening our data quality checks this year to ensure ever more rigour.?

ÁñÁ«ÊÓƵ

ADVERTISEMENT

Citations data?

We utilise bibliometric data from our partners at Elsevier to understand research quality at universities. Two years ago, we came up with a new suite of research quality metrics to help address anomalies, and universities have given us a lot of praise for one metric in particular: research influence. This metric significantly strengthens our rankings by determining the relevance of citations. We are looking at further bolstering the effect and use of the research influence metric in our rankings and external analyses.?

David Watkins is managing director of data at Times Higher Education. ??

Register to continue

Why register?

  • Registration is free and only takes a moment
  • Once registered, you can read 3 articles a month
  • Sign up for our newsletter
Register
Please Login or Register to read this article.
<ÁñÁ«ÊÓƵ class="pane-title"> Related articles
<ÁñÁ«ÊÓƵ class="pane-title"> Sponsored
<ÁñÁ«ÊÓƵ class="pane-title"> Featured jobs
ADVERTISEMENT