The best way to understand how data have been collected, what questions were used and how to interpret the information is through examining its metadata. This module will cover what metadata is and why it’s important, how metadata standards help us understand the main features of a dataset, and how to look up questions, variables, questionnaires and datasets to find the data you need.
Suggested citation: Kaye, N., Mills, H. & Johnson, J. (2020). Understanding metadata. CLOSER Learning Hub, London, UK: CLOSER
Metadata – data about data – is crucially important when trying to make sense of datasets. This section outlines how metadata provides the essential context to allow us to make sense of data, what it looks like and how it helps to use datasets for research.
Structured metadata defines the relationship between data items to enable computer systems to understand the contextual meaning of the data – to display the relevant information on a website, for instance.
Structured metadata tells a computer what something is, how it relates to other objects and what to do with it. By standardising the content and structure, it makes it easier for computers to automatically extract information from the metadata.
This information can then be provided to researchers to help them discover and access data from many different sources. It facilitates data sharing and allows data collected in one study to be re-used in the future by other researchers.
The documentation that accompanies dataset provides important information describing how the study was done, how data were collected and how to interpret its metadata. Additionally, for longitudinal studies, the documentation provides information about the different waves of data collection and whether certain questions were included in some waves but not others.
Researchers need better access to information about data so as to be able to discover variables relevant to their research and to find variables that, although collected through different sources, are nonetheless comparable.
Whilst the standards used and the information available has improved significantly over the years, the information provided to data repositories and archives has not (in general). It continues to require large amounts of effort on the part of researchers to turn it into high quality research resources. One reason is because information about the data collection is very often detached from the datasets made available for research, and high quality data requires them to be better connected.
How much have you learned about metadata? When you have completed all the sections in this module, try to complete the crossword puzzle to test how much you know.
CLOSER’s Training Hub provides more in-depth learning on metadata, in our data management section, which details data documentation levels, metadata standards and more.
The Learning Hub is a resource for students and educators
tel | +44 (0)20 7331 5102 |
---|---|
closer@ucl.ac.uk |
Sign up for our email newsletters to get the latest from CLOSER
Sign up