Tuesday 7 July 2015

Accessing micro-data (part I)

data.path / FlickR user r2hox (CC BY 2.0)
Obtaining micro-data can be tricky, take time and might actually be virtually impossible (i.e. sometimes it is just not available in the way one thinks it could be available: e.g. either it’s not online, or it has never been collated or presented in the way needed). Hence, if you are thinking of embarking on a research project where your approach relies exclusively or heavily on having access to such data, please do not commit to undertaking such research unless you already have such data. (That is generally a good approach for any data-led or data-centred research, but with micro-data it is absolutely crucial).

The reason for organisations not just handing over such detailed levels of data lies in the data itself: protecting a discrete group of people who might be identifiable by postcode or other means might translate into them not being disadvantaged or singled out. Most organisations have a two-part approach to granting or denying access to such data:
  1. Can you prove that you are a bona fide researcher? This might seem obvious, and depending on the stage in your academic career this can be tricky. However, it is easy to see this step not only as a stumbling block, but also as quite snobbish, or even an elitist stance on the part of the organisations holding such data. But to look at it from the perspective of the guardians of such data: they want to know whether you can make use of the data (i.e. have you got the experience and skills to analyse, present and discuss this data in a way which will provide new insights for scholarship, or even for policy makers). Depending how big, or detailed the data, the organisation asking for your credentials might demand quite a high level of proof to be certain that you can handle such a project academically and in a scholarly manner.
  2. The second part of an application for micro-data can often provide more technical and legal challenges for giving access to such data. Can the researcher guarantee that the data will not be used by anyone else (unless they individually have applied for access too)? Often data may not be stored in open networks, sometimes not even on any local servers or computers; apart from the minimum time required to consult and analyse the raw data itself, the data usually needs to be destroyed once the project is finished. Again, taking the perspective of those who are responsible for keeping such data safe from abuse, one can understand their position.

So, when you have found a micro-data source,

do
  • have enough time to investigate how to apply for access.
  • work on the assumption that you need to apply for access and that this might relate to the two aspects as outlined above.
  • appreciate the position of those responsible for guarding this data
  • accept that such data is usually only disclosed to an individual, not a library, a faculty or a team of researchers.
and do not:
  • waste time in trying to find alternative sources of data, or relying on someone who has access to the data to let you (“quietly”) have access to it too – if this micro-data is worth consulting, then it probably is only available from one source.
  • take the procedure personally; as some of the data relates to real people and their lives, it is deemed worth protecting.
  • plan to have access to such data without having applied for it (just assuming that you will get it quickly is not realistic).

screenshot of  http://bit.ly/1JLbHUh
Just to give one recent example where I was asked for my opinion: the ONS’ Wealth and Assets Survey  is the kind of data where one must apply for Approved Researcher accreditation.
Admittedly, this all is tricky and definitely time-consuming, but the earlier in the process of your research you appreciate the above, the better it will be for deciding whether you can manage to conduct the research you are hoping to do. There are one or two notable exceptions to the above, of which in another post.

CG