Sensitive Data

Sensitive data are often confidential as they can be used to identify an individual, species, object, process, or location that introduces a risk of discrimination, harm, or unwanted attention. It can include personal and health/medical data, Indigenous data, ecological data, or commercial-in-confidence data.

Sensitive data is commonly subject to legal and ethical obligations that impose restrictions on how it is accessed, used, and handled. The data often can’t simply be published and made openly accessible.

CARE data principles for Indigenous data

The emphasis on FAIR data principles creates concern for Indigenous Peoples who are asserting greater control over the use of Indigenous data and Indigenous knowledge for collective benefit. The CARE Principles for Indigenous Data Governance are people and purpose-oriented, reflecting the crucial role of data in advancing Indigenous innovation and self-determination. These principles complement the existing FAIR principles encouraging open and other data movements to consider both people and purpose in their advocacy and pursuits.

CARE data principles

CARE data elements

Collective Benefit

Data ecosystems shall be designed and function in ways that enable Indigenous Peoples to derive benefit from the data.

Authority to Control

Peoples’ rights and interests in Indigenous data must be recognised and their authority to control such data be empowered. Indigenous data governance enables Indigenous Peoples and governing bodies to determine how Indigenous Peoples, as well as Indigenous lands, territories, resources, knowledges and geographical indicators, are represented and identified within data.

Responsibility

Those working with Indigenous data have a responsibility to share how those data are used to support Indigenous Peoples’ self-determination and collective benefit. Accountability requires meaningful and openly available evidence of these efforts and the benefits accruing to Indigenous Peoples.

Ethics

Indigenous Peoples’ rights and wellbeing should be the primary concern at all stages of the data life cycle and across the data ecosystem.

De-identifying sensitive data

De-identified data defines information from which the identifiers about the person have been permanently removed, or where the identifiers have never been included.

De-identification is important because it can make research data sources available to future researchers whilst preserving an individual's privacy.

Here are some tips to start your de-identification:

Plan de-identification early in the research as part of your data management plan.
Make sure the consent process includes the accepted level of anonymity required and clearly states what may and may not be recorded, transcribed, or shared.
Retain original unedited versions of data for use within the research team and for preservation.
Create a de-identification log of all replacements, aggregations or removals made.
Store the log separately from the de-identified data files.
Identify replacements in text in a meaningful way, e.g. in transcribed interviews indicate replaced text with [brackets] or use XML markup tags, such as …….
For qualitative data (such as transcribed interviews or survey textual answers), use pseudonyms or generic descriptors rather than blanking out information.
Digitally manipulate audio and image files to remove identifying information.

Publishing sensitive data

The advantages of publishing your sensitive data will probably outweigh any potential disadvantages when simple and appropriate steps are taken. Sensitive data can often be anonymised and de-identified so that the value of the data collected can be realised without compromising the privacy of the research participants.

Sensitive data that has been de-identified can be shared, but researchers can also place conditions around access to the published data. The Library can publish a description of your data which means others can discover it and cite it, without making the data itself openly accessible. This is a metadata-only record that can be issued with a DOI ( permanent identifier) if appropriate and made discoverable through the Research Data Australia portal.

Publishing a metadata-only record increases the visibility of your research, which leads to new collaborations, increased citations, and greater research impact.

The Australian Research Data Commons provides guidance on when and how to publish sensitive data as openly and ethically as possible.

If you need further assistance, the Support for research data management forms are available in WesternNow for both staff researchers and HDR students.