Data Provenance

Big data and analytics are changing the way healthcare professionals make decisions about their patients and population. But how can the data be trusted to help make such critical decisions?

What problems do healthcare professionals have with analytics?

Complex processes

There can be many steps between raw data and a final report. This can make it hard to be sure that the outputs are accurate and without bias.

Uncleansed data

One of the chief concerns is erroneous data.

Many data sources

Matching data between so many data sources can lead to question marks over data integrity and accuracy.

Failure to reproduce results

74% of researchers have tried and failed to reproduce another scientist's experiments.

Want to learn more?

In early 2017 the USACM issued a statement on Algorithmic Transparency and Accountability. Read it here.

What's being done to build trust in analytics?

Data Provenance

One important development in research is the use of data provenance. In a nutshell, data provenance is a technique that describes:

  • The history of a piece of data
  • Where it came from
  • How it came to be in its present state
  • Who or what acted upon it

How does provenance help build trust?

When healthcare analytics software shows provenance in the data used, it can help to:

  • Link data together
  • Produce repeatable results
  • Improve auditing
  • Tell the data's full story

Provenance maintains the integrity of data so healthcare professionals can make informed decisions using trusted, accurate data.

What should I look for in my analytics software?

Ask these questions when evaluating the software.

Can I easily...

  • See where and when data has been changed?
  • See who has made changes to the data?
  • Record my own decision making?
  • Explore the thinking behind a report?
  • Visualise the audit history?
  • See the data behind the report at any time?
  • Share and discuss the findings with my peers?
  • Re-run a report but with different data?