Skip to main content

Table 1 Data-driven analysis and related terminology

From: Data-driving methods: More than merely trendy buzzwords?

Big data Data sets with size/complexity beyond the capacity of commonly used methodological approaches to capture, manage and process data. Big data might be defined by their high volume, large variety and the important velocity that is required to process (3v definition)
Closed-loop system System in which some or all its outputs are used as inputs. In health care, the use of such feedback loop enables real-time analysis of patient databases and could permit to optimize clinical care leading to more efficient targeting of tests and treatments and vigilance for adverse effects (i.e. dynamic clinical data mining)
Cross-validation Statistical technique for assessing how the results of an analysis will generalize to an independent data set. For example, doing so it could permit to estimate how accurately a predictive model will perform in practice
Crowdsourcing The practice of obtaining needed solution by soliciting contribution from a large group of people and specially from online communities
Data mining The process of collecting, searching through and analysing a large amount of data in a database, as to discover patterns of relationships. It is worth noting that this approach does not look for causality and simply aim to detect significant data configurations
Machine learning Derived methods from artificial intelligence that provides computers with the ability to learn without being explicitly programmed. The process of machine learning uses the data to detect patterns and adjust programme actions accordingly