What is Big Data? – A definition with five Vs

To define where Big Data begins and from which point the targeted use of data become a Big Data project, you need to take a look at the details and key features of Big Data. Its definition is most commonly based on the 3-V model from the analysts at Gartner and, while this model is certainly important and correct, it is now time to add another two crucial factors.

Big Data definition – the three fundamental Vs:

  • Volume defines the huge amount of data that is produced each day by companies, for example. The generation of data is so large and complex that it can no longer be saved or analyzed using conventional data processing methods.
  • Variety refers to the diversity of data types and data sources. 80 percent of the data in the world today is unstructured and at first glance does not show any indication of relationships. Thanks to Big Data such algorithms, data is able to be sorted in a structured manner and examined for relationships. Data does not always comprise only conventional datasets, but also images, videos and speech recordings.
  • Velocity refers to the speed with which the data is generated, analyzed and reprocessed. Today this is mostly possible within a fraction of a second, known as real time.

Big Data definition – two crucial, additional Vs:

  • Validity is the guarantee of the data quality or, alternatively, Veracity is the authenticity and credibility of the data. Big Data involves working with all degrees of quality, since the Volume factor usually results in a shortage of quality.
  • Value denotes the added value for companies. Many companies have recently established their own data platforms, filled their data pools and invested a lot of money in infrastructure. It is now a question of generating business value from their investments.

As we wrote in our previous blog post, defining Big Data is not so easy since the term relates to many aspects and disciplines. And for many people the most important thing is companies’ success (Value), the key to which is gaining new information – which must be available to many users very quickly (Velocity) – using huge amounts of data (Volume) from highly diverse sources (Variety) and of differing quality (Validity), in order to be able to quickly make important decisions to gain or maintain competitive advantage.

In the book “Big Data – Using smart Big Data analytics and metrics to make better decisions and improve performance” Bernard Marr writes that if Big Data ultimately did not result in an advantage then it would be useless. We could not agree more.