Last week, we introduced our latest blog theme: data optimization. This topic is front and center right now, and everyone ‘s talking about it. However, that doesn’t mean people understand how big data is really defined.
With countless definitions floating around, and many of them complex and confusing, it’s hard for companies to know exactly what big data offers and how to use it. The first step towards data optimization is getting comfortable with the topic.
In late 2013, a research team from St. Andrews University conducted a survey of big data definitions. Their goal was to figure out what concepts have gained traction and combine them in a concise, consistent summary that takes the ambiguity out of the term.
The researchers found that among all common big data conceptions, there were three similarities. Each definition made at least one of the following claims, and most made two.
Size: the volume of the datasets is a critical factor.
Complexities: the structure, behaviour and permutations of the datasets is a critical factor.
Technologies: the tools and techniques which are used to process a sizable or complex dataset is a critical factor.
Based on their survey, St. Andrews developed a concise definition of big data that’s pretty useful.
Big data is a term describing the storage and analysis of large and or complex data sets using a series of techniques including, but not limited to: NoSQL, MapReduce and machine learning.
Applying it is another matter, but we’ll get to that next week.
Image via (cc) Bilal Kimoon