Big Data Dimension – Attributes of Big Data | 4 Vs of Big Data

Big data is important because it enables companies to gather, store, manage, and manipulate vast amounts data at the right speed and at the right time to gain the right insights. To achieve this, big data dimension must be considered in any big data solution.

A big data solution must address the below mentioned four V’s, also known as attributes of big data.

In addition to Volume, the velocity of data is used to define the speed with which different types of data gets generated every second.

Variety refers to structured, unstructured and semi-structured nature of data such as web logs, sensor data, radio frequency ID (RFID), meter data, stock ticker data, tweets, images, and video files on the Internet.

Veracity plays an important role in addressing whether data can be trusted or not when decisions need to be taken.

Big Data Dimension

[image_frame url=”http://rcvacademy.com/wp-content/uploads/2016/11/attributes-of-big-data.png” border_style=”boxed-frame-hover” action=”open-lightbox”]

  • Variety

    The variety of data is the first big data dimension.

    Variety refers to collecting data from various sources (human and machine) and include data from sources like, social media, credit card usage, website visits, retail shops, hospitals, mobiles, sensors, log files, security cameras, etc.

    As data is captured from the variety of sources and multiple data types like structured, semi-structured and unstructured from internal systems and external systems so it becomes very important to integrate these multiple data types.

  • Volume

    Volume is the second dimension of big data, volume refers to the quantity of data.

    With internet era the data is generated by machines, human interaction on social sites and other platforms, so the volume of data generated every day is humongous.

    IBM estimates that 2.5 quintillion bytes of data is created each day.

  • Velocity

    The third big data dimension deals with the speed of data which flows from various sources like social media and internal business processes.

    In the internet era the flow of data from social media is massive and continuous so handling the velocity of such amount of data and coming up with meaningful information helps the organization in making key business decisions.

  • Veracity

    Veracity is the fourth attribute which refers to the abnormality of data. How much of the data can be trusted as it is when decisions have to be taken.

    This dimension focuses on how to integrate data from different sources into a consistently high-quality data which can be helpful in making the meaningful decision for a business.

Big Data Definition | Introduction – Why Big Data is Needed

Before we start Big Data definition and introduction we need to understand why do we need big data technology when we have high performance and reliable relational database management system (RDBMS)?

Why Big Data

The reason to use big data is that, in the relational databases, data is stored in a structured format with data modeling techniques such as entity-relationship modeling, star schema modeling or snowflake schema techniques.

Initially, it was just transactional data and hence if the data grows over a period of time, organizations started analyzing the data using data marts and data warehouses.

Business Intelligence done on top of data marts and data warehouses is the key drivers for CxOs to make forecasts, define budgets, and determine new market drivers of growth.

Big Data Definition
Big Data Definition – Image Source “Wikimedia Commons”

Until the era of internet, business intelligence analysis was done on the enterprise data. However, in the era of internet, data existing outside the enterprise become the key need for strategic decisions.

Things started getting more complex in terms of the variety, velocity and volume of data with the advent of social networking sites and search engines such as Google, Yahoo, and Bing.

Businesses need to find the pragmatic approach to capture this information to survive or gain a competitive advantage with other vendors.

Organizations need to collect this data generated from a variety of sources such as images, streaming videos, social media feeds, text files, documents, sensor data, and so on to respond and innovate quickly to customer needs in order to gain the competitive advantage over other companies.

The solution of above problem is “BIG DATA” however the unstructured or semi-structured nature of data with the velocity with which it is getting created is the real challenge for the big data.

Big Data Definition

Let us go through big data definition below to understand about big data.

Big data is a term that describes the large volume of data, both structured and unstructured, that a business generates on a day-to-day basis.

However, it’s not the amount of data that’s important. The idea of Big Data is basically how do I extract extra dollars from someone’s pocket to maximize sales and minimize cost in order to increase the profit margin.

Organizations are discovering that important predictions can be made by sorting through and analyzing Big Data.

Data is the new oil. There are 1000’s of companies which are just working towards collecting the data. No manufacturing plant, no supply chain strategies; they just collect the data.

Big Data in Action – Examples of Big Data Analytics

American retail company Walmart collects 2.5 petabytes of unstructured data from 1 million customers every hour which is equivalent to 167 times the books in America’s Library of Congress.

With tons of unstructured data being generated every hour, Walmart is improving its operational efficiency by leveraging big data analytics.

One of the finest applications Walmart has is Savings Catcher Application which alerts the customer whenever its neighboring competitor reduces the cost of an item the customer already bought.

This application then sends a gift voucher to the customer to compensate the price difference. This application runs on top of the tons and tons of data which Walmart collects every hour.

The universe of Big Data is surrounded by customer reviews, feedbacks, who are talking about a particular product through the communication channels such as Facebook, Twitter, product review forums, etc.

It is important for organizations to understand and analyze what customers say about their goods and/or services to ensure customer satisfaction.

Important predictions such as analyzing customer sentiments, which give organizations a clear picture of what they need to do to outperform their competitors can be made by sorting through and analyzing Big Data.

Therefore, big data can be analyzed for insights that lead to better decisions and strategic business moves.