Saturday, 17 December 2016

BigData and Hadoop

What is Big Data ?
In general, word or excel documents can have data in range of  MB.
Movies,application code can  have data in range of GB.
But data of the size of PB(10^15 -Peta Bytes) is called as Big Data.

Almost 90% of data have been generated in the last 3-4 years.

Sources of Big Data -
1.Social Networking Sites such as Facebook,Google,Twitter etc.
2.E-commerce Sites such as Amazon, Flipkart, Alibaba etc.
3.Weather Stations
4.Telecom Companies
5.Share Market

3 V's of Big Data -
1.Velocity - The rate of increase of Big Data - Doubles in Every 2 years
2.Veracity-  The Data can be structured such as RDBMS or NON-SQL like MongoDB
Bank Transaction is structured data stored in tables.
Log File or CCTV footage is unstructured data.
3.Volume  - The Size of the data.

Realtime application of Big Data -
A E-commerce company wants to reward with $200 Gift Voucher to its top 10 customers who have spent most in last 1 year. Moreover, company wants to find the trend of these customers so that it can suggest more related items to them.

Problem -
Huge  amount of data  needs to be stored, analyzed, processed,




No comments:

Post a Comment