
Big Data Case Study
How big MNC’s like Google, Facebook, Instagram etc. stores, manages and manipulate thousands of terabytes of data with High Speed and High Efficiency.
Data Collection is the need of almost every organization because data is the thing that runs their business and makes their products or services more efficient.
BIG DATA is a problem
Big data is not a technology but is a problem of a data world as the physical storing components, the velocity of processing this huge data is limited at present time. It requires a single day to create approx 25000000 Terabytes of data and a second to create 1.7 megabytes of data approximately.
Data keep on continuously increasing day by day and to store such large amount of data with a security is major concern. If companies create storage for large amount of data in a single hard Disc so another problem occurs which is known as Input/Output Problem(Input to read data and output to save any kind of data) and Input/Output processing in hard disc is very slow. So Big Data is a big challenging problem which has two major factors having large Volume which is the large size of data and Velocity which is the speed of storing large amount of data in secured way.
Not only the large size of data and management of speed but also cost and many more factors as well is a big problem.
Distributed Storage is a solution

To solve this problem there are many software tools available in the market like Hadoop.
Whenever any kind of big data comes and hits the system we can split the data into many set of small data and then pass those data set to the independent working servers / resources

Here in this kind of setup we have a master and slave providing their resources to the master. There can be lots of slaves with a single master to manage a big environment. The data splits and goes to the master node which then passes those to the slave nodes
If we do such kind of setup we see that we can scale up the resources according to our need and as the data is split and processed by different machines so the we can perform the operations over the data faster.