The Big Data
Tech Talk #36
The Data
Each “online” action therefore produces data and this mass of data continues to increase.
In 2018 there are 50,000 Giga-bytes of data created per second.
In 2019 it was predicted that the global data volume will be multiplied by 45 between 2020 and 2035.
In 2020, the global storage capacity (installed base) reached 6.7 zettabytes, and it is forecast to grow on average by almost 20% per year over the period 2020-2025.
In which sector is BIG DATA useful?
Big Data also refers to the technologies used to manage large volumes of data because to analyze all this data the human brain or the tools
classical analysis is no longer sufficient.
Big Data is useful in / for many areas, especially in the sector:
E-Commerce with recommendation systems resulting from the analysis of millions of customer data.
Medical for the use of health data in order to better prevent and treat diseases.
In the field of Insurance where big data makes it possible to better understand the behavior of policyholders in order to provide them with a personalized offer.
What language for BIG DATA?
The analyzes call upon specific tools and Data-scientists who baptize Mathematical models and algorithms to give meaning to the data.
The choice of language is then very wide. Functions performing complex tasks are already available in the most common languages.
We find:
Python which has a very strong community
R which in part also intended for statistics
Matlab used for numerical calculation purposes
GO originally developed by Google
Usefulness of BIG DATA
To analyze this big data, we can make use of a Data Lake which is a technology offering the possibility of storing any type of data in order to be able to process and analyze them very quickly or even almost instantaneously depending on the data.
Since in a data warehouse the structuring of the data is not important therefore any type of data can be deposited there and will pass through a pipeline which will give the possibility of processing them
To do any kind of analysis that are:
Study past events
Prevent future events
Above all, predict the most appropriate actions depending on a situation.
Final note
Big Data is characterized by Volume, its Variety but also by its Velocity by which it can be processed.
It allows you to perform near real-time analyzes, predict events and even prescribe solutions.
The algorithms put in place, the automated processing, the capacity of machines to remember, learn and draw conclusions about the processing are an opening towards Machine Learning, which is another technological branch.
For a data warehouse offering performance 10 times better than traditional warehouses, opt for Redshift which is an AWS service based on the OLAP (Online Analytical Processing) model allowing complex multi-dimensional queries to be made.
3 times faster.
75% less expensive for the same amount of data as traditional warehouses.
Do not hesitate to approach us SYLVERSYS Senegal for more advice.
We are a solutions integration company specializing in providing secure IT and network products / solutions for businesses and organizations in the private and public sectors.