This article has been updated in November 2020 to reflect recently received information regarding the global pandemic.
Since the beginning of this century, humanity has created almost twenty zettabytes of data. To store it, we would need 625 billion 32GB iPads. By the end of 2020, we will reach 40 zettabytes, which is twice as much as humanity has produced until now. With the growth of the wearable market including fitness trackers and medical devices creating around 50 thousand gigabytes of data every second, the new golden era has begun.
What Does Data Mean?
Data is the number of people living in Australia, music streamed, our current body temperature, or notes from physics class. Data surrounds us, traveling through the air, earth, and everything around the globe. The everyday flow is enormous, which has left us with no choice but to create new means of transport for them. That is how communication standards appeared, and also why the new standards of Bluetooth, Wi-Fi, and the next-gen methods of mobile communication are in high demand. We store, analyze, observe, and sometimes are even capable of perceiving the data. Could we divide data by kind? Obviously, we could — data is quantitative if we’re talking about a number of cars and qualitative as we describe predictions of Earth’s resources.
Is collecting data useful? The fact that data is everywhere is obvious to everyone. However, stringing data together into something meaningful is not. Often the real value can be understood only in a particular form and with proper circumstances. That is why there are so many job offers for data analysts at the moment.
The Power That Data Possess
It is great that people are so concerned about data. Yet it’s not the data specifically which is our main interest, but the information created by analysis. Where does the success of Google lie? The storage of millions of links? Of course not. It’s a matter of valuing content and finding relations between websites. In reality, data reveals its strength by the ability to share conclusions. Without this knowledge, it is nothing more than random numbers or characters. As Dr. Michael Wu, Principal Scientist of Analytics at Lithium says:
The value of any data is easy to define by the information and insights we can extract from it.
Data usefulness is hard to measure. The way data was collected may have been wrong by a lack of context (is that ECG data coming from a healthy or an unhealthy person?) or we have a hard time finding proper value function (like the famous PageRank algorithm in the evaluation of a website’s strength). This is why usefulness is dependent on the institute that possesses data and knows what to do with it.
Taking a short tour into a medical market could help us explain this problem. The growing number of sensors in medical equipment focused on data in healthcare and its importance is unquestionable. Analytics are helping in the realization of the goals in diagnosing, treating, helping, and healing patients. The end goal of this domain is improving Health Care Output (HCO), or the quality of care provided to hospitalized people. With monitoring equipment, an additional chance is provided to recognize alarming symptoms sooner than the body might notify with a bad feeling or high temperature.
Nonetheless, data analysis is also a great problem for companies and skilled teams. Those responsible for seeking information should be able to find the right algorithms needed to get value out of the database. The power of information received after the analysis is quite often determined by a person trying to get them. Lack of knowledge or experience can be a roadblock for data analysts trying to learn something new from a database.
However, before thinking about the value of data, we should once again reconsider security threats. A good example is credit cards, as they have accustomed us to easy payments, which have left us unsecured. The way data is stored is as important as the data itself, so security and countermeasures are a critical part of setting up a new server. In fact, we probably don’t want to share our bank account password with anyone. It is the same with health. Medical records are among the most compelling, valuable, and treasured documents available.
Power of the data relations
When all the needed data is stored, it’s time to get everything related. Many say that seeing is believing, so analyzing a simple example should be the shortest way to understanding.
There were 681990 births in 2014 in Germany. The number of births alone gives us the impression that many new people are being born. Only after adding next data — the number of people — will it be possible to calculate a birth rate and new data will be created. As there are 80.996.685 people in Germany, the rate of birth is 8.42 born for every 1000 people. This seems to be great news for the German government, as there are new people appearing all around the country. Nearly 17 babies for every 2000 people seems like a good figure, especially when the number of living babies, children and old people in these 2000 is included. But there is more data, which will change everything. There were 914450 deaths in 2014 in Germany. Through this number, the whole picture is appearing. The number of people is not growing, as more people died than was born and there is a gap of more than 200000 people between newborns and dead ones. There were 11.29 dead for every 1000 people. This shows the way that data connection works and how Germany and other countries are endangered by the decreasing number of citizens. That example was just for starters; indeed, there is a subject much more exciting — sports and fitness. Take a look at this map collection:
This shows a decrease in the number of people who don’t exercise around the country. As is noticeable, the 2011–2014 change is outstanding and right now there are even more people working on their sports habits than was shown in the last chart. Apparently, that is only the partial truth, as there is some more data to share:
This is the obesity report. Those numbers seem to be very disturbing. How is it possible that so many people out there are obese? And what about those who are overweight, who are not included here? Shouldn’t the trend start shifting the other way? The number of people who are training is growing, and it is common knowledge that fat is killed by exercise, so the society of activists should be fit and healthy. Though actually, the latest research shows it is not connected at all. That is why people should probably start eating less instead of signing up for a gym to lose weight (exercises are good for many other things though).
Data relations in healthcare
Data analysis seems to be essential to our civilization, so it has reached healthcare, as this is one of the biggest and most complicated fields of research. Human bodies are unpredictable at the moment, as we are all a little different from others. This is why data collected from every human is needed for finding patterns impossible to notice through the analysis of single cases.
Nowadays, data scientists are trying to do a similar thing in relation to healthcare. With a proper connection between different vital signals and an accurate amount of data, we will discover an increasing amount of information about our health. Unfortunately, it’s just a theory proved by small portions of data and estimations, as there is no clear evidence yet. Didn’t we hear the stories about elders with poor diets and lack of activity exceeding one hundred? The data might also be difficult to analyze due to differences in geographical regions, lifestyle, culture, and other factors. Different climate and food, skin color, and sicknesses we have been through differentiating us from others, but with proper data analysis and databases collecting those events and putting them together like pieces of a puzzle we can find a way to not only heal people but even save them from dying.
To fulfill those plans we have to choose data that will give us the highest amount of information and start adding other records to it. Yet, it’s very good to store ECG signal as the first one, because it holds a lot of useful knowledge about a person and can be used to arrive at a conclusion later.
That’s why storing more data might be very useful here. Sometimes we might be missing the obvious reason. If we would learn that the case for preventing heart disease might be due to morning jogging then it could save millions. It could be possible if we could spare time not by looking at data only from a single person, but rather a whole population.
This raises the question of how medical data can be stored. This problem is solved in several very different ways. One could use traditional meters, but they do not support big data collection, so device with connection to the Internet would be needed. That is mostly perfected by all of the additional bluetooth improvements our smartphones can have. One of those is a device we are working on, the Aidlab, which stores user’s data for further analysis, looking for connections between the data and creating groups of similar health statuses all around the world.
Big data
Data analysis is closely connected to the term big data, but do we know the real meaning of this overused term? Simple data storage is not enough for the analysis of information collected around the entire globe.
What do we call big data. The term big data is a vague term with a definition that is not universally agreed upon. According to a rough description and estimations, it would be any kind of data that is around a petabyte or more in size. In Health Informatics research, though, big data of this size is quite rare. Mostly it is just a large amount of data collected from various sources all around the globe, giving us vital information on some particular problem. It is like a database of patients and their records; for example, ECG. The data from one hospital is just data. But when you store together or somehow connect data from several cities in different countries on Earth, and several hospitals, it becomes vital big data.
So bringing our results to the global database is an excellent contribution for humanity as long as this data is used for a good purpose. The slogan ‘save millions’ is noticeable in many advertising campaigns convincing us to utilize big data storage. But what do they mean by that? How could one save millions, literally?
How One Could Save Millions
The earthquake prediction model has been created thanks to data taken from the seismic records. You, as a person with a smartphone, could act as a very simple seismologic station:
Theoretically, any device connected to the Internet with an internal MEMS accelerometer, such as a computer or mobile phone, can become a strong-motion seismic station stated by Antonino D’Alessandro, co-author of the study published in Bulletin of the Seismological Society of America
It is possible to connect those seismographic data that will provide us with the seismographic model. We are doing things like this with the use of predictive learning models. With these, even the smallest signals can be analyzed and connected with the incoming threat. Proper algorithms will model the machine learning part for us and after a short period, we will receive data created in this analysis. Later on, SI will analyze the outcome of the algorithm and inform us about the earthquake. Any increases in the accuracy of prediction will save lives by making an evacuation and ensuring the allocation of the resources is more effective. It takes just one person with a smartphone (all of them have an accelerometer) to take part in data collection and support the rise of a system saving lives on Earth. If a seismic threat is caught by the individual, the data can be sent for analysis and the application could simply inform everyone in the area about the incoming disaster.
If you enjoyed this article please recommend and share.