Big data architecture, use cases and benefits in IoT

Big data architecture, use cases and benefits in IoT

1. Introduction

In recent years, the "Internet of Things" (IoT) and "Big Data" are two of the most popular topics. In the concept of IoT, any device that switches on and off to the network will be connected to each other, and they are all connected to each other. This includes mobile phones, coffee machines, washing machines, headphones, desk lamps, and wearable devices. Many items fall into this category (Figure 1). This also applies to machine parts, such as: jet engines of airplanes or drill bits of oil drilling platforms. Whether we realize it or not, our lives are surrounded by these things that rely on big data, but this also makes life better.

Figure 1 Application of IoT in connected devices (Source: the IPSO Alliance)

The Internet of Things (IoT) is the biggest trend in the big data market. In the next decade, it is estimated that there will be about 25 billion network-connected devices, which is more than the number of personal computers, mobile phones, and tablet computers combined. This is a huge number of connections (some even estimate that this number is much higher, more than 100 billion). The Internet of Things is a huge network that connects "things". This relationship is between people, people and things, and things and things. Therefore, one of the factors that affect the Internet of Things is data: data volume, data management and use, as shown in Figure 2.

Figure 2 Number of IoT connection data

2. Big Data

Big data refers to large amounts of data that are unstructured and unorganized, and refers to the ever-increasing amount of data that requires technology to collect, store, manage and analyze. It is a complex and multifaceted phenomenon that affects people, processes and technology. From a technical perspective, big data integrates the organization, management, analysis and display of data, which are the characteristics of the "Seven V'S".

Figure 3 Big data in Seven V'S evolves into the value of data

1. Data volume

The volume of big data refers to the data collected from these sources (text, sound, video, social networks, surveys, medical data, spatial imagery, crime reports, weather forecasts and natural disasters, etc.). When dealing with big data problems, the amount is a key factor.

2. Data input and output speed

This speed affects the amount of data that can be processed and the speed at which it comes in. For example, when data is time-critical and needs to be processed immediately and saved quickly.

3. Data type, diversity

Diversity refers to the different data sources and formats of data, which cannot store themselves in a structured relational database system. The diversity of data directly affects the integrity of the data. The more complex the diversity in the data, the more likely it is to cause errors.

4. Authenticity

Since the processing of unstructured and big data should consider its accuracy - the authenticity of the data. And "authenticity" will become the most concerned factor, especially for the processing of big data, related analysis and the results of the data.

5. Validity

Data validity may sound similar to data accuracy, but the concepts are different. Validity means the correctness and precision of the data.

6. Visibility

Visibility refers to being able to see or be seen – it is implicit. Data from different sources needs to be merged together, and they are made up of big data with visible technology layers.

7. Value

This is very important, valuable or useful data. This "value" is the desired result after the big data is processed. In fact, the value of the data must exceed the cost.

Big data technology is a new technology and structure that refers to the use of rapid acquisition to discover and/or analyze big data technology layers in order to extract value from a variety of very large amounts of data, including:

  • Infrastructure, such as storage systems, servers, and data center network infrastructure
  • Data organization and management software
  • Analysis and search software
  • Decision support and automation software
  • The server includes business consulting, business process outsourcing, IT outsourcing, IT project-based services, IT assistance and training on how to use big data.

Figure 4 The role of big data in the Internet of Things

Without proper data collection in place, it will be impossible for enterprises to sort through all the information flowing through embedded sensors (Figure 4). This means that without big data, the Internet of Things can only offer enterprises a little. The key to introducing advanced IoT use cases is to make data analysis common, to be able to move from imagination to actual implementation, and to achieve affordability and ease of maintenance by entering the data infrastructure.

8. Big Data Architecture

New big data structures are to make up for the deficiencies in traditional systems, but also increase the overall complexity. This technology enables companies to allocate data storage and data analysis, and analyze effectiveness and identify patterns, trends, etc. Companies can not only conduct historical analysis and feedback-oriented reports, but also look forward to predicting business insights to actively support future decisions. Most professional systems need to handle different requirements and methods. Especially for the Internet of Things, M2M and sensor data, because real-time processing and analysis of data is necessary, and traditional systems cannot provide what they need, so the application of memory and fluid databases is necessary and indispensable.

The technical structure of big data can be divided into six different focus areas, with professional technology as the main axis: data storage, data access, data integration, analytical processing, vision and data management.

Figure 5 Simple big data architecture

9. Cloud Computing

The real innovation of IoT comes from its perfect combination with cloud computing. When connected devices interact with each other, a large amount of data will be generated. This data can be easily captured and stored, but it needs to be transformed into valuable knowledge and actionable intelligence – and this is where the real power of the cloud lies. In fact, cloud computing is a model that configures a shared pool of computing resources (such as networks, servers, storage, applications and services) in order to be convenient enough and based on network demand permissions. It can be quickly configured and released with minimal management work or service provider interaction. There are three cloud service models as follows (Figure 6):

(1) Cloud Software as a Service (SaaS)

Most SaaS applications are designed to cover the needs of enterprise users in all situations.

(2) Cloud Platform Services (PaaS)

PaaS provides developer tools and knowledge base to build, test, configure and run the application on cloud infrastructure. PaaS reduces the management workload by eliminating the need to configure and extend Hadoop elements and serves as a development platform for advanced analytical applications.

(3) Cloud Infrastructure Services (IaaS)

IaaS can allocate or delay time on shared resource servers, which are often virtualized, to handle the computing and storage needs of big data analytics.

Figure 6 Service model

Three main cloud architecture models have evolved over time; private, public, and hybrid clouds (Figure 7). They all share the idea of ​​commoditizing resources and often virtualizing the compute and storage layers for this purpose.

(4) Private Cloud

A private cloud is dedicated to an organization and does not share physical resources. These resources can be provided by the company or outside. A typical requirement for a private cloud configuration is security, strictly separating the company's data storage and processing from accidental or malicious access to shared resources.

(5) Public Cloud

Public clouds share physical resources for data transmission, storage, and processing. However, customers have a private, visible computing environment and independent storage. Security issues, using some private clouds or custom configurations, are not relevant for the vast majority of customers and projects.

Figure 7 The difference between private and public cloud computing

(6) Hybrid Cloud

Hybrid cloud architectures combine private and public cloud configurations, often to enforce security and resiliency, or to provide cheaper baseload and burst capacity.

The cloud computing model increases IT agility, which can save a lot of costs. In addition, cloud computing is a free big data - any enterprise can work with unstructured data at a huge scale. The rise of cloud computing and cloud data storage will be a precursor and promoter for the emergence of big data. Cloud computing uses standardized technology methods to commoditize computing time and data storage. It has significant advantages over traditional physical configurations. However, cloud platforms come in several forms and sometimes have to be integrated with traditional architectures. Cloud computing uses visualization to run many standardized virtual servers on the same physical machine to compute resources. Cloud providers allow low prices and billing based on short time intervals to achieve this economy of scale. This standardization allows computing needs to be elastic and highly available options.

Vertical scaling is achieved by adding additional instances and serving each of them a portion of the demand. Software like Hadoop is specifically designed for distributed systems to take advantage of vertical scaling, they process small independent tasks in large parallel scale. Distributed systems can also be used as data storage, such as NoSQL databases, or file systems like Hadoop's HDFS. Storm can be used to provide coordinated data flow processing in near real time through clusters of machines with complex workflows. The data processing of typical cloud big data projects focuses on scaling or adopting Hadoop. Tools like Hive and Pig are already on top of Hadoop, which makes it feasible to easily process huge data.

HADOOP

Hadoop is a powerful open architecture that is made up of many different technologies across the big data stack, leveraging several organizations that are using it to collect, analyze and process data through the Internet of Things (IoT). Collecting unstructured data is only half the battle for IoT, the other half is processing batches and analyzing data using Hadoop. The success or failure of IoT depends on big data, and as businesses step into the world of IoT, the symbiotic relationship between IoT and big data is widely leveraged for profitable business decisions. IoT is mostly about data rather than data in devices. Big data and IoT are like strategic partners because they don’t just collect data from IoT, they must also process and analyze it in order to strive for improved business operations. IoT devices are suitable for adopting the method of analyzing big data due to the nature of the data. The infrastructure of IoT (IoT) has reached a level of maturity:

  • Ubiquitous - Sensors are now cheap and can be included in any system.
  • Scale-out centralized computing - Hadoop can be used to analyze, process, and store all the data generated by IoT with a cost-effective, scalable distributed computing system.

Hadoop uses an enterprise-class storage processing layer that can store files approaching one terabyte. Various correlations can be made between different types of unstructured data, which can be leveraged by Hadoop and the Internet of Things to bring another level of competitive advantage to enterprises. The following diagram shows how the interaction between data storage and big data analysis engines occurs, as shown in Figure 8.

Figure 8 Interactions between the three elements of the Internet of Things

4. Application of IoT and Big Data HADOOP

1. The popular MagicBand exclusive to Disney World

One of the best examples of how IoT can leverage big data is the MagicBand, unique to Disney World. The MagicBand is a wearable wristband that visitors can wear on their wrists to start from check-in to their room, purchase food, and enter the theme park’s turnstiles. Disney collects unstructured data about visitors’ movements within the theme park to use it for analytics, helping them to - staff attractions and rides effectiveness, adjust restaurant inventory during peak hours, and accommodate more guests into restaurants.

2. Alex and Ani, a popular jewelry store

Alex and Ani, two popular jewelry stores, use Beacon technology in their stores to track the number of visitors in the store and send specific discount coupons to customers as soon as they enter the store.

3. Beacon technology in McDonald’s food supply chain

McDonald's uses Beacon technology to provide coupons to customers through the customer's mobile phone application to know who is near the McDonald's restaurant. Customers receive personalized messages and use the mobile application to find the most relevant deals.

4. UPS (United Parcel Service)

UPS is the largest transportation company in the United States. They are using sensor data for big data analysis to improve efficiency, save money, and reduce environmental impact. UPS installs sensors on delivery vehicles to track fuel mileage, engine conditions such as stops and acceleration. These IoT sensors collect nearly 200 data points from each vehicle in each fleet, nearly 80,000 per day. UPS has successfully reduced the consumption of harmful emission fuels and reduced vehicle idle time.

5. Cases of using big data

Big data technology now offers a variety of capabilities. It has been used to create new products, predict behaviors and trends, and optimize sales campaigns. Big data is changing the way a variety of industries do business, providing tailored healthcare, and making our cities smarter and safer. The rest of this section will discuss some specific situations where big data is being used.

1. Using big data to predict crime locations

Predicting future crimes is part of the present reality. One example is the Los Angeles Police Department (LAPD) which recently used big data to predict crime locations, thereby reducing crime across the metropolitan area, contributing to a 33% reduction in house burglary, 21% in violent crime, and 12% in property crime in the areas where the prediction software was used. When an earthquake occurs, there is a high probability of aftershocks in the vicinity. This mathematical model, developed by an assistant professor, George Moher, can be used to define and predict new aftershock patterns.

Crime data shows similar patterns (see Figure 9). These data help the Los Angeles Police Department understand the nature of crime. It means that when crime occurs in one place, more crimes will occur nearby. The pattern of these criminal activities is similar to the pattern of aftershocks. When they plug the previous crimes into the equation, they will produce predictions about what happened in the past. Now the department can analyze and identify crime patterns through algorithms. This systematic analysis has led to a continuous decrease in violent crime in Los Angeles.

Like following an earthquake aftershock, nearby thefts also follow in rapid succession (data from Los Angeles 2004/5)

Figure 9 Crime aftershocks

2. Big data as a source of innovation in healthcare

The release of big data is likely to inspire many companies to develop healthcare applications, or similar innovations. Here are some examples of healthcare innovations created by the big data revolution:

(1) MHealthCoach supports patients in chronic care, providing education and treatment through an interactive system. The application uses data on healthcare costs and programs sponsored by the Agency for Healthcare Research and Quality, as well as results and alerts from clinical trials. MHealthCoach can also be used by providers to identify high-risk patients and provide important messages and reminders to them (Figure 10).

Figure 10 Mhealthcoach

(2) Asthmapolis has created a GPS tracker that monitors inhaler use with asthma and has launched an app called Propeller Health on the Google Play Store and App Store (see Figure 11). The Propeller sensor continuously tracks the patient's medication use and records it over time and location, locating the inhalers used by the patient. It can also be used for both rescue and controller medication symptom tracking. This information is put into a central database and used to identify individuals, groups, and population-based trends, combined with CDC information on asthma catalysts (for example, pollen counts in the Northeast and the impact of volcanic fog in Hawaii) to help doctors develop personalized treatment plans and prevent flare-ups. The Propeller wireless sensor syncs with the patient's smartphone using built-in Bluetooth technology. The Propeller mobile app for iOS (such as the iPhone and iPod Touch) and Android devices allows patients to view data, giving them personalized feedback and education on ways to improve their asthma control or COPD.

Figure 11 Propeller application

Ginger.io offers a mobile app that patients (such as those with diabetes) agree to track through their phones, in partnership with their providers, and helps the app record calls, text messages, location, and even movement information (Figure 12). Patients also respond to smartphone surveys. The Ginger.io app integrates public research and other health data from the National Institutes of Health to gain revealing insights; for example, lack of exercise or other activity may indicate that patients are feeling unwell, and irregular sleep patterns may indicate that an anxiety attack is imminent.

Figure 12 Ginger io application

6. Benefits of Big Data for Businesses and Consumers

Big data creates value for businesses and customers, and these benefits can be felt across a wide range of sectors, whether large or small companies. Among large companies, there are several drivers for investing in big data technologies; analyzing business and transaction data, being able to gather insights into customer behavior on the web, and using advanced analytics to discover where BTO manufacturers can schedule machines, staff, and sales with minimal impact on existing production plans. Big data is being used by manufacturers to improve warranty management and equipment monitoring, as well as optimize the logistics of getting their products to market. Retailers are leveraging a wide variety of customer interactions, both online and offline, to provide more targeted recommendations and optimal pricing. Technology companies are using big data to analyze millions of data points to provide more reliable and accurate voice interfaces. Banks use big data technologies to improve fraud detection.

For consumers, big data creates products and services that impact their daily lives. It enables cybersecurity experts to protect credit card systems by leveraging large amounts of network and application data and using it to identify anomalies and threats. Nearly 29% of Americans who are "unbanked" or "banked" qualify for lines of credit for a wider range of uses, such as paying rent, utility bills, mobile phone subscriptions, insurance, child care and tuition.

When companies adopt big data as part of their business strategy, the first question is usually what kind of value will big data trigger? Will it help high-level or low-level, or will there be a non-financial driver? From a value point of view, big data analysis applications can be divided into one of three dimensions (see Figure 13).

The first and most obvious one is operational efficiency. In this case, data is used to make better decisions to optimize resource consumption and improve the quality and performance of processes, which is what automatic data processing has always provided, but with an enhanced feature set. The second dimension is customer experience; the typical goals are to improve customer loyalty, conduct precise customer segmentation, and optimize customer service. Including the vast data resources of the public Internet, big data is driving the next stage of development of CRM technology, which also enables new business models, supplementary revenue sources from existing products, and the creation of additional revenue from completely new (data) products.

Figure 13 Big data use cases in the value dimension

<<:  The countdown to global 5G commercialization begins: an overview of the industrial landscape of various countries

>>:  The three major operators are deploying the next generation of the Internet of Things and have determined three specific directions

Recommend

The US court's suspension of the TikTok ban will take effect this Sunday

There was another new change today. The U.S. Dist...

Network | 5G secrets that operators don’t want to tell

On November 1, several major domestic operators o...

US operators confirm that only premium users can enjoy C-band 5G signals

According to foreign media reports, sources have ...

Where can I find the IP address of my router?

When we need to set up a wireless router, we need...

Analysis on the current status of global 5G development

[[417613]] This article is reprinted from the WeC...

Log Analysis for Software Defined Data Center (SDDC)

Modern infrastructure is generating log data at a...

RedCap Will 5G spark an IoT gold rush?

A scaled-down version of 5G could spark a surge i...

Regular end-to-end encryption may not be that secure

[51CTO.com Quick Translation] Is the messaging pl...

Virtual Private Server Operation Beginner's Guide

A Virtual Private Server (VPS) is a popular hosti...

5G drives growth in rising private mobile network market

The use of dedicated mobile networks based on LTE...