What is Predictive Analytics?
Introduction to Predictive Analytics
The objective of predictive analytics is to make predictions about future events, which can have immense benefits. For example, the manufacturer of an aircraft will like to predict an engine failure. On the other hand, a service like Netflix will like to predict when a subscriber will cancel their subscription. If you manage a hedge fund or large stock brokerage, you will be interested in learning about the different signals that take place in market data just before a bear market crash. Medical practitioners studying cardiovascular disease will be interested in the changes that occur in certain blood markers when a heart attack is imminent. Credit card or loan companies are interested in behavior patterns that signify a person will default on a loan. In all the scenarios, the problems seem different at first look, but a common thread runs through all of them—looking for certain patterns in the data that will be a precursor to the event they wish to analyze. For example, on Netflix, you might observe certain behavioral changes in watching patterns; maybe a decline in the frequency with which the app is opened below a certain threshold. Or users may still open the app at a normal rate, but now spend more time going through movie catalogs rather than selecting movies to watch.
Predictive analytics is the technique that enables you to find and uncover the factors that indicate the event illustrated in each case above is about to occur. Before doing any analysis, it is difficult to know what those factors are. In the case of Netflix, all we had were mere conjectures. This kind of guesswork is not helpful to contemporary organizations, so they need to use predictive analytics to get the answers. To determine what factors need to come together for the event we want to predict, an enormous quantity of data will be collected and analyzed.
Why Use Predictive Analytics?
The purpose of using predictive analytics is to ensure companies can adopt a proactive stance. When a company can predict a situation, then it can intervene before the failure or service cancellation occurs. In the case of the airplane engine, the plane can be grounded and sent for inspection. The patterns that might emerge when studying past engine failures might provide the information necessary to enact appropriate repairs. An engine failure won’t happen unexpectedly, before doing your analysis you do not know what systems within the engine tend to fail. This information may be learned using the tools of predictive analytics. With this information in hand, the company can enact repairs on those systems to significantly lower the probability of engine failure. Netflix’s example is very common today. Netflix will be very interested in determining if someone is likely to consider canceling their subscription. If they can get this information, then they can take action to try to keep the subscriber. They can start presenting the customer with renewal offers that propose a discount if they renew their subscription within a limited time period. This proactive action would help the company avoid losing large numbers of users.
The Process of Predictive Analytics
The predictive analytics process will move through many stages. Like any activity within a large organization, it will begin by defining the problem. The problem must be defined precisely so that models and computer systems have something specific to tackle, and all members of the organization understand the goals. The scope of the effort should be more limited to get more reliable results.
Once the problem has been identified, big data comes in. This is where the role of the data scientist becomes more important. Data scientists in the organization must determine the best data sources to tackle and solve the problem. Data has often already been collected, but, if necessary, there can be a data collection stage. In some cases, the need for certain data may be identified in the planning stage. While this will cause some delays in moving forward, if the data is required for a solution to the problem, this step will be essential. Once the data sets are selected or become available, the data science team can begin a preliminary analysis. This will include the framing of hypotheses and preliminary modeling. The data science team would analyze the data to determine which algorithms are most suitable to get the results they are seeking.
Next, the test data will be selected, and the type selected will depend on many factors. For example, you will have some known data in some cases. This could be the case for a subscription service like Netflix. Certainly, the company would have information that would include subscribers who have canceled their service. The company may even have survey data from a subset of users. Some companies may attempt to ask users some simple questions to get at the reasons for cancellation. A preliminary examination of the data may also reveal some inputs that correlate to the output at hand, which is canceling the service. However, that data may not be available, but it does not matter because machine learning can proceed either way. The next step is to train the system on subsets of data. This is the learning phase of machine learning. The system will be presented with data, and it will seek out patterns, correlations, and trends that can be used for future estimates.
At each step, the data science team will analyze the results, and the learning process can continue until the team is satisfied with the current state of performance. At this point, the models can be deployed in real-time. Netflix models can be used to read data from customers to uncover patterns that predict a subscriber is about to cancel. As mentioned earlier, the company will probably aim to predict cancellation behavior several weeks ahead of time to give them time to adjust for the consumer and keep them from going ahead with the cancellation.
Applications of Predictive Analytics
As we have shown, business enterprises and other large organizations can use predictive analytics in many ways. Let us familiarize ourselves with some general applications of predictive analytics.
Healthcare
The function of predictive analytics in healthcare are quite numerous. One common problem that occurs in hospitals is secondary infections. Patients might come in with a viral or other illness and then contract bacterial pneumonia while in the hospital. This problem is ripe for machine learning, provided that the data is available. Big data could be used in conjunction with machine learning to develop models that would estimate a patient’s risk for developing secondary infections. The hospital could take steps to reduce the risk, such as keeping patient exposure to others at a minimum during their hospital stay. Nursing and other healthcare staff could also implement extra protection measures.
Another application in the healthcare arena is determining who is at risk for different diseases before symptoms occur. For example, patients could be analyzed to determine their risk of developing diabetes or heart disease. Big data analytics could be used to find patterns in the data for breast and prostate cancer, identifying patients that are at high risk of developing these diseases using previously unknown hidden patterns so that intervention or additional screening could be applied.
Both big data and predictive analytics can be used to improve staffing in hospitals. For example, we could determine the best times or days to increase staffing using predictive analytics and determine where to allocate that staffing.
Law Enforcement And Rescue Missions
Predicting where and when future crimes will occur is where predictive analytics is married with big data to help law enforcement in a more efficient way. This can occur on several levels. For example, past data will indicate where and when most crimes occur. The variation in crime patterns and types of crime might change with time of year, or based on a myriad of other factors we may not be able to determine ahead of time. By predicting when and where various crimes are more likely to occur, the deployment of police in the field can be made more efficient. This can even be taken down to the level of specific crimes. For example, it might be known that on certain days of the week or months of the year, there are more auto thefts in one part of a city, while there are more rapes and assaults in another. This can help law enforcement not only allocate resources to various locations, but it can also help them determine what types of resources to move to what locations and when they should do it.
Predictive analytics can also be used in larger applications such as drug trafficking. Big data collected on previous movements of drug traffickers, including arrests, can be used to better allocate different types of resources. In some cases, more effort might be required at certain airports, while other resources may be applied to different border areas. The data can be updated in real-time to continually adjust the models, as drug traffickers adjust their strategies in response to changing enforcement
These same methods can be used for the deployment of fire and rescue, and even for deployment locations and levels of Coast Guard resources to get the most efficiency possible. In these cases, predictive analytics can be developed by looking at past data that could indicate where and when most auto accidents occur. Other information can be gleaned, such as the frequency of heart attacks in different areas and at different times. This can help officials put forward more efficient staffing, locate fire stations in the best places, and deploy and adjust resources as necessary
Credit Approval
One of the most popular ways predictive analytics has been used in recent years is in credit approval for loan applications. Big data has made much information available about people applying for credit over many years. This data has revealed hidden patterns and correlations used in predictive analytics. The patterns in the data have revealed previously unknown correlations that go beyond the simple calculation of a credit score to completely analyze an applicant to determine their risk of default on a loan. The data can also determine other risk factors that could prevent the loan from being paid off, such as the risk of bankruptcy, future unemployment, and even major illnesses that could result in the customer being unable to pay their bills. What is stunning about this trend is that it enables companies to determine the real risk of a given customer defaulting on their loan within seconds. There are downsides to this. There may be cases when a person-to-person interaction might help someone who has risk factors, as their circumstances may have changed. The automatic systems only go with probabilities and what the data says that instant, so they will probably err at times. Overall, the predictive analytics used in this type of application has proven to be extremely accurate.
Importance of Predictive Analytics
Predictive analytics will not create completely flawless systems. However, before predictive analytics and big data, companies had to act based on very limited information and relied on educated guesses and hunches. As a result, the efficiency in operations, reliability in products, and quality of services were not nearly as good as it could be. Although predictive analytics cannot provide you with a hundred percent reliability (nothing can do that anyway), it takes guessing out of the process. Over time, predictive analytics can also be refined. If a company can predict failure to a ninety percent probability, in a few years, they may be able to push that to ninety-five percent or higher. The more data the company can collect, the more the predictive analytical tools can be refined and improved. The result is that there will be many benefits both to the company and to the people they serve. First, companies that manufacture products can build products that are safer, more reliable, and offer better performance, overall. Those that offer services can improve them to better meet the needs of their customer base.