Artificial intelligence (AI) and machine learning (ML) are rapidly growing disciplines within the life sciences industry. Applications of AI in healthcare alone are expected to grow to more than $8 billion USD by 2022 (globally). Almost half of global life sciences professionals are either using, or are interested in using, AI in some area of their work.
The healthcare industry is competitive and dynamic in nature. AI tools are highly attractive to this industry since they have been successful in clinical research, trial management, regulatory and market access, as well as commercial effectiveness applications. AI/ML has been adopted by those needing access to deeper market insights to craft real world data-driven strategies with speed and precision. AI/ML, integrated into a healthcare company’s analytics strategy, replaces gut instinct and rule-based decision making. It provides evidence-based insights that can reveal complex patterns like those found in patient behaviors, health outcomes, HCP prescribing, and sales, that were previously undetected.
Advances in AI/ML, combined with the increasing availability of healthcare data (e.g. from pharmacies, insurers, healthcare professionals (HCP), labs, electronic medical records (EMR), marketing campaigns, and social media), offers the life sciences industry a wealth of insights and the promise of a competitive advantage with the ability to drive healthcare forward.
Machine learning first appeared in computer science research in the 1950s. So why, after all these decades, is the life sciences industry finally interested in this family of analytics? The simple answer has to do with data storage and data processing capacities. Both have grown tremendously since that time, to the point where it is now affordable for businesses to use machine learning. Consider that, for a fraction of the cost, a smartphone now has more storage and computing power than a mainframe in the 80s.
Machine learning draws from numerous fields of study: artificial intelligence, data mining, statistics, and optimization. Data (text) mining uses data storage and data manipulation technologies to prepare the data for analysis. Then, as part of the data mining task, statistical or machine learning algorithms can detect patterns in the data and make predictions about new data.
When comparing machine learning to classical statistics, we often look to the assumptions about the data required for the analyses to function reliably. Classical statistical methods typically require the data to have certain characteristics and often use only a few features (called covariates or predictors) to produce results, while machine learning models might use hundreds or even thousands of parameters in a computer-based method to find similarities and patterns among data.
The similarities and differences between classical statistics and machine learning is a topic that has generated numerous discussion papers and blogs. Here are some key points worth mentioning:
Classical statistics, a subfield of mathematics, almost always starts with a hypothesis, and generally assumes that some structural relationship exists in the data. It uses probability theory and underlying distributions, and is usually applied:
In the life sciences industry, the use of classical statistical methods is the foundation for R&D activities and peer-reviewed, real world publications. Statistical analysis plans in this discipline must adhere to pre-defined industry standards. Such cases include randomized clinical trial analyses and patient analytics, such as survival analysis to compare persistence metrics across multiple groups.
Machine learning is more exploratory and less dependent on a priori hypotheses or assumptions. Algorithms are typically far more complex than their statistical counterparts and often require design decisions to be made before an iterative training process begins. This is due to the difficulty of feature engineering caused by the large number of inputs (high dimensional data sets) and the inclusion of unstructured data (e.g. text data).
Despite these differences, there are many instances where classical statistics and ML use similar approaches and, therefore, overlap with each other. For example, logistic regression is one technique ML borrowed from the field of statistics. It is widely used for classification problems such as segmentation and prediction of group assignment.
Here’s a quick summary of the differences between classical statistics and machine learning:
|
Classical Statistics |
Machine Learning |
Approach |
Data Generating (stochastic) Process |
Algorithmic Model |
Driver |
Math, Probability Theory |
Fitting Data |
Focus |
Hypothesis Testing, Interpretability |
Predictive Accuracy (Precision and Recall) |
Data size |
Low-Med |
Big Data |
Dimensions |
Mostly for Low Dimensions |
High Dimensional Data |
Inference |
Parameter Estimation, Predictions, Estimating errors |
Pattern Recognition |
Model Choice |
Parameter Significance (p-values), Goodness of Fit |
Cross-validation of Predictive Accuracy on Partitions of Data |
Popular Tools |
R, SAS |
Python |
Interpretability |
High |
Med |
For life sciences companies, understanding the pros and cons of both classical statistics and AI/ML is important when investing in your business. Several key industry-specific conditions can lead decision makers to adopt machine learning solutions. For example:
Effectively deploying AI/ML technologies can transform a commercial strategy, giving decision makers an edge in the marketplace. However, it only works when organizations have a machine learning strategy with all the necessary elements:
AI and machine learning can deliver previously inaccessible insights that can positively impact commercial activities and support various functions within healthcare organizations. AI/ML methods have been shown to consistently deliver more accurate outcomes in less time than conventional assessments. Deriving the greatest benefit from the investment entails adopting a long-term strategy and new ways of performing analytics, rather than looking for short-term gains.
Strategies include:
Classical statistics and machine learning need to co-exist; the use of one versus the other should be based on the analytical problem at hand. In some scenarios, they serve very different purposes. In others, they may overlap. The question is not whether one approach should be adopted at the expense of the other, but rather to determine which is the most appropriate for any given business situation.
Machine learning is moving into the mainstream. Effective use of machine learning in business entails developing an understanding of ML within the broader analytics environment, becoming familiar with proven applications, anticipating the challenges you may face using it in your organization, and learning from leaders in the field. Consider a holistic view of machine learning inside your organization. The volume and variety of data, combined with significant regulatory requirements in the healthcare industry, presents a challenge. However, if healthcare companies can successfully navigate this challenge, they face an unpreceded opportunity to answer complex questions about how to best demonstrate the value of their products, craft messaging, and execute sales strategies that deliver commercial success.
In the next few blogs of this AI/ML series, we will demonstrate success stories where AI/ML has been applied to bring competitive advantage to clinical and commercial teams.
If you have questions or comments about this blog or would like to discuss how your business can transform from using traditional statistics to machine learning, contact Pierre.St-Martin or Canadainfo@iqvia.com.