IEEE Published Work: Medicine Expenditure Prediction

Machine learning (ML) offers a wide range of techniques to predict medicine expenditures using historical expenditures data as well as other healthcare variables. For example, researchers have developed multilayer perceptron (MLP), long short-term memory (LSTM), and convolutional neural network (CNN) models for predicting healthcare outcomes. However, recently proposed generative approaches (e.g., generative adversarial networks; GANs) are yet to be explored for time-series prediction of medicine-related expenditures. The primary objective of this research was to develop and test a generative adversarial network model (called “variance-based GAN or V-GAN”) that specifically minimizes the difference in variance between model and actual data during model training. For our model development, we used patient expenditure data of a popular pain medication in the US. In the V-GAN model, we used an LSTM model as a generator network and a CNN model or an MLP model as a discriminator network. The V-GAN model’s performance was compared with other GAN variants and ML models proposed in prior research such as linear regression (LR), gradient boosting regression (GBR), MLP, and LSTM. Results revealed that the V-GAN model using an LSTM generator and a CNN discriminator outperformed other GAN-based prediction models, as well as the LR, GBR, MLP, and LSTM models in correctly predicting medicine expenditures of patients. Through this research, we highlight the utility of developing GAN-based architectures involving variance minimization for predicting patient-related expenditures in the healthcare domain.

Machine Learning with CMS Public Healthcare Dataset, Part II

Introduction to CMS OpenPayments Data Analysis using Machine Learning

In the previous blog we discussed the fundamental concept of what Machine Learning is and how it can be applied in the modern world of Pharmaceuticals and Healthcare, further to this we explored CMS Open Payments, the federal program that collects information about the payments drug and device companies make to their potential clients.


Machine Learning with CMS Public Healthcare Dataset, Part I


From the outset, the term “Machine Learning” can seem very daunting to those unfamiliar with the technicalities of what this actually means, or so it seemed to me when I was initially assigned to develop a use case for one of these algorithms during my time here at RxDataScience.  From a quick study into the topic, Machine Learning (ML), put simply, is a branch of Artificial Intelligence (AI) that allows a system to automatically learn and improve itself without being explicitly instructed to, by using past and present data to predict certain outcomes [1].  The following video provides a gentle introduction into what ML is all about:


Human Activity Recognition Using Pharma ML and IoT Devices, Part II

Decision Trees

As mentioned in my previous exert, I will be delving further into the complexity of the algorithm I used in my study. Following some research into decision trees and the impact they have had on healthcare and pharma I found that their presence has been assisting across the field since the early 90’s in the form of Evidence Based Medicine (EBM). The stages detailed in this process where summarised to:


Leveraging the Power of Real-time ETL for Better Pharmaceutical Insights

The ETL (extract, transform and load) process by which organizations prepare data for storage is an essential part of modern database systems, particularly used for business intelligence applications. The problem is that it can be inefficient and slow — too slow for companies to do real-time and streaming analytics.


Machine Learning for Pharma using Random Forest, Part II

Introduction to Machine Learning with Random Forest (Pharma/Genetics)

To pick up where we left off last blog post, we discussed the potential of predictive analytics in the genetics of cancer. I aim to achieve this by using the aforementioned Random Forest classification algorithm.


Big Data Application in Healthcare For Effective Patient Treatment

Within the first decade of the 21st century, the use of big data became very popular in many big industries. The methods for capturing big data have since evolved from traditional data lake systems to more integrated technologies that combine big data with all other systems within a company.1


Using Analytics to Address the Most Common Patient Journey Challenges

A patient’s journey as they navigate and choose from the many treatments and options available to them is highly unique. The sheer number of variables can be overwhelming. Aggregate patient journey information is locked in APLD (anonymous patient level data) sources like claims, prescription, and EMR data sets. Logically sifting through multiple multi-billion row data sets looking for actionable insights is important for today’s pharmaceutical manufacturers. It requires the most leading edge possible data infrastructures to handle at scale.


Big Pharma; Big Data: how a third party company can revolutionize your data for a better tomorrow

As each industry becomes more technologically savvy, the pharmaceutical industry has the chance to emerge into the 21st century and reform how they treat patients while tackling different medical epidemics that affect the world today.


PM360: Four Key Questions About Data Analytics

PM360 asked experts in data analytics about how the process behind collecting and analyzing data is changing, including these four key questions:

  • How is the advancement of artificial intelligence, natural language processing, machine learning, and deep learning impacting the data analytics landscape?
  • How has the rise of concern for data privacy and the implementation of the EU’s General Data Protection Regulation (GDPR) affected pharma’s ability to collect or use data?
  • What can organizations do to help make their employees more data literate? Ultimately, how do you make everyone comfortable using data analytics?
  • What is the future of data analytics? How else is data collection, analyzation, visualization, implementation, etc., likely to change in 2019 and beyond?