Using a Data Warehouse in Healthcare: Architecture, Benefits, and Use Cases

Published on January 20, 2022

What image comes to mind when you think of a warehouse? Endless rows of goods, neatly stacked on racks, packed and ready for shipping? A data warehouse for healthcare is based on a similar principle.

A healthcare data warehouse (healthcare DWH) is a digital repository of data that has been gathered from multiple sources and prepared for analysis. It may contain entries from medical records, insurance claims, lab tests, pharmacy prescriptions, or even population-wide research. A DWH is usually an integral part of well-designed healthcare analytics software

If you want to enhance business intelligence and make the best use of available data, a DWH could be the right tool to add to your arsenal. Having a centralized data storage point enables your organization to extract insights, improve decision-making, optimize resources, and provide better patient care.

Demigos has worked with a number of established healthcare organizations and startups, building healthtech solutions of various kinds. In this article, we'll use our expertise in data science and knowledge of the industry to showcase the importance of DWH in healthcare. We'll also explain how the technology works and look at several use cases.

Let's start with the advantages of having a data warehouse.

Benefits of a data warehouse to healthcare organizations

The benefits of data warehouse to a healthcare organization.

Each year, the healthcare industry is increasing its focus on data analytics. Research on the healthcare big data market estimated its worth at $11.5 billion in 2016, predicting it will reach $70 billion by 2025. That's a six-times increase in less than nine years.

Modern technologies like cloud computing and machine learning are helping medical organizations cope with the growing volumes of data. But in order to get quality insights, you first need to aggregate the data from disparate sources and give it structure. Using a healthcare data warehouse can help achieve that intermediate objective.

We've compiled a list of the typical benefits of DWH in healthcare applications. Although the specific goals, processes, and pain points of medical businesses vary greatly, you'll likely find that most of these resonate with you. 

Efficient reporting

With a centralized data repository that also provides on-the-fly analytics, your healthcare facility can generate timely and precise reports. For example, you can effectively monitor patient conditions, personnel performance, or pharmacy sales. Combined with advanced analytics and data visualization tools, you can present data to key stakeholders, identify problematic areas, or boost clinical research.   

Better clinical decisions

Harnessing the power of data is a complex endeavor, but it bears fruit — especially when the right insights come at the right time.

Instead of trying to quickly process unorganized input from siloed databases, you can work with data that has been structured and pre-processed. What's more, with a DWH, you'll have unified storage as well as analytics tools at your disposal. Take high-quality data, quick processing, and easy access, and add integration with a properly designed clinical decision support system. What you'll get is a highly efficient and agile framework that can output clinical decisions just when you need them.

Optimized insurance claims and payments

Processing large amounts of claim-related data quickly gives you a bird’s-eye view of the situation. It's a great opportunity for a hospital or any other healthcare facility to review its insurance compensation procedures, identify possible issues, and prevent fraud.

Enhanced strategic planning

The use of a data warehouse in healthcare enables a comprehensive approach to resource planning and promotes communication between departments. For instance, you can leverage the analytical capabilities of a DWH in conjunction with your inventory management software to optimize stock and plan procurement. 

By employing descriptive analysis, you can keep track of the current operations across departments and quickly identify inefficiencies. The next step — predictive analysis — will help you plan for the future and avoid known pitfalls.

With data gathered from disparate sources and processed in a timely manner, you'll always have the most comprehensive and current insights to act on. 

Improved patient experience and outcomes 

Why is having a data warehouse important in healthcare from a patient perspective? When you combine EHR/EMR information with diagnostic data, follow-ups, and long-term outcomes, you'll have the complete patient journey in front of you. Based on this data, you can bridge gaps in services, improve patient satisfaction and loyalty, and, ultimately, achieve a higher standard of care. 

Personalized value-based care

Using a data warehouse in healthcare analytics can propel providers on their way to offering value-based care. With the help of machine learning and other advanced analytics techniques, hospitals can gain deeper insights into the efficiency of certain treatment plans. This way, patients can receive treatment that is tuned precisely to their needs, which helps avoid unnecessary expenses.

Hopefully, we've answered the question of the importance of data warehouses in healthcare.  Implementing a warehouse can be very beneficial for a medical practice of any size. Let's now take a look inside a healthcare DWH and see what it's made of.

Healthcare data warehouse architecture

This image shows a typical healthcare warehouse design.

The choice of a healthcare data warehouse design depends on a number of factors. These include the scale of your organization, its specialization, and the specific business goals you plan to achieve by implementing a DWH. We recommend employing the help of a software vendor to assess your particular needs and pick the most suitable architecture that will meet them. 

Now, let's talk about the two most popular architecture models — individual data marts, and the enterprise DWH. 

Individual data mart model

This approach to DWH architecture involves working with specialized subsets of data from different domains. That's what a data mart is — an isolated repository of data dedicated to one subject. The biggest upside of this method is the ability to start small and scale up as necessary. 

For instance, you can ingest and analyze specific data that pertains to chronic illness or insurance claims, targeting the most critical areas. You can later build more of them, expanding the scope of your analytics. 

The data mart model is suitable for smaller healthcare providers that want to start improving specific operations as soon as possible. Individual data marts can exist independently, but they are also utilized in the enterprise model. Let's see what we mean by that.

Enterprise-wide data model

The enterprise data warehouse model is the most comprehensive option and includes most of the typical components and functionality. This approach to DWH architecture is indispensable when a healthcare provider wants to analyze multiple types of data within the organization. It’s also the go-to option for larger health center networks. 

The diagram below shows the typical internal setup of an enterprise data warehouse software for healthcare.

The diagram shows the typical internal setup of an enterprise data warehouse software for healthcare.

Data source layer

Multiple streams of healthcare data originate from different sources, such as EHRs/EMRs, financial reports, laboratory and pharmacy systems, clinical trials, claim management, the HR department, and so on. This data varies in formats and isn't ready for analysis yet.

Staging zone

This layer contains temporary storage where processes like data aggregation and normalization take place. The end goal is to receive high-quality, well-structured data that is free from duplicates, inaccuracies, and inconsistencies. 

By applying one of the two methods — ELT, extract, load, and transform (data is loaded into the warehouse as is and transformed on demand) or ETL, extract, transform, and load (data undergoes transformation and is then loaded into the warehouse) — data is organized and formatted to be ready for analytics purposes.

Data storage layer

This is the actual data warehouse, the heart of the system. Data exists here in a highly structured form, usually in an SQL database. It has been formatted, transformed, and standardized (the extent of this depends on the choice of the ETL or ELT process), and is ready to be analyzed. 

If necessary, you can organize data into individual data marts by specific business areas for use by separate departments. PHI (protected health information) is also stored here, and it's often anonymized during the transformation step to protect the patients' identity.

Analytics and business intelligence

This is where you can use your internal analytics tool kit, as well as connect third-party apps. Things like data mining, statistical analysis, reporting, and data visualization are done here. 

This final layer of the DWH can benefit from integrations with ML (machine learning) software. ML models can be trained with large amounts of data to help drive decision-making later.

Before we move on to the next topic, let's mention the deployment/hosting options for your future DWH.

Data warehouse deployment strategies

What strategy to go with when deploying your data warehouse for healthcare applications?

In short, three options are available: using in-house infrastructure, moving the warehouse software and data to the cloud, or going hybrid. Below is a quick rundown of the pros and cons of each approach.

Using on-premise servers

With this method, both the software and the data are deployed on local servers.

The pros:

  • Data governance and regulatory compliance. With on-premise servers, you are in charge of every process and standard.

  • Total control of the tech stack. You have the keys to every door. Your team has direct access to the hardware, so if anything malfunctions, it can be quickly fixed. The same applies to the software: if you need to add features, your local or outsourced engineers can work with the code.

  • Less delays due to bandwidth. If the on-premise servers are connected to your local network, you can avoid the latency issues that are sometimes present with cloud solutions.

The cons:

  • An up-front investment into building the infrastructure. You'll need to purchase the servers and all the network equipment, pay for the installation, etc.

  • Costs of maintenance. Any CPU, disk, or memory upgrades, software, and firmware updates will come out of your pocket.

  • Scaling up takes time. And again, money. It's not like you have unlimited hardware capabilities. Setting up additional servers will require extra effort.

Does the next option solve these problems? Let's find out.

Deploying to the cloud

You can use services like GCP (Google Cloud Platform), AWS (Amazon Web Services), or Microsoft Azure to host your data warehouse. By taking advantage of the microservices architecture (small modular apps that perform certain functions) and using APIs (application programming interfaces), you can connect third-party services to your software. The vendor will take care of your hardware needs.

Even more, there are turnkey solutions, such as Amazon Redshift, Oracle Autonomous Database, and Azure Synapse Analytics, to choose from. These platforms are equipped with essential tools but might lack certain functionality and still need to be configured to meet your needs.  

The pros:

  • Pay only for the resources used. The subscription model that cloud providers use makes sure you're not spending money on hardware or bandwidth you don't need.

  • No hardware purchasing or maintenance headaches. You're completely relying on the vendor's infrastructure, so the initial investment can be less significant. And if anything breaks down — be it hardware or software — it's the provider's responsibility to fix it.

  • Scaling up is easy. Cloud solution vendors can dynamically add server nodes when your workload requires it.

The cons:

  • Performance issues due to bandwidth. During peak loads, cloud-based solutions can deliver mediocre performance when there's a limit on transfer speeds.

  • Reliance on the vendor for upgrades and maintenance. You have no control over these items, so you'll have to count on the vendor's tech support.

The third option is a combination of the first two.

The hybrid approach

You can combine the benefits of the on-premise and cloud approaches if your business model requires it. For instance, you can run your core analytics on your premises and test out new use cases with the help of a cloud solution. Or choose to store and process your PHI data on the local servers for security reasons. Another scenario is using cloud servers for backup.

Data warehouse challenges in the healthcare industry

The challenges of building a healthcare data warehouse.

Now that you have an idea of the inner workings of a healthcare data warehouse, it's a good time to switch to another topic. Here are some concerns to consider when making a move.

Data integration complexity

Data interoperability is a major pain point for many medical organizations, and it will surely surface when building a clinical data warehouse in healthcare. When you add financial and administrative data into the mix, the task of transforming data from those sources to meet a common standard becomes incredibly taxing. 

Building an efficient set of tools, a reliable ETL or ELT pipeline, is essential for successful data integration. However, there may be obstacles on the way. For instance, not every vendor supports HL7 (a set of healthcare data standards) compatibility when exporting data. 

To overcome these challenges, you'll need a team of experts with a background in data science. Which takes us to the next point.

Lack of in-house technical expertise

Developing a DWH requires careful planning and considerable effort, so involving healthcare data warehouse experts is a must. It's especially true in the case of enterprise-scale medical companies, where the volume of data is often immense, and its types are innumerable. 

It's important to realistically access the capacity and experience of your internal IT team. Even though they might have done a good job of building or supporting your current software, designing a full-fledged DHW is a different animal. Find a software development company with the relevant background and a solid track record to share this journey with.

Data security and privacy concerns

Medical systems are historically rich in PHI. In the US, such data must be stored and managed according to the Health Insurance Portability and Accountability Act , or HIPAA. Keeping your data warehouse HIPAA-compliant is absolutely crucial for a successful implementation. If your business model necessitates sharing private patient information with partners, all parties concerned must enter into a Business Associate Agreement.

As to data security, while HIPAA’s formulation “to protect against reasonably anticipated threats” may be a bit vague, the penalties for allowing data breaches are quite real. Not to mention the reputational losses. Regardless of the deployment strategy — whether on-premise or in the cloud — security assessments should be done regularly. 

This is by no means an exhaustive list of the challenges of implementing a DWH. You may also run into problems when allocating funds and hardware resources or even get stuck at the stage of securing stakeholder buy-in. You may face purely technical difficulties that will result in latency issues or downtime. 

However, with a reliable IT company that uses the right tech stack and follows best practices, you stand a real chance of receiving a five-star product.

In the following section, we’ll look at several data warehouse healthcare examples.

3 Use cases of healthcare data warehouses

Real-world data warehouse healthcare examples.

So, how do the benefits of having a DWH apply to real-life scenarios? We've selected three use cases as illustrations.

Improving outcomes for patients with diabetes

According to a CDC report, more than 10% of Americans were living with diabetes in 2020, and about 34% of the US population had prediabetes. These two facts make diabetes the perfect subject for a data analytics effort:

  • the demographic coverage of the disease is extensive

  • most of its long-term complications are preventable with appropriate care

This means that a lot of data is available for analysis and that the insights produced in the process can directly influence patient care. Adjustments to the treatment/exam procedures can improve outcomes for patients, reduce costs for hospitals, and ease the burden for insurers.

For these plans to become reality, healthcare providers must implement data warehouses that enable quick and accurate reporting. By leveraging the benefits of a DWH approach, clinicians will be able to proactively monitor and treat diabetic patients.

Managing childhood immunization

To vaccinate children according to individual schedules, medical facilities need to complete a series of steps:

  • Obtain information on vaccination history

  • Schedule planned vaccinations according to CDC requirements

  • Accurately map the patient to an available healthcare provider

  • Inform the patient

  • Inform the clinic

  • Set up an appointment

A healthcare data warehouse solution, coupled with a reporting system, would be very helpful in solving this problem. A DWH could take care of gathering and merging data from multiple sources, tracking the patients and their schedules. The system's internal logic would then match patients with available doctors, while also following CDC immunization guidelines.  

This would eliminate manual labor and streamline the entire process of childhood immunization.

Reducing risk/liability

An average medical institution is exposed to many risks and liabilities. A properly implemented healthcare data warehouse can help reduce or even eliminate many of these.

  • Aggregation and analysis of claim management data can help prevent insurance fraud. By processing significant amounts of claim-related data, the analytics tools connected to your warehouse can learn to identify "red flags" and stop fraudsters in their tracks.

  • Patient data can be analyzed for gaps in care to fix issues and prevent lawsuits. With a data warehouse, every patient journey becomes clearly visible. By gathering enough insights, direct correlations can be established between treatment plans, exams, and outcomes. So if a missed lab test or request for imaging led to poor consequences, next time the system will alert clinicians in a similar situation. 

  • Tracking recalled medications and negative drug interactions. When relevant data is added and processed regularly and automatedly, a healthcare provider can prevent medicine-related incidents. This can be done by setting up controls in the reporting component of a data warehouse that will alert clinicians in potentially dangerous situations.

Essentially, a DWH can provide better visibility into a facility's operations, help automate processes, and offer more granular control. 

Wrapping up

A healthcare data warehouse can be an invaluable asset for any medical organization. The value lies in the insights extracted from the company's data, but more importantly, in the quality of those insights and the speed of processing. To achieve that quality and speed, you'll need to spend time and money developing a solution that is technologically advanced and has a sound architectural design. 

But technology alone can't be the silver bullet. Long before building a data warehouse, you'll have to start by assessing your pain points and data analytics needs. Coincidentally, that is the best time to involve a software vendor in the process.  

The Demigos team is ready to create the data warehouse your medical business needs. We specialize in building top-of-the-line healthtech and agetech solutions. Take a look at our latest health-related projects like Wendy's Team and GapNurse.

Ivan Dunskiy
Ivan has been working in the tech industry for more than 10 years as a Quality Assurance Engineer, Mobile Software Developer, and Product Manager. Co-founder of 2 startups.