MUMPS: the Arcane Database Language Behind Our Electronic Medical Records
Michael T. Fenn
Technology Strategy & Analysis
May 07, 2019
Arrange an Expert ConsultIn our increasingly digital age, databases sit behind the scenes retaining and managing our digital data—whether financial records, text messages, a company sales list, or the “trivial” statistics of fantasy football—nearly everywhere.
Databases frequently become critical—even the critical—evidence during litigation.
For example, databases holding electronic medical records (EMR)* and the software that manages them may be scrutinized. Doctors accused of malpractice may have the authenticity of their electronically-logged notes or test orders questioned. Hospitals may be questioned on the accuracy of their billing and reimbursement calculations. The EMR software vendors themselves can even be the targets of False Claims Act or class action lawsuits attacking the reliability of their software and seeking substantial damages.
Accordingly, to conduct efficient and effective discovery, litigators need to understand the underlying databases at issue in their matters—or work with experts who do. This is particularly true when dealing with the peculiar databases found in most EMR systems.
Standard (Relational) Databases
Most databases in use today are “relational” databases. These comprise the well-known constructs of tables with rows and columns. According to a 2018 survey, specific relational database implementations occupied the top three places and four of the top five places for the most popular databases used by programmers. Programmers typically interact with relational databases using SQL (Structured Query Language), a widely-known language specifically designed to meet the needs of relational databases. According to that same survey, SQL is the fourth most popular programming language among all programming languages—not just database languages—with nearly 60% of programmers using it.
Non-standard (Non-relational) Databases
A large number of critical database applications, including those managing our electronic medical records, do not use relational databases and SQL. Rather, many are written in a database programming language known to few and understood by fewer: MUMPS.
The Massachusetts General Hospital Utility Multi-Programming System (MUMPS or simply M) dates back to 1966 and since then has been a staple of EMR database systems, including EMR software developed in-house by healthcare organizations, as well as widely used commercial EMR software (such as the software sold by Epic Systems, MEDITECH, and others). Since its invention more than a half century ago, MUMPS has evolved into numerous flavors, such as InterSystems’ Caché®, a commercial MUMPS database management system. Caché is used in Epic Systems’ commercial EMR software, as well in the EMR systems of Partners HealthCare and the U.S. Department of Veterans Affairs. Caché and MUMPS are also prominent in financial services database software.
To the uninitiated, MUMPS analysis presents a steep (and expensive) learning curve. In a MUMPS database, there are no familiar tables or corresponding SQL queries. Instead, data objects are stored in balanced tree structures, presented to the programmer or examiner as multidimensional sparse arrays. While a table in a relational database is limited to two dimensions—rows and columns—a single MUMPS data object can have an arbitrary number of dimensions. Accordingly, the widely known SQL format for queries simply doesn’t work for this data model, and instead MUMPS provides its own syntax for referencing data.
Further complicating analysis, MUMPS requires no “schema” formally defining how its data are organized; clues to this organization reside only in the application’s source code. For those without experience, the language itself can pose a challenging read: its general syntax and control flow eschew the conventions of modern languages. Furthermore, because it originated in a bygone era when program memory was a scarce resource, the language provides a myriad of one- or two-character abbreviations for most commands; terse naming conventions in the source code are the norm and make analysis more difficult.
If you encounter electronic medical records or other MUMPS databases, it’s worth your time to work with experts who have experience with the MUMPS language and database management systems like Caché. Call us and we will help you understand the best course of action for analyzing this data; that first call is always free.
* Electronic Medical Records (EMR) and Electronic Health Records (EHR) are typically viewed as interchangeable in the medical technology industry.