Friday, April 3, 2020

The IHME model showing mass COVID-19 deaths in Alabama is highly suspect

A model currently being used to make projections about the numbers of cases and fatalities due to the COVID-19 has serious flaws and its results should be viewed with great skepticism.

Risk management and numerical modeling is a major part of what I do in my day job. So I'm not parroting what I've heard from people who just don't like grisly numbers.

The basic problem with the model produced by the Institute for Health Metrics and Evaluation (IHME) is that it is using a horribly biased sample of existing cases in heavily populated areas of the United States. Then it is using inferential statistics with those data to reach conclusions about possible outcomes and thus producing expectations that are unrealistic. An analogy that I can communicate is this: It's as if I used a sample of homes taken from New York and New Jersey to represent floodprone properties in places like Elba, Geneva, Demopolis and Saraland in Alabama. Last month, I wrote a Facebook post about a quirk in the state's COVID-19 statistics, wherein I pointed out that Jefferson County's number of confirmed cases of the virus was out of whack with the county's proportion of the state population. Early on, JeffCo's case rate was as high as three times the county's population vis-a-vis the rest of the state. The county's population of about 660,000 is about 13.5% of Alabama's estimated 4.88 million souls but it was accounting for as much as 59% of our COVID-19 cases. Some of that can be explained by greater number of tests for the virus, but not all of it. As of this writing, the county has 340 of the state's 1340 cases, or about 25% of total cases. This has been a fairly stable rate for the last several days. Yet it is still twice the county's representation in the state population.

This pattern of a state's largest metro area see a case rate far out of balance with its representation in the state's population is occurring nearly everywhere else in the country.  The most extreme example is New York City. The Big Apple has 43% of the state's cases. That's a stable percentage and about where JeffCo was vis-a-vis Alabama. (New York City accounts for about 13% of all cases nationwide.)

The IHME approach has been to use the data with the most observations, and they have the most observations in New York and New Jersey, the epicenter of the pandemic in the U.S. They really can't be faulted for this--the Institute was tasked with rapidly producing a reasonable approximation of caseload and mortality for the purpose of planning for the numbers of hospital beds, ICU beds and ventilators. They had little choice but to use the large metro data because that's where all the data were. There are ways of testing for and adjusting the sample to remove bias, but the modelers either didn't have time or just didn't test and adjust their sample. This is an ad hoc planning model, after all.

Another problem I have is with how the planning model has been used in the media, and how news organizations are failing to properly communicate the importance of uncertainty, and how human intervention can drastically change what actually happens next week vs what's being forecast today. One average person making one good decision about how he interacts with others can remove entire branches of unfavorable outcomes in the decision tree. One average Joe making a stupid decision can add them all back again. The truth is that we just don't know, so modelers provide their forecasts with an uncertainty band. It allows the modeler to say something like, "we think the number could be as many as X, it could be as low as Y, but we think the actual number will be close to Z. But remember that this could all change tomorrow." None of that important context was presented in the story linked above or in the video segment that appeared on the station's news broadcast.

The takeaway is this--the IHME model has serious technical issues and that uncertainty is not being communicated properly by the media, causing undue alarm among a nervous public. It also causes people to be dismissive of modeling because the chance of the model's projections being realized down to the digit are infinitesimally small.


Post a Comment

You must have a Google Account to post a comment.

WARNING: Posting on this blog is a privilege. You have no First Amendment rights here. I am the sole, supreme and benevolent dictator. This blog commenting system also has a patented Dumbass Detector. Don't set it off.

Note: Only a member of this blog may post a comment.