An article touted by anti-vaxxers shows why “doing your own” COVID-19 research is not a good idea

Anyone can read a scientific article but only an expert can judge its worth and put it in context

Edmond Alkaslassy
20 min readOct 30, 2021

--

Probably most of us have looked up information to fill in gaps in our knowledge about COVID-19. What exactly is a spike protein? How does mRNA work? What are co-morbidities? Why do some communities have higher infection rates? It is healthy to ask questions and learn, and many of us have learned a lot since the pandemic began.

Others look for information not just to answer a few questions but to create from scratch their basic understanding of COVID-19. Why do they do this when they could do what most of us do: listen to the experts working at public health agencies like CDC, NIH and WHO? The reason is simple: for whatever reasons(s), they do not trust information from those (and similar) organizations. Now put yourself in their untrusting shoes for a moment. If you didn’t trust those agencies, how would you acquire a basic understanding of COVID-19? You would be on your own and have to figure out for yourself what is “really” going; you would have to “do your own research.” Unfortunately, as we will see, “doing your own research” almost guarantees a poor understanding of COVID-19, unless you happen to be an expert.

I have a friend who does her own online COVID-19 research. She has little trust in public health agencies (CDC, NIH, WHO). And by any reasonable standard she does not have expertise in science; she is not an expert.

My friend is not alone in getting much of her information online. According to Pew Research, roughly half of U.S. adults are getting some (30%) or a lot (18%) of news and information about COVID-19 vaccines from social media. And 39% of them say that social media is either an important way (33%) or the most important way (6%) of keeping up with news about COVID-19 vaccines.

She recently posted an article that has gone viral in the anti-vaxx community. The article exemplifies the pitfalls of non-experts “doing their own research.” The article was peer reviewed and published in a reputable journal so the non-expert would probably be inclined to accept it at face value. But the truth about scientific articles is more complicated.

Some journals have higher publication standards — greater rigor — than other journals. Different methods of data collection have their strengths and weaknesses. And conclusions too can be debated: data points never speak for themselves but instead must be woven into a narrative, and sometimes there can be more than one reasonable narrative. In addition, one needs to be familiar with many of the thousands of COVID-19 studies in order to contextualize any one study. Experts know all of this and approach new articles with these caveats in mind. But the average person “doing their own research” is not likely to approach any single article the way an expert would, let alone be able to place it within the larger COVID-19 literature.

Pop quiz!

Let’s see the world as an expert might by critiquing the article posted by my friend, published in the European Journal of Epidemiology. (I hasten to note that I am not an expert in disciplines relevant to COVID-19 science — virology, immunology, epidemiology — but during my two decades as a biology professor I critiqued many scientific articles. A true expert would probably have more to say about this article than I do.)

Let’s start with the title:

“Increases in COVID‐19 are unrelated to levels of vaccination across 68 countries and 2947 counties in the United States”

Quick! What was your first reaction to this title?

Was your reaction more like A, B or C?

A. That’s great news. I can see why there is such a big push to get everyone vaccinated.

B. That’s a bummer. I wonder why there is such a big push to get everyone vaccinated.

C. I don’t know what to make of the title.

Whatever your reaction was, set it aside for now; we’ll return to it later.

After reading the title a good next step would be to read the entire article; please click here even if you do not want to read the whole article. And here we already have a potential problem: What proportion of vaccine skeptical folks (or anyone!) encountering this article online would choose to read the entire article vs only the title? I don’t know but clearly some percentage will be satisfied to read only the title. And if they do, they could easily reach the simplistic conclusion that it doesn’t matter how much a country or county is vaccinated, everyone will be at equal risk for infection. As we will see, the story is not that simple.

But let’s hope for the best, that folks would read the entire article. What would the non-expert make of this article?

A tale of two figures

The article includes three data figures. I cannot include them here for copyright reasons but I encourage you to keep the article handy for reference. We will explore Figures 1 and 3. First we’ll describe both figures, then we’ll critique them.

Figure 1 shows the “(r)elationship between cases per 1 million people (last 7 days) and percentage of population fully vaccinated across 68 countries as of September 3, 2021.”

Please look at Figure 1. You can see that the line has a slightly positive (upward) slope so the authors are on solid ground when they say: “the trend line suggests a marginally positive association such that countries with higher percentage of population fully vaccinated have higher COVID-19 cases per 1 million people.” It appears that a highly vaccinated population does not experience a reduced rate of infection. Indeed, the figure suggests that the more a population is vaccinated the higher their infection rate. This is a very interesting finding, to say the least.

Figure 3 shows the “(p)ercentage of (U.S.) counties that experienced an increase of cases between two consecutive 7-day time periods by percentage of population fully vaccinated across 2947 counties as of September 2, 2021.” Take a moment and look at the graph. The X axis divides counties into groups whose vaccination rates are 5% “wide”: 0–5% of the population is vaccinated, 5–10%, etc. up to the final group of 70% or more. The height of each bar indicates the percentage of counties that experienced an increase in infections between the two 7-day data collection periods. The higher the bar, the higher the percentage of counties whose infection rates increased from the first week to the second. The authors say there is “no discernable (sic) association between COVID-19 cases and levels of fully vaccinated.” If you see the bars as being roughly the same in height (“flat”) across the graph, as apparently the authors did, that would be a fair conclusion. But is that what you see?

Lies, damn lies and statistics…and eyeballs

Every study is unique and few are without flaws. It is a messy world (that’s why research is so challenging and fun) and only rarely do researchers have a handy template to follow to ensure that their research is as close to impeccable as possible. (An exception is research on the safety and efficacy of new drugs, for which there is a gold standard template to follow, described here.)

So I am not going to nitpick this study to death. But I am going to fault the authors (and the journal editor) for not presenting statistical analysis of the data. Statistics have been part of research study design and data analysis since the early 1900’s, maybe even earlier. One hundred years ago crunching numbers by hand was a terrible slog but with today’s computers there is simply no excuse for the absence of statistical treatment of data in this (or any) scientific paper.

Why is statistical analysis so important? Without it, we are left to look at graphs and “eyeball” them for trends that may or may not be truly present. Look, for example, at Figure 1. The authors say, “the trend line suggests….a positive association.” Statistical analysis takes the data out of the realm of suggestion and puts it into the realm of hard probability. Those 68 data points can be entered into a statistical analysis program and in seconds we can see the equation for that line. The next step is to do a statistical analysis to find out whether the slope of that line is significant, i.e. if it is meaningfully different from zero. A line with a slope of zero is a flat line: no matter how large the X variable is, it has no effect on the Y variable. But if the slope of the line is significantly different from zero and slopes upward (“is positive”) then there is a genuine positive relationship there: as the X variable increases so does the Y variable.

It only takes a few seconds to perform these analyses. But you have to know that they need to be done, and you have to know how to do them, and then you have to do them. Unfortunately the authors did not do them; I don’t know why and I find it frustrating. Compared to having the definitive results of statistical analysis, “eyeballing” the data is an entirely unsatisfactory and anticlimactic way of reaching a conclusion, especially after the authors spent so much effort gathering and organizing the data. It’s like building a big beautiful house but failing to put on a roof: a lot of effort was spent creating something that is not nearly as useful as it could have been.

To return briefly to Figure 1, a fair summary is that there “appears” to be a positive association between the two variables but without statistical analysis (which could easily be done) we do not know for sure.

For Figure 3, there was again no statistical analysis and to be clear, these data could be so analyzed (perhaps they will be). The authors concluded from Figure 3 that there “appears to be no significant signaling of COVID-19 cases decreasing with higher percentages of population fully vaccinated.” (In scientific research, “significant” is a word used to describe the results of a statistical test so it is true that there was no “significant” signaling.) That’s how the data “appeared” to them. But how does it appear to you?

Another look at the data

It is difficult (though not impossible) to challenge the results of statistical tests. But in the absence of statistical conclusions we can only look directly at the data figures to see what story they tell.

Let’s look again at Figure 1. Find the five data points (countries) with the highest vaccination rates. It’s easy to find them because they are at the right edge of the graph and they are all clustered together. Notice anything else? They are all below the line, meaning that their infection rates are all lower than the line would predict (if the line was statistically significant.) This suggests that perhaps the countries with the highest vaccination rates have lower (not higher) rates of infection, the opposite of what the authors suggests.

That observation led me to calculate the average infection rate for the five countries with highest vaccination rates (70% and above) and that for the remaining 63 countries. Those five countries have an average infection rate of 728.5 infections per million, a number that is noticeably lower than the average for the remaining countries: 1082.3 infections per million. The difference between the two averages may appear — via the “eyeball method” — to be meaningful but according to the statistical test I used that is not the case. Here’s how the results of this statistical test would typically be reported:

“The two averages were not significantly different (Mann Whitney U-Test, U = 156, z = -0.0235, p = 0.98).”

So the “five country” average rate of infection is lower than that of the remaining countries, but not significantly so. One possible reason for the lack of significance is the small sample size of countries (five) whose vaccination rates are at or above 70% (small sample sizes reduce statistical power). But there are now at least 20 countries with vaccination rates above 70%, plenty of data for a more robust analysis. It should now be possible to determine whether or not those “high vaccination countries” are experiencing higher or lower rates of infection.

And when I looked at Figure 3 I made two observations: 1) the shortest bar out of all 15 bars is the one farthest to the right, and 2) the two shortest bars occur in the three vaccination categories farthest to the right. Let’s consider each observation in turn.

The first observation is that the counties with the highest vaccination rate (70%+) had the lowest percentage of counties that experienced increased rates of infection. How much lower? There are 15 bars on the graph, corresponding to the 15 vaccination categories. The bar at the far right has a value of 40.6% (much of the underlying data are available at the end of the article under “Supplementary Information” so anyone can see it and make their own calculations) and I calculated the average of the other 14 bars to be 63.7%. There is no statistical test that can compare one number to the average of 14 numbers, but by the “eyeball method” 40.6% and 63.7% seem pretty far apart, especially considering the general lack of variation (the “flatness”) of the data overall. And unless there really IS a relationship between the variables, it is unlikely that the shortest bar out of 15 would just happen to be at that most extreme right edge of the graph. My conclusion is that there might be a glimmer of a trend: counties with the very highest vaccination rates may indeed be experiencing reduced infections. But without statistical tests we do not know for certain.

The second observation led me to calculate the average for the three bars farthest to the right (the three highest vaccination categories, 60% and above) vs the average for the remaining 12 bars (less than 60%). The average proportion of counties experiencing an increase in cases for the 12 “lower vaccination rate” bars is 65.0% while the average of the three “highest vaccination rate” bars is 50.7%. I do not know a statistical test that can determine whether these two averages are significantly different (the sample size of three bars is too small). But again, unless there really IS an association between the two variables, it is rather suspicious that the average of the three bars associated with the highest rates of vaccination would be so much lower than the average for the remaining bars, again especially considering the general “flatness” of the graph. So at this point we have two reasons (the location of the single shortest bar and the lower average of the three “highest vaccination category” bars) to wonder whether counties with higher vaccination rates might be experiencing reduced infection rates.

To sum up, our observations run counter to the findings of the authors, who write that infection rates are “unrelated” to vaccination rates. In the absence of hard statistics, these new findings are just as likely to be correct as those of the authors.

Timing is everything

When was the study conducted? And why might that matter?

It matters because the data were not collected in a vacuum. As it turns out, the data collection periods for both Figures 1 and 3 were near the peak of a recent wave of infection, a fact that naturally had an effect on the data.

Figure 1 includes data from 68 countries so let’s look at what was happening globally. The data for Figure 1 was collected during the seven days ending in 3 September 2021. At that time there was a major wave of infection globally (see graph below). As of 9 October 2021, the peak (seven day rolling average) of that wave was 30 August 2021. So during the study period many countries were experiencing many new infections, regardless of what percent of the population was vaccinated. (I will return to this point later.)

Global daily COVID cases.

Source: https://www.worldometers.info/coronavirus/worldwide-graphs/#daily-cases

As a thought experiment, consider what Figure 1 would have looked like if the data had been collected during the late-June lull in infections before the most recent wave. Most countries would have had much lower rates of infection since there were so many fewer new infections, and so Figure 1 would probably have looked different.

The same reasoning applies to Figure 3, which includes only U.S. counties. Data for Figure 3 was collected during two 7-day periods: 19–25 August 2021 and 26 August — 1 September 2021. The authors compared the infection rates between the two periods and if the number of

infections was greater in the second week then the county was scored as having experienced an increase in cases. What was happening in the U.S. at that time? The peak of a major wave of infection was on 2 September 2021, at the end of the study period (see graph below).

What was happening 14 days before the peak, during the two 7-day study periods? There was a massive increase in infections across the country.

U.S. daily COVID cases.

Source: https://www.worldometers.info/coronavirus/country/us/

Since the entire nation was undergoing a severe wave of infection it should come as no surprise that many counties showed an increase in new cases from one week to the next.

The high levels of infection globally and in the U.S. during the periods of data collection make it almost inevitable that the data would reflect those high rates of infection. The timing of data collection had a strong effect on the outcome.

Perhaps the authors would say that the ideal time to find out whether highly vaccinated populations fare better than poorly vaccinated ones is during a wave of infections; that would be a fair point. But it would only strengthen my view that this article indirectly addresses a concept not mentioned in the paper: herd immunity.

Safety in numbers?

The article seems to be asking: Does the level of vaccination in a population affect the likelihood of future infections? And that is really a question about herd immunity.

As we have seen, the absence of statistical analysis reduces the strength of the article’s conclusions. Nevertheless, the article contains a lot of interesting data and we should learn what we can from it. Perhaps those data can shed some light on the concept of herd immunity.

Earlier estimates were that herd immunity for COVID-19 could be achieved if 60% — 70% of the population was vaccinated. But for a variety of reasons (e.g., a more transmissible Delta variant), that is now not likely to be the case. Instead, if herd immunity is possible at all, it is now thought to require a population vaccinated at 90% — 95%.

Let’s reconsider Figures 1 and 3 in light of those two herd immunity yardsticks. Figure 1 includes 68 countries. Only five of those 68 countries had vaccination rates above 70%: Iceland (77%), Malta (80%), Portugal (75%), United Arab Emirates (76%) and Uruguay (72%). Put another way, those five countries were between 70% and 80% vaccinated, a range that puts them above the earlier estimates needed for herd immunity but below the current estimates. The fact that all five data points fall below the line suggests that these countries have a rate of infection lower than predicted by the line (assuming the line is statistically significant). We also saw that the average infection rate for these five countries was lower than that of the other countries (although the difference was not significant). So perhaps those five countries with the highest vaccination rates are getting some modest protection from future infections, a.k.a. the beginnings of herd immunity.

Sixty-three of the 68 countries had vaccination rates below 70%. Thus it is no surprise that a study comprised primarily of countries with vaccination rates below 70% did not find any discernible reduction in their infection rates.

If the trend line in Figure 1 reflected only the absence of herd immunity, it would be flat (at least up to the highest vaccination rates). But the actual trend line “appears” to have a positive slope so there is something else going on. Like what? We don’t know. Many possible factors can affect infection rate. For example, temperature, population size and mean age have an effect on COVID-19 outbreaks. Male gender, being over 60 years old, living in urban areas, being married, and comorbidities (e.g., obesity) affect the risk of infection. Spending more time indoors is riskier than spending time outdoors. Are countries with higher vaccination rates also associated with some of these risk factors? Maybe, maybe not. The authors have done an important service by assembling the data in Figure 1 but did not expend much ink explaining why the line might have a positive slope. Hopefully that question will be the subject of future research.

Figure 3 considers U.S. counties. “Eyeballing” the data, it seems clear that counties in the first 12 vaccination categories (0–5% up to 55–60%) did not experience any benefit from herd immunity: those bars are pretty “flat.” (For context, the U.S. as a whole was only 61.4% vaccinated by the end of the study period on 3 September 2021.). But, as calculated above, the average “increase in infection” for the three highest vaccination categories (60% or higher) is lower than the average of the other 12 categories (50.7% vs 65.0%). And there is also the fact that the shortest of all 15 bars (the lowest “increase in infection”) is in the highest vaccination category (70%+).

Do these two aspects of Figure 3 suggest that a small degree of herd immunity may be kicking in at the highest vaccination levels in U.S. counties? Perhaps. It would help to know how many counties there are in the “70%+” category and how far above 70% they are. And those facts are knowable.

It’s complicated

Now that we have examined the article, let’s see how you react to its title this time:

“Increases in COVID‐19 are unrelated to levels of vaccination across 68 countries and 2947 counties in the United States”

Do you have the same reaction you had initially? Or is your reaction now more nuanced, or skeptical, maybe even guarded? If so, you have taken a small step toward the world of experts, whose first thought after reading any article’s title is, “Well, let’s see.…”

Closely analyzing Figures 1 and 3 led to new insights. Recall the take home messages that emerged from our re-analysis of the data:

From Figure 1: The average infection rate of the five countries with the highest vaccination rates (70% or higher) was lower than the average for the remaining 63 countries.

From Figure 3: The average proportion of U.S. counties with increased infection rates was lower in counties with the highest vaccination rates (60% or above).

Given these findings, a different title and top-line message would be at least as justified as was the original title:

“Highly vaccinated countries and counties have lower rates of COVID-19 infection.”

I don’t love this title either because these findings are not statistically significant but then neither were the findings that the original title was based on. This alternate title is accurate. More to the point: It conveys a very different message than does the original title.

What message did vaccine skeptics hear?

The anti-vaxx community has embraced that article and sees it as evidence that vaccines do not provide the benefits that public health agencies say they do. The article’s title could easily lead vaccine skeptics to ask: If having more people vaccinated doesn’t reduce infections, why bother? And even if folks read the whole article, how many are able (and inclined) to put it under the microscope? Vaccine skeptics, primed by the title and lacking scientific expertise, appear to have concluded that this article provides solid support for their viewpoint.

But it would be a mistake to conclude that the authors are anti-vaxxers or that they view their article as evidence that vaccinations are not worthwhile. In the article they write: “Other pharmacological and non-pharmacological interventions may need to be put in place alongside increasing vaccination rates.” Clearly, they are in favor of increasing vaccination rates (and of taking additional measures to protect public health).

The authors also write: “…vaccinations offers (sic) protection to individuals against severe hospitalization and death…” This is a critical point. Perhaps no country or U.S. county was sufficiently vaccinated to strongly protect its unvaccinated residents via herd immunity but that is very different from claiming that vaccinations do not protect individuals. There is an abundance of data that definitively show the benefits of vaccination to individuals (for example). Vaccinations offer strong (albeit imperfect) protection to individuals from hospitalization and death due to COVID-19.

The vaccine skeptical appear to have heard a different message than that intended by the authors, and I did too — we have that in common. But I heard a message of uncertainty and multiple interpretations while the vaccine skeptical heard a message that validated their pre-existing beliefs.

Science sherpas

The simple truth is that most people “doing their own COVID-19 research” do not have the tools to dissect, as we just did, every scientific article they read, or to place any one article in the broader context of the overall body of COVID-19 research. We simply cannot do it all ourselves and so we must rely on experts to do it for us. Lucky for us, public health organizations (CDC, NIH, WHO and others) have teams of knowledgeable people who read and evaluate new articles and make sense of them for the public. Acknowledging that we are reliant on experts is an unpopular message in this age of anti-elitism but no less true for that.

Part of the challenge faced by experts is the staggering number of new articles they must read. More than 23,000 academic papers about COVID-19 have been published, a number that doubles every 20 days. Teams of experts wade through new articles and assign more weight to stronger research and less weight to weaker research to build the most reliable narrative they can. None of us “doing our research” at home can possibly replicate their coordinated efforts. On our own, we can only passively bump into (or actively cherry pick) one paper or another and make of it what we will, unable to properly evaluate it or place it within the larger corpus of COVID-19 research.

Consider the paper that we just evaluated. Yes, it was peer-reviewed and published in a reputable journal, both positive signs. But, as is true of most articles, this one was well short of iron-clad (as I’m sure the authors would agree). Only experts are likely to see the paper for what it is: a small and imperfect piece of a massive puzzle. In contrast, non-experts see it as supporting their vaccine skepticism. It is easy to see what we want to see and to downplay inconvenient truths. In this information-saturated age anyone hoping to find support for their point of view will probably find it, somewhere.

Trust is everything, coloring what we read and who we give credence to. When people “do their own research” they are likely to encounter weaker research that (appropriately) has not been assigned much weight by experts and so has not received much attention. But skeptics encountering such research might instead see proof that untrusted experts are hiding something from the public. Why, they might ask, are the experts not talking about this article? Is it because the experts are trying to squelch research that casts doubt on the value of vaccines? No, it is because experts know that, given the totality of information available, the article is not strong enough to affect their conclusions and recommendations. Rather than appreciating the teams of public health experts dedicated to separating the wheat from the chaff, skeptics can see a conspiracy of elites misleading the public for nefarious ends.

Experts in public health are our “science sherpas.” Like Himalayan sherpas, they have the knowledge, skills and experience to guide the rest of us through confusing terrain as safely as possible. Their job is difficult and dynamic, and the public should be at least as grateful for their efforts as climbers are for the work of their mountain sherpas.

Experts know that a single article is rarely definitive or impeccable. The methods can be flawed. The data can be analyzed in more than one way (as we just saw). Different conclusions can be reached (ditto). Often an article generates (intentionally or not) more questions than answers. Studies can appear to be in conflict. And there are many, many studies to take into account. Without experienced science sherpas to guide us through the massive, complicated and messy world of COVID-19 science we would lose our way.

--

--

Edmond Alkaslassy

Faculty Emeritus, Assistant Professor of Biology, Pacific University, Oregon. He is writing a book that compares the daily lives of humans and other animals.