[A series of posts explaining a paper on the mathematical modeling of the spread of antiviral resistance. Links to other posts in the series by clicking tags, "Math model series" or "Antiviral model series" under Categories, left sidebar. Preliminary post here. Table of contents at end of this post.]
We continue our examination of the paper, "Antiviral resistance and the control of pandemic influenza," by Lipsitch et al., published in PLoS Medicine. We have gotten to the point where our population is divided into five categories. We follow the course of the epidemic on each of its days by examining how many people are still susceptible (denoted by X), how many are infected with sensitive virus and are being treated with Tamiflu (YST), how many are infected with sensitive virus but are untreated (YSU), how many are infected with resistant virus (YR); and how many have either died or recovered and are immune (Z). These last are called "Removed" because they are no longer part of the process of viral spread. I think most people would agree that if we had an accounting of these five categories, day-by-day, we'd also have a pretty good idea of the progress of the epidemic, including the spread of antiviral resistance. Here's the strategy from now on.
How we run the model:
We agree that if we stipulate how many people are in each of these categories on some particular day, this will determine how many there are on the next day. We can do this with a "rule book" that allows us to look up the in each of the five boxes on a specific day and gives us the numbers in each box for the next day, etc. If we start off with everyone uninfected on day zero and introduce a single, untreated person infected with a virus sensitive to Tamiflu on day one, we can use the rule book, day-by-day, to follow the epidemic. To spell this out, we look up the value for day 2 when day 1 has X-1 susceptibles, YST=1 untreated people infected with a sensitive virus and the other Y boxes and the Removed boxes are initially empty. We do the same for day 3, using the value we got for day 2, etc. That's all there is to running the model. All we need is a rule book. The rule book is the key. It "is" the model.
The Rule Book:
The rule book is based on the Law of Mass Action (see earlier posts). We'll spell this out, as it isn't intuitive. Suppose there is a number, Y, of infected people in town. If your chance of encountering any of them is roughly the same, then the more of them there are, the greater your chance of an encounter. How does that chance increase as the number of infected people increases? Maybe the change is related in some way to the square of the number of infected people plus the cube of the number all of this divided by some other number or even something much more complicated. There are infinite number possibilities for this. The Law of Mass Action picks one, just about the simplest possible. It says that for every new infected person we add, we also add some fixed chance of being infected for each susceptible in the population. Said another way, the per person chance of infection each day is proportional to the number of infected people on that day. Note that if there are no infectious (infected) people (they all got well or died), the risk for the susceptibles becomes zero and no new infected people will be added. Likewise, if there are no susceptible people, left the risk will be zero. In both cases the epidemic is over.
Is picking the simplest such relationship risky? It depends. In this case, no. That's because the amount of added risk for a susceptible person  when there is one more infected person around (called the transmission rate constant and designated by the Greek letter, β) is very small in this model. The transmission constants, β, are all on the order of a ten thousandth millionth [hat tip bar] or less (you can find values in Supplementary Table S1, online; they were not picked out of the air but obtained from a study of social contacts and influenza spread in The Netherlands). Under such circumstances, even very complicated relationships "look" proportional. The surface of the earth, which we know is curved, still looks flat to us. For getting around in ordinary life we don't go very wrong by assuming it is flat. Similarly, here. Tiny β's confine us to a local area that "looks flat" no matter how complicated the big picture is. This is a very common technique in applied mathematics. It works well enough to be able to put a space ship precisely at a certain point over Mars despite a journey of many million miles.
Using the Rule Book:
So the rule is that the average per person risk of infection of a susceptible on a particular day is β times Y (the number of infecteds), and since there are X susceptible people on that day, the total number of new infected people is β times Y, X times (written βXY). That means there will also be βXY fewer susceptibles the next day, since we assume the total number of people in town isn't changing appreciably over the course of the epidemic. The numbers change day to day as the susceptible number decreases and the infected numbers increase by this amount and decrease by the number that get well or die. In essence, people are flowing through the boxes (or compartments) from X to Y to Z. When the flow stops, the spread is over. At that point there will be none or few in the Y boxes or compartments. All the people will be in the X or Z compartments.
Moving on, using the Mass Action principle, the number of people infected on a particular day is the number susceptible on that day (X) times the number of infecteds in each Y category times the transmission rate constant, β. But we wouldn't expect this transmission rate to be the same if the contact was between a susceptible person receiving prophylactic Tamiflu versus one receiving nothing (one of the purposes of prophylaxis is to decrease transmission, after all); or between either kind of a susceptible person and someone infected with the resistant strain (one of the ideas is that resistant strains might be less fit and hence less transmissible); or someone who is infected with a sensitive strain but is not being treated and a susceptible person with or without prophylaxis. I'm sure you are now confused by all the possibilities, but we need to keep track of them, and the figure shows us how.
Deciphering Figure 1:
Since there are three flavors of infected people (YST, YSU, YR), we need three transmission rates, (βST, βSU and βR). So the number of new people infected with a sensitive strain produced by a susceptible person encountering one of the YST or YSU people will be βST X YST + βSU X YSU; and the total number of new people infected with a resistant strain will be βR X YR. These people will leave the X compartment and head to one of the three Y compartments.
You can see this in the figure where two arrows lead out of the X box and are labeled with the terms we just set out (not exactly; the X term is left out to simplify the figure because it is common to all of the arrows out of the X box). One of the arrows comes out of the middle of the right side of the X box, the other from the middle of the bottom of the X box. The sensitive virus encounters have been combined into one arrow because the same thing happens to each of the terms in the next step. Lipsitch et al. could have made the two terms into separate arrows, but at the risk of complicating the figure.

If you followed it so far, you shouldn't have any trouble with the rest of it because we will be doing the same thing. This will also take care of equation (2), which is just a repeat of the figure without the boxes and the arrows. We'll be doing it for each of the arrows to give you lots of practice in seeing what's going on.
On day 1 there aren't any resistant viruses, so we will have to account for how they arise, and we will have to make some assumptions on the proportion of the population that receives prophylactic Tamiflu and the proportion of sick people who get Tamiflu for treatment. We'll also need to make some simple assumptions about how long someone stays sick before moving to the Z box and being "Removed" from the process. This sounds complicated and it is, but only in the bookkeeping sense. Even if you don't follow every single step (because your attention wanders or your patience or interest gives out), I hope you will see that the details aren't that mysterious. We are just following the rule book and counting up people in each compartment from day to day.
We'll pursue the details in the next post.
Table of contents for posts in the series:
The Introduction. What's the paper about?
Sidebar: thinking mathematically
The rule book in equation form
Effects of treatment and prophylaxis on resistance
Effects of Tamiflu use and non drug interventions
Effects of fitness costs of resistance
 
Thanks for all this. Your explanation has turned muddy to clear. btw, you have beta as ''one ten thousandth or less''. Maybe I am missing something here, but on S1 beta seems to max out at somewhere of the order of two / ten millionths (which is, admittedly, less than one / ten thousandth)
Maybe you've missed out on something, but with this formulation, isn't the value beta critically dependent on the actual magnitude of the total population? I would expect that for a given proportion of infected people to the total population, the proportion of new cases would be the same whether total population was, say, ten thousand or ten million. This is not the case with this formulation, if I understand it correctly, unless beta is determined for a specific population size?
Janne: Not exactly sure what you mean here. Think of beta as the additional risk a susceptible person incurs when the infected population increases by one. Thus the risk of the average susceptible person will increase as there are more infected people around. It is a function of how transmissible the virus is on contact.
bar: You are quite correct. These old eyes read 109 as 106 in Table S1. I'll correct in text. Thanks.
Revere: sorry, I was too brief.
Say that you had X=Y=100, and beta=0.0001. That would give you one additional case - 1% of the healthy population would fall ill.
Say you have X=Y=1000 - a population increase of ten times, but with the same proportion of susceptible and diseased individuals. You also keep beta fixed at 0.0001. That would give you 100 new cases, which is 10% of the healthy population, or ten times the infection rate of the smaller population.
I would have expected the model to be largely insensitive to the total population size as long as the numbers are large enough. But with the formulation you give, it becomes critically dependant on the exact population sizes. Which makes me suspect either that I have missed something, or that you've simplified away some aspect that explains this.
Janne: Ah. Now I see what you are getting at. Sorry it has taken so long to reply. The pesky day job, again. I guess I needed to lift the hood a little more on the assumptions. Because we are dealing with a fixed population over a relatively short period of time, what happens when the infectives increase is that their density also increases and the likelihood of a contact is greater. Another way to think of the transmission rate coefficient is that it is the probability of transmission per contact.
Here is the math:
we are assuming dS/dt = -b * I * S, or 1/S * dS/dt = -b * I. This last is the per capita increase in risk when there are I infectious people in the population, and, yes, that does go up as I increases, as you so correctly imply. You can also see that b = -(1/I)*(1/S) * dS/dt, which is the other formulation, transmission probability per contact.
I queried Prof. Lipsitch about this and his response was that there is little empirical data on how contact frequency changes with number of infectives in a situation like this. He points out this would not be a good assumption for a sexually transmitted disease, and in that kind of modeling there is a more or less fixed contact rate used. He thinks the actual answer is somewhere between the two extremes. He agrees that the assumption makes comparison across populations of very different sizes problematic, but that's not what is being done here. The betas were derived by calibrating to the populations in Table S1 and an R0 of 2.0.
So the bottom line here is that you are right (you haven't missed or misunderstood something) but that he (and I) feel that the more or less constant population and area assumptions are appropriate in this kind of modeling and to try to do add on complications wouldn't change much (that's my view; he didn't express an opinion because I didn't ask him). It's a good dissertation for someone. Modeling of this kind is doing science. Each experiment produces more questions.
The model structure is not tied to population size (the assumptions make the underlying total population cancel out) but the transmission rate coefficients are, that is, the parameters will be different.
You clearly are following this. Good question.
I guess I'll need to wait for the next installment, but the fact that there are seemly duplicate paths from some of the nodes is a little confusing.
For example the path from X to Yr compared with the path from X to the point just right of X and then to Yr. Apparently at the points from which these arcs emanate we have 'mini-nodes' in which sub-populations exist or maybe I'm just interpreting the diagram incorrectly.
Very interesting so far thanks.
rickr: I'll (try to) explain this in the next post, but, yes, there can be more than one path to YR. In fact there are three: those infected with resistant virus, those prophylaxed whose sensitive virus becomes resistant, and those treated whose sensitive virus become resistant. Next time for more details.
Thanks.
I am doing modeling myself, but of a rather different kind. It would have been presumptive (not to say dangerourly liekly to put my foot in my mouth) to bring my ideas about how do construct a model into a completely different field, and one with a pretty successful track record at that.
My first instinct would have been to basically postulate a total population of 1.0 and set the different subpopulations as fractions: sum(x, y..., z) = 1. That would, I guess, mean that you'd need to express beta explicitly as a function of the proportion of uninfected and infected individuals: beta = f(x/(sum(y)) or something like that. Messier, but you'd be able to explicitly describe the transimssion properties.
Anyway, a big thanks for this exposition. Learning how to interpret a model like this is a great learning experience!