
Courtesy of Wikimedia Commons
When they asked the editors to write something for this blog, I thought, “Wasn’t that a sci-fi movie with Steve McQueen?”
But we are going to look at a different menace called the coronavirus, and in particular an S.I.R. model, which relates three variables that describe the three groups that a population is divided into at the time of a contagious illness:
- S = the number of susceptible people,
- I = the number of infected people, and
- R = the number of removed people (recovered, deceased).
Each of these numbers changes over time t, so we will use function notation for them: S(t), l(t) and R)t). (This analysis assumes that no vaccine exists.)
The S.I.R. model formulates how the rate of change of each variable is related to the other variables, and consequently what the values of the functions over time are.
Here is a little background math you will need for this explanation.
In middle school, you learned about average rate of change. If you drive 200 miles in 4 hours, the average rate of change (in this context, velocity) is 50 miles per hour. If you were using cruise control, so that you are actually always driving at this rate, then the distance d traveled, in miles, is directly proportional to the time t, in hours, you drive. In plain English, directly proportional means that if you double the input (time), then you double the output (distance); if you triple the input, then you triple the output, and so forth. Precisely, it means that the output equals a constant times the input: in this example, (50 mph for t hours yields d miles).
But in reality, when you drive down the highway, you can look at your speedometer and see that your speed is always changing. What you see on the speedometer at any instant is the instantaneous rate of change (of distance traveled). If you were to study calculus, you would learn how to calculate the derivative, or instantaneous rate of change, of many functions, f(t). The notation f'(t) means the derivative of f(t).
For example, if you drop a ball off of a roof that is 100 feet high, its height in feet after t seconds is given by h(t) = 100 – ½gt2. (Trust me! But you should at least see that if you plug in 0 seconds, you get 100 feet, so that makes sense. And the bigger t gets, the smaller h gets, because the height is decreasing over time.) The letter g is the “constant” acceleration due to gravity. (I’m not being ironic; it’s just not really constant, but close enough.) Then your friend who knows calculus can tell you that the instantaneous velocity h'(t) = -gt (but now I’m your friend, and I’m telling you). Why does this make sense? First, the velocity is negative because the height is decreasing. Second, acceleration (g) is the rate of change of velocity. The acceleration here is constant, so velocity is directly proportional to time (the original example also had a constant rate of change, linking other quantities, so you get the same sort of equation).
When you finish studying calculus for a year, you can study differential equations for a semester. Now in algebra, you studied equations where the variable that you are solving for is an unknown number. But in a diffy q (as they are fondly called), the ‘variable’ you are solving for is an unknown function, and the equation somehow involves derivatives of the unknown function (I can see you scrunching your face from here.). If I gave you the equation h'(t) = -gt from above as a starting point, and I asked the question “What is the function h(t)?,” then this equation is now a simple diffy q. The function h(t) = 100 – ½gt2 is a solution to the diffy q (you can replace the 100 by another constant to get another solution).
Explaining the S.I.R. model
The S.I.R. model is a system of three differential equations involving S'(t), l'(t) and R'(t) (the derivatives, or rates of change, of our three functions). It relates how fast the number of susceptible, infected, and removed people are changing over time to the present number of susceptible and infected people.
Let us examine these differential equations, not to see if they are correct, or “prove” them, but to see if they sound like reasonable assumptions. Seeing if your equations and results make sense is something one should do while working on almost any math problem: try to understand what is really going on behind the mechanics of the calculations. Here is the actual S.I.R. model:
- S'(t) = -𝛽S(t)l(t)
- l'(t) = 𝛽S(t)l(t) – 𝛾l(t)
- R'(t) = 𝛾l(t)
The Greek letters 𝛽 and 𝛾 represent positive constants.
Some general comments: every person starts out susceptible. They either become infected or remain susceptible. After being infected, they either recover or pass away. So at any time, some fraction of the infected is becoming part of the removed population. The model assumes that this happens in a uniform manner. In particular, it assumes that the rate at which people join the removed population (R'(t)) is directly proportional to the number of people currently infected (l(t)). This is exactly what (3) says, in plain English.
Now, at what rate is the susceptible population changing (S'(t))? First, this is negative, because the number of susceptible people is always decreasing (some are becoming infected). They can only become infected from currently infected people. If the number of infected people doubles, say, then they will probably infect twice as many susceptible people. Thus, it is reasonable to assume that S'(t) is directly proportional to the number of people currently infected (l(t)). This gives us the following underlined parts of (1): S'(t) = -𝛽S(t)l(t).
Does the remaining part make sense? If the susceptible population shrunk to half its size, say, there would exist half as many people that can get infected, so it seems reasonable that half as many people would get infected (leave the susceptible population). This says that S'(t) is directly proportional to S(t). That is the other part of (1).
As with any mathematical model, this model includes some simplifying assumptions, including a constant total population. Equation (2) follows from (1), (3), and this assumption, as follows.
A decrease of S by 10, say, is an increase in I by 10. A decrease in I by 5 is an increase in R by 5. Because the population is assumed constant, all the changes at any time add up to 0. So all the rates of change add up to 0:
S'(t) + I'(t) + R'(t) = 0
→ l'(t) = -S'(t) – R'(t)
Now substitute the values from equations (1) and (3) into the above:
l'(t) = 𝛽S(t)l(t) – 𝛾l(t)
This is equation (2).
I will not attempt to solve the S.I.R. model, or system of differential equations, but will simplify the equations to obtain numerical approximations of the three functions over time. (This is a mathematical exercise just to see what happens, and not based on real parameters or data. I will just make up values of 𝛽 and 𝛾.) Let one week be our unit of time. The Greek letter Δ is used for change and here it will mean the change of some amount over one week. I will use this to approximate the instantaneous rate of change (derivative). Effectively, I am assuming the numbers change once a week, rather than continuously. I will use S old and I old to indicate “last week’s” numbers.
Suppose that 𝛾 = ½ and 𝛽 = 0.000001. The approximations to the differential equations are the three changes below. Notice that the three changes sum to 0.
4. ΔS = -0.000001SoldIold
5. ΔI = 0.000001SoldIold – ½Iold
6. ΔR = ½Iold
Suppose we start counting when 2 people in Massachusetts are infected and everyone else is susceptible (and treat the state like a closed system). The resulting table below is what happens with no social distancing or precautionary measures, but rather the case where the virus is just allowed to spread.
S | I | R | |
Week 0 | 7,000,000 | 2 | 0 |
The changes from Week 0 to Week 1 according to our approximations (4)–(6) would be:
ΔS = -0.000001(7,000,000)(2) = -14
ΔI = 0.000001(7,000,000)(2) – ½(2) = 13
ΔR = ½(2) = 1
This says that 14 people got infected and then 1 person recovered (net 13 infected). To get Week 1’s numbers, you add these changes to Week 0’s S, I, and R-values, respectively (adding a negative change is subtracting). For example, . Now we have
S | I | R | |
Week 0 | 7,000,000 | 2 | 0 |
Week 1 | 6,999,986 | 15 | 1 |
The total number in each row is the total population and is constant in this model. Now, you take the numbers from Week 1 and plug them into equations (4)–(6) to get the changes from Week 1 to Week 2. Then add those to Week 1’s numbers to get Week 2’s numbers. Continue to do this for subsequent rows (this would be a good place to use an Excel sheet). These are all numbers of people, so I have slightly rounded in some places.
S | I | R | |
Week 0 | 7,000,000 | 2 | 0 |
Week 1 | 6,999,986 | 15 | 1 |
Week 2 | 6,999,881 | 113 | 8 |
Week 3 | 6,999,090 | 847 | 65 |
Week 4 | 6,993,162 | 6352 | 488 |
Week 5 | 6,948,741 | 47,597 | 3664 |
Week 6 | 6,618,002 | 354,538 | 27,462 |
Week 7 | 4,271,669 | 2,523,602 | 204,731 |
In the following week, the change in the number of susceptible will be greater than S according to our model. You cannot have a negative number of people, so we use ΔS = -Sold (everyone susceptible becomes infected) and ΔI = SoldIold – ½Iold for this week.
S | I | R | |
Week 7 | 4,271,669 | 2,523,602 | 204,731 |
Week 8 | 0 | 5,533,470 | 1,466,532 |
Week 9 | 0 | 2,766,735 | 4,233,267 |
Week 10 | 0 | 1,383,368 | 5,616,634 |
Week 11 | 0 | 691,684 | 6,308,318 |
Week 12 | 0 | 345,842 | 6,654,160 |
Week 13 | 0 | 172,921 | 6,827,081 |
Week 14 | 0 | 86,460 | 6,913,542 |
The number of infected at this point is halving every week and approaching 0, while the number of removed is approaching the entire population.
MORE BLOGS: Follow more insights from Victory Productions’ staff on our blog page