Code and notes for my class "Global Change and Emerging Infectious Disease"

View the Project on GitHub eehh-stanford/gceid


Epidemic models, like the SIR model, involve dyads of individuals – one susceptible and one infectious – coming together at a specified rate and generating a new infection with a specified probability. We say that a population is well-mixed if all infectious-susceptible dyads in the population have approximately the same probability of occurring. This is obviously a very strong assumption. For example, as a student at Stanford, you are probably much more likely to encounter another Stanford student infected with the flu than you are an infected student at, say, the University of South Carolina.

We can relax the assumption of a population being well-mixed by adding structure to it. The resulting model will be characterized by structured mixing, meaning that there are potentially quite different probabilities associated with dyads that can be formed from the various elements of structure. This structure can represent geographic location (e.g., Palo Alto, CA vs. Columbia, SC) or it can represent various mechanisms by which social or cultural attributes affect the way people interact, e.g., race/ethnicity, age, gender, occupation, social class, income quartile, etc.

Whatever the nature of the structuring, there are certain generic features of structured epidemic models. First, structuring slows down epidemics. Second, structuring typically leads to smaller epidemics. As I revise these notes in the midst of COVID-19 social-distancing efforts, these points have taken on a new urgency.

We can construct a toy example to show this qualitative behavior. We will compare two SIR models, one well-mixed and the other structured into two asymmetrically-mixing groups. For the well-mixed model, we will assume that the effective contact rate is β = 0.3 and the removal rate of infectious individuals is ν = 0.2. For the structured model, we will assume that the average effective contact rate is the same β̄ = 0.3, but that there is asymmetry in the contact rates such that β11 = 0.5, β12 = 0.15, β21 = 0.05, and β22 = 0.5, where βi**j is the rate of transmission from infectious individuals in class j to susceptible individuals in class i. Assume that the removal rate is the same for both classes, ν = 0.2.

Plot the epidemic (i.e., incidence) curves for the two models.


## SIR model
sir <- function(t,x,parms){
  S <- x[1]
  I <- x[2]
  R <- x[3]
  C <- x[4]
  S.dot <- -beta*S*I/N
  I.dot <- beta*S*I/N - nu*I
  R.dot <- nu*I
  C.dot <- beta*S*I/N              ## cumulative cases
  res <- c(S.dot,I.dot,R.dot,C.dot)
## SIR model with two subpopulations
hsir <- function(t,x,parms){
    S1 <- x[1]
    I1 <- x[2]
    R1 <- x[3]
    S2 <- x[4]
    I2 <- x[5]
    R2 <- x[6]
    C <- x[7]
             S1.dot <- -S1*(beta11*I1 + beta12*I2)/N
             I1.dot <- S1*(beta11*I1 + beta12*I2)/N - nu*I1
             R1.dot <- nu*I1
             S2.dot <- -S2*(beta22*I2 + beta21*I1)/N
             I2.dot <- S2*(beta22*I2 + beta21*I1)/N - nu*I2
             R2.dot <- nu*I2
             C.dot <-  S1*(beta11*I1 + beta12*I2)/N + S2*(beta22*I2 + beta21*I1)/N       ## cumulative cases
             res <- c(S1.dot,I1.dot,R1.dot,S2.dot,I2.dot,R2.dot,C.dot)

times <- seq(0,300,1)
N <- 1000000
x0 <- c((N-1)/2,0.5,0,(N-1)/2,0.5,0,1)
parms <- c(N=N, beta11=0.5, beta12=0.15, beta22=0.5, beta21=0.05, nu=1/5)

## ODE evals
stateMatrix <- ode(y=x0, times, hsir, parms)
colnames(stateMatrix) <- c("Time","S1","I1","R1","S2","I2","R2","C")

### Compare to unstructured model
y0 <- c(N-1,1,0,1)
parms0 <- c(N=N, beta=0.3, nu=1/5)
stateMatrix1 <- ode(y=y0, times, sir, parms0)
colnames(stateMatrix1) <- c("Time","S","I","R","C")

## plot epidemic curves
plot(stateMatrix1[,"Time"], stateMatrix1[,"I"], type="l", lwd=3, col="blue",
     xlab="Time", ylab="Number of Cases")
lines(stateMatrix[,"Time"], (stateMatrix[,"I1"] + stateMatrix[,"I2"]), 
      lwd=3, col="red")
legend("topleft", c("Unstructured","Structured"), col=c("blue","red"), lwd=3)

Note that the structured epidemic is slower to take off and infects a smaller fraction of the population even though it technically has a higher R0. Importantly, from a healthcare-logistics perspective, the peak number of cases is lower. Even if an epidemic cannot be avoided, delaying it buys time to develop vaccines, treatments, etc. and reducing the peak of cases makes it more likely that healthcare systems’ capacity will not be overwhelm.

## R_0 = beta/nu = 1.5
0.3 * 5

## [1] 1.5

## R_0 dominant eigenvalue of G = 2.93
G <- matrix(c(0.5, 0.15, 0.05, 0.5)*5, nr=2, nc=2, byrow=TRUE)

## [1] 2.933013 2.066987

Flattening the Epidemic Curve When Mixing Is Heterogeneous

As I write these notes, the pandemic respiratory infection caused by the SARS-CoV-2 virus, COVID-19, is sweeping the United States. One of the major concerns of COVID-19 is that there is a substantial subpopulation that is particularly vulnerable to serious complications and death from infection. In particular, there is mounting evidence that COVID-19 is especially dangerous for the elderly.

This observation leads to some obvious public-health policy questions. What is the best way to protect a vulnerable subpopulation? In particular, should we isolate members of vulnerable sub-populations through quarantine or social-distancing efforts?

We can use our simple structured-mixing model to investigate what happens when we reduce the contact rates (e.g., through social distancing efforts) of the high-risk population, the lower-risk population, or both.

We will explore four sets of parameters reflecting:

  1. Baseline conditions
  2. Reduction of contact rates with the high-risk subpopulation
  3. Reduction of contact rates with the low-risk subpopulation
  4. Global reduction of contact rates

We assume that the contact rates within the low-risk subpopulation are higher than either the contact rates within the high-risk subpopulation or between subpopulations. This assumption follows from the fact that the high-risk subpopulation is older and less likely to be engaged in the workforce, etc. We retain the assumption from above that there is an asymmetry in transmission between groups. This arises from the fact that caretakers of the elderly are typically younger and therefore in the low-risk subpopulation. A sick elderly person is more likely to transmit to their young caregiver than a young person is to an elderly individual. As with everything else, this is an assumption that can be played around with and perhaps even left as an exercise.

We should think a bit about mortality rates. These are necessarily higher in the high-risk subpopulation (that’s what makes them high-risk). Note, however, that in the equations describing our structured SIR model, we simply have a constant removal rate. The term “removal rate” covers a multitude of sins. One can be removed from the infectious compartment by a number means, e.g., through recovery, death, quarantine, emigration from the affected region, etc. In order not to complicate our simple model too much, we will simply retain the assumption that our νi terms incorporate both recovery and mortality. Since it is quite likely that there could actually be opposite effects on the νi if people in the high-risk population experience greater mortality rates but also slower recovery. For now, we’ll stick with a single removal rate, ν. While the details of recovery and mortality rates would be very important if we were developing a tactical model to help with formulating detailed policy recommendations, our model here is strategic and more concerned with acquiring a general understanding of the consequences of changed contact rates on epidemic dynamics.

We will also modify the starting population since there are fewer people in the high-risk subpopulation.

We will assume that we can cut the contact rates with the high-risk subpopulation in half, but that we can’t reduce the contact rates with the low-risk subpopulation as much. This follows from the fact that members of the high-risk subpopulation are more likely to not be engaged in the workforce and may be more easily epidemiologically isolated.

As I said above, all these assumptions can be modified and you should probably expect future assignments in which you will be asked to investigate departures from the simple model presented here.

Because we are slowing down the epidemic, we will also extend the time over which we integrate the model.

# baseline from before
parms1 <- c(N=N, beta11=0.5, beta12=0.15, beta22=0.5, beta21=0.05, nu=1/5)
# reduce high-risk contact
## beta21 and beta22 define transmission to group 2 and incorporate
## contact rates along with transmissibility
## reduce these betas by half
parms2 <- c(N=N, beta11=0.5, beta12=0.15, beta22=0.25, beta21=0.025, nu=1/5)
# reduce low-risk contact
# assume we can only reduce by a third
parms3 <- c(N=N, beta11=0.335, beta12=0.1, beta22=0.5, beta21=0.05, nu=1/5)
# reduce both
parms4 <- c(N=N, beta11=0.335, beta12=0.1, beta22=0.25, beta21=0.025, nu=1/5)
z0 <- c(2*(N-1)/3,0.5,0,(N-1)/3,0.5,0,1)
times <- seq(0,600,1)

Now run the models.

## baseline
stateMatrix1 <- ode(y=z0, times, hsir, parms1)
colnames(stateMatrix1) <- c("Time","S1","I1","R1","S2","I2","R2","C")
## reduce high-risk contact
stateMatrix2 <- ode(y=z0, times, hsir, parms2)
colnames(stateMatrix2) <- c("Time","S1","I1","R1","S2","I2","R2","C")
## reduce low-risk contact
stateMatrix3 <- ode(y=z0, times, hsir, parms3)
colnames(stateMatrix3) <- c("Time","S1","I1","R1","S2","I2","R2","C")
## reduce both
stateMatrix4 <- ode(y=z0, times, hsir, parms4)
colnames(stateMatrix4) <- c("Time","S1","I1","R1","S2","I2","R2","C")

## plot epidemic curves
# baseline
plot(stateMatrix1[,"Time"], (stateMatrix1[,"I1"] + stateMatrix1[,"I2"]), 
  type="l", lwd=2, col="black",
  xlab="Time", ylab="Number of Cases")
# reduce high-risk
lines(stateMatrix2[,"Time"], (stateMatrix2[,"I1"] + stateMatrix2[,"I2"]), 
      lwd=2, col="blue")
# reduce low-risk
lines(stateMatrix3[,"Time"], (stateMatrix3[,"I1"] + stateMatrix3[,"I2"]), 
      lwd=2, col="cyan")
# reduce both
lines(stateMatrix4[,"Time"], (stateMatrix4[,"I1"] + stateMatrix4[,"I2"]), 
      lwd=2, col="magenta")
legend("topright", c("Baseline","Reduce High-Risk", "Reduce Low-Risk", "Reduce Both"), col=c("black","blue","cyan","magenta"), lwd=2)

We should probably check that these dramatic differences in epidemic curves are not simply driven by changes in the low-risk population, though the dramatic differences in the peak size of the epidemic is reassuring.

## plot epidemic curves
# baseline
plot(stateMatrix1[,"Time"], stateMatrix1[,"I2"], 
  type="l", lwd=2, col="black",
  xlab="Time", ylab="Number of High-Risk Cases")
# reduce high-risk
lines(stateMatrix2[,"Time"], stateMatrix2[,"I2"], lwd=2, col="blue")
# reduce low-risk
lines(stateMatrix3[,"Time"], stateMatrix3[,"I2"], lwd=2, col="cyan")
# reduce both
lines(stateMatrix4[,"Time"], stateMatrix4[,"I2"], lwd=2, col="magenta")
legend("topright", c("Baseline","Reduce High-Risk", "Reduce Low-Risk", "Reduce Both"), col=c("black","blue","cyan","magenta"), lwd=2)

Note the very different range of the y-axis!

This stupidly simple model makes the stark point that if we want to control an epidemic in a heterogeneous population, we should target the segment of the population that generates the most infections. This will often be the same segment of the population that is least at-risk for the most severe consequences of infection. This model provides the logic underlying what may seem like draconian social-distancing interventions (e.g., closing schools, canceling public gatherings).