Rewriting Relational Event Models to Poisson Regressions

Author

Jonathan Koop

Published

March 4, 2025

Relational Event Models are a popular and powerful tool for analyzing dynamic social networks. Because of their capability of modeling the timing of events, they are particularly useful for understanding the dynamics of social networks over time.

Ever since the first description of the model by Butts in 2008, multiple software packages have been developed to estimate these models. Besides the popular relevent package in R, there are various other packages allowing for the Bayesian estimation of these models and including ways to model the parameters in computationally efficient ways, particularly the remverse recently published by researchers from the University of Tilburg.

Compared to generalized linear models, however, these packages are limited in their functionalities. For example, in the realm of my thesis project, we were interested in applying Bayesian regularization through the inclusion of shrinkage priors on the parameters. This is not possible in the current implementations of relational event models. For this reason an alternative approach was needed: rewriting the relational event model to a Poisson regression model.

In this document, I will provide a step-by-step hands-on tutorial on how to rewrite a relational event model to a Poisson regression model. This is mainly based on the brief, rather technical description by Vieira, Leenders and Mulder (2024).

Step 1: Preparing Relational Event History Data

In the first step, we need to prepare the relational event history data. To do so, we will draw upon the remify and remstats packages from the remverse suite. These packages allow us to prepare the data in the correct format for the relational event model, and most importantly to calculate endogenous statistics for the model.

# Loading the remverse packages
library(remstats)
library(remify)

Next, we will load the data and prepare it for the relational event model. For this example, we will use data that is included in the remstats package.

# Loading the data
data("randomREHsmall")

Now, we can prepare the data for the relational event model.

# Specifying the model
eff <- ~ inertia(scaling = "std") + reciprocity(scaling = "std") + send("extraversion")

# Preparing the data
reh_tie <- remify::remify(edgelist = history, model = "tie")

# Calculating the endogenous statistics
reh_stats <- remstats(reh = reh_tie, tie_effects = eff, attr_actors = info)

Step 2: Rewriting the Data

Statistical Underpinnings

As Vieira, Leenders and Mulder (2024) describe, Relational Event Models can be seen as a special case of Poisson regression models.

To provide a bit of context, REMs model the event rate \(\lambda^{\text{REM}}(s,r,t)\) between sender \(s\) and receiver \(r\) at time \(t\) as a function of the endogenous and exogenous statistics \(X_{s,r}(t)\), i.e.:

\[\begin{equation} \log \lambda^{\text{REM}}(s,r,t) = X_{s,r}(t)'\beta, \label{eq:rem_equation} \end{equation}\]

With the assumption that the time intervals between events are exponentially distributed

\[\begin{equation} \Delta t \sim \text{Exponential}(\lambda_t') \label{eq:rem_exp} \end{equation}\]

where \(\lambda_t' = \sum_{s,r} \lambda^{\text{REM}}(s,r,t)\), thus representing the sum of all event rates at time \(t\).

This can be rewritten as a Poisson regression model by adding the inter-event time as offset in the event rate and using a binary indicator for the event occurrence as the dependent variable. The Poisson regression model can be written as:

\[\begin{equation} \log \, \hat{y}_{s_m, r_m} = \log(\Delta t_m) + X_{s_m, r_m}(t_m)' \beta. \end{equation}\]

where \(\hat{y}_{s_m, r_m}\) takes the value 1 if an event occurs between sender \(s_m\) and receiver \(r_m\) at time \(t_m\), and 0 otherwise and \(\log(\Delta t_m)\) is the offset term.

Rewriting the Data

Following the description above, we need to rewrite the data to a data frame, that includes:

  • The binary dependent variable \(\hat{y}_{s_m, r_m}\),
  • The offset term \(\log(\Delta t_m)\),
  • The endogenous and exogenous statistics \(X_{s_m, r_m}(t_m)\).

To do so, we need to transform the calculated statistics from the array with dimensions \(M \times N \times P\) (\(M\) is number of events, \(N\) total number of dyads, and \(P\) the number of statistics) to a data frame.

To do that, we create the function poisson_df below.

poisson_df <- function(events, tie_stats, tie_reh, t0 = 0) {
  # Get unique actors in the event data
  actors <- sort(unique(c(events$sender, events$receiver)))
  
  # Creating risk set
  risk_set <- vector("list", dim(tie_stats)[2])
  for (i in 1:dim(tie_stats)[2]) {
    risk_set[[i]] <- getDyad(tie_reh, i)
  }
  
  risk_set <- do.call(rbind, risk_set)[, 2:3]
  
  M <- nrow(events) # number of events
  poisson_list <- vector("list", M) # initialize list to store data frames
  
  for (m in 1:M) {
    # Calculate time difference between current and previous event
    t_curr <- events$time[m]
    t_prev <- if (m == 1)
      t0
    else
      events$time[m - 1]
    delta_m <- t_curr - t_prev
    
    # Get statistics for current event m
    stats_m <- as.data.frame(tie_stats[m, , ])
    
    # Combine risk set with covariates
    df_m <- cbind(risk_set, stats_m)
    
    # Create binary outcome y
    df_m$y <- ifelse(df_m$actor1 == events$actor1[m] &
                       df_m$actor2 == events$actor2[m],
                     1,
                     0)
    
    # Add offset
    df_m$logDelta <- log(delta_m)
    
    # Store data frame for event m
    poisson_list[[m]] <- df_m
  }
  
  # Combine all event data frames in list into one data frame
  df_poisson <- do.call(rbind, poisson_list)
  df_poisson <- df_poisson %>%
    select(-actor1, -actor2, -baseline) %>%
    # turn all variables that include "same_" to factors for memory efficiency
    mutate(across(contains("same_"), as.factor)) %>%
    # turn y to integer for memory efficiency
    mutate(y = as.integer(y))
  
  return(df_poisson)
}

Now, we can use the function to create the data frame for the Poisson regression model.

# Load the dplyr package
library(dplyr)

# Applying the function
poisson_data <- poisson_df(events = history, tie_stats = reh_stats, tie_reh = reh_tie)

Since here \(M=115\) and \(N=90\), our created data frame will have \(115 \times 90 = 10350\) rows.

nrow(poisson_data)
[1] 10350

And 5 columns, since we have 3 statistics, the offset term, and the binary dependent variable.

ncol(poisson_data)
[1] 5

Step 3: Estimating the Poisson Regression Model

Estimating the Model

Now that we have the data in the correct format, we can estimate the Poisson regression model. For this, we will use the glm function in R.

To do that, we can again create a function that automatically processes the data and estimates the model.

estimate_poisson <- function(df_poisson) {
  # get col index for baseline
  predictors <- names(df_poisson)[1:(which(colnames(df_poisson) == "y") - 1)]
  glm_formula <- as.formula(paste("y ~", paste(
    c(predictors, "offset(logDelta)"), collapse = " + "
  )))
  
  # Estimate Poisson regression model
  model <- glm(
    formula = glm_formula,
    data = df_poisson,
    family = poisson(link = "log")
  )
  
  return(model)
}

Now, we can estimate the model and inspect the estimated parameters.

# Estimate the model
poisson_model <- estimate_poisson(poisson_data)

# Inspect the model
summary(poisson_model)

Call:
glm(formula = glm_formula, family = poisson(link = "log"), data = df_poisson)

Coefficients:
                   Estimate Std. Error  z value Pr(>|z|)    
(Intercept)       -10.01654    0.09434 -106.176   <2e-16 ***
inertia            -0.04348    0.09662   -0.450    0.653    
reciprocity        -0.13479    0.10486   -1.285    0.199    
send_extraversion  -0.06682    0.09499   -0.703    0.482    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for poisson family taken to be 1)

    Null deviance: 1185.9  on 10349  degrees of freedom
Residual deviance: 1183.5  on 10346  degrees of freedom
AIC: 1421.5

Number of Fisher Scoring iterations: 7

Comparing the Results to `remstimate``

To make sure that the results are correct, we can compare the results to the remstimate function from the remverse.

library(remstimate)

# Estimate the relational event model
re_model <- remstimate(reh = reh_tie,
                                 stats = reh_stats,
                                 method = "MLE",
                                 ncores = 1)

summary(re_model)
Relational Event Model (tie oriented) 

Call:
~inertia(scaling = "std") + reciprocity(scaling = "std") + send("extraversion")


Coefficients (MLE with interval likelihood):

                     Estimate    Std. Err     z value Pr(>|z|) Pr(=0)
baseline           -10.008043    0.094335 -106.090313   0.0000 <2e-16
inertia             -0.043148    0.096207   -0.448489   0.6538 0.9065
reciprocity         -0.133547    0.104344   -1.279865   0.2006 0.8254
send_extraversion   -0.066629    0.095034   -0.701108   0.4832 0.8935
Null deviance: 2529.297 on 115 degrees of freedom
Residual deviance: 2526.876 on 111 degrees of freedom
Chi-square: 2.420673 on 4 degrees of freedom, asymptotic p-value 0.6588946 
AIC: 2534.876 AICC: 2535.24 BIC: 2545.856