# Loading the remverse packages
library(remstats)
library(remify)
Rewriting Relational Event Models to Poisson Regressions
Relational Event Models are a popular and powerful tool for analyzing dynamic social networks. Because of their capability of modeling the timing of events, they are particularly useful for understanding the dynamics of social networks over time.
Ever since the first description of the model by Butts in 2008, multiple software packages have been developed to estimate these models. Besides the popular relevent
package in R, there are various other packages allowing for the Bayesian estimation of these models and including ways to model the parameters in computationally efficient ways, particularly the remverse
recently published by researchers from the University of Tilburg.
Compared to generalized linear models, however, these packages are limited in their functionalities. For example, in the realm of my thesis project, we were interested in applying Bayesian regularization through the inclusion of shrinkage priors on the parameters. This is not possible in the current implementations of relational event models. For this reason an alternative approach was needed: rewriting the relational event model to a Poisson regression model.
In this document, I will provide a step-by-step hands-on tutorial on how to rewrite a relational event model to a Poisson regression model. This is mainly based on the brief, rather technical description by Vieira, Leenders and Mulder (2024).
Step 1: Preparing Relational Event History Data
In the first step, we need to prepare the relational event history data. To do so, we will draw upon the remify
and remstats
packages from the remverse
suite. These packages allow us to prepare the data in the correct format for the relational event model, and most importantly to calculate endogenous statistics for the model.
Next, we will load the data and prepare it for the relational event model. For this example, we will use data that is included in the remstats
package.
# Loading the data
data("randomREHsmall")
Now, we can prepare the data for the relational event model.
# Specifying the model
<- ~ inertia(scaling = "std") + reciprocity(scaling = "std") + send("extraversion")
eff
# Preparing the data
<- remify::remify(edgelist = history, model = "tie")
reh_tie
# Calculating the endogenous statistics
<- remstats(reh = reh_tie, tie_effects = eff, attr_actors = info) reh_stats
Step 2: Rewriting the Data
Statistical Underpinnings
As Vieira, Leenders and Mulder (2024) describe, Relational Event Models can be seen as a special case of Poisson regression models.
To provide a bit of context, REMs model the event rate \(\lambda^{\text{REM}}(s,r,t)\) between sender \(s\) and receiver \(r\) at time \(t\) as a function of the endogenous and exogenous statistics \(X_{s,r}(t)\), i.e.:
\[\begin{equation} \log \lambda^{\text{REM}}(s,r,t) = X_{s,r}(t)'\beta, \label{eq:rem_equation} \end{equation}\]
With the assumption that the time intervals between events are exponentially distributed
\[\begin{equation} \Delta t \sim \text{Exponential}(\lambda_t') \label{eq:rem_exp} \end{equation}\]
where \(\lambda_t' = \sum_{s,r} \lambda^{\text{REM}}(s,r,t)\), thus representing the sum of all event rates at time \(t\).
This can be rewritten as a Poisson regression model by adding the inter-event time as offset in the event rate and using a binary indicator for the event occurrence as the dependent variable. The Poisson regression model can be written as:
\[\begin{equation} \log \, \hat{y}_{s_m, r_m} = \log(\Delta t_m) + X_{s_m, r_m}(t_m)' \beta. \end{equation}\]
where \(\hat{y}_{s_m, r_m}\) takes the value 1 if an event occurs between sender \(s_m\) and receiver \(r_m\) at time \(t_m\), and 0 otherwise and \(\log(\Delta t_m)\) is the offset term.
Rewriting the Data
Following the description above, we need to rewrite the data to a data frame, that includes:
- The binary dependent variable \(\hat{y}_{s_m, r_m}\),
- The offset term \(\log(\Delta t_m)\),
- The endogenous and exogenous statistics \(X_{s_m, r_m}(t_m)\).
To do so, we need to transform the calculated statistics from the array with dimensions \(M \times N \times P\) (\(M\) is number of events, \(N\) total number of dyads, and \(P\) the number of statistics) to a data frame.
To do that, we create the function poisson_df
below.
<- function(events, tie_stats, tie_reh, t0 = 0) {
poisson_df # Get unique actors in the event data
<- sort(unique(c(events$sender, events$receiver)))
actors
# Creating risk set
<- vector("list", dim(tie_stats)[2])
risk_set for (i in 1:dim(tie_stats)[2]) {
<- getDyad(tie_reh, i)
risk_set[[i]]
}
<- do.call(rbind, risk_set)[, 2:3]
risk_set
<- nrow(events) # number of events
M <- vector("list", M) # initialize list to store data frames
poisson_list
for (m in 1:M) {
# Calculate time difference between current and previous event
<- events$time[m]
t_curr <- if (m == 1)
t_prev
t0else
$time[m - 1]
events<- t_curr - t_prev
delta_m
# Get statistics for current event m
<- as.data.frame(tie_stats[m, , ])
stats_m
# Combine risk set with covariates
<- cbind(risk_set, stats_m)
df_m
# Create binary outcome y
$y <- ifelse(df_m$actor1 == events$actor1[m] &
df_m$actor2 == events$actor2[m],
df_m1,
0)
# Add offset
$logDelta <- log(delta_m)
df_m
# Store data frame for event m
<- df_m
poisson_list[[m]]
}
# Combine all event data frames in list into one data frame
<- do.call(rbind, poisson_list)
df_poisson <- df_poisson %>%
df_poisson select(-actor1, -actor2, -baseline) %>%
# turn all variables that include "same_" to factors for memory efficiency
mutate(across(contains("same_"), as.factor)) %>%
# turn y to integer for memory efficiency
mutate(y = as.integer(y))
return(df_poisson)
}
Now, we can use the function to create the data frame for the Poisson regression model.
# Load the dplyr package
library(dplyr)
# Applying the function
<- poisson_df(events = history, tie_stats = reh_stats, tie_reh = reh_tie) poisson_data
Since here \(M=115\) and \(N=90\), our created data frame will have \(115 \times 90 = 10350\) rows.
nrow(poisson_data)
[1] 10350
And 5 columns, since we have 3 statistics, the offset term, and the binary dependent variable.
ncol(poisson_data)
[1] 5
Step 3: Estimating the Poisson Regression Model
Estimating the Model
Now that we have the data in the correct format, we can estimate the Poisson regression model. For this, we will use the glm
function in R.
To do that, we can again create a function that automatically processes the data and estimates the model.
<- function(df_poisson) {
estimate_poisson # get col index for baseline
<- names(df_poisson)[1:(which(colnames(df_poisson) == "y") - 1)]
predictors <- as.formula(paste("y ~", paste(
glm_formula c(predictors, "offset(logDelta)"), collapse = " + "
)))
# Estimate Poisson regression model
<- glm(
model formula = glm_formula,
data = df_poisson,
family = poisson(link = "log")
)
return(model)
}
Now, we can estimate the model and inspect the estimated parameters.
# Estimate the model
<- estimate_poisson(poisson_data)
poisson_model
# Inspect the model
summary(poisson_model)
Call:
glm(formula = glm_formula, family = poisson(link = "log"), data = df_poisson)
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -10.01654 0.09434 -106.176 <2e-16 ***
inertia -0.04348 0.09662 -0.450 0.653
reciprocity -0.13479 0.10486 -1.285 0.199
send_extraversion -0.06682 0.09499 -0.703 0.482
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for poisson family taken to be 1)
Null deviance: 1185.9 on 10349 degrees of freedom
Residual deviance: 1183.5 on 10346 degrees of freedom
AIC: 1421.5
Number of Fisher Scoring iterations: 7
Comparing the Results to `remstimate``
To make sure that the results are correct, we can compare the results to the remstimate
function from the remverse
.
library(remstimate)
# Estimate the relational event model
<- remstimate(reh = reh_tie,
re_model stats = reh_stats,
method = "MLE",
ncores = 1)
summary(re_model)
Relational Event Model (tie oriented)
Call:
~inertia(scaling = "std") + reciprocity(scaling = "std") + send("extraversion")
Coefficients (MLE with interval likelihood):
Estimate Std. Err z value Pr(>|z|) Pr(=0)
baseline -10.008043 0.094335 -106.090313 0.0000 <2e-16
inertia -0.043148 0.096207 -0.448489 0.6538 0.9065
reciprocity -0.133547 0.104344 -1.279865 0.2006 0.8254
send_extraversion -0.066629 0.095034 -0.701108 0.4832 0.8935
Null deviance: 2529.297 on 115 degrees of freedom
Residual deviance: 2526.876 on 111 degrees of freedom
Chi-square: 2.420673 on 4 degrees of freedom, asymptotic p-value 0.6588946
AIC: 2534.876 AICC: 2535.24 BIC: 2545.856