Executive Summary

This script generates the plots and performs the data analysis for the following article:
Petrov, A. A. & Du, A. Y. (2024). Are Abstract Relational Roles Encoded Visually? Evidence from Priming Effects. Accepted for publication as a six-page paper in the Proceedings of the Annual Conference of the Cognitive Science Society, 2024.https://alexpetrov.com/pub/cogsci24/

This article reports a behavioral experiment that was part (referred to as “Experiment 4”) in the PhD Thesis of Yuhui (Ava) Du. The home folder for this experiment is named “SeeRel1” or “SR1” for short. The data were collected in the first half of 2023 and Ava defended her dissertation in August 2023.

The script repeats and updates the exploratory data analysis of an earlier script, “./SR1_CogSci2024.Rmd”. The main difference is the transition from raw RTs to residual RTs as the primary dependent variable. It loads a dataframe called SR1_data generated by “./SR1_learning_effect1.Rmd”. Data from the 15 participants who pass the exclusion criteria are concatenated together. Each row in the dataframe represents one experimental trial. The variables (columns) are documented in “./SR1_get_data2.Rmd” and “./SR1_learning_effect1.Rmd”.

The current script is a revised and simplified version of “./SR1_exploratory1.Rmd” and “./SR1_exploratory2.Rmd”. See the former for extensive comments about the experimental design, cross-tabulations of the counterbalancing of the independent variables, etc.

Sources for Statistics Cited in the Article

This script contains almost all inferential statistics that are cited in the published text of the article. (Many descriptive statistics were taken directly from Chapter 5 of Ava Du’s dissertation.) The inferential statistics can be found below in the following sections:


Getting started

## Loading required package: lme4
## Loading required package: Matrix
## 
## Attaching package: 'lmerTest'
## The following object is masked from 'package:lme4':
## 
##     lmer
## The following object is masked from 'package:stats':
## 
##     step

To run this script on your computer, you may have to change the “trial_data_path” below. The file.choose() function will be your friend in this regard.

## Loading objects:
##   SR1_data
##   SR1_sbj
##   meta_data
##   model_fits
## descriptor1: Raw data of SeeRel1 experiment. Collected in May-June 2023 at OSU. 
## descriptor2: Imported in R using SR1_get_data1.Rmd on Fri Jul 14 20:50:22 2023. 
## SeeRel1_root: /Users/petrov.11/work/RelExper/SeeRel1 
## data_path: data 
## raw_data.csv: SeeRelation_allData_formatted.csv 
## trial_data.RData: SR1_trial_data.RData 
## analysis_path: analysis 
## descriptor3: Dataframe SR1_trials loaded by SR1_get_data2.Rmd on Mon Jul 24 19:08:12 2023. 
## trial_data_path: /Users/petrov.11/work/RelExper/SeeRel1/data/SR1_main_data.RData 
## dropped_subj: c("16832", "18364", "24965", "29364", "29725", "38106") 
## zRT_threshold: 3 
## main_data.RData: SR1_main_data.RData 
## descriptor4: Dataframe SR1_data[23040x32] loaded by SR1_learning_effect1.Rmd on Thu May  9 14:57:22 2024. 
## RT_limits: c(200, 2000) 
## exp_ref_trials: c(192, 576) 
## main_data2.RData: SR1_main_data2.RData 
## descriptor5: Dataframe SR1_data[23040x43] loaded by SR1_exploratory2.Rmd on Sat May 11 14:49:49 2024. 
## trial_data2_path: /Users/petrov.11/work/RelExper/SeeRel1/data/SR1_main_data2.RData

Experimental design with respect to relational roles

Each trial presented a sequence of two stimuli: a prime and a target. The target was always a static image and the participant’s main task was to press the “A” key if the foreground-colored animal was on the left and press the “L” key if it was on the right. The prime stimuli varied by session – they were videos when session=“video” and sentences when session=“ling”.

The prime events (that is, the events depicted in the prime stimuli) were counterbalanced across four types: “break”, “deform”, “launch”, and “control”. The first three had a well-defined agent – the animal that moved first and made things happen. They also had a well-defined patient – the other animal that received the action initiated by the agent. By contrast, in “control” events the two animals were “dancing” side by side. Everything was symmetric and thus there was no distinction between agent and patient.

The experimental design crossed the “primeEvent” factor orthogonally with the “targetEvent” factor. There are 64 replications per participant per session, which produces a total of 1920 replications across the 15 participants and 2 sessions.

with(SR1_data, table(primeEvent, targetEvent, useNA="ifany"))
##           targetEvent
## primeEvent break deform launch control ambig
##    break    1920   1920      0       0  1920
##    deform   1920   1920      0       0  1920
##    launch   1920   1920      0       0  1920
##    control  1920   1920      0       0  1920
##    ambig       0      0      0       0     0

We are especially interested in relational-role priming effects. Thus, a very important independent variable describes the congruence between the relational roles in the prime event and the relational roles in the target event. The input dataframe contains a variable “repRole” (short for “repeated relational role”) with the following three possible values:

with(SR1_data, table(primeEvent, repRole, useNA="ifany"))
##           repRole
## primeEvent   -1    0    1
##    break   1920 1920 1920
##    deform  1920 1920 1920
##    launch  1920 1920 1920
##    control    0 5760    0
##    ambig      0    0    0

Define Factor “repRoleF”

The numerical values of repRole (-1, 0, +1) are not convenient for plotting. Let’s convert them to a factor with levels:

In addition to the order along the X axis in plots, the A-R-S order is useful for interpreting the fixed-effect coefficients of the mixed-effects lmer models below. Each of these coefficients represents an increment (in millisec) relative to the “baseline” level specified by the leftmost level of the corresponding factor. Thus, the ambiguous repRole condition will serve as baseline (because “A” is the leftmost level of repRoleF). This is more natural and informative than using the “switch” (repRole == -1) level.

## Define a helper function that converts a vector of (numeric) repRole's 
## into a vector of strings. Note that it returns a string value, not a factor: 
repRole_string <- function(repRole)
  ifelse (repRole==0,     "A",
    ifelse(repRole==+1,   "R",
      ifelse(repRole==-1, "S",
                          NA         # should never occur
)))

## Insert a new factor into the dataframe:
SR1_data$repRoleF <- with(SR1_data, factor(repRole_string(repRole)))

## Tabulate against repRole:
with(SR1_data, table(repRoleF, repRole, useNA="ifany"))
##         repRole
## repRoleF    -1     0     1
##        A     0 11520     0
##        R     0     0  5760
##        S  5760     0     0

Define Factor “repRoleAU”

When we fit mixed-effect models later, it will be convenient to have a binary factor “repRoleAU” that contrasts the “ambiguous” level against the mean of the “repeat” and “switch” levels. It corresponds to contrast coefficients c(0.5, -1, 0.5) for the original repRole ordering convention. Equivalently, repRoleAU corresponds to contrast coefficients (-1, 0.5, 0.5) for the repRoleF convention. All in all, repRoleAU has two levels:

Ava Du uses the name “defined_role” for an analogous factor in her scripts (notably, “…/people/yuhui-du/PsychDept/PhD/expmt/Chapter4_Analysis.Rmd”).

## Define a helper function that converts a vector of (numeric) repRole's 
## into a vector of strings. Note that it returns a string value, not a factor: 
repRoleAU_string <- function(repRole)
  ifelse (repRole==0,     "A",
    ifelse(repRole==+1,   "U",
      ifelse(repRole==-1, "U",
                          NA         # should never occur
)))

## Insert a new factor into the dataframe:
SR1_data$repRoleAU <- with(SR1_data, factor(repRoleAU_string(repRole)))

## Tabulate against repRoleF:
with(SR1_data, table(repRoleAU, repRoleF, useNA="ifany"))
##          repRoleF
## repRoleAU     A     R     S
##         A 11520     0     0
##         U     0  5760  5760

Extract the trials that satisfy the trial-level exclusion critieria:

The substantive changes relative to “./SR1_CogSci2024.Rmd” begin here.

The first substantive change is that we use filter3 rather than filter2 to select the “good” data for analysis. Specifically…

However, as explained in the penultimate section of “./SR1_learning_effect1.Rmd”, filter2 is now deprecated. We use the residual-based filter3 instead:

The response-time (RT) analyses are going to be based on a subset of the trials. The exclusion criteria are implemented (and documented) in “./SR1_get_data2.Rmd” and “./SR1_learning_effect1.Rmd”. The “good” trials are marked by a the “filter3” variable. Note that it supersedes the now deprecated “filter2” (which was used in the earlier version of this script.)

After the filtering operation, we simplify the dataframe by removing those variables (columns) that have constant values.

The resulting dataframe D is used in the RT analyses below.

## Apply the trial-level exclusion criteria:
D <- subset(SR1_data, filter3)   # NB: filter3, *not* filter2

## Sanity checks:
stopifnot(D$timeout==0)     # trials with RT>2000 have been excluded
stopifnot(D$acc==1)         # trials with incorrect responses have been excluded
stopifnot(D$filter1)

## Remove columns that have constant values:
D <- D[, setdiff(names(D), c("timeout", "acc", "filter1", "filter3"))]

## Print a message on the console:
cat(sprintf("Extracted a dataframe D with %d rows and %d columns.\n",
            nrow(D), ncol(D)))
## Extracted a dataframe D with 21973 rows and 41 columns.
## Grand descriptive statistics of residRT:
summary(D$residRT)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## -440.51  -59.29  -22.13  -14.57   15.79  577.17

Note that mean(D$residRT) is slightly negative (-14.7) because most of the excluded outliers were “long.” This stems from the well-known positive skew of RT distributions.

Plot the individual RT profiles as a function of repRoleF

Collect session-level RT statistics on the “good” trials only.

Major change: We use residual RT instead of raw RT.

## Count the errors conditional on repRole per subject and session:
#  Note that we're using the dataframe D rather than SR1_data. In other words,
#  we include only the trials that satisfy the trial-exclusion criteria.
stats_by_repRole <- ddply(D, .(subjNum, session, repRole), function(D1) {
  data.frame(
    #subjNum  = constant(D1$subjNum,  verifyp=FALSE), 
    #session  = constant(D1$session,  verifyp=FALSE), 
    repRole  = constant(D1$repRole,  verifyp=FALSE), 
    repRoleF = constant(D1$repRoleF, verifyp=FALSE), 
    N_good_trials = nrow(D1),
    RT_mean  = mean(D1$RT),       # for backward compatibility
    RT_std = sd(D1$RT),           # for backward compatibility
    rRT_mean = mean(D1$residRT),  # NEW: Added May 2024
    rRT_std = sd(D1$residRT)      # NEW: Added May 2024
)})

Make the individual plots themselves. To do: @@@ error bars @@@

# Individual unprocessed-RT profiles (in milliseconds):
gp$RT_individ <- ggplot(stats_by_repRole, 
  aes(x=repRoleF, y=RT_mean, group=session, color=session, shape=session)) + 
  geom_point() + geom_line() + 
  facet_wrap(vars(subjNum))
print(gp$RT_individ)

# Individual residual-RT profiles (in milliseconds):
gp$rRT_individ <- ggplot(stats_by_repRole, 
  aes(x=repRoleF, y=rRT_mean, group=session, color=session, shape=session)) + 
  geom_point() + geom_line() + 
  facet_wrap(vars(subjNum))
print(gp$rRT_individ)

Group-level RT profile as a function of repRoleF

@@@ Add text here…

## Means and standard deviations averaged across participants:
stats_aggreg <- ddply(stats_by_repRole, .(session, repRoleF), function(D1) {
  data.frame(
    session  = constant(D1$session,  verifyp=FALSE), 
    repRole  = constant(D1$repRole,  verifyp=FALSE), 
    repRoleF = constant(D1$repRoleF, verifyp=FALSE), 
    sum_N_good_trials = sum(D1$N_good_trials),
    # mean_N_good_trials = mean(D1$N_good_trials),    # mean_N = sum_N / N_subj
    mean_RT_mean  = mean(D1$RT_mean),     # for backward compatibility
    mean_rRT_mean  = mean(D1$rRT_mean)    # NEW: Added May 2024
    # @@@_RT_std 
)})
print(stats_aggreg, digits = 3)
##   session repRole repRoleF sum_N_good_trials mean_RT_mean mean_rRT_mean
## 1   video       0        A              5472          490        -21.56
## 2   video       1        R              2732          502         -9.45
## 3   video      -1        S              2693          501        -11.20
## 4    ling       0        A              5511          387        -13.38
## 5    ling       1        R              2787          387        -13.73
## 6    ling      -1        S              2778          388        -12.55

Average all 15 participants together to produce group-aggregated profiles:

## Plot the aggregate unprocessed-RT profile (in ms)
gp$RT_aggreg1 <- ggplot(stats_aggreg, 
  aes(x = repRoleF, y = mean_RT_mean, group=session, color=session, shape=session)) + 
  geom_point() + geom_line()
print(gp$RT_aggreg1)

## Plot the aggregate residual-RT profile (in ms)
gp$rRT_aggreg1 <- ggplot(stats_aggreg, 
  aes(x = repRoleF, y = mean_rRT_mean, group=session, color=session, shape=session)) + 
  geom_point() + geom_line()
print(gp$rRT_aggreg1)

The unprocessed-RT profile is Figure 4.6 in Ava Du’s PhD Thesis, except that the error bars are missing. To do the error bars, we need the residual standard deviation, which will be calculated in the model-fitting sections below.

ANOVA “aov1”

Before bringing the heavy guns (mixed-effect models), let’s fit a good old-fashioned ANOVA:

# Fit the model:
M$aov1 <- aov(residRT ~ session * repRoleF + subjNum, data = D)

# Print the coefficients (and other info) of the model:
summary(M$aov1)
##                     Df    Sum Sq Mean Sq F value   Pr(>F)    
## session              1     45483   45483   6.858  0.00883 ** 
## repRoleF             2    177714   88857  13.398 1.53e-06 ***
## subjNum             14    919814   65701   9.907  < 2e-16 ***
## session:repRoleF     2    169496   84748  12.779 2.84e-06 ***
## Residuals        21953 145591490    6632                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Note that session still has a statistically significant effect even though we fitted separate learning curves to the two sessions. Similarly, there is a highly significant effect of subjNum even though we fitted separate learning curves to each individual participant. It’s hard for me to judge the details of this statistical phenomenon. If I had to guess, I would say that it is due to the misalignment between the skewed nature of the RT distributions and the assumptions the least-squares algorithm used to fit the learning curves (in “./SR1_learning_effect1.Rmd”).

On a positive note, it seems we have plenty of statistical power.

Mixed-Effect Model “main1”

In her dissertation, Ava Du uses a rather complex model that includes half a dozen predictors. For purposes of clarity and simplicity, here we use a very simple model with only three predictors:

We use the lmerTest package (Kuznetsova, Brockhoff, Christensen, 2017) to fit all mixed-effect models.

Let’s start with the model specified above. We will refer to it as M$main1. Here and below unless explicitly stated otherwise, the only difference between the current model and its counterpart in “./SR1_CogSci2024.Rmd” is that here the dependent variable is residRT, whereas it was raw RT in the earlier version.

# Fit the model:
M$main1 <- lmerTest::lmer(residRT ~ session * repRoleF + (1 | subjNum), 
                          data = D, REML = FALSE)

# Print the coefficients (and other info) of the model:
summary(M$main1)
## Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
##   method [lmerModLmerTest]
## Formula: residRT ~ session * repRoleF + (1 | subjNum)
##    Data: D
## 
##      AIC      BIC   logLik deviance df.resid 
##   255756   255820  -127870   255740    21965 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -5.1826 -0.5457 -0.1088  0.3563  7.4419 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  subjNum  (Intercept)   37.71   6.141  
##  Residual             6630.46  81.428  
## Number of obs: 21973, groups:  subjNum, 15
## 
## Fixed effects:
##                        Estimate Std. Error        df t value Pr(>|t|)    
## (Intercept)             -21.598      1.930    26.215 -11.189 1.75e-11 ***
## sessionling               8.277      1.554 21958.375   5.326 1.01e-07 ***
## repRoleFR                12.048      1.908 21958.294   6.316 2.74e-10 ***
## repRoleFS                10.301      1.917 21958.179   5.374 7.77e-08 ***
## sessionling:repRoleFR   -12.406      2.687 21958.198  -4.617 3.92e-06 ***
## sessionling:repRoleFS    -9.363      2.695 21958.336  -3.474 0.000514 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) sssnln rpRlFR rpRlFS ss:RFR
## sessionling -0.404                            
## repRoleFR   -0.329  0.409                     
## repRoleFS   -0.327  0.407  0.331              
## sssnlng:RFR  0.234 -0.578 -0.710 -0.235       
## sssnlng:RFS  0.233 -0.577 -0.236 -0.711  0.333
# ANOVA table:
anova(M$main1)
## Type III Analysis of Variance Table with Satterthwaite's method
##                  Sum Sq Mean Sq NumDF DenDF F value    Pr(>F)    
## session            5145    5145     1 21959  0.7759    0.3784    
## repRoleF         180662   90331     2 21958 13.6237 1.222e-06 ***
## session:repRoleF 169232   84616     2 21958 12.7617 2.890e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We compare these results to their counterparts in “./SR1_CogSci2024.Rmd” where the exact same model was fitted to the raw RT data instead of the residuals. The most important change relative to the earlier model is that the Session factor is no longer significant!
This is evident from the ANOVA table above. By comparison, Session used to be the factor with the highest significance – it had p < 2.2e-16 on the raw data, whereas now we have p = 0.3784.

This is very good news! It indicates that we have succeeded in our goal to statistically equate the baseline RTs across the two sessions. This baseline (which corresponds to the main effect of Session) is confounded with the effect of practice due to an imperfection in our experimental design. The problem is that the video session always was on Day 1 and the linguistic session on Day 2 for all participants.

We see from the ANOVA table that the substantive conclusions of our earlier analysis apply even stronger to the new version. The repRole factor is highly significant – more so (p = 1.222e-06) than it was in the earlier version (p = 2.200e-06).
The Session-by-Role interaction is also highly significant – again more so (p = 2.890e-06) than before (p = 8.258e-06). These effects will be analyzed further below.

Mixed-Effect Model “main2”

Before getting into this analysis, however, let’s consider a slightly more complex model. It is the same as “main1” except that it also includes a random slope term for session. That is, the formula for the random effects is ~ (1 + session | subjNum) rather than simply ~ (1 | subjNum). Let’s call this model M$main2:

# Fit the model:
M$main2 <- lmerTest::lmer(residRT ~ session * repRoleF + (1 + session | subjNum), 
                          data = D, REML = FALSE)

# ANOVA table:
anova(M$main2)
## Type III Analysis of Variance Table with Satterthwaite's method
##                  Sum Sq Mean Sq NumDF   DenDF F value    Pr(>F)    
## session             545     545     1    15.4  0.0826    0.7777    
## repRoleF         181051   90526     2 21943.5 13.7225 1.107e-06 ***
## session:repRoleF 169459   84730     2 21943.5 12.8438 2.662e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Notice that the F statistic for the main effect of session is driven down almost to zero. This is because of the new random term in the model – the so-called “mixed-effect slope”. This statistic is reported in the published article (p. 5, right column): “The main effect of Session … [its] associated F statistic was close to zero (F(1, 15) = 0.08, p = 0.78).”

The published article also lists [p. 6, left column] the session:repRoleF interaction: “As predicted, priming was observed only in the video session (S-by-R interaction, F(2, 21943) = 12.8, p < 1E-5).”

Which of the two models (main1 vs main2) is to be preferred? Informal analyses showed that all substantive conclusions are very similar under both models. Let’s use the lmerTest::step() function for model selection.
It must be applied to the more complex model – that is, main2:

M$st_main2 <- lmerTest::step(M$main2)
print(M$st_main2)
## Backward reduced random-effect table:
## 
##                                    Eliminated npar  logLik    AIC   LRT Df
## <none>                                          10 -127829 255678         
## session in (1 + session | subjNum)          0    8 -127870 255756 81.82  2
##                                    Pr(>Chisq)    
## <none>                                           
## session in (1 + session | subjNum)  < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Backward reduced fixed-effect table:
## Degrees of freedom method: Satterthwaite 
## 
##                  Eliminated Sum Sq Mean Sq NumDF DenDF F value    Pr(>F)    
## session:repRoleF          0 169459   84730     2 21944  12.844 2.662e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Model found:
## residRT ~ session * repRoleF + (1 + session | subjNum)

The likelihood-ratio test is highly significant (LRT==81.82, p<2.2e-16). (It was LRT==1812.6 in the earlier version.) This indicates (if I interpret it correctly) that the more complex model is to be preferred. Thus, we will use M$main2 as our main model.

Let’s delve into its details:

# Print the coefficients (and other info) of the main2 model:
summary(M$main2)
## Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
##   method [lmerModLmerTest]
## Formula: residRT ~ session * repRoleF + (1 + session | subjNum)
##    Data: D
## 
##       AIC       BIC    logLik  deviance  df.resid 
##  255678.2  255758.2 -127829.1  255658.2     21963 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -5.0949 -0.5432 -0.1160  0.3521  7.5618 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr 
##  subjNum  (Intercept)   40.36   6.353        
##           sessionling  134.73  11.607   -0.49
##  Residual             6596.90  81.221        
## Number of obs: 21973, groups:  subjNum, 15
## 
## Fixed effects:
##                        Estimate Std. Error        df t value Pr(>|t|)    
## (Intercept)             -21.560      1.974    20.934 -10.922 4.20e-10 ***
## sessionling               8.186      3.374    18.732   2.426 0.025541 *  
## repRoleFR                12.062      1.903 21944.075   6.339 2.36e-10 ***
## repRoleFS                10.310      1.912 21943.672   5.393 7.02e-08 ***
## sessionling:repRoleFR   -12.407      2.680 21943.640  -4.629 3.70e-06 ***
## sessionling:repRoleFS    -9.383      2.688 21943.504  -3.490 0.000483 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) sssnln rpRlFR rpRlFS ss:RFR
## sessionling -0.546                            
## repRoleFR   -0.321  0.188                     
## repRoleFS   -0.319  0.187  0.331              
## sssnlng:RFR  0.228 -0.266 -0.710 -0.235       
## sssnlng:RFS  0.227 -0.265 -0.236 -0.711  0.333
# Double-check the interpretation of the fixed coefficients above
print(stats_aggreg)   
##   session repRole repRoleF sum_N_good_trials mean_RT_mean mean_rRT_mean
## 1   video       0        A              5472     489.6894    -21.564231
## 2   video       1        R              2732     502.4767     -9.445816
## 3   video      -1        S              2693     500.5311    -11.196110
## 4    ling       0        A              5511     387.1307    -13.379715
## 5    ling       1        R              2787     386.7756    -13.729580
## 6    ling      -1        S              2778     388.0696    -12.549771

It is instructive to interpret the fixed-effect coefficients. All of them are in milliseconds because that’s the unit of the dependent variable RT. {The values in curly braces reproduce the corresponding coefficients of the earlier version of model “main2” (as of January 2024).}

Modeling the Linguistic Session Separately

Important: Notice that the interaction effect cancels out the main effect of repRoleF!
Concretely, repRoleFR + sessionling:repRoleFR = 12.1 - 12.4 = -0.3 is close to zero. And repRoleFS + sessionling:repRoleFS = 10.3 - 9.4 = 0.9 is also close to zero. This is how the model reconstructs the near-flat RT profile in the linguistic session: RT = c(-13.28, -13.73, -12.55).

This suggests that the repRoleF factor has negligible effect in the linguistic session. I don’t know how to formulate this hypothesis in terms of a planned contrast (in the statistical sense) of the mixed-effect model. As a quick-and-dirty substitute, let’s simply fit the model to the data from the linguistic session only:

# Fit the model:
M$main_ling <- lmerTest::lmer(residRT ~ repRoleF + (1 | subjNum), 
                              data = subset(D, session=="ling"), REML = FALSE)

# ANOVA table:
anova(M$main_ling)
## Type III Analysis of Variance Table with Satterthwaite's method
##          Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
## repRoleF 2473.1  1236.6     2 11061  0.3564 0.7002

It’s official: repRole has no statistically significant effect for linguistic primes F = 0.36, p = 0.7 {was F = 0.48, p = 0.6}. This statistic is cited in the CogSci article (p. 6, top left): “The role assignment of linguistic primes had no statistically significant effect… (F(2, 11061) = 0.36, p = 0.6).”

In a section below, we will use the residual standard deviation of this model (sigma(M$main_ling) == 58.9 ms {was sigma(M$main_ling) == 60.1 ms}) to plot the error bars in the linguistic condition.

Modeling the Video Session Separately

The complementary analysis isolates the data with session == "video":

# Fit the model:
M$main_video <- lmerTest::lmer(residRT ~ repRoleF + (1 | subjNum), 
                              data = subset(D, session=="video"), REML = FALSE)

# Print the coefficients (and other info) of the model:
summary(M$main_video)
## Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
##   method [lmerModLmerTest]
## Formula: residRT ~ repRoleF + (1 | subjNum)
##    Data: subset(D, session == "video")
## 
##      AIC      BIC   logLik deviance df.resid 
## 131071.9 131108.4 -65531.0 131061.9    10892 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.2019 -0.5963 -0.1300  0.3886  6.1951 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  subjNum  (Intercept)   35.95   5.996  
##  Residual             9775.86  98.873  
## Number of obs: 10897, groups:  subjNum, 15
## 
## Fixed effects:
##              Estimate Std. Error        df t value Pr(>|t|)    
## (Intercept)   -21.560      2.045    24.141 -10.540 1.63e-10 ***
## repRoleFR      12.055      2.316 10882.791   5.204 1.98e-07 ***
## repRoleFS      10.305      2.327 10882.495   4.428 9.62e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##           (Intr) rpRlFR
## repRoleFR -0.377       
## repRoleFS -0.375  0.331
# ANOVA table:
anova(M$main_video)
## Type III Analysis of Variance Table with Satterthwaite's method
##          Sum Sq Mean Sq NumDF DenDF F value    Pr(>F)    
## repRoleF 344994  172497     2 10883  17.645 2.235e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Two things are notable here: First, the statistical significance of repRole increased because the pattern is no longer “diluted” by the flat profile of the linguistic-session RTs. Second, the fixed-effect coefficients are identical (within rounding error) to those from the two main models (“main1” and “main2”).

In a section below, we will use the residual standard deviation of this model (sigma(M$main_video) == 98.9 ms {was sigma(M$main_video) == 109.6 ms}) to plot the error bars in the video condition.

Lumping the Two Unambigous repRoles Together

As a quick-and-dirty method of estimating the significance of the planned contrast of the ambigous (A) level against the average of the two other levels (R and S), we fit a couple of models involving the binary factor “repRoleAU”:

# Fit the model to all data:
M$main2AU <- lmerTest::lmer(residRT ~ session * repRoleAU + (1 + session | subjNum), 
                          data = D, REML = FALSE)

# ANOVA table:
anova(M$main2AU)
## Type III Analysis of Variance Table with Satterthwaite's method
##                   Sum Sq Mean Sq NumDF DenDF F value    Pr(>F)    
## session             4845    4845     1    15  0.7344     0.405    
## repRoleAU         181025  181025     1 21944 27.4396 1.636e-07 ***
## session:repRoleAU 163201  163201     1 21944 24.7379 6.617e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Fit the model to the video data only:
M$main2AU_video <- lmerTest::lmer(residRT ~ repRoleAU + (1 | subjNum), 
                          data = subset(D, session=="video"), REML = FALSE)

# ANOVA table:
anova(M$main2AU_video)
## Type III Analysis of Variance Table with Satterthwaite's method
##           Sum Sq Mean Sq NumDF DenDF F value    Pr(>F)    
## repRoleAU 340842  340842     1 10883  34.864 3.641e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Print the coefficients of the reduced model:
summary(M$main2AU_video)
## Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
##   method [lmerModLmerTest]
## Formula: residRT ~ repRoleAU + (1 | subjNum)
##    Data: subset(D, session == "video")
## 
##      AIC      BIC   logLik deviance df.resid 
## 131070.3 131099.5 -65531.2 131062.3    10893 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.2108 -0.5974 -0.1307  0.3907  6.1949 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  subjNum  (Intercept)   35.94   5.995  
##  Residual             9776.25  98.875  
## Number of obs: 10897, groups:  subjNum, 15
## 
## Fixed effects:
##              Estimate Std. Error        df t value Pr(>|t|)    
## (Intercept)   -21.560      2.045    24.143 -10.541 1.63e-10 ***
## repRoleAUU     11.186      1.894 10882.814   5.905 3.64e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##            (Intr)
## repRoleAUU -0.461

The unambiguous repRole (“repeat” and “switch” combined) condition is 11.2 msec {was 12.0 msec} slower on average than the ambiguous condition in the video session. This difference is highly statistically significant t(1) = 5.90, p = 3.64e-09 {was t(1) = 5.69, p = 1.28e-08}. This statistic may be cited in the paper (p. 6, top left): “[The RRT] was approx 11 ms slower on unambiguous than ambiguous trials. The difference is highly statistically significant (t(15) = 5.9, p < 1E-8) and indicates a priming effect.”

Note that the F statistic of the ANOVA gives the same significance level, as it should because t = 5.905 = sqrt(34.86) = sqrt(F) {was t = 5.69 = sqrt(32.4) = sqrt(F)}.

Contrasting the Repeat and Switch conditions

The easiest way to obtain some estimate of the statistical significance of the contrast between the “repeat” and “switch” levels is to use the original repRole factor. Its “leftmost” level is -1=="switch" and it will be the “baseline” for the fixed-effect coefficients:

# Fit the model to all data:
M$main2RF <- lmerTest::lmer(residRT ~ session * factor(repRole) +(1+session | subjNum), 
                          data = D, REML = FALSE)

# ANOVA table:
anova(M$main2RF)
## Type III Analysis of Variance Table with Satterthwaite's method
##                         Sum Sq Mean Sq NumDF   DenDF F value    Pr(>F)    
## session                    545     545     1    15.4  0.0826    0.7777    
## factor(repRole)         181051   90526     2 21943.5 13.7225 1.107e-06 ***
## session:factor(repRole) 169459   84730     2 21943.5 12.8438 2.662e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Fit the model to the video data only:
M$main2RF_video <- lmerTest::lmer(residRT ~ factor(repRole) + (1 | subjNum), 
                          data = subset(D, session=="video"), REML = FALSE)

# ANOVA table:
anova(M$main2RF_video)
## Type III Analysis of Variance Table with Satterthwaite's method
##                 Sum Sq Mean Sq NumDF DenDF F value    Pr(>F)    
## factor(repRole) 344994  172497     2 10883  17.645 2.235e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Print the coefficients of the reduced model:
summary(M$main2RF_video)
## Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
##   method [lmerModLmerTest]
## Formula: residRT ~ factor(repRole) + (1 | subjNum)
##    Data: subset(D, session == "video")
## 
##      AIC      BIC   logLik deviance df.resid 
## 131071.9 131108.4 -65531.0 131061.9    10892 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.2019 -0.5963 -0.1300  0.3886  6.1951 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  subjNum  (Intercept)   35.95   5.996  
##  Residual             9775.86  98.873  
## Number of obs: 10897, groups:  subjNum, 15
## 
## Fixed effects:
##                   Estimate Std. Error        df t value Pr(>|t|)    
## (Intercept)        -11.255      2.455    50.061  -4.584 3.07e-05 ***
## factor(repRole)0   -10.305      2.327 10882.495  -4.428 9.62e-06 ***
## factor(repRole)1     1.750      2.685 10882.303   0.652    0.515    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) fc(R)0
## fctr(rpRl)0 -0.635       
## fctr(rpRl)1 -0.551  0.581

The relevant line is: factor(repRole)1 = 1.750, t = 0.652, p = 0.515 {was factor(repRole)1 = 3.498, t = 1.18, p = 0.24}. The 1.75-ms {was 3.5-ms} difference between these two group means is not statistically significant. This is cited in the paper (p. 6, end of Results section): “There was no evidence of switch costs (t(10882) = 0.65, p = 0.5).”

Calculate Error Bars

Use the residual standard deviations from the reduced models, separately for the two sessions. The RTs on linguistic trials are less variable (and faster) than those on video trials.

Concretely, sigma_video = sigma(M$main_video) and sigma_video = sigma(M$main_ling). Both models are defined above. sigma() is a generic function that retrieves the standard deviation of the residuals of linear models.

# Define a helper function -- compare with `let` in Lisp:
# (We store the function object in M to avoid cluttering the main workspace.)
M$supply_sigma <- function(session, sigma_video, sigma_ling)
  ifelse(session=="video",  sigma_video,
    ifelse(session=="ling", sigma_ling, 
                            NA            # should never occur
))

# Test:
print(M$supply_sigma(stats_aggreg$session, 1.0, 2.0))
## [1] 1 1 1 2 2 2
# Add a new field to stats_aggreg:
stats_aggreg$sigma_resid <- M$supply_sigma(stats_aggreg$session,
                                           sigma(M$main_video),
                                           sigma(M$main_ling))

# Confidence interval: CI95 = (sigma / sqrt(N)) * qnorm(0.975)
stats_aggreg$CI95 <- with(stats_aggreg,
                          sigma_resid / sqrt(sum_N_good_trials)) *
                     qnorm(0.975)

# Print the augmented data frame:
print(stats_aggreg)
##   session repRole repRoleF sum_N_good_trials mean_RT_mean mean_rRT_mean
## 1   video       0        A              5472     489.6894    -21.564231
## 2   video       1        R              2732     502.4767     -9.445816
## 3   video      -1        S              2693     500.5311    -11.196110
## 4    ling       0        A              5511     387.1307    -13.379715
## 5    ling       1        R              2787     386.7756    -13.729580
## 6    ling      -1        S              2778     388.0696    -12.549771
##   sigma_resid     CI95
## 1    98.87297 2.619706
## 2    98.87297 3.707535
## 3    98.87297 3.734285
## 4    58.90140 1.555101
## 5    58.90140 2.186781
## 6    58.90140 2.190320

Technical note: These are within-subject confidence intervals. They do not include individual differences in overall (intercept RT) or individual differences in the main effect of session (cf. Loftus & Masson, 1994, PB&R, 1, 476-490). Technically this simplistic calculation is not quite right, but it’s good enough.

Plot the RT Profiles with Errorbars

We re-plot the group-level RT figure from above, but this time add errorbars. Section 5.1 of the ggplot2 book illustrates various geoms for depicting uncertainty. Two geoms are most relevant: geom_errorbar()+geom_point() and geom_pointrange():

## Make a ggplot object of the aggregate unprocessed-RT profiles (in ms)
gp$rRT_aggreg2 <- ggplot(stats_aggreg, 
  aes(x = repRoleF, y = mean_rRT_mean, 
      ymin = mean_rRT_mean - CI95, ymax = mean_rRT_mean + CI95,   
      group=session, color=session, shape=session)) + 
  geom_line()

## Render with geom_pointrange()
print(gp$rRT_aggreg2 + geom_pointrange())

## Render the same plot with geom_errorbar() + geom_point()
print(gp$rRT_aggreg2 + geom_errorbar(width=0.05) + geom_point())

I [Alex] like the errorbar option (with a small width setting) a little better than the pointrange option. I will use geom_errorbar() for the production-quality plot below.

Make Publication-Quality Plot

We plot the session-specific rRT profiles side by side:

## Make a ggplot object of the aggregate unprocessed-RT profiles (in ms)
gp$rRT_aggreg3 <- ggplot(stats_aggreg, 
  aes(x = repRoleF, y = mean_rRT_mean, 
      ymin = mean_rRT_mean - CI95, ymax = mean_rRT_mean + CI95,
      group = session)) + 
  facet_wrap(~session, nrow=1) +   # Now we do NOT need scales="free_y".
  geom_line() + geom_point() + 
  geom_errorbar(width=0.1)

## Render the plot with default ylims
print(gp$rRT_aggreg3 )

The earlier version of the script had a chunk here that used the ggh4x package that provided tools for setting the scale parameters separately for each individual facet. This complication does not arise here because the residual RTs are comparable in both sessions.

Now let’s work on the annotations, axis titles and X labels:

## Add axis titles and X labels:
gp$rRT_aggreg3 <- gp$rRT_aggreg3 +
   labs(x = "Relational-role assignment",
        y = "Residual response time [ms]") +
   scale_x_discrete(label = c("ambiguous", "repeat", "switch"))

## Render it with the default theme:
print(gp$rRT_aggreg3)

To customize the facet labels we need an extra argument to facet_wrap. We also need to make a separate object – “session_labels” in this case. It is needed so that its “names” attribute can be changed. See https://www.datanovia.com/en/blog/how-to-change-ggplot-facet-labels/ for details. The documentation of the labellers function did not provide a worked out example!!

## Customize the facet labels:
# (We store session_labels in gp to avoid cluttering the main workspace.)
gp$session_labels <- c("Video prime", "Sentence prime")
names(gp$session_labels) <- levels(stats_aggreg$session)   # Sic!

## Make a ggplot object of the aggregate unprocessed-RT profiles (in ms)
gp$rRT_aggreg4 <- ggplot(stats_aggreg, 
  aes(x = repRoleF, y = mean_rRT_mean, 
      ymin = mean_rRT_mean - CI95, ymax = mean_rRT_mean + CI95,
      group = session)) + 
  geom_line() + geom_point() + geom_errorbar(width=0.1) +
  # facet_wrap(~session, nrow=1) +   # default facet labels
  facet_wrap(~session, nrow=1,
             labeller = labeller(session = gp$session_labels)) +
  labs(x = "Relational-role assignment",
       y = "Residual response time [ms]") +
  scale_x_discrete(label = c("ambiguous", "repeat", "switch"))

## Render it with the default theme:
print(gp$rRT_aggreg4)

Tilt the X labels because otherwise they get squashed together when the plot is rendered at width=3.3in (see next section). Tilting in ggplot is explained in Section 17.4.2 of ggplot2 book.

# Tilted X labels
print(gp$rRT_aggreg4 + 
        theme(axis.text.x = element_text(angle = -30, vjust = 1, hjust = 0)))

Alternatively, we can “dodge” the X labels as explained in Section 10.3.2:

# Tilted X labels
print(gp$rRT_aggreg4 + 
        guides(x = guide_axis(n.dodge = 2)))

Neither option is great, but dodging seems less bad to me. So, let’s go with it. The book says that guide() is a shorthand way to set “guide” arguments to one or more scales. The same result can be achieved via scale_x_discrete().

All in all, this is our production-quality ggplot object:

## Customize the facet labels:
# (We store session_labels in gp to avoid cluttering the main workspace.)
gp$session_labels <- c("Video prime", "Sentence prime")
names(gp$session_labels) <- levels(stats_aggreg$session)   # Sic!

## Make a ggplot object of the aggregate unprocessed-RT profiles (in ms)
gp$rRT_FINAL <- ggplot(stats_aggreg, 
  aes(x = repRoleF, y = mean_rRT_mean, 
      ymin = mean_rRT_mean - CI95, ymax = mean_rRT_mean + CI95,
      group = session)) + 
  geom_line() + geom_point() + geom_errorbar(width=0.1) +
  # facet_wrap(~session, nrow=1) +   # default facet labels
  facet_wrap(~session, nrow=1,
             labeller = labeller(session = gp$session_labels)) +
  # facetted_pos_scales(y = list(scale_y_continuous(limits = c(480,510)), 
  #                              scale_y_continuous(limits = c(380,410)))) +
  labs(x = "Relational-role assignment",
       y = "Residual response time [ms]") +
  scale_x_discrete(label = c("ambiguous", "repeat", "switch"),
                   guide = guide_axis(n.dodge = 2))

## Render it with the default theme:
print(gp$rRT_FINAL)

Export the Production-Quality Plot

We save the exact same plot 4 times:

We set the width to 3.3 inches, which is one column width according to CogSci 2024 document template. The height is set to 3.8 {was 4.0 inches}, which works nicely with the 3.3in width.

# Default color scheme, "png" format
ggsave("fig/SR1_CogSci2024_final_plot.png",
       plot = gp$rRT_FINAL, width=3.3, height=3.8, units="in")

# Default color scheme, "pdf" format
ggsave("fig/SR1_CogSci2024_final_plot.pdf", 
       plot = gp$rRT_FINAL, width=3.3, height=3.8, units="in")

# BW color scheme, "png" format
ggsave("fig/SR1_CogSci2024_final_plot_BW.png", 
       plot = gp$rRT_FINAL + theme_bw(),
       width=3.3, height=3.8, units="in")

# BW color scheme, "pdf" format
ggsave("fig/SR1_CogSci2024_final_plot_BW.pdf", 
       plot = gp$rRT_FINAL + theme_bw(),
       width=3.3, height=3.8, units="in")

Prepare to Plot the Group Learning Curves

The revised, final paper will have one more figure than the originally submitted manuscript. We need to plot the learning curves in order to show the reader the practice effects and the main effect of Session. The new dependent variable – residRT – is the residual around these learning curves.

The script “./SR1_learning_effect1.Rmd” plotted our sorts of learning curves. Our goal here is to make a publication-quality version of the last figure there.

We need to aggregate the data across subjects and also across “miniblocks”. The following two code cells borrow heavily from “./SR1_learning_effect1.Rmd”. Note that we use the unfiltered data in SR1_data rather than the filtered data in D. If we were to use the latter, the learning-curve plots acquire a jagged appearance due to irregular dropping of trials from disparate subjects at different times. For purposes of plotting the data (as opposed to analyzing statistically), it is OK to use the unfiltered data set. Moreover, it is what was actually used to fit the learning curves themselves.

## Average the RTs across individual participants:
SR1_groupRT_data <- ddply(SR1_data, .(session, trialNum), function(D1) {
  data.frame(
    RT_mean  = mean(D1$clippedRT),
    RT_lcurve = mean(D1$RT_lcurve),
    N_sbj = nrow(D1)
)})
stopifnot(is.constant(SR1_groupRT_data$N_sbj))

## Global parameter
#  This was PAR$miniblock_size = 8 in "./SR1_learning_effect1.Rmd".
#  Here we'll try gp$miniblock_size = 6, which should show a little more scatter.
gp$miniblock_size = 6

## Helper function:
convert_trialNum_to_trialMblk <- function(trialNum, miniblock_size = gp$miniblock_size)
  (floor((trialNum-1) / miniblock_size) * miniblock_size + floor(miniblock_size/2))

## Define a new variable trialMblk that collapses each miniblock into a single point:
SR1_groupRT_data$trialMblk <- convert_trialNum_to_trialMblk(SR1_groupRT_data$trialNum)

## Average the RTs further across the so-called miniblocks:
SR1_groupRT_Mblk <- ddply(SR1_groupRT_data, .(session, trialMblk), function(D1) {
  data.frame(
    RT_mean  = mean(D1$RT_mean),
    RT_lcurve = mean(D1$RT_lcurve),
    N_obs = nrow(D1) * constant(D1$N_sbj, verifyp=FALSE)   # number of observations
)})
stopifnot(constant(SR1_groupRT_Mblk$N_obs) == 15*gp$miniblock_size)  # N_subj==15
## Scatter-plot the group-averaged RTs:
gp$RT_lcurve1 <- ggplot(SR1_groupRT_Mblk, 
  aes(x = trialMblk, y = RT_mean, shape=session)) + 
  geom_point(color="#AAAAAA") +
  geom_line(aes(y = RT_lcurve, color=session), linewidth=1)
print(gp$RT_lcurve1)

Make Publication-Quality Plot of the Group Learning Curves

We plot the two sessions side-by-side rather than one under the other. Customize the axis titles (see https://ggplot2-book.org/annotations) and Y limits. We customize the facet labels using the same gp$session_labels as we did for the RT profile plots above.

## Plot with facet_wrap(~session):
gp$RT_lcurve2 <- ggplot(SR1_groupRT_Mblk,
  aes(x = trialMblk, y = RT_mean)) +       # shape=session
  geom_point(color="#AAAAAA") +
  geom_line(aes(y = RT_lcurve), color="black", linewidth=1) +
  facet_wrap(~session, nrow=1,
             labeller = labeller(session = gp$session_labels)) +
   labs(x = "Trials since start of session",
        y = "Response time [ms]") +
  ylim(320, 780)

## Render the plot with default theme:
print(gp$RT_lcurve2 )

Export the Production-Quality Plot

We save the exact same plot 4 times:

We use the same sizes (w=3.3, h=3.8) as for the RT profile plot above.

# Default color scheme, "png" format
ggsave("fig/SR1_CogSci2024_final_lcurves.png",
       plot = gp$RT_lcurve2, width=3.3, height=3.8, units="in")

# Default color scheme, "pdf" format
ggsave("fig/SR1_CogSci2024_final_lcurves.pdf", 
       plot = gp$RT_lcurve2, width=3.3, height=3.8, units="in")

# BW color scheme, "png" format
ggsave("fig/SR1_CogSci2024_final_lcurves_BW.png", 
       plot = gp$RT_lcurve2 + theme_bw(),
       width=3.3, height=3.8, units="in")

# BW color scheme, "pdf" format
ggsave("fig/SR1_CogSci2024_final_lcurves_BW.pdf", 
       plot = gp$RT_lcurve2 + theme_bw(),
       width=3.3, height=3.8, units="in")

End of script “SR1_CogSci2024_final.Rmd”.