I'm assuming that the data are assumed to be paired as ordered in the two inputs in the non-formula signature of the t.test function t.test(x, y, paired=T) (x and y in the documentation).
However, how does the function associate observations in the two groups as pairs in the formula signature t.test(values groups, df, paired=T)? by sequence?
I make a dataframe with paired before and after data in the spreadsheet below. Then, I formatted it twice in long form (as would be appropriate for the t.test function): 1) List the "before" group in the order of observation, followed by the "after" group. 2) List every piece of information in random order.
On both sets of data, I conduct a paired t-test. It's rather clear that in two groups, the t.test function is making some kind of assumption.
library(tidyverse)
df = data.frame(
observation = 1:20,
before = rnorm(20, 10, 2),
after = rnorm(20, 10.2, 2.3)
)
print.data.frame(df)
#> observation before after
#> 1 1 10.930157 11.818216
#> 2 2 10.870749 10.699232
#> 3 3 9.603120 14.384484
#> 4 4 9.615291 8.777045
#> 5 5 6.714043 9.506421
#> 6 6 9.063117 5.574887
#> 7 7 8.152260 10.357455
#> 8 8 8.256237 8.660646
#> 9 9 12.641977 7.511760
#> 10 10 11.010290 9.391047
#> 11 11 12.545197 9.072856
#> 12 12 12.606526 9.110687
#> 13 13 8.659088 12.445071
#> 14 14 8.958959 10.783168
#> 15 15 11.635443 6.926802
#> 16 16 6.922437 12.419453
#> 17 17 10.326176 10.416757
#> 18 18 7.680960 9.836573
#> 19 19 9.458365 8.083777
#> 20 20 7.235837 12.094290
df_long =
df %>%
pivot_longer(
cols = c("before", "after"),
names_to = "time",
values_to="fabulousness"
)
print.data.frame(df_long)
#> observation time fabulousness
#> 1 1 before 10.930157
#> 2 1 after 11.818216
#> 3 2 before 10.870749
#> 4 2 after 10.699232
#> 5 3 before 9.603120
#> 6 3 after 14.384484
#> 7 4 before 9.615291
#> 8 4 after 8.777045
#> 9 5 before 6.714043
#> 10 5 after 9.506421
#> 11 6 before 9.063117
#> 12 6 after 5.574887
#> 13 7 before 8.152260
#> 14 7 after 10.357455
#> 15 8 before 8.256237
#> 16 8 after 8.660646
#> 17 9 before 12.641977
#> 18 9 after 7.511760
#> 19 10 before 11.010290
#> 20 10 after 9.391047
#> 21 11 before 12.545197
#> 22 11 after 9.072856
#> 23 12 before 12.606526
#> 24 12 after 9.110687
#> 25 13 before 8.659088
#> 26 13 after 12.445071
#> 27 14 before 8.958959
#> 28 14 after 10.783168
#> 29 15 before 11.635443
#> 30 15 after 6.926802
#> 31 16 before 6.922437
#> 32 16 after 12.419453
#> 33 17 before 10.326176
#> 34 17 after 10.416757
#> 35 18 before 7.680960
#> 36 18 after 9.836573
#> 37 19 before 9.458365
#> 38 19 after 8.083777
#> 39 20 before 7.235837
#> 40 20 after 12.094290
df_long_not_paired =
df_long %>%
arrange(fabulousness)
print.data.frame(df_long_not_paired)
#> observation time fabulousness
#> 1 6 after 5.574887
#> 2 5 before 6.714043
#> 3 16 before 6.922437
#> 4 15 after 6.926802
#> 5 20 before 7.235837
#> 6 9 after 7.511760
#> 7 18 before 7.680960
#> 8 19 after 8.083777
#> 9 7 before 8.152260
#> 10 8 before 8.256237
#> 11 13 before 8.659088
#> 12 8 after 8.660646
#> 13 4 after 8.777045
#> 14 14 before 8.958959
#> 15 6 before 9.063117
#> 16 11 after 9.072856
#> 17 12 after 9.110687
#> 18 10 after 9.391047
#> 19 19 before 9.458365
#> 20 5 after 9.506421
#> 21 3 before 9.603120
#> 22 4 before 9.615291
#> 23 18 after 9.836573
#> 24 17 before 10.326176
#> 25 7 after 10.357455
#> 26 17 after 10.416757
#> 27 2 after 10.699232
#> 28 14 after 10.783168
#> 29 2 before 10.870749
#> 30 1 before 10.930157
#> 31 10 before 11.010290
#> 32 15 before 11.635443
#> 33 1 after 11.818216
#> 34 20 after 12.094290
#> 35 16 after 12.419453
#> 36 13 after 12.445071
#> 37 11 before 12.545197
#> 38 12 before 12.606526
#> 39 9 before 12.641977
#> 40 3 after 14.384484
df_long_paired =
df_long %>%
arrange(desc(time))
print.data.frame(df_long_paired)
#> observation time fabulousness
#> 1 1 before 10.930157
#> 2 2 before 10.870749
#> 3 3 before 9.603120
#> 4 4 before 9.615291
#> 5 5 before 6.714043
#> 6 6 before 9.063117
#> 7 7 before 8.152260
#> 8 8 before 8.256237
#> 9 9 before 12.641977
#> 10 10 before 11.010290
#> 11 11 before 12.545197
#> 12 12 before 12.606526
#> 13 13 before 8.659088
#> 14 14 before 8.958959
#> 15 15 before 11.635443
#> 16 16 before 6.922437
#> 17 17 before 10.326176
#> 18 18 before 7.680960
#> 19 19 before 9.458365
#> 20 20 before 7.235837
#> 21 1 after 11.818216
#> 22 2 after 10.699232
#> 23 3 after 14.384484
#> 24 4 after 8.777045
#> 25 5 after 9.506421
#> 26 6 after 5.574887
#> 27 7 after 10.357455
#> 28 8 after 8.660646
#> 29 9 after 7.511760
#> 30 10 after 9.391047
#> 31 11 after 9.072856
#> 32 12 after 9.110687
#> 33 13 after 12.445071
#> 34 14 after 10.783168
#> 35 15 after 6.926802
#> 36 16 after 12.419453
#> 37 17 after 10.416757
#> 38 18 after 9.836573
#> 39 19 after 8.083777
#> 40 20 after 12.094290
df_long_not_paired %>%
t.test(fabulousness ~ time, ., paired=T)
#>
#> Paired t-test
#>
#> data: fabulousness by time
#> t = 2.0289, df = 19, p-value = 0.05672
#> alternative hypothesis: true difference in means is not equal to 0
#> 95 percent confidence interval:
#> -0.007878376 0.506318062
#> sample estimates:
#> mean of the differences
#> 0.2492198
df_long_paired %>%
t.test(fabulousness ~ time, ., paired=T)
#>
#> Paired t-test
#>
#> data: fabulousness by time
#> t = 0.3422, df = 19, p-value = 0.736
#> alternative hypothesis: true difference in means is not equal to 0
#> 95 percent confidence interval:
#> -1.27509 1.77353
#> sample estimates:
#> mean of the differences
#> 0.2492198