- I propose that this concept be called "unexpected surprise" rather than "strictly confused":
"Strictly confused" suggests logical incoherence.
"Unexpected surprise" can be motivated the following way: let $$~$ s(d) = \textrm{surprise}(d \mid H) = - \log \Pr (d \mid H) $~$$ be how surprising data $~$d$~$ is on hypothesis $~$H$~$. Then one is "strictly confused" if the observed $~$s$~$ is larger than than one would expect assuming a $~$H$~$ holds.
This terminology is nice because the average of $~$s$~$ under $~$H$~$ is the entropy or expected surprise in $~$(d \mid H)$~$. It also connects with Bayes, since $$~$\textrm{log-likelihood} = -\textrm{surprise}$~$$ is the evidential support $~$d$~$ gives $~$H$~$.
The section on "Distinction from frequentist p-values" is, I think, both technically incorrect and a bit uncharitable.
It's technically incorrect because the following isn't true:
The classical frequentist test for rejecting the null hypothesis involves considering the probability assigned to particular 'obvious'-seeming partitions of the data, and asking if we ended up inside a low-probability partition.
Actually, the classical frequentist test involves specifying an obvious-seeming measure of surprise $~$t(d)$~$, and seeing whether $~$t$~$ is higher than expected on $~$H$~$. This is even more arbitrary than the above.
On the other hand, it's uncharitable because it's widely acknowledged one should try to choose $~$t$~$ to be sufficient, which is exactly the condition that the partition induced by $~$t$~$ is "compatible" with $~$\Pr(d \mid H)$~$ for different $~$H$~$, in the sense that $$~$\Pr(H \mid d) = \Pr(H \mid t(d))$~$$ for all the considered $~$H$~$.
Clearly $~$s$~$ is sufficient in this sense. But there might be simpler functions of $~$d$~$ that do the job too ("minimal sufficient statistics").
Note that $~$t$~$ being sufficient doesn't make it non-arbitrary, as it may not be a monotone function of $~$s$~$.
Finally, I think that this concept is clearly "extra-Bayesian", in the sense that it's about non-probabilistic ("Knightian") uncertainty over $~$H$~$, and one is considering probabilities attached to unobserved $~$d$~$ (i.e., not conditioning on the observed $~$d$~$).
I don't think being "extra-Bayesian" in this sense is problematic. But I think it should be owned-up to.
Actually, "unexpected surprise" reveals a nice connection between Bayesian and sampling-based uncertainty intervals:
- To get a (HPD) credible interval, exclude those $~$H$~$ that are relatively surprised by the observed $~$d$~$ (or which are a priori surprising).
- To get a (nice) confidence interval, exclude those $~$H$~$ that are "unexpectedly surprised" by $~$d$~$.