JOURNAL OF MATHEMATICAL PSYCHOLOGY

Expressions for Bayesian confidence of drift diffusion observers in fluctuating stimuli tasks
Calder-Travis J, Bogacz R and Yeung N
We introduce a new approach to modelling decision confidence, with the aim of enabling computationally cheap predictions while taking into account, and thereby exploiting, trial-by-trial variability in stochastically fluctuating stimuli. Using the framework of the drift diffusion model of decision making, along with time-dependent thresholds and the idea of a Bayesian confidence readout, we derive expressions for the probability distribution over confidence reports. In line with current models of confidence, the derivations allow for the accumulation of "pipeline" evidence that has been received but not processed by the time of response, the effect of drift rate variability, and metacognitive noise. The expressions are valid for stimuli that change over the course of a trial with normally-distributed fluctuations in the evidence they provide. A number of approximations are made to arrive at the final expressions, and we test all approximations via simulation. The derived expressions contain only a small number of standard functions, and require evaluating only once per trial, making trial-by-trial modelling of confidence data in stochastically fluctuating stimuli tasks more feasible. We conclude by using the expressions to gain insight into the confidence of optimal observers, and empirically observed patterns.
Experiment-based calibration in psychology: Optimal design considerations
Bach DR
Psychological theories are often formulated at the level of latent, not directly observable, variables. Empirical measurement of latent variables ought to be valid. Classical psychometric validity indices can be difficult to apply in experimental contexts. A complementary validity index, termed retrodictive validity, is the correlation of theory-derived predicted scores with actually measured scores, in specifically designed calibration experiments. In the current note, I analyse how calibration experiments can be designed to maximise the information garnered and specifically, how to minimise the sample variance of retrodictive validity estimators. First, I harness asymptotic limits to analytically derive different distribution features that impact on estimator variance. Then, I numerically simulate various distributions with combinations of feature values. This allows deriving recommendations for the distribution of predicted values, and for resource investment, in calibration experiments. Finally, I highlight cases in which a misspecified theory is particularly problematic.
How do people build up visual memory representations from sensory evidence? Revisiting two classic models of choice
Robinson MM, DeStefano IC, Vul E and Brady TF
In many decision tasks, we have a set of alternative choices and are faced with the problem of how to use our latent beliefs and preferences about each alternative to make a single choice. Cognitive and decision models typically presume that beliefs and preferences are distilled to a scalar latent strength for each alternative, but it is also critical to model how people use these latent strengths to choose a single alternative. Most models follow one of two traditions to establish this link. Modern psychophysics and memory researchers make use of signal detection theory, assuming that latent strengths are perturbed by noise, and the highest resulting signal is selected. By contrast, many modern decision theoretic modeling and machine learning approaches use the softmax function (which is based on Luce's choice axiom; Luce, 1959) to give some weight to non-maximal-strength alternatives. Despite the prominence of these two theories of choice, current approaches rarely address the connection between them, and the choice of one or the other appears more motivated by the tradition in the relevant literature than by theoretical or empirical reasons to prefer one theory to the other. The goal of the current work is to revisit this topic by elucidating which of these two models provides a better characterization of latent processes in -alternative decision tasks, with a particular focus on memory tasks. In a set of visual memory experiments, we show that, within the same experimental design, the softmax parameter varies across -alternatives, whereas the parameter of the signal-detection model is stable. Together, our findings indicate that replacing softmax with signal-detection link models would yield more generalizable predictions across changes in task structure. More ambitiously, the invariance of signal detection model parameters across different tasks suggests that the parametric assumptions of these models may be more than just a mathematical convenience, but reflect something real about human decision-making.
A Statistical Foundation for Derived Attention
Paskewitz S and Jones M
According to the theory of derived attention, organisms attend to cues with strong associations. Prior work has shown that - combined with a Rescorla-Wagner style learning mechanism - derived attention explains phenomena such as learned predictiveness, inattention to blocked cues, and value-based salience. We introduce a Bayesian derived attention model that explains a wider array of results than previous models and gives further insight into the principle of derived attention. Our approach combines Bayesian linear regression with the assumption that the associations of any cue with various outcomes share the same prior variance, which can be thought of as the inherent importance of that cue. The new model simultaneously estimates cue-outcome associations and prior variance through approximate Bayesian learning. A significant cue will develop large associations, leading the model to estimate a high prior variance and hence develop larger associations from that cue to novel outcomes. This provides a normative, statistical explanation for derived attention. Through simulation, we show that this Bayesian derived attention model not only explains the same phenomena as previous versions, but also retrospective revaluation. It also makes a novel prediction: inattention after backward blocking. We hope that further development of the Bayesian derived attention model will shed light on the complex relationship between uncertainty and predictiveness effects on attention.
Adaptive Design Optimization for a Mnemonic Similarity Task
Villarreal M, Stark CEL and Lee MD
The Mnemonic Similarity Task (MST: Stark et al., 2019) is a modified recognition memory task designed to place strong demand on pattern separation. The sensitivity and reliability of the MST make it an extremely valuable tool in clinical settings, where it has been used to identify hippocampal dysfunction associated with healthy aging, dementia, schizophrenia, depression, and other disorders. As with any test used in a clinical setting, it is especially important for the MST to be administered as efficiently as possible. We apply adaptive design optimization methods (Lesmes et al., 2015; Myung et al., 2013) to optimize the presentation of test stimuli in accordance with previous responses. This optimization is based on a signal-detection model of an individual's memory capabilities and decision-making processes. We demonstrate that the adaptive design optimization approach generally reduces the number of test stimuli needed to provide these measures.
A step-by-step tutorial on active inference and its application to empirical data
Smith R, Friston KJ and Whyte CJ
The active inference framework, and in particular its recent formulation as a partially observable Markov decision process (POMDP), has gained increasing popularity in recent years as a useful approach for modeling neurocognitive processes. This framework is highly general and flexible in its ability to be customized to model any cognitive process, as well as simulate predicted neuronal responses based on its accompanying neural process theory. It also affords both simulation experiments for proof of principle and behavioral modeling for empirical studies. However, there are limited resources that explain how to build and run these models in practice, which limits their widespread use. Most introductions assume a technical background in programming, mathematics, and machine learning. In this paper we offer a step-by-step tutorial on how to build POMDPs, run simulations using standard MATLAB routines, and fit these models to empirical data. We assume a minimal background in programming and mathematics, thoroughly explain all equations, and provide exemplar scripts that can be customized for both theoretical and empirical studies. Our goal is to provide the reader with the requisite background knowledge and practical tools to apply active inference to their own research. We also provide optional technical sections and multiple appendices, which offer the interested reader additional technical details. This tutorial should provide the reader with all the tools necessary to use these models and to follow emerging advances in active inference research.
Development of a novel computational model for the Balloon Analogue Risk Task: The Exponential-Weight Mean-Variance Model
Park H, Yang J, Vassileva J and Ahn WY
The Balloon Analogue Risk Task (BART) is a popular task used to measure risk-taking behavior. To identify cognitive processes associated with choice behavior on the BART, a few computational models have been proposed. However, the extant models either fail to capture choice patterns on the BART or show poor parameter recovery performance. Here, we propose a novel computational model, the exponential-weight mean-variance (EWMV) model, which addresses the limitations of existing models. By using multiple model comparison methods, including post hoc model fits criterion and parameter recovery, we showed that the EWMV model outperforms the existing models. In addition, we applied the EWMV model to BART data from healthy controls and substance-using populations (patients with past opiate and stimulant dependence). The results suggest that (1) the EWMV model addresses the limitations of existing models and (2) heroin-dependent individuals show reduced risk preference than other groups, which may have significant clinical implications.
A Modified Sequential Probability Ratio Test
Pramanik S, Johnson VE and Bhattacharya A
We describe a modified sequential probability ratio test that can be used to reduce the average sample size required to perform statistical hypothesis tests at specified levels of significance and power. Examples are provided for tests, tests, and tests of binomial success probabilities. A description of a software package to implement the test designs is provided. We compare the sample sizes required in fixed design tests conducted at 5% significance levels to the average sample sizes required in sequential tests conducted at 0.5% significance levels, and we find that the two sample sizes are approximately equal.
A Study of Individual Differences in Categorization with Redundancy
Shamloo F and Hélie S
Humans and other animals are constantly learning new categories and making categorization decisions in their everyday life. However, different individuals may focus on different information when learning categories, which can impact the category representation and the information that is used when making categorization decisions. This article used computational modeling of behavioral data to take a closer look at this possibility in the context of a categorization task with redundancy. Iterative decision bomid modeling and drift diffusion models were used to detect individual differences in human categorization performance. The results show that participants differ in terms of what stimulus features they learned and how they use the learned features. For example, while some participants only learn one stimulus dimension (which is sufficient for perfect accuracy), others learn both stimulus dimensions (which is not required for perfect accuracy). Among participants that learned both dimensions, some used both dimensions, while others show error and RT patterns suggesting the use of only one of the dimensions. The diversity of obtained results is problematic for existing categorization models and suggests that each categorization model may be able to account for the performance of some but not all participants.
Active inference on discrete state-spaces: A synthesis
Da Costa L, Parr T, Sajid N, Veselic S, Neacsu V and Friston K
Active inference is a normative principle underwriting perception, action, planning, decision-making and learning in biological or artificial agents. From its inception, its associated process theory has grown to incorporate complex generative models, enabling simulation of a wide range of complex behaviours. Due to successive developments in active inference, it is often difficult to see how its underlying principle relates to process theories and practical implementation. In this paper, we try to bridge this gap by providing a complete mathematical synthesis of active inference on discrete state-space models. This technical summary provides an overview of the theory, derives neuronal dynamics from first principles and relates this dynamics to biological processes. Furthermore, this paper provides a fundamental building block needed to understand active inference for mixed generative models; allowing continuous sensations to inform discrete representations. This paper may be used as follows: to guide research towards outstanding challenges, a practical guide on how to implement active inference to simulate experimental behaviour, or a pointer towards various in-silico neurophysiological responses that may be used to make empirical predictions.
A Note on Decomposition of Sources of Variability in Perceptual Decision-making
Kang I, Ratcliff R and Voskuilen C
Information processing underlying human perceptual decision-making is inherently noisy and identifying sources of this noise is important to understand processing. Ratcliff, Voskuilen, and McKoon (2018) examined results from five experiments using a double-pass procedure in which stimuli were repeated typically a hundred trials later. Greater than chance agreement between repeated tests provided evidence for trial-to-trial variability from external sources of noise. They applied the diffusion model to estimate the quality of evidence driving the decision process (drift rate) and the variability (standard deviation) in drift rate across trials. This variability can be decomposed into random (internal) and systematic (external) components by comparing the double-pass accuracy and agreement with the model predictions. In this note, we provide an additional analysis of the double-pass experiments using the linear ballistic accumulator (LBA) model. The LBA model does not have within-trial variability and thus it captures all variability in processing with its across-trial variability parameters. The LBA analysis of the double-pass data provides model-based evidence of external variability in a decision process, which is consistent with Ratcliff et al.'s result. This demonstrates that across-trial variability is required to model perceptual decision-making. The LBA model provides measures of systematic and random variability as the diffusion model did. But due to the lack of within-trial variability, the LBA model estimated the random component as a larger proportion of across-trial total variability than did the diffusion model.
Dissecting EXIT
Paskewitz S and Jones M
Kruschke's EXIT model (Kruschke, 2001b) has been very successful in explaining a variety of learning phenomena by means of selective attention. In particular, EXIT produces learned predictiveness effects (Le Pelley & McLaren, 2003), the inverse base rate effect (Kruschke, 1996; Medin & Edelson, 1988), inattention after blocking (Beesley & Le Pelley, 2011; Kruschke & Blair, 2000), differential cue use across the stimulus space (Aha & Goldstone, 1992) and conditional learned predictiveness effects (Uengoer, Lachnit, Lotz, Koenig, & Pearce, 2013). We dissect EXIT into its component mechanisms (error-driven learning, selective attention, attentional competition, rapid attention shifts and exemplar mediation of attention) and test whether simplified versions of EXIT can explain the same experimental results as the full model. Most phenomena can be explained by either rapid attention shifts or attentional competition, without the need for combining them as in EXIT. There is little evidence for exemplar mediation of attention when people learn linearly separable category structures (e.g. Kruschke & Blair, 2000; Le Pelley & McLaren, 2003); whether or not it is needed for non-linear categories depends on stimulus representation (Aha & Goldstone, 1992; Uengoer et al., 2013). On the whole, we believe that attentional competition-embodied in a model which we dub CompAct-offers the simplest explanation for the experimental results we examine.
Fictional narrative as a variational Bayesian method for estimating social dispositions in large groups
Carney J, Robertson C and Dávid-Barrett T
Modelling intentions in large groups is cognitively costly. Not alone must first order beliefs be tracked ('what does A think about X?'), but also beliefs about beliefs ('what does A think about B's belief concerning X?'). Thus linear increases in group size impose non-linear increases in cognitive processing resources. At the same time, however, large groups offer coordination advantages relative to smaller groups due to specialisation and increased productive capacity. How might these competing demands be reconciled? We propose that fictional narrative can be understood as a cultural tool for dealing with large groups. Specifically, we argue that prototypical action roles that are removed from real-world interactions function as interpretive priors in a form of variational Bayesian inference, such that they allow estimations can be made of unknown social motives. We offer support for this claim in two ways. Firstly, by evaluating the existing literature on narrative cognition and showing where it anticipates a variational model; and secondly, by simulation, where we show that an agent-based model naturally converges on a set of social categories that resemble narrative across a wide range of starting points.
Multinomial Models with Linear Inequality Constraints: Overview and Improvements of Computational Methods for Bayesian Inference
Heck DW and Davis-Stober CP
Many psychological theories can be operationalized as linear inequality constraints on the parameters of multinomial distributions (e.g., discrete choice analysis). These constraints can be described in two equivalent ways: Either as the solution set to a system of linear inequalities or as the convex hull of a set of extremal points (vertices). For both representations, we describe a general Gibbs sampler for drawing posterior samples in order to carry out Bayesian analyses. We also summarize alternative sampling methods for estimating Bayes factors for these model representations using the encompassing Bayes factor method. We introduce the R package multinomineq, which provides an easily-accessible interface to a computationally efficient implementation of these techniques.
A tutorial on Dirichlet Process mixture modeling
Li Y, Schofield E and Gönen M
Bayesian nonparametric (BNP) models are becoming increasingly important in psychology, both as theoretical models of cognition and as analytic tools. However, existing tutorials tend to be at a level of abstraction largely impenetrable by non-technicians. This tutorial aims to help beginners understand key concepts by working through important but often omitted derivations carefully and explicitly, with a focus on linking the mathematics with a practical computation solution for a Dirichlet Process Mixture Model (DPMM)-one of the most widely used BNP methods. Abstract concepts are made explicit and concrete to non-technical readers by working through the theory that gives rise to them. A publicly accessible computer program written in the statistical language R is explained line-by-line to help readers understand the computation algorithm. The algorithm is also linked to a construction method called the Chinese Restaurant Process in an accessible tutorial in this journal (Gershman & Blei, 2012). The overall goals are to help readers understand more fully the theory and application so that they may apply BNP methods in their own work and leverage the technical details in this tutorial to develop novel methods.
Audiovisual detection at different intensities and delays
Chandrasekaran C, Blurton SP and Gondan M
In the redundant signals task, two target stimuli are associated with the same response. If both targets are presented together, redundancy gains are observed, as compared with single-target presentation. Different models explain these redundancy gains, including race and coactivation models (e.g., the Wiener diffusion superposition model, Schwarz, 1994, Journal of Mathematical Psychology, and the Ornstein Uhlenbeck diffusion superposition model, Diederich, 1995, Journal of Mathematical Psychology). In the present study, two monkeys performed a simple detection task with auditory, visual and audiovisual stimuli of different intensities and onset asynchronies. In its basic form, a Wiener diffusion superposition model provided only a poor description of the observed data, especially of the detection rate (i.e., accuracy or hit rate) for low stimulus intensity. We expanded the model in two ways, by (A) adding a temporal deadline, that is, restricting the evidence accumulation process to a stopping time, and (B) adding a second "nogo" barrier representing target absence. We present closed-form solutions for the mean absorption times and absorption probabilities for a Wiener diffusion process with a drift towards a single barrier in the presence of a temporal deadline (A), and numerically improved solutions for the two-barrier model (B). The best description of the data was obtained from the deadline model and substantially outperformed the two-barrier approach.
A General Approach to Prior Transformation
Segert S and Davis-Stober CP
We present a general method for setting prior distributions in Bayesian models where parameters of interest are re-parameterized via a functional relationship. We generalize the results of Heck and Wagenmakers (2016) by considering the case where the dimension of the auxiliary parameter space does not equal that of the primary parameter space. We present numerical methods for carrying out prior specification for statistical models that do not admit closed-form solutions. Taken together, these results provide researchers a more complete set of tools for setting prior distributions that could be applied to many cognitive and decision making models. We illustrate our approach by reanalyzing data under the Selective Integration model of Tsetsos et al. (2016). We find, via a Bayes factor analysis, that the selective integration model with all four parameters generally outperforms both the three-parameter variant (omitting early cognitive noise) and the = 1 variant (omitting selective gating), as well as an unconstrained competitor model. By contrast, Tsetsos et al. found the three parameter variant to be the best performing in a BIC analysis (in the absence of a competitor). Finally, we also include a pedagogical treatment of the mathematical tools necessary to formulate our results, including a simple "toy" example that illustrates our more general points.
A hierarchical Bayesian state trace analysis for assessing monotonicity while factoring out subject, item, and trial level dependencies
Sadil P, Cowell RA and Huber DE
State trace analyses assess the latent dimensionality of a cognitive process by asking whether the means of two dependent variables conform to a monotonic function across a set of conditions. Using an assumption of independence between the measures, recently proposed statistical tests address bivariate measurement error, allowing both frequentist and Bayesian analyses of monotonicity (e.g., Davis-Stober, Morey, Gretton, & Heathcote, 2016; Kalish, Dunn, Burdakov, & Sysoev, 2016). However, inference can be biased by unacknowledged dependencies between measures, particularly when the data are insufficient to overwhelm an incorrect prior assumption of independence. To address this limitation, we developed a hierarchical Bayesian model that explicitly models the separate roles of subject, item, and trial-level dependencies between two measures. Assessment of monotonicity is then performed by fitting separate models that do or do not allow a non-monotonic relation between the condition effects (i.e., same vs. different rank orders). The Widely Applicable Information Criterion (WAIC) and Pseudo Bayesian Model Averaging - cross validation measures of model fit - are used for model comparison, providing an inferential conclusion regarding the dimensionality of the latent psychological space. We validated this new state trace analysis technique using model recovery simulation studies, which assumed different ground truths regarding monotonicity and the direction/magnitude of the subject- and trial-level dependence. We also provide an example application of this new technique to a visual object learning study that compared performance on a visual retrieval task (forced choice part recognition) versus a verbal retrieval task (cued recall).
Optimizing sequential decisions in the drift-diffusion model
Nguyen KP, Josić K and Kilpatrick ZP
To make decisions organisms often accumulate information across multiple timescales. However, most experimental and modeling studies of decision-making focus on sequences of independent trials. On the other hand, natural environments are characterized by long temporal correlations, and evidence used to make a present choice is often relevant to future decisions. To understand decision-making under these conditions we analyze how a model ideal observer accumulates evidence to freely make choices across a sequence of correlated trials. We use principles of probabilistic inference to show that an ideal observer incorporates information obtained on one trial as an initial bias on the next. This bias decreases the time, but not the accuracy of the next decision. Furthermore, in finite sequences of trials the rate of reward is maximized when the observer deliberates longer for early decisions, but responds more quickly towards the end of the sequence. Our model also explains experimentally observed patterns in decision times and choices, thus providing a mathematically principled foundation for evidence-accumulation models of sequential decisions.
Extended Formulations for Order Polytopes through Network Flows
Davis-Stober CP, Doignon JP, Fiorini S, Glineur F and Regenwetter M
Mathematical psychology has a long tradition of modeling probabilistic choice via distribution-free random utility models and associated random preference models. For such models, the predicted choice probabilities often form a bounded and convex polyhedral set, or polytope. Polyhedral combinatorics have thus played a key role in studying the mathematical structure of these models. However, standard methods for characterizing the polytopes of such models are subject to a combinatorial explosion in complexity as the number of choice alternatives increases. Specifically, this is the case for random preference models based on linear, weak, semi- and interval orders. For these, a complete, linear description of the polytope is currently known only for, at most, 5-8 choice alternatives. We leverage the method of extended formulations to break through those boundaries. For each of the four types of preferences, we build an appropriate network, and show that the associated network flow polytope provides an extended formulation of the polytope of the choice model. This extended formulation has a simple linear description that is more parsimonious than descriptions obtained by standard methods for large numbers of choice alternatives. The result is a computationally less demanding way of testing the probabilistic choice model on data. We sketch how the latter interfaces with recent developments in contemporary statistics.
Rotational-symmetry in a 3D scene and its 2D image
Sawada T and Zaidi Q
A 3D shape of an object is -fold rotational-symmetric if the shape is invariant for 360/ degree rotations about an axis. Human observers are sensitive to the 2D rotational-symmetry of a retinal image, but they are less sensitive than they are to 2D mirror-symmetry, which involves invariance to reflection across an axis. Note that perception of the mirror-symmetry of a 2D image and a 3D shape has been well studied, where it has been shown that observers are sensitive to the mirror-symmetry of a 3D shape, and that 3D mirror-symmetry plays a critical role in the veridical perception of a 3D shape from its 2D image. On the other hand, the perception of rotational-symmetry, especially 3D rotational-symmetry, has received very little study. In this paper, we derive the geometrical properties of 2D and 3D rotational-symmetry and compare them to the geometrical properties of mirror-symmetry. Then, we discuss perceptual differences between mirror- and rotational symmetry based on this comparison. We found that rotational-symmetry has many geometrical properties that are similar to the geometrical properties of mirror-symmetry, but note that the 2D projection of a 3D rotational-symmetrical shape is more complex computationally than the 2D projection of a 3D mirror-symmetrical shape. This computational difficulty could make the human visual system less sensitive to the rotational-symmetry of a 3D shape than its mirror-symmetry.