Paper ID: 2201.00889

Biased Hypothesis Formation From Projection Pursuit

John Patterson, Chris Avery, Tyler Grear, Donald J. Jacobs

The effect of bias on hypothesis formation is characterized for an automated data-driven projection pursuit neural network to extract and select features for binary classification of data streams. This intelligent exploratory process partitions a complete vector state space into disjoint subspaces to create working hypotheses quantified by similarities and differences observed between two groups of labeled data streams. Data streams are typically time sequenced, and may exhibit complex spatio-temporal patterns. For example, given atomic trajectories from molecular dynamics simulation, the machine's task is to quantify dynamical mechanisms that promote function by comparing protein mutants, some known to function while others are nonfunctional. Utilizing synthetic two-dimensional molecules that mimic the dynamics of functional and nonfunctional proteins, biases are identified and controlled in both the machine learning model and selected training data under different contexts. The refinement of a working hypothesis converges to a statistically robust multivariate perception of the data based on a context-dependent perspective. Including diverse perspectives during data exploration enhances interpretability of the multivariate characterization of similarities and differences.

Submitted: Jan 3, 2022