Peax is a novel feature-based technique for interactive visual pattern search in sequential data. Visually searching for patterns by similarity is often challenging because of the large search space, the visual complexity of patterns, and the user's perception of similarity. For example, in genomics, researchers try to link patterns in multivariate sequential data to fundamental cellular or pathogenic processes, but a lack of ground truth and high variance makes automatic pattern detection unreliable. We have developed a convolutional autoencoder for unsupervised representation learning of regions in sequential data that can capture more visual details of complex patterns compared to existing similarity measures. Using this learned representation as features of the sequential data, our visual query system enables interactive feedback-driven adjustments of the pattern search to adapt to the users' perceived similarity. While users label regions as either matching their search target or not, a random forest classifier learns to weigh the importance of different dimensions of the learned representation. We employ an active learning strategy to focus the labeling process on regions that will improve the classifier in subsequent training.
The application's source code, the study code, and 6 pre-trained autoencoder for 3, 12, and 120 kb windows of DNase-seq and histone mark ChIP-seq data are available at: