Interactive visual pattern search in sequential data
using unsupervised deep representation learning

Peax is a novel feature-based technique for interactive visual pattern search in sequential data. Visually searching for patterns by similarity is often challenging because of the large search space, the visual complexity of patterns, and the user's perception of similarity. For example, in genomics, researchers try to link patterns in multivariate sequential data to fundamental cellular or pathogenic processes, but a lack of ground truth and high variance makes automatic pattern detection unreliable. We have developed a convolutional autoencoder for unsupervised representation learning of regions in sequential data that can capture more visual details of complex patterns compared to existing similarity measures. Using this learned representation as features of the sequential data, our visual query system enables interactive feedback-driven adjustments of the pattern search to adapt to the users' perceived similarity. While users label regions as either matching their search target or not, a random forest classifier learns to weigh the importance of different dimensions of the learned representation. We employ an active learning strategy to focus the labeling process on regions that will improve the classifier in subsequent training.

Screencast & Presentation

Video Introduction
Slides from BioIT World 2019

Preprint

  1. Peax: Interactive Visual Pattern Search in Sequential Data Using Unsupervised Deep Representation Learning

    1. Fritz Lekschas
    2. Brant Peterson
    3. Daniel Haehn
    4. Eric Ma
    5. Nils Gehlenborg
    6. Hanspeter Pfister
    bioRxiv, 2019. doi: 10.1101/597518

Code & Data

The application's source code, the study code, and 6 pre-trained autoencoder for 3, 12, and 120 kb windows of DNase-seq and histone mark ChIP-seq data are available at:

Authors

  1. Fritz Lekschas

    Harvard John A. Paulson School of Engineering and Applied Sciences

  2. Brant Peterson

    Novartis Institutes for BioMedical Research

  3. Daniel Haehn

    Harvard John A. Paulson School of Engineering and Applied Sciences

  4. Eric Ma

    Novartis Institutes for BioMedical Research

  5. Nils Gehlenborg

    Harvard Medical School

  6. Hanspeter Pfister

    Harvard John A. Paulson School of Engineering and Applied Sciences