Projects

See my code on GitHub. I code primarily in Python and R, and I have proficiency in C, C++, and Julia.



Biology

I have built or contributed to several projects that have dramatically sped up my computational biology workflow.

FAST-iCLIP

This is a complete pipeline to analyze RNA-protein interactions at nucleotide resolution.

Metagene Maker

This is a tool to visualize genomic information across user-defined genomic regions. It is extremely flexible and can be used to visualize, among other things:
  • ChIP-seq tracks of transcription factors
  • Transcription measured by GROseq, RNAseq, NETseq, CAGE, etc.
  • CLIP-seq RNA-protein interaction data
  • Replication timing


Data explorations


SherlockNet

We implemented convolutional neural networks to automatically tag and caption 1 million images scanned from 15th through 19th century books in the British Library 1M collection. Our work won the 2016 British Library Labs Competition. Check out our web portal here.

Pairwise epistasis

How multiple mutations affect protein folding and fitness, a phenomenon known as epistasis, is still not well understood. We used simulations, experimental data, and evolutionary findings to study the role of pairwise interactions in epistasis. Our writeup is here.