The bioinformatics chat is a podcast about computational biology, bioinformatics, and next generation sequencing.
The bioinformatics chat is produced by Roman Cheplyaka.
Don't miss the next episode!
or via an RSS feed link.
April 29, 2019
In this episode we hear from Jacob Schreiber about his algorithm,
Avocado uses deep tensor factorization to break a three-dimensional tensor of
epigenomic data into three orthogonal dimensions corresponding to cell types,
assay types, and genomic loci. Avocado can extract a low-dimensional,
information-rich latent representation from the wealth of experimental data
from projects like the Roadmap Epigenomics Consortium and ENCODE. This
representation allows you to impute genome-wide epigenomics experiments that
have not yet been performed.
Jacob also talks about a pitfall he discovered when trying to predict gene
expression from a mix of genomic and epigenomic data. As you increase the
complexity of a machine learning model, its performance may be increasing for
the wrong reason: instead of learning something biologically interesting, your
model may simply be memorizing the average gene expression for that gene
across your training cell types using the nucleotide sequence.
March 24, 2019
The third Bioinformatics Contest took place in
Alexey Sergushichev, one of the organizers of the contest,
and Gennady Korotkevich, the 1st prize winner,
join me to discuss this year’s problems.
February 27, 2019
Hi-C is a sequencing-based assay that provides information about the 3-dimensional organization of the genome.
In this episode Simeon Carstens explains how he
applied the Inferential Structure Determination (ISD) framework to build a 3D
model of chromatin and fit that model to Hi-C data using Hamiltonian Monte
Carlo and Gibbs sampling.
#29 Haplotype-aware genotyping from long reads with Trevor Pesout
#28 Space-efficient variable-order Markov models with Fabio Cunial
#27 Classification of CRISPR-induced mutations and CRISPRpic with HoJoon Lee and Seung Woo Cho
#26 Feature selection, Relief and STIR with Trang Lê
#25 Transposons and repeats with Kaushik Panda and Keith Slotkin
#24 Read correction and Bcool with Antoine Limasset
#23 RNA design, EteRNA and NEMO with Fernando Portela
#22 smCounter2: somatic variant calling and UMIs with Chang Xu
#21 Linear mixed models, GWAS, and lme4qtl with Andrey Ziyatdinov
#20 B cell receptor substitution profile prediction and SPURF with Kristian Davidsen and Amrit Dhar
#19 Genome fingerprints with Gustavo Glusman
#18 Bioinformatics Contest 2018 with Alexey Sergushichev and Ekaterina Vyahhi
#17 Rarefaction, alpha diversity, and statistics with Amy Willis
#16 Javier Quilez on what makes large sequencing projects successful
#15 Optimal transport for single-cell expression data with Geoffrey Schiebinger
#14 Generating functions for read mapping with Guillaume Filion
#13 Bracken with Jennifer Lu
#12 Modelling the immune system and C-ImmSim with Filippo Castiglione
#11 Collective cell migration with Linus Schumacher
#10 Spatially variable genes and SpatialDE with Valentine Svensson
#9 Michael Tessler and Christopher Mason on 16S amplicon vs shotgun sequencing
#8 Perfect k-mer hashing in Sailfish
#7 Metagenomics and Kraken
#6 Allele-specific expression
#5 Relative data analysis and propr with Thom Quinn
#4 ChIP-seq and GenoGAM with Georg Stricker and Julien Gagneur
#3 miRNA target site prediction and seedVicious with Antonio Marco
#2 Single-cell RNA sequencing with Aleksandra Kolodziejczyk
#1 Transcriptome assembly and Scallop with Mingfu Shao