Latest posts covering bioinformatics methods, computational biology insights, and technology updates.
I found myself to force to use cellranger. Meanwhile it helps a lot to run from bcl files to single cell counts matrixes, I discovered that is quite difficult to control many options related to optimization.
We use bcbio-nextgen for the analysis of sequencing data, mainly, (sc)RNAseq, smallRNAseq, DNASeq and ChIPSeq. It is not rare that we get collaborators who wants to re-analyze public data-set.
Inside bcbio, we have bcbio_prepare_samples.py
to help to merge multiple
files that belong to the same sample into one file to make easier the configuration
of bcbio. We extended this script to pull down data from GEO and
SRA repository.
This is a funny story, and I will try to tell you how I realized I don’t know anything about R in 400 words.
I work at the Bioinformatic Core at Harvard TH School. People who know us, or collaborate with us, knows that we mainly use bcbio to analyze sequencing data (check it out, super cool tool).
Differentially gene expression analysis with RNA-seq data is quite common nowadays, and there are pretty good Bioconductor packages for that: limma::voom, DESeq2 …
The code for that part is quite simple, being super quick to get a list of de-regulated genes. However, downstream analyses vary a lot depending on the project itself. But I found myself doing the same plots and analyses many times for different project, so I put together a bunch of plots and analyses using code from my colleagues at work (@HSPH bioinformatics core) and myself.