Latest posts covering bioinformatics methods, computational biology insights, and technology updates.
I found myself to force to use cellranger. Meanwhile it helps a lot to run from bcl files to single cell counts matrixes, I discovered that is quite difficult to control many options related to optimization.
We use bcbio-nextgen for the analysis of sequencing data, mainly, (sc)RNAseq, smallRNAseq, DNASeq and ChIPSeq. It is not rare that we get collaborators who wants to re-analyze public data-set.
Inside bcbio, we have bcbio_prepare_samples.py
to help to merge multiple
files that belong to the same sample into one file to make easier the configuration
of bcbio. We extended this script to pull down data from GEO and
SRA repository.
This is a funny story, and I will try to tell you how I realized I don’t know anything about R in 400 words.
I work at the Bioinformatic Core at Harvard TH School. People who know us, or collaborate with us, knows that we mainly use bcbio to analyze sequencing data (check it out, super cool tool).
Differentially gene expression analysis with RNA-seq data is quite common nowadays, and there are pretty good Bioconductor packages for that: limma::voom, DESeq2 …
The code for that part is quite simple, being super quick to get a list of de-regulated genes. However, downstream analyses vary a lot depending on the project itself. But I found myself doing the same plots and analyses many times for different project, so I put together a bunch of plots and analyses using code from my colleagues at work (@HSPH bioinformatics core) and myself.
In summary: I will show which is the best miRNA mapping tool. I used several options for this benchmarking:
I think that these are the most used, and other not used but good to try them. They were clearly developed for other purposes, but as well, they generate the input of many miRNA pipelines. I just wanted to know how well my tool was doing. The first aim to develop miraligner was to get annotated additions of nucleotides at the end of miRNA sequences, something that is very common in mirna biogenesis: isomirs and often they are missed by short read and fast mappers. I have a repository for this kind of things, so anybody can reproduce my results, and check if I did something wrong, or comment on it. In this post I just want to know which tool detects more miRNA, for that I did two main steps: