📬 Stay updated—subscribe to Decoding Data Subscribe Now

Blog Posts

Latest posts covering bioinformatics methods, computational biology insights, and technology updates.

How to set up cellranger to make your hpc admin happy

I found myself to force to use cellranger. Meanwhile it helps a lot to run from bcl files to single cell counts matrixes, I discovered that is quite difficult to control many options related to optimization.

How to set up public dataset analysis with bcbio-nextgen

We use bcbio-nextgen for the analysis of sequencing data, mainly, (sc)RNAseq, smallRNAseq, DNASeq and ChIPSeq. It is not rare that we get collaborators who wants to re-analyze public data-set.

Inside bcbio, we have bcbio_prepare_samples.py to help to merge multiple files that belong to the same sample into one file to make easier the configuration of bcbio. We extended this script to pull down data from GEO and SRA repository.

DEGreport to plot nice RNA-seq figures

Differentially gene expression analysis with RNA-seq data is quite common nowadays, and there are pretty good Bioconductor packages for that: limma::voom, DESeq2

The code for that part is quite simple, being super quick to get a list of de-regulated genes. However, downstream analyses vary a lot depending on the project itself. But I found myself doing the same plots and analyses many times for different project, so I put together a bunch of plots and analyses using code from my colleagues at work (@HSPH bioinformatics core) and myself.

miRNA Annotation Tools Comparison

In summary: I will show which is the best miRNA mapping tool. I used several options for this benchmarking:

I think that these are the most used, and other not used but good to try them. They were clearly developed for other purposes, but as well, they generate the input of many miRNA pipelines. I just wanted to know how well my tool was doing. The first aim to develop miraligner was to get annotated additions of nucleotides at the end of miRNA sequences, something that is very common in mirna biogenesis: isomirs and often they are missed by short read and fast mappers. I have a repository for this kind of things, so anybody can reproduce my results, and check if I did something wrong, or comment on it. In this post I just want to know which tool detects more miRNA, for that I did two main steps: