Differentially gene expression analysis with RNA-seq data is quite common nowadays, and there are pretty good Bioconductor packages for that: limma::voom, DESeq2 …

miRNA Annotation Tools Comparison

3 minute read

Published: March 24, 2016

In summary: I will show which is the best miRNA mapping tool. I used several options for this benchmarking:

small RNA abundance [miss-] viz

1 minute read

Published: October 29, 2013

My PhD was focused on small RNA sequencing data. I had a problem when I wanted to visualized the amount of small RNAs from the beginning. Here the problem, assume that you have a certain distribution of small RNA sequences abundance:

Lying factor in figures

1 minute read

Published: August 07, 2013

What is the lying factor in figures?

miRNA annotation: complex scenarios

4 minute read

Published: May 27, 2013

Everybody who is working with microRNA knows about miRBase, it was the first miRNA catalogue. Everybody is using it to annotate small RNA sequences as miRNA or not. And it is great, and very helpfully…but there are some cases that we should investigate our results.

visualizing small RNA mapping complexity

1 minute read

Published: September 18, 2012

I spent all my PhD working with small RNA sequences data. The main problem was, always those sequences that map in multiple locations, also denominated ambiguous sequences. From the very beginning, this made that pipelines remove this kind of sequences from the analysis, because you cannot assign them a unique location in the genome. But these sequences are interesting to study, since many of them change in size, for instance. This complexity is due to repeats in the genome and the scenario I am talking about here it is shown in the following figure:

Visualizing science is also cool @polychart

1 minute read

Published: August 14, 2012

One of my interest in science is finding new ways to visualize Big Data. Scientist are used to work with static visualization, that of course, is wonderful in the majority of the case. But it wouldn’t be better dynamic visualization for exploration? Play with your data, explore, and finally when you get that great figure that tell you everything you was looking for, you can export it to an image portable format.

Bionformatics evolution 2005-2012

less than 1 minute read

Published: April 13, 2012

How much keywords in bioinformatics has changed in the past 7 years?
I did a small and quick experiment. I took the abstract of the papers published in Bioinformatics during Jan-2005 and during Jan-2012. Then I represented the words according to their frequency with the wordle tool. And the result was:

portfolio

Portfolio item number 1

Published: July 11, 2025

Short description of portfolio item number 1

Portfolio item number 2

Published: July 11, 2025

Short description of portfolio item number 2

publications

Genome assembly comparison identifies structural variants in the human genome.

Published in Nature Genetics, 2025

R Khaja,J Zhang,J MacDonald,Y He,A Joseph-George,J Wei,M Rafiq,C Qian,M Shago,L Pantano,H Aburatani,K Jones,R Redon,M Hurles,L Armengol,X Estivill,R Mural,C Lee,S Scherer,L Feuk

Abstract

Numerous types of DNA variation exist, ranging from SNPs to larger structural alterations such as copy number variants (CNVs) and inversions. Alignment of DNA sequence from different sources has been used to identify SNPs and intermediate-sized variants (ISVs). However, only a small proportion of total heterogeneity is characterized, and little is known of the characteristics of most smaller-sized (1.5 million SNPs. Some differences were simple insertions and deletions, but in regions containing CNVs, segmental duplication and repetitive DNA, they were more complex. Our results uncover substantial undescribed variation in humans, highlighting the need for comprehensive annotation strategies to fully interpret genome scanning and personalized sequencing projects.

Recommended citation: R Khaja,J Zhang,J MacDonald,Y He,A Joseph-George,J Wei,M Rafiq,C Qian,M Shago,**L Pantano**,H Aburatani,K Jones,R Redon,M Hurles,L Armengol,X Estivill,R Mural,C Lee,S Scherer,L Feuk (2006) Genome assembly comparison identifies structural variants in the human genome. Nature Genetics www.ncbi.nlm.nih.gov/pubmed/?term=17115057

Highlights from the Third International Society for Computational Biology Student Council Symposium at the Fifteenth Annual International Conference on Intelligent Systems for Molecular Biology

Published in BMC Bioinformatics, 2025

P Grynberg,T Abeel,P Lopes,G Macintyre,L Pantano

Abstract

Abstract In this meeting report we give an overview of the 3rd International Society for Computational Biology Student Council Symposium. Furthermore, we explain the role of the Student Council and the symposium series in the context of large, international conferences.

Recommended citation: P Grynberg,T Abeel,P Lopes,G Macintyre,**L Pantano** (2007) Highlights from the Third International Society for Computational Biology Student Council Symposium at the Fifteenth Annual International Conference on Intelligent Systems for Molecular Biology BMC Bioinformatics Comming soon

ProSeeK A web server for MLPA probe design

Published in BMC Genomics, 2025

L Pantano,L Armengol,S Villatoro,X Estivill

Abstract

Background: The technological evolution of platforms for detecting genome-wide copy number imbalances has allowed the discovery of an unexpected amount of human sequence that is variable in copy number among individuals. This type of human variation can make an important contribution to human diversity and disease susceptibility. Multiplex Ligation-dependent Probe Amplification (MLPA) is a targeted method to assess copy number differences for up to 40 genomic loci in one single experiment. Although specific MLPA assays can be ordered from MRC-Holland (the proprietary company of the MLPA technology), custom designs are also developed in many laboratories worldwide. After our own experience, an important drawback of custom MLPA assays is the time spent during the design of the specific oligonucleotides that are used as probes. Due to the large number of probes included in a single assay, a number of restrictions need to be met in order to maximize specificity and to increase success likelihood. Results: We have developed a web tool for facilitating and optimising custom probe design for MLPA experiments. The algorithm only requires the target sequence in FASTA format and a set of parameters, that are provided by the user according to each specific MLPA assay, to identify the best probes inside the given region. Conclusion: To our knowledge, this is the first available tool for optimizing custom probe design of MLPA assays. The ease-of-use and speed of the algorithm dramatically reduces the turn around time of probe design. ProSeeK will become a useful tool for all laboratories that are currently using MLPA in their research projects for CNV studies.

Recommended citation: **L Pantano**,L Armengol,S Villatoro,X Estivill (2008) ProSeeK A web server for MLPA probe design BMC Genomics www.ncbi.nlm.nih.gov/pubmed/?term=19040730

Identification of Copy Number Variants Defining Genomic Differences among Major Human Groups

Published in PLoS ONE, 2025

L Armengol,S Villatoro,J Gonzalez,L Pantano,M Garcia-Aragones,R Rabionet,M Caceres,X Estivill

Abstract

Background: Understanding the genetic contribution to phenotype variation of human groups is necessary to elucidate differences in disease predisposition and response to pharmaceutical treatments in different human populations. Methodology/Principal Findings: We have investigated the genome-wide profile of structural variation on pooled samples from the three populations studied in the HapMap project by comparative genome hybridization (CGH) in different array platforms. We have identified and experimentally validated 33 genomic loci that show significant copy number differences from one population to the other. Interestingly, we found an enrichment of genes related to environment adaptation (immune response, lipid metabolism and extracellular space) within these regions and the study of expression data revealed that more than half of the copy number variants (CNVs) translate into gene-expression differences among populations, suggesting that they could have functional consequences. In addition, the identification of single nucleotide polymorphisms (SNPs) that are in linkage disequilibrium with the copy number alleles allowed us to detect evidences of population differentiation and recent selection at the nucleotide variation level. Conclusions: Overall, our results provide a comprehensive view of relevant copy number changes that might play a role in phenotypic differences among major human populations, and generate a list of interesting candidates for future studies.

Recommended citation: L Armengol,S Villatoro,J Gonzalez,**L Pantano**,M Garcia-Aragones,R Rabionet,M Caceres,X Estivill (2009) Identification of Copy Number Variants Defining Genomic Differences among Major Human Groups PLoS ONE www.ncbi.nlm.nih.gov/pubmed/?term=19789632

Fibroblast-derived induced pluripotent stem cells show no common retroviral vector insertions.

Published in Stem cells (Dayton Ohio), 2025

F Varas,M Stadtfeld,A De,N Maherali,T di,L Pantano,C Notredame,K Hochedlinger,T Graf

Abstract

Several laboratories have reported the reprogramming of mouse and human fibroblasts into pluripotent cells, using retroviruses carrying the Oct4, Sox2, Klf4, and c-Myc transcription factor genes. In these experiments the frequency of reprogramming was lower than 0.1pct of the infected cells, raising the possibility that additional events are required to induce reprogramming, such as activation of genes triggered by retroviral insertions. We have therefore determined by ligation-mediated polymerase chain reaction (LM-PCR) the retroviral insertion sites in six induced pluripotent stem (iPS) cell clones derived from mouse fibroblasts. Seventy-nine insertion sites were assigned to a single mouse genome location. Thirty-five of these mapped to gene transcription units, whereas 29 insertions landed within 10 kilobases of transcription start sites. No common insertion site was detected among the iPS clones studied. Moreover, bioinformatics analyses revealed no enrichment of a specific gene function, network, or pathway among genes targeted by retroviral insertions. We conclude that Oct4, Sox2, Klf4, and c-Myc are sufficient to promote fibroblast-to-iPS cell reprogramming and propose that the observed low reprogramming frequencies may have alternative explanations.

Recommended citation: F Varas,M Stadtfeld,A De,N Maherali,T di,**L Pantano**,C Notredame,K Hochedlinger,T Graf (2009) Fibroblast-derived induced pluripotent stem cells show no common retroviral vector insertions. Stem cells (Dayton Ohio) www.ncbi.nlm.nih.gov/pubmed/?term=19008347

A myriad of miRNA variants in control and Huntingtons disease brain regions detected by massively parallel sequencing

Published in Nucleic Acids Research, 2025

E Marti,L Pantano,M Banez-Coronel,F Llorens,E Minones-Moyano,S Porta,L Sumoy,I Ferrer,X Estivill

Abstract

Huntington disease (HD) is a neurodegenerative disorder that predominantly affects neurons of the forebrain. We have applied the Illumina massively parallel sequencing to deeply analyze the small RNA populations of two different forebrain areas, the frontal cortex (FC) and the striatum (ST) of healthy individuals and individuals with HD. More than 80pct of the small-RNAs were annotated as microRNAs (miRNAs) in all samples. Deep sequencing revealed length and sequence heterogeneity (IsomiRs) for the vast majority of miRNAs. Around 8090pct of the miRNAs presented modifications in the 3-terminus mainly in the form of trimming and/or as nucleotide addition variants, while the 5-terminus of the miRNAs was specially protected from changes. Expression profiling showed strong miRNA and isomiR expression deregulation in HD, most being common to both FC and ST. The analysis of the upstream regulatory regions in co-regulated miRNAs suggests a role for RE1-Silencing Transcription Factor (REST) and P53 in miRNAs downregulation in HD. The putative targets of deregulated miRNAs and seed-region IsomiRs strongly suggest that their altered expression contributes to the aberrant gene expression in HD. Our results show that miRNA variability is a ubiquitous phenomenon in the adult human brain, which may influence gene expression in physiological and pathological conditions.

Recommended citation: E Marti,**L Pantano**,M Banez-Coronel,F Llorens,E Minones-Moyano,S Porta,L Sumoy,I Ferrer,X Estivill (2010) A myriad of miRNA variants in control and Huntingtons disease brain regions detected by massively parallel sequencing Nucleic Acids Research www.ncbi.nlm.nih.gov/pubmed/?term=20591823

SeqBuster a bioinformatic tool for the processing and analysis of small RNAs datasets reveals ubiquitous miRNA modifications in human embryonic cells

Published in Nucleic Acids Research, 2025

L Pantano,X Estivill,E Marti

Abstract

High-throughput sequencing technologies enable direct approaches to catalog and analyze snapshots of the total small RNA content of living cells. Characterization of high-throughput sequencing data requires bioinformatic tools offering a wide perspective of the small RNA transcriptome. Here we present SeqBuster, a highly versatile and reliable web-based toolkit to process and analyze large-scale small RNA datasets. The high flexibility of this tool is illustrated by the multiple choices offered in the pre-analysis for mapping purposes and in the different analysis modules for data manipulation. To overcome the storage capacity limitations of the web-based tool, SeqBuster offers a stand-alone version that permits the annotation against any custom database. SeqBuster integrates multiple analyses modules in a unique platform and constitutes the first bioinformatic tool offering a deep characterization of miRNA variants (isomiRs). The application of SeqBuster to small-RNA datasets of human embryonic stem cells revealed that most miRNAs present different types of isomiRs, some of them being associated to stem cell differentiation. The exhaustive description of the isomiRs provided by SeqBuster could help to identify miRNA-variants that are relevant in physiological and pathological processes. SeqBuster is available at http://estivilllab.crg.es/seqbuster.

Recommended citation: **L Pantano**,X Estivill,E Marti (2010) SeqBuster a bioinformatic tool for the processing and analysis of small RNAs datasets reveals ubiquitous miRNA modifications in human embryonic cells Nucleic Acids Research Comming soon

A non-biased framework for the annotation and classification of the non-miRNA small RNA transcriptome.

Published in Bioinformatics (Oxford England), 2025

L Pantano,X Estivill,E Marti

Abstract

MOTIVATION: Recent progress in high-throughput sequencing technologies has largely contributed to reveal a highly complex landscape of small non-coding RNAs (sRNAs), including novel non-canonical sRNAs derived from long non-coding RNA, repeated elements, transcription start sites and splicing site regions among others. The published frameworks for sRNA data analysis are focused on miRNA detection and prediction, ignoring further information in the dataset. As a consequence, tools for the identification and classification of the sRNAs not belonging to miRNA family are currently lacking. RESULTS: Here, we present, SeqCluster, an extension of the currently available SeqBuster tool to identify and analyze at different levels the sRNAs not annotated or predicted as miRNAs. This new module deals with sequences mapping onto multiple locations and permits a highly versatile and user-friendly interaction with the data in order to easily classify sRNA sequences with a putative functional importance. We were able to detect all known classes of sRNAs described to date using SeqCluster with different sRNA datasets.

Recommended citation: **L Pantano**,X Estivill,E Marti (2011) A non-biased framework for the annotation and classification of the non-miRNA small RNA transcriptome. Bioinformatics (Oxford England) www.ncbi.nlm.nih.gov/pubmed/?term=21976421

A Pathogenic Mechanism in Huntingtons Disease Involves Small CAG-Repeated RNAs with Neurotoxic Activity.

Published in PLoS genetics, 2025

M Banez-Coronel,S Porta,B Kagerbauer,E Mateu-Huertas,L Pantano,I Ferrer,M Guzman,X Estivill,E Marti

Abstract

Huntington’s disease (HD) is an autosomal dominantly inherited disorder caused by the expansion of CAG repeats in the Huntingtin (HTT) gene. The abnormally extended polyglutamine in the HTT protein encoded by the CAG repeats has toxic effects. Here, we provide evidence to support that the mutant HTT CAG repeats interfere with cell viability at the RNA level. In human neuronal cells, expanded HTT exon-1 mRNA with CAG repeat lengths above the threshold for complete penetrance (40 or greater) induced cell death and increased levels of small CAG-repeated RNAs (sCAGs), of [?]21 nucleotides in a Dicer-dependent manner. The severity of the toxic effect of HTT mRNA and sCAG generation correlated with CAG expansion length. Small RNAs obtained from cells expressing mutant HTT and from HD human brains significantly decreased neuronal viability, in an Ago2-dependent mechanism. In both cases, the use of anti-miRs specific for sCAGs efficiently blocked the toxic effect, supporting a key role of sCAGs in HTT-mediated toxicity. Luciferase-reporter assays showed that expanded HTT silences the expression of CTG-containing genes that are down-regulated in HD. These results suggest a possible link between HD and sCAG expression with an aberrant activation of the siRNA/miRNA gene silencing machinery, which may trigger a detrimental response. The identification of the specific cellular processes affected by sCAGs may provide insights into the pathogenic mechanisms underlying HD, offering opportunities to develop new therapeutic approaches.

Recommended citation: M Banez-Coronel,S Porta,B Kagerbauer,E Mateu-Huertas,**L Pantano**,I Ferrer,M Guzman,X Estivill,E Marti (2012) A Pathogenic Mechanism in Huntingtons Disease Involves Small CAG-Repeated RNAs with Neurotoxic Activity. PLoS genetics www.ncbi.nlm.nih.gov/pubmed/?term=22383888

A highly expressed miR-101 isomiR is a functional silencing small RNA.

Published in BMC genomics, 2025

F Llorens,M Banez-Coronel,L Pantano,R del,I Ferrer,X Estivill,E Marti

Abstract

BACKGROUND: MicroRNAs (miRNAs) are short non-coding regulatory RNAs that control gene expression usually producing translational repression and gene silencing. High-throughput sequencing technologies have revealed heterogeneity at length and sequence level for the majority of mature miRNAs (IsomiRs). Most isomiRs can be explained by variability in either Dicer1 or Drosha cleavage during miRNA biogenesis at 5’ or 3’ of the miRNA (trimming variants). Although isomiRs have been described in different tissues and organisms, their functional validation as modulators of gene expression remains elusive. Here we have characterized the expression and function of a highly abundant miR-101 5’-trimming variant (5’-isomiR-101).

Recommended citation: F Llorens,M Banez-Coronel,**L Pantano**,R del,I Ferrer,X Estivill,E Marti (2013) A highly expressed miR-101 isomiR is a functional silencing small RNA. BMC genomics www.ncbi.nlm.nih.gov/pubmed/?term=23414127

Microarray and deep sequencing cross-platform analysis of the mirRNome and isomiR variation in response to epidermal growth factor.

Published in BMC genomics, 2025

F Llorens,M Hummel,L Pantano,X Pastor,A Vivancos,E Castillo,H Mattlin,A Ferrer,M Ingham,M Noguera,R Kofler,J Dohm,R Pluvinet,M Bayes,H Himmelbauer,R del,E Marti,L Sumoy

Abstract

BACKGROUND: Epidermal Growth Factor (EGF) plays an important function in the regulation of cell growth, proliferation, and differentiation by binding to its receptor (EGFR) and providing cancer cells with increased survival responsiveness. Signal transduction carried out by EGF has been extensively studied at both transcriptional and post-transcriptional levels. Little is known about the involvement of microRNAs (miRNAs) in the EGF signaling pathway. miRNAs have emerged as major players in the complex networks of gene regulation, and cancer miRNA expression studies have evidenced a direct involvement of miRNAs in cancer progression.

Recommended citation: F Llorens,M Hummel,**L Pantano**,X Pastor,A Vivancos,E Castillo,H Mattlin,A Ferrer,M Ingham,M Noguera,R Kofler,J Dohm,R Pluvinet,M Bayes,H Himmelbauer,R del,E Marti,L Sumoy (2013) Microarray and deep sequencing cross-platform analysis of the mirRNome and isomiR variation in response to epidermal growth factor. BMC genomics www.ncbi.nlm.nih.gov/pubmed/?term=23724959

Regulation of miRNA strand selection follow the leader?

Published in Biochemical Society transactions, 2025

H Meijer,E Smith,M Bushell,A Kozomara,S Griffiths-Jones,V Kim,J Han,M Siomi,Y Wang,S Juranek,H Li,G Sheng,G Wardle,T Tuschl,D Patel,S Chi,J Zang,A Mele,R Darnell,M Hafner,M Landthaler,L Burger,M Khorshid,J Hausser,P Berninger,A Rothballer,M Ascano,A Jungkamp,M Munschauer,S Djuranovic,A Nahvi,R Green,E Huntzinger,D Kuzuoglu-Ozturk,J Braun,A Eulalio,L Wohbold,E Izaurralde,Q Liu,P Halvey,Y Shyr,R Slebos,D Liebler,B Zhang,H Meijer,Y Kong,W Lu,A Wilczynska,R Spriggs,S Robinson,J Godfrey,A Willis,M Bushell,J Krol,I Loedige,W Filipowicz,M Biasiolo,G Sales,M Lionetti,L Agnelli,K Todoerti,A Bisognin,A Coppe,C Romualdi,A Neri,S Bortoluzzi,M He,Y Liu,X Wang,M Zhang,G Hannon,Z Huang,A Mathelier,A Carbone,F Rivas,N Tolia,J Song,J Aragon,J Liu,G Hannon,L Joshua-Tor,N Tahbaz,F Kolb,H Zhang,K Jaronczyk,W Filipowicz,T Hobman,J Pare,N Tahbaz,J Lopez-Orozco,P LaPointe,P Lasko,T Hobman,C Noland,J Doudna,S Gu,L Jin,F Zhang,Y Huang,D Grimm,J Rossi,M Kay,S Iwasaki,M Kobayashi,M Yoda,Y Sakaguchi,S Katsuma,T Suzuki,Y Tomari,E Elkayam,C Kuhn,A Tocilj,A Haase,E Greene,G Hannon,L Joshua-Tor,M Yoda,T Kawamata,Z Paroo,X Ye,S Iwasaki,Q Liu,Y Tomari,P Kwak,Y Tomari,C Matranga,Y Tomari,C Shin,D Bartel,P Zamore,D Schwarz,G Hutvagner,T Du,Z Xu,N Aronin,P Zamore,J Krol,K Sobczak,U Wilczynska,M Drath,A Jasinska,D Kaczynska,W Krzyzosiak,A OToole,S Miller,N Haines,M Zink,M Serra,S Miller,L Jones,K Giovannitti,D Piper,M Serra,H Hu,Y Zheng,Y Xu,H Hu,C Menzel,Y Zhou,W Chen,P Khaitovich,M Ohanian,D Humphreys,E Anderson,T Preiss,D Fatkin,E Marti,L Pantano,M Banez-Coronel,F Llorens,E Minones-Moyano,S Porta,L Sumoy,I Ferrer,X Estivill,C Neilsen,G Goodall,C Bracken,D Humphreys,C Hynes,H Patel,G Wei,L Cannon,D Fatkin,C Suter,J Clancy,T Preiss,H Zhou,M Arcila,Z Li,E Lee,C Henzler,J Liu,T Rana,K Kosik,M Xie,M Li,A Vilborg,N Lee,M Shu,V Yartseva,N Sestan,J Steitz,H Lee,K Zhou,A Smith,C Noland,J Doudna,J Winter,S Diederichs,J Yang,M Philips,D Betel,P Mu,A Ventura,A Siepel,K Chen,E Lai,L Guo,Z Lu,S Ro,C Park,D Young,K Sanders,W Yan,H Chiang,L Schoenfeld,J Ruby,V Auyeung,N Spies,D Baek,W Johnston,C Russ,S Luo,J Babiarz,S Bortoluzzi,A Bisognin,M Biasiolo,P Guglielmelli,F Biamonte,R Norfo,R Manfredini,A Vannucchi,S Griffiths-Jones,J Hui,A Marco,M Ronshaugen,A Packer,Y Xing,S Harper,L Jones,B Davidson,L Jiang,C Lin,L Song,J Wu,B Chen,Z Ying,L Fang,X Yan,M He,J Li,M Li,M Rubio,R Montanez,L Perez,M Milan,X Belles,S Shan,L Fang,T Shatseva,Z Rutnam,X Yang,W Du,W Lu,J Xuan,Z Deng,B Yang,X Yang,W Du,H Li,F Liu,A Khorshidi,Z Rutnam,B Yang,H Zhou,X Huang,H Cui,X Luo,Y Tang,S Chen,L Wu,N Shen,L Tarassishin,O Loudig,A Bauman,B Shafit-Zagardo,H Suh,S Lee,C Giles,R Girija-Devi,M Dozmorov,J Wren,S Chatterjee,M Fasier,I Bussing,H Gro$\beta$hans,M Gantier,C McCoy,I Rusinova,D Saulep,D Wang,D Xu,A Irving,M Behlke,P Hertzog,F Mackay,B Williams,J Krol,V Busskamp,I Markiewicz,M Stadler,S Ribi,J Richter,J Duebel,S Bicker,H Fehling,D Schrubeler,T Miki,H Grosshans,S Balaraman,E Lunde,O Sawant,T Cudd,S Washburn,R Miranda

Abstract

miRNA strand selection is the process that determines which of the two strands in a miRNA duplex becomes the active strand that is incorporated into the RISC (RNA-induced silencing complex) (named the guide strand, leading strand or miR) and which one gets degraded (the passenger strand or miR). Thermodynamic features of the duplex appear to play an important role in this decision; the strand with the weakest binding at its 5’-end is more likely to become the guide strand. Other key characteristics of human miRNA guide strands are a U-bias at the 5’-end and an excess of purines, whereas the passenger strands have a C-bias at the 5’-end and an excess of pyrimidines. Several proteins are known to play a role in strand selection [Ago (Argonaute), DICER, TRBP (trans-activation response RNA-binding protein), PACT (protein activator of dsRNA-dependent protein kinase) and Xrn-1/2]; however, the mechanisms by which these proteins act are largely unknown. For several miRNAs the miR/miR ratio varies dependent on cell type, developmental stage and in different disease states, suggesting that strand selection is a tightly controlled process. The present review discusses our current knowledge regarding the factors and processes involved in strand selection and the many questions that still remain.

Recommended citation: H Meijer,E Smith,M Bushell,A Kozomara,S Griffiths-Jones,V Kim,J Han,M Siomi,Y Wang,S Juranek,H Li,G Sheng,G Wardle,T Tuschl,D Patel,S Chi,J Zang,A Mele,R Darnell,M Hafner,M Landthaler,L Burger,M Khorshid,J Hausser,P Berninger,A Rothballer,M Ascano,A Jungkamp,M Munschauer,S Djuranovic,A Nahvi,R Green,E Huntzinger,D Kuzuoglu-Ozturk,J Braun,A Eulalio,L Wohbold,E Izaurralde,Q Liu,P Halvey,Y Shyr,R Slebos,D Liebler,B Zhang,H Meijer,Y Kong,W Lu,A Wilczynska,R Spriggs,S Robinson,J Godfrey,A Willis,M Bushell,J Krol,I Loedige,W Filipowicz,M Biasiolo,G Sales,M Lionetti,L Agnelli,K Todoerti,A Bisognin,A Coppe,C Romualdi,A Neri,S Bortoluzzi,M He,Y Liu,X Wang,M Zhang,G Hannon,Z Huang,A Mathelier,A Carbone,F Rivas,N Tolia,J Song,J Aragon,J Liu,G Hannon,L Joshua-Tor,N Tahbaz,F Kolb,H Zhang,K Jaronczyk,W Filipowicz,T Hobman,J Pare,N Tahbaz,J Lopez-Orozco,P LaPointe,P Lasko,T Hobman,C Noland,J Doudna,S Gu,L Jin,F Zhang,Y Huang,D Grimm,J Rossi,M Kay,S Iwasaki,M Kobayashi,M Yoda,Y Sakaguchi,S Katsuma,T Suzuki,Y Tomari,E Elkayam,C Kuhn,A Tocilj,A Haase,E Greene,G Hannon,L Joshua-Tor,M Yoda,T Kawamata,Z Paroo,X Ye,S Iwasaki,Q Liu,Y Tomari,P Kwak,Y Tomari,C Matranga,Y Tomari,C Shin,D Bartel,P Zamore,D Schwarz,G Hutvagner,T Du,Z Xu,N Aronin,P Zamore,J Krol,K Sobczak,U Wilczynska,M Drath,A Jasinska,D Kaczynska,W Krzyzosiak,A OToole,S Miller,N Haines,M Zink,M Serra,S Miller,L Jones,K Giovannitti,D Piper,M Serra,H Hu,Y Zheng,Y Xu,H Hu,C Menzel,Y Zhou,W Chen,P Khaitovich,M Ohanian,D Humphreys,E Anderson,T Preiss,D Fatkin,E Marti,**L Pantano**,M Banez-Coronel,F Llorens,E Minones-Moyano,S Porta,L Sumoy,I Ferrer,X Estivill,C Neilsen,G Goodall,C Bracken,D Humphreys,C Hynes,H Patel,G Wei,L Cannon,D Fatkin,C Suter,J Clancy,T Preiss,H Zhou,M Arcila,Z Li,E Lee,C Henzler,J Liu,T Rana,K Kosik,M Xie,M Li,A Vilborg,N Lee,M Shu,V Yartseva,N Sestan,J Steitz,H Lee,K Zhou,A Smith,C Noland,J Doudna,J Winter,S Diederichs,J Yang,M Philips,D Betel,P Mu,A Ventura,A Siepel,K Chen,E Lai,L Guo,Z Lu,S Ro,C Park,D Young,K Sanders,W Yan,H Chiang,L Schoenfeld,J Ruby,V Auyeung,N Spies,D Baek,W Johnston,C Russ,S Luo,J Babiarz,S Bortoluzzi,A Bisognin,M Biasiolo,P Guglielmelli,F Biamonte,R Norfo,R Manfredini,A Vannucchi,S Griffiths-Jones,J Hui,A Marco,M Ronshaugen,A Packer,Y Xing,S Harper,L Jones,B Davidson,L Jiang,C Lin,L Song,J Wu,B Chen,Z Ying,L Fang,X Yan,M He,J Li,M Li,M Rubio,R Montanez,L Perez,M Milan,X Belles,S Shan,L Fang,T Shatseva,Z Rutnam,X Yang,W Du,W Lu,J Xuan,Z Deng,B Yang,X Yang,W Du,H Li,F Liu,A Khorshidi,Z Rutnam,B Yang,H Zhou,X Huang,H Cui,X Luo,Y Tang,S Chen,L Wu,N Shen,L Tarassishin,O Loudig,A Bauman,B Shafit-Zagardo,H Suh,S Lee,C Giles,R Girija-Devi,M Dozmorov,J Wren,S Chatterjee,M Fasier,I Bussing,H Gro$\beta$hans,M Gantier,C McCoy,I Rusinova,D Saulep,D Wang,D Xu,A Irving,M Behlke,P Hertzog,F Mackay,B Williams,J Krol,V Busskamp,I Markiewicz,M Stadler,S Ribi,J Richter,J Duebel,S Bicker,H Fehling,D Schrubeler,T Miki,H Grosshans,S Balaraman,E Lunde,O Sawant,T Cudd,S Washburn,R Miranda (2014) Regulation of miRNA strand selection follow the leader? Biochemical Society transactions www.ncbi.nlm.nih.gov/pubmed/?term=25110015

Paternal Diet Defines Offspring Chromatin State and Intergenerational Obesity

Published in Cell, 2025

A Ost,A Lempradl,E Casas,M Weigert,T Tiko,M Deniz,L Pantano,U Boenisch,P Itskov,M Stoeckius,M Ruf,N Rajewsky,G Reuter,N Iovino,C Ribeiro,M Alenius,S Heyne,T Vavouri,J Pospisilik

Abstract

The global rise in obesity has revitalized a search for genetic and epigenetic factors underlying the disease. We present a Drosophila model of paternal-diet-induced intergenerational metabolic reprogramming (IGMR) and identify genes required for its encoding in offspring. Intriguingly, we find that as little as 2 days of dietary intervention in fathers elicits obesity in offspring. Paternal sugar acts as a physiological suppressor of variegation, desilencing chromatin-state-defined domains in both mature sperm and in offspring embryos. We identify requirements for H3K9/K27me3-dependent reprogramming of metabolic genes in two distinct germline and zygotic windows. Critically, we find evidence that a similar system may regulate obesity susceptibility and phenotype variation in mice and humans. The findings provide insight into the mechanisms underlying intergenerational metabolic reprogramming and carry profound implications for our understanding of phenotypic variation and evolution.

Recommended citation: A Ost,A Lempradl,E Casas,M Weigert,T Tiko,M Deniz,**L Pantano**,U Boenisch,P Itskov,M Stoeckius,M Ruf,N Rajewsky,G Reuter,N Iovino,C Ribeiro,M Alenius,S Heyne,T Vavouri,J Pospisilik (2014) Paternal Diet Defines Offspring Chromatin State and Intergenerational Obesity Cell www.ncbi.nlm.nih.gov/pubmed/?term=25480298

InvFEST a database integrating information of polymorphic inversions in the human genome

Published in Nucleic Acids Research, 2025

A Martinez-Fundichely,S Casillas,R Egea,M Ramia,A Barbadilla,L Pantano,M Puig,M Caceres

Abstract

The newest genomic advances have uncovered an unprecedented degree of structural variation throughout genomes, with great amounts of data accumulating rapidly. Here we introduce InvFEST (http://invfestdb.uab.cat), a database combining multiple sources of information to generate a complete catalogue of non-redundant human polymorphic inversions. Due to the complexity of this type of changes and the underlying high false-positive discovery rate, it is necessary to integrate all the available data to get a reliable estimate of the real number of inversions. InvFEST automatically merges predictions into different inversions, refines the breakpoint locations, and finds associations with genes and segmental duplications. In addition, it includes data on experimental validation, population frequency, functional effects and evolutionary history. All this information is readily accessible through a complete and user-friendly web report for each inversion. In its current version, InvFEST combines information from 34 different studies and contains 1092 candidate inversions, which are categorized based on internal scores and manual curation. Therefore, InvFEST aims to represent the most reliable set of human inversions and become a central repository to share information, guide future studies and contribute to the analysis of the functional and evolutionary impact of inversions on the human genome.

Recommended citation: A Martinez-Fundichely,S Casillas,R Egea,M Ramia,A Barbadilla,**L Pantano**,M Puig,M Caceres (2014) InvFEST a database integrating information of polymorphic inversions in the human genome Nucleic Acids Research www.ncbi.nlm.nih.gov/pubmed/?term=24253300

Specific small-RNA signatures in the amygdala at premotor and motor stages of Parkinsons disease revealed by deep sequencing analysis.

Published in Bioinformatics (Oxford England), 2025

L Pantano,M Friedlander,G Escaramis,E Lizano,J Pallares-Albanell,I Ferrer,X Estivill,E Marti

Abstract

MOTIVATION: Most computational tools for small non-coding RNAs (sRNA) sequencing data analysis focus in microRNAs (miRNAs), overlooking other types of sRNAs that show multi-mapping hits. Here, we have developed a pipeline to non-redundantly quantify all types of sRNAs, and extract patterns of expression in biologically defined groups. We have used our tool to characterize and profile sRNAs in post-mortem brain samples of control individuals and Parkinson’s disease (PD) cases at early-premotor and late-symptomatic stages. RESULTS: Clusters of co-expressed sRNAs mapping onto tRNAs significantly separated premotor and motor cases from controls. A similar result was obtained using a matrix of miRNAs slightly varying in sequence (isomiRs). The present framework revealed sRNA alterations at premotor stages of PD, which might reflect initial pathogenic perturbations. This tool may be useful to discover sRNA expression patterns linked to different biological conditions. AVAILABILITY AND IMPLEMENTATION: The full code is available at http://github.com/lpantano/seqbuster. CONTACT: lpantano@hsph.harvard.edu or eulalia.marti@crg.euSupplementary information: Supplementary data are available at Bioinformatics online.

Recommended citation: **L Pantano**,M Friedlander,G Escaramis,E Lizano,J Pallares-Albanell,I Ferrer,X Estivill,E Marti (2015) Specific small-RNA signatures in the amygdala at premotor and motor stages of Parkinsons disease revealed by deep sequencing analysis. Bioinformatics (Oxford England) www.ncbi.nlm.nih.gov/pubmed/?term=26530722

Functional Impact and Evolution of a Novel Human Polymorphic Inversion That Disrupts a Gene and Creates a Fusion Transcript

Published in PLoS Genetics, 2025

M Puig,D Castellano,L Pantano,C Giner-Delgado,D Izquierdo,M Gaya-Vidal,J Lucas-Lledo,T Esko,C Terao,F Matsuda,M Caceres

Abstract

Despite many years of study into inversions, very little is known about their functional consequences, especially in humans. A common hypothesis is that the selective value of inversions stems in part from their effects on nearby genes, although evidence of this in natural populations is almost nonexistent. Here we present a global analysis of a new 415-kb polymorphic inversion that is among the longest ones found in humans and is the first with clear position effects. This inversion is located in chromosome 19 and has been generated by non-homologous end joining between blocks of transposable elements with low identity. PCR genotyping in 541 individuals from eight different human populations allowed the detection of tag SNPs and inversion genotyping in multiple populations worldwide, showing that the inverted allele is mainly found in East Asia with an average frequency of 4.7pct. Interestingly, one of the breakpoints disrupts the transcription factor gene ZNF257, causing a significant reduction in the total expression level of this gene in lymphoblastoid cell lines. RNA-Seq analysis of the effects of this expression change in standard homozygotes and inversion heterozygotes revealed distinct expression patterns that were validated by quantitative RT-PCR. Moreover, we have found a new fusion transcript that is generated exclusively from inverted chromosomes around one of the breakpoints. Finally, by the analysis of the associated nucleotide variation, we have estimated that the inversion was generated 40,000-50,000 years ago and, while a neutral evolution cannot be ruled out, its current frequencies are more consistent with those expected for a deleterious variant, although no significant association with phenotypic traits has been found so far.

Recommended citation: M Puig,D Castellano,**L Pantano**,C Giner-Delgado,D Izquierdo,M Gaya-Vidal,J Lucas-Lledo,T Esko,C Terao,F Matsuda,M Caceres (2015) Functional Impact and Evolution of a Novel Human Polymorphic Inversion That Disrupts a Gene and Creates a Fusion Transcript PLoS Genetics www.ncbi.nlm.nih.gov/pubmed/?term=26427027

The small RNA content of human sperm reveals pseudogene-derived piRNAs complementary to protein-coding genes.

Published in RNA (New York N.Y.), 2025

L Pantano,M Jodar,M Bak,J Ballesca,N Tommerup,R Oliva,T Vavouri

Abstract

At the end of mammalian sperm development, sperm cells expel most of their cytoplasm and dispose of the majority of their RNA. Yet, hundreds of RNA molecules remain in mature sperm. The biological significance of the vast majority of these molecules is unclear. To better understand the processes that generate sperm small RNAs and what roles they may have, we sequenced and characterized the small RNA content of sperm samples from two human fertile individuals. We detected 182 microRNAs, some of which are highly abundant. The most abundant microRNA in sperm is miR-1246 with predicted targets among sperm-specific genes. The most abundant class of small noncoding RNAs in sperm are PIWI-interacting RNAs (piRNAs). Surprisingly, we found that human sperm cells contain piRNAs processed from pseudogenes. Clusters of piRNAs from human testes contain pseudogenes transcribed in the antisense strand and processed into small RNAs. Several human protein-coding genes contain antisense predicted targets of pseudogene-derived piRNAs in the male germline and these piRNAs are still found in mature sperm. Our study provides the most extensive data set and annotation of human sperm small RNAs to date and is a resource for further functional studies on the roles of sperm small RNAs. In addition, we propose that some of the pseudogene-derived human piRNAs may regulate expression of their parent gene in the male germline.

Recommended citation: **L Pantano**,M Jodar,M Bak,J Ballesca,N Tommerup,R Oliva,T Vavouri (2015) The small RNA content of human sperm reveals pseudogene-derived piRNAs complementary to protein-coding genes. RNA (New York N.Y.) www.ncbi.nlm.nih.gov/pubmed/?term=25904136

Genomic analyses identify molecular subtypes of pancreatic cancer

Published in Nature, 2025

P Bailey,D Chang,K Nones,A Johns,A Patch,M Gingras,D Miller,A Christ,T Bruxner,M Quinn,C Nourse,L Murtaugh,I Harliwong,S Idrisoglu,S Manning,E Nourbakhsh,S Wani,L Fink,O Holmes,V Chin,M Anderson,S Kazakoff,C Leonard,F Newell,N Waddell,S Wood,Q Xu,P Wilson,N Cloonan,K Kassahn,D Taylor,K Quek,A Robertson,L Pantano,L Mincarelli,L Sanchez,L Evers,J Wu,M Pinese,M Cowley,M Jones,E Colvin,A Nagrial,E Humphrey,L Chantrill,A Mawson,J Humphris,A Chou,M Pajic,C Scarlett,A Pinho,M Giry-Laterriere,I Rooman,J Samra,J Kench,J Lovell,N Merrett,C Toon,K Epari,N Nguyen,A Barbour,N Zeps,K Moran-Jones,N Jamieson,J Graham,F Duthie,K Oien,J Hair,R Grutzmann,A Maitra,C Iacobuzio-Donahue,C Wolfgang,R Morgan,R Lawlor,V Corbo,C Bassi,B Rusev,P Capelli,R Salvia,G Tortora,D Mukhopadhyay,G Petersen,P Australian,D Munzy,W Fisher,S Karim,J Eshleman,R Hruban,C Pilarsky,J Morton,O Sansom,A Scarpa,E Musgrove,U Bailey,O Hofmann,R Sutherland,D Wheeler,A Gill,R Gibbs,J Pearson,N Waddell,A Biankin,S Grimmond

Abstract

Integrated genomic analysis of 456 pancreatic ductal adenocarcinomas identified 32 recurrently mutated genes that aggregate into 10 pathways: KRAS, TGF-beta, WNT, NOTCH, ROBO/SLIT signalling, G1/S transition, SWI-SNF, chromatin modification, DNA repair and RNA processing. Expression analysis defined 4 subtypes: (1) squamous; (2) pancreatic progenitor; (3) immunogenic; and (4) aberrantly differentiated endocrine exocrine (ADEX) that correlate with histopathological characteristics. Squamous tumours are enriched for TP53 and KDM6A mutations, upregulation of the TP63N transcriptional network, hypermethylation of pancreatic endodermal cell-fate determining genes and have a poor prognosis. Pancreatic progenitor tumours preferentially express genes involved in early pancreatic development (FOXA2/3, PDX1 and MNX1). ADEX tumours displayed upregulation of genes that regulate networks involved in KRAS activation, exocrine (NR5A2 and RBPJL), and endocrine differentiation (NEUROD1 and NKX2-2). Immunogenic tumours contained upregulated immune networks including pathways involved in acquired immune suppression. These data infer differences in the molecular evolution of pancreatic cancer subtypes and identify opportunities for therapeutic development.

Recommended citation: P Bailey,D Chang,K Nones,A Johns,A Patch,M Gingras,D Miller,A Christ,T Bruxner,M Quinn,C Nourse,L Murtaugh,I Harliwong,S Idrisoglu,S Manning,E Nourbakhsh,S Wani,L Fink,O Holmes,V Chin,M Anderson,S Kazakoff,C Leonard,F Newell,N Waddell,S Wood,Q Xu,P Wilson,N Cloonan,K Kassahn,D Taylor,K Quek,A Robertson,**L Pantano**,L Mincarelli,L Sanchez,L Evers,J Wu,M Pinese,M Cowley,M Jones,E Colvin,A Nagrial,E Humphrey,L Chantrill,A Mawson,J Humphris,A Chou,M Pajic,C Scarlett,A Pinho,M Giry-Laterriere,I Rooman,J Samra,J Kench,J Lovell,N Merrett,C Toon,K Epari,N Nguyen,A Barbour,N Zeps,K Moran-Jones,N Jamieson,J Graham,F Duthie,K Oien,J Hair,R Grutzmann,A Maitra,C Iacobuzio-Donahue,C Wolfgang,R Morgan,R Lawlor,V Corbo,C Bassi,B Rusev,P Capelli,R Salvia,G Tortora,D Mukhopadhyay,G Petersen,P Australian,D Munzy,W Fisher,S Karim,J Eshleman,R Hruban,C Pilarsky,J Morton,O Sansom,A Scarpa,E Musgrove,U Bailey,O Hofmann,R Sutherland,D Wheeler,A Gill,R Gibbs,J Pearson,N Waddell,A Biankin,S Grimmond (2016) Genomic analyses identify molecular subtypes of pancreatic cancer Nature www.ncbi.nlm.nih.gov/pubmed/?term=26909576

bcbioRNASeq R package for bcbio RNA-seq analysis

Published in F1000Research, 2025

M Steinbaugh,L Pantano,R Kirchner,V Barrera,B Chapman,M Piper,M Mistry,R Khetani,K Rutherford,O Hofmann,J Hutchinson,S Ho

Abstract

RNA-seq analysis involves multiple steps from processing raw sequencing data to identifying, organizing, annotating, and reporting differentially expressed genes. bcbio is an open source, community-maintained framework providing automated and scalable RNA-seq methods for identifying gene abundance counts. We have developed bcbioRNASeq, a Bioconductor package that provides ready-to-render templates and wrapper functions to post-process bcbio output data. bcbioRNASeq automates the generation of high-level RNA-seq reports, including identification of differentially expressed genes, functional enrichment analysis and quality control analysis.

Recommended citation: M Steinbaugh,**L Pantano**,R Kirchner,V Barrera,B Chapman,M Piper,M Mistry,R Khetani,K Rutherford,O Hofmann,J Hutchinson,S Ho (2017) bcbioRNASeq R package for bcbio RNA-seq analysis F1000Research Comming soon

Viewing RNA-seq data on the entire human genome

Published in F1000Research, 2025

B Busby,E Weitz,L Pantano,J Zhu,B Upton

Abstract

RNA-Seq Viewer is a web application that enables users to visualize genome-wide expression data from NCBI’s Sequence Read Archive (SRA) and Gene Expression Omnibus (GEO) databases. The application prototype was created by a small team during a three-day hackathon facilitated by NCBI at Brandeis University. The backend data pipeline was developed and deployed on a shared AWS EC2 instance. Source code is available at https://github.com/NCBI-Hackathons/rnaseqview.

Recommended citation: B Busby,E Weitz,**L Pantano**,J Zhu,B Upton (2017) Viewing RNA-seq data on the entire human genome F1000Research Comming soon

Comparative analysis of LIN28-RNA binding sites identified at single nucleotide resolution

Published in RNA Biology, 2025

E Ransey,A Bjorkbom,V Lelyveld,P Biecek,L Pantano,J Szostak,P Sliz

Abstract

LLC It remains a formidable challenge to characterize the diverse complexes of RNA binding proteins and their targets. While crosslink and immunoprecipitation (CLIP) methods are powerful techniques that identify RNA targets on a global scale, the resolution and consistency of these methods is a matter of debate. Here we present a comparative analysis of LIN28-pre-let-7 UV-induced crosslinking using a tandem mass spectrometry (MS/MS) and deep sequencing interrogation of in vitro crosslinked complexes. Interestingly, analyses by the two methods diverge in their identification of crosslinked nucleotide identity - whereas bioinformatics and sequencing analyses suggest guanine in mammalian cells, MS/MS identifies uridine. This work suggests the need for comprehensive analysis and validation of crosslinking methodologies.

Recommended citation: E Ransey,A Bjorkbom,V Lelyveld,P Biecek,**L Pantano**,J Szostak,P Sliz (2017) Comparative analysis of LIN28-RNA binding sites identified at single nucleotide resolution RNA Biology Comming soon

Maintenance of macrophage transcriptional programs and intestinal homeostasis by epigenetic reader SP140

Published in Science Immunology, 2025

S Mehta,D Cronkite,M Basavappa,T Saunders,F Adiliaghdam,H Amatullah,S Morrison,J Pagan,R Anthony,P Tonnerre,G Lauer,J Lee,S Digumarthi,L Pantano,S Ho,F Ji,R Sadreyev,C Zhou,A Mullen,V Kumar,Y Li,C Wijmenga,R Xavier,T Means,K Jeffrey

Abstract

Epigenetic “readers” that recognize defined posttranslational modifications on histones have become desirable therapeutic targets for cancer and inflammation. SP140 is one such bromodomain- and plant homeodomain (PHD)-containing reader with immune-restricted expression, and single-nucleotide polymorphisms (SNPs) within SP140 associate with Crohn’s disease (CD). However, the function of SP140 and the consequences of disease-associated SP140 SNPs have remained unclear. We show that SP140 is critical for transcriptional programs that uphold the macrophage state. SP140 preferentially occupies promoters of silenced, lineage-inappropriate genes bearing the histone modification H3K27me3, such as the HOXA cluster in human macrophages, and ensures their repression. Depletion of SP140 in mouse or human macrophages resulted in severely compromised microbe-induced activation. We reveal that peripheral blood mononuclear cells (PBMCs) or B cells from individuals carrying CD-associated SNPs within SP140 have defective SP140 messenger RNA splicing and diminished SP140 protein levels. Moreover, CD patients carrying SP140 SNPs displayed suppressed innate immune gene signatures in a mixed population of PBMCs that stratified them from other CD patients. Hematopoietic-specific knockdown of Sp140 in mice resulted in exacerbated dextran sulfate sodium (DSS)-induced colitis, and low SP140 levels in human CD intestinal biopsies correlated with relatively lower intestinal innate cytokine levels and improved response to anti-tumor necrosis factor (TNF) therapy. Thus, the epigenetic reader SP140 is a key regulator of macrophage transcriptional programs for cellular state, and a loss of SP140 due to genetic variation contributes to a molecularly defined subset of CD characterized by ineffective innate immunity, normally critical for intestinal homeostasis.

Recommended citation: S Mehta,D Cronkite,M Basavappa,T Saunders,F Adiliaghdam,H Amatullah,S Morrison,J Pagan,R Anthony,P Tonnerre,G Lauer,J Lee,S Digumarthi,**L Pantano**,S Ho,F Ji,R Sadreyev,C Zhou,A Mullen,V Kumar,Y Li,C Wijmenga,R Xavier,T Means,K Jeffrey (2017) Maintenance of macrophage transcriptional programs and intestinal homeostasis by epigenetic reader SP140 Science Immunology www.ncbi.nlm.nih.gov/pubmed/?term=28783698

Molecular phenotypic and sample-associated data to describe pluripotent stem cell lines and derivatives

Published in Scientific Data, 2025

K Daily,S Ho,L Schriml,P Dexheimer,N Salomonis,R Schroll,S Bush,M Keddache,C Mayhew,S Lotia,T Perumal,K Dang,L Pantano,A Pico,E Grassman,D Nordling,W Hide,A Hatzopoulos,P Malik,J Cancelas,C Lutzko,B Aronow,L Omberg

Abstract

The use of induced pluripotent stem cells (iPSC) derived from independent patients and sources holds considerable promise to improve the understanding of development and disease. However, optimized use of iPSC depends on our ability to develop methods to efficiently qualify cell lines and protocols, monitor genetic stability, and evaluate self-renewal and differentiation potential. To accomplish these goals, 57 stem cell lines from 10 laboratories were differentiated to 7 different states, resulting in 248 analyzed samples. Cell lines were differentiated and characterized at a central laboratory using standardized cell culture methodologies, protocols, and metadata descriptors. Stem cell and derived differentiated lines were characterized using RNA-seq, miRNA-seq, copy number arrays, DNA methylation arrays, flow cytometry, and molecular histology. All materials, including raw data, metadata, analysis and processing code, and methodological and provenance documentation are publicly available for re-use and interactive exploration at https://www.synapse.org/pcbc. The goal is to provide data that can improve our ability to robustly and reproducibly use human pluripotent stem cells to understand development and disease.

Recommended citation: K Daily,S Ho,L Schriml,P Dexheimer,N Salomonis,R Schroll,S Bush,M Keddache,C Mayhew,S Lotia,T Perumal,K Dang,**L Pantano**,A Pico,E Grassman,D Nordling,W Hide,A Hatzopoulos,P Malik,J Cancelas,C Lutzko,B Aronow,L Omberg (2017) Molecular phenotypic and sample-associated data to describe pluripotent stem cell lines and derivatives Scientific Data www.ncbi.nlm.nih.gov/pubmed/?term=28350385

Empirical comparison of reduced representation bisulfite sequencing and Infinium BeadChip reproducibility and coverage of DNA methylation in humans

Published in npj Genomic Medicine, 2025

J Carmona,W Accomando,A Binder,J Hutchinson,L Pantano,B Izzi,A Just,X Lin,J Schwartz,P Vokonas,S Amr,A Baccarelli,K Michels

Abstract

We empirically examined the strengths and weaknesses of two human genome-wide DNA methylation platforms: rapid multiplexed reduced representation bisulfite sequencing and Illumina’s Infinium BeadChip. Rapid multiplexed reduced representation bisulfite sequencing required less input DNA, offered more flexibility in coverage, and interrogated more CpG loci at a higher regional density. The Infinium covered slightly more protein coding, cancer-associated and mitochondrial-related genes, both platforms covered all known imprinting clusters, and rapid multiplexed reduced representation bisulfite sequencing covered more microRNA genes than the HumanMethylation450, but fewer than the MethylationEPIC. Rapid multiplexed reduced representation bisulfite sequencing did not always interrogate exactly the same CpG loci, but genomic tiling improved overlap between different libraries. Reproducibility of rapid multiplexed reduced representation bisulfite sequencing and concordance between the platforms increased with CpG density. Only rapid multiplexed reduced representation bisulfite sequencing could genotype samples and measure allele-specific methylation, and we confirmed that Infinium measurements are influenced by nearby single-nucleotide polymorphisms. The respective strengths and weaknesses of these two genome-wide DNA methylation platforms need to be considered when conducting human epigenetic studies.

Recommended citation: J Carmona,W Accomando,A Binder,J Hutchinson,**L Pantano**,B Izzi,A Just,X Lin,J Schwartz,P Vokonas,S Amr,A Baccarelli,K Michels (2017) Empirical comparison of reduced representation bisulfite sequencing and Infinium BeadChip reproducibility and coverage of DNA methylation in humans npj Genomic Medicine www.ncbi.nlm.nih.gov/pubmed/?term=29263828

Circulating miRNAs isomiRs and small RNA clusters in human plasma and breast milk

Published in PLoS ONE, 2025

M Rubio,M Bustamante,C Hernandez-Ferrer,D Fernandez-Orth,L Pantano,Y Sarria,M Pique-Borras,K Vellve,S Agramunt,R Carreras,X Estivill,J Gonzalez,A Mayor

Abstract

Circulating small RNAs, including miRNAs but also isomiRs and other RNA species, have the potential to be used as non-invasive biomarkers for communicable and non-communicable diseases. This study aims to characterize and compare small RNA profiles in human biofluids. For this purpose, RNA was extracted from plasma and breast milk samples from 15 healthy postpartum mothers. Small RNA libraries were prepared with the NEBNext\textregistered small RNA library preparation kit and sequenced in an Illumina HiSeq2000 platform. miRNAs, isomiRs and clusters of small RNAs were annotated using seqBuster/seqCluster framework in 5 plasma and 10 milk samples that passed the initial quality control. The RNA yield was 81 ng/mL [standard deviation (SD): 41] and 3985 ng/mL (SD: 3767) for plasma and breast milk, respectively. Mean number of good quality reads was 4.04 million (M) (40.01pct of the reads) in plasma and 12.5M (89.6pct) in breast milk. One thousand one hundred eighty two miRNAs, 12,084 isomiRs and 1,053 small RNA clusters that included piwi-interfering RNAs (piRNAs), tRNAs, small nucleolar RNAs (snoRNA) and small nuclear RNAs (snRNAs) were detected. Samples grouped by biofluid, with 308 miRNAs, 1,790 isomiRs and 778 small RNA clusters differentially detected. In summary, plasma and milk showed a different small RNA profile. In both, miRNAs, piRNAs, tRNAs, snRNAs, and snoRNAs were identified, confirming the presence of non-miRNA species in plasma, and describing them for the first time in milk.

Recommended citation: M Rubio,M Bustamante,C Hernandez-Ferrer,D Fernandez-Orth,**L Pantano**,Y Sarria,M Pique-Borras,K Vellve,S Agramunt,R Carreras,X Estivill,J Gonzalez,A Mayor (2018) Circulating miRNAs isomiRs and small RNA clusters in human plasma and breast milk PLoS ONE www.ncbi.nlm.nih.gov/pubmed/?term=29505615

talks

small RNAseq data in bcbio-nexgen

Published in MIT, 2015

Integration of small RNAseq data into bcbio-nextgen.

Characterization of the small RNA transcriptome using the bcbio-nextgen python framework

Published in Walt Disney World Yacht, 2016

The study of small RNA helps us understand some of the complexity of gene regulation of a cell. Of the different types of small RNAs, the most important in mammals are miRNA, tRNA fragments and piRNAs. The advantage of small RNA-seq analysis is that we can study all small RNA types simultaneously, with the potential to detect novel small RNAs. bcbio-nextgen is a community- developed Python framework that implements best practices for next-generation sequence data analysis and uses gold standard data for validation. We have extended bcbio to include a small RNA-seq analysis pipeline that performs quality control, removal of adapter contamination, annotation of miRNA, isomiRs and tRNAs, novel miRNA discovery, and genome-wide characterization of other types of small RNAs. The pipeline integrates tools such as miRDeep2, seqbuster, seqcluster and tdrMapper to facilitate annotation to small RNA categories. It produces an R Markdown template that helps with downstream statistical analyses in R, including quality control metrics and best practices for differential expression and clustering analyses. Finally, the pipeline generates an interactive HTML-based browser for visualization purposes. This is useful for characterizing novel small RNA types, working with non-model organisms, or providing a general profiling description. This browser shows the small RNA regions along with their genomic annotation, expression profile over the precursor, secondary structure, and the top expressed sequences. Here, we show the capabilities of the pipeline and validation using data from the miRQC project. We show that the quantification accuracy is around 95% for miRNAs. We obtained similar results for other types of small RNA molecules, demonstrating that we can reliably detect small RNAs without a dependency on specific databases.

miRNA and isomiR annotation

Published in MIT, 2016

Preliminary results about isomiRs naming. The adventage to create systematic rules and specific score mapping for read alignments to miRBase precursor.

miRTop: An open source community project for the development of a unified format file for miRNA data

Published in Reed College, 2018

slides F1000

teaching

Small RNAseq analysis

Published in Harvard Chan School, 2016

Small RNA-seq analysis, possibilities and expectations. Standards tools to use, quality metrics to look at, etc …

Differential Expression analysis

Published in Harvard Chan School, 2017

Assistant teacher: Supporting the lectures for quesitons or advance analysis comments. material

L Pantano

Sitemap

Pages

Posts

portfolio

publications

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

talks

teaching