Transcriptomic resources for the medicinal legume Mucuna pruriens: de novo transcriptome assembly, annotation, identification and validation of EST-SSR markers [Next Gen Genomics Facility]

You are here

TitleTranscriptomic resources for the medicinal legume Mucuna pruriens: de novo transcriptome assembly, annotation, identification and validation of EST-SSR markers [Next Gen Genomics Facility]
Publication TypeJournal Article
Year of Publication2017
AuthorsSathyanarayana N, Pittala RKumar, Tripathi PKumar, Chopra R, Singh HRussiachan, Belamkar V, Bhardwaj PKumar, Doyle JJ, Egan AN
JournalBMC Genomics
Volume18
Issue1
Pagination409
Date Published2017 May 25
ISSN1471-2164
KeywordsData Mining, Expressed Sequence Tags, Gene Expression Profiling, Microsatellite Repeats, Molecular Sequence Annotation, Mucuna, Polymorphism, Genetic, RNA, Messenger, Transcription Factors
Abstract

BACKGROUND: The medicinal legume Mucuna pruriens (L.) DC. has attracted attention worldwide as a source of the anti-Parkinson's drug L-Dopa. It is also a popular green manure cover crop that offers many agronomic benefits including high protein content, nitrogen fixation and soil nutrients. The plant currently lacks genomic resources and there is limited knowledge on gene expression, metabolic pathways, and genetics of secondary metabolite production. Here, we present transcriptomic resources for M. pruriens, including a de novo transcriptome assembly and annotation, as well as differential transcript expression analyses between root, leaf, and pod tissues. We also develop microsatellite markers and analyze genetic diversity and population structure within a set of Indian germplasm accessions.

RESULTS: One-hundred ninety-one million two hundred thirty-three thousand two hundred forty-two bp cleaned reads were assembled into 67,561 transcripts with mean length of 626 bp and N50 of 987 bp. Assembled sequences were annotated using BLASTX against public databases with over 80% of transcripts annotated. We identified 7,493 simple sequence repeat (SSR) motifs, including 787 polymorphic repeats between the parents of a mapping population. 134 SSRs from expressed sequenced tags (ESTs) were screened against 23 M. pruriens accessions from India, with 52 EST-SSRs retained after quality control. Population structure analysis using a Bayesian framework implemented in fastSTRUCTURE showed nearly similar groupings as with distance-based (neighbor-joining) and principal component analyses, with most of the accessions clustering per geographical origins. Pair-wise comparison of transcript expression in leaves, roots and pods identified 4,387 differentially expressed transcripts with the highest number occurring between roots and leaves. Differentially expressed transcripts were enriched with transcription factors and transcripts annotated as belonging to secondary metabolite pathways.

CONCLUSIONS: The M. pruriens transcriptomic resources generated in this study provide foundational resources for gene discovery and development of molecular markers. Polymorphic SSRs identified can be used for genetic diversity, marker-trait analyses, and development of functional markers for crop improvement. The results of differential expression studies can be used to investigate genes involved in L-Dopa synthesis and other key metabolic pathways in M. pruriens.

DOI10.1186/s12864-017-3780-9
Alternate JournalBMC Genomics
PubMed ID28545396
PubMed Central IDPMC5445377