PIPEMicroDB: microsatellite database and primer generation tool for pigeonpea genome

Sarika, Sarika; Arora, Vasu; Iquebal, M. A.; Rai, Anil; Kumar, Dinesh

doi:10.1093/database/bas054

Abstract

Molecular markers play a significant role for crop improvement in desirable characteristics, such as high yield, resistance to disease and others that will benefit the crop in long term. Pigeonpea (Cajanus cajan L.) is the recently sequenced legume by global consortium led by ICRISAT (Hyderabad, India) and been analysed for gene prediction, synteny maps, markers, etc. We present PIgeonPEa Microsatellite DataBase (PIPEMicroDB) with an automated primer designing tool for pigeonpea genome, based on chromosome wise as well as location wise search of primers. Total of 123 387 Short Tandem Repeats (STRs) were extracted from pigeonpea genome, available in public domain using MIcroSAtellite tool (MISA). The database is an online relational database based on ‘three-tier architecture’ that catalogues information of microsatellites in MySQL and user-friendly interface is developed using PHP. Search for STRs may be customized by limiting their location on chromosome as well as number of markers in that range. This is a novel approach and is not been implemented in any of the existing marker database. This database has been further appended with Primer3 for primer designing of selected markers with left and right flankings of size up to 500 bp. This will enable researchers to select markers of choice at desired interval over the chromosome. Furthermore, one can use individual STRs of a targeted region over chromosome to narrow down location of gene of interest or linked Quantitative Trait Loci (QTLs). Although it is an in silico approach, markers’ search based on characteristics and location of STRs is expected to be beneficial for researchers.

Database URL:http://cabindb.iasri.res.in/pigeonpea/

Introduction

Pigeonpea or ‘tur’ is one of the most important pulse crops belonging to family Fabaceae and scientifically known as Cajanus cajan. It is known to be cultivated in more than 25 countries of the world such as Indian subcontinent, Africa and Central America in purview of the favourable climatic conditions. It is grown on 4.7 million hectares and ranks sixth among grain legumes in production. The world production of this crop figures around 3.69 million tons annually (1). Economic loss due to biotic stress factors has been estimated to be $US 8.48 billion earlier, while abiotic stress may reduce pigeonpea production by 50% (IIPR vision). Pigeonpea is a versatile crop, consumed as a staple food, green vegetable and as a fodder crop besides being a boon for soil. It improves soil fertility through nitrogen fixation as well as from the leaf fall and recycling of the nutrients (2, 3).

Indian economy depends significantly on pulses and India is the largest producer, consumer and importer of pulses. India’s pigeonpea production stands at around 2.86 million tons, which is four-fifth share in the world total pigeonpea produced. About 90% of the global pigeonpea area falls in India (1). Pigeonpea is the second largest pulse crop of India accounting for 15.8% of total pulse production (18.09 million tons) during 2010–11 (Source: Ministry of Agriculture, Government of India). The domestic consumption of tur in India is estimated at around 3.4 million tons (4). The production of pigeonpea is not sufficient to satisfy the domestic demand, hence it has to rely on imports of the crop especially from Myanmar and Tanzania. Still the demand per capita is not fulfilled, leading to malnutrition worldwide. To combat this gap between production and consumption, genomics may play a significant role. Researchers in today’s world are looking to produce crops that will possess desirable characteristics, such as high yields, resistance to disease and many other characteristics that will benefit the crop in the long term. Genome sequencing has significantly advanced in recent years. Molecular markers can play a significant role in crop improvement. The knowledge of genetic structure based on molecular markers such as microsatellites (5) plays important role in population genetic studies, gene regulation and genome evolution (6–8). These have proved useful in marker-assisted selection of desirable traits to which they are linked, hence are the markers of choice for genome mapping studies. Genome sequencing has contributed greatly to biological science. Till date, many crop genomes have been sequenced, pigeonpea being the recently sequenced genome. There is need to develop platform for mining genic microsatellites to ensure their better utilization as molecular markers, their abundance, distribution, evolution and putative function, if any.

Information available as Supplementary File (URL: http://www.icrisat.org/gt-bt/iipg/sup_files/Table15.html) is not chromosome wise but it is based on scaffold (9). Unless it is chromosome wise, it cannot be directly used in gene or QTL mapping. Here, we present mining STRs chromosome wise along with user-defined primer designing. Though high numbers of STR markers are available in pigeonpea, their effective use is very limited because the users cannot select primers at equal distance over particular chromosome especially in QTL mapping or fine mapping of the genes. Moreover, since there is no publicly available STR marker database with more user-friendly parameters, the present database with add-on tool having user need-based primer designing options is being made available for global community. Our database is unique and user friendly as it designs primer on loci of choice with amplicon range up to 500 bp, further facilitating the low-cost rapid revalidation in simple agrose gel.

Database development

Database flow

The chromosome-wise pigeonpea genome, available in public domain (http://www.icrisat.org/gt-bt/iipg/genomedata.zip), was downloaded in FASTA format. The tool PIgeonPEa Microsatellite DataBase (PIPEMicroDB) stores catalogue of microsatellites fetched from pigeonpea genome. The chromosome-wise sequences were used to extract microsatellite markers using MIcroSAtellite tool (MISA) (10). The output of MISA was processed using PERL scripts. The data were assembled in proper format in order to create the data file which was further imported to MySQL database. The query for STRs may be made chromosome wise along with the microsatellite characteristics such as motif type, repeat motif and repeat kind. Furthermore, the advance search may be made with the range of chromosomal location, GC content, number of base pairs and copy number. For the graphical user interface, PHP was used. The primers are generated using primer3 standalone tool. For primer generation, only STRs have been considered and not scaffolds. The data flow of tool is illustrated in Figure 1.

Figure 1

Open in new tab Download slide

Data exchange flow diagram for PIPEMicroDB.

Database architecture

The PIPEMicroDB is an online relational database that catalogues information about the microsatellite repeats of the recently sequenced pigeonpea. All the microsatellite markers extracted from pigeonpea genome have been generated using MISA. The database architecture is a ‘three-tier architecture’ (Figure 2) with a client tier, middle tier and database tier. This user-friendly interface for the database has been developed using PHP (Hypertext Preprocessor), which is an open-source server-side scripting language. The microsatellite marker information is accessed from MySQL database. In first tier of the architecture, the in silico mined STRs through MISA were stored in MySQL database. In the middleware, user need-based customized query provisions have been made. For primer designing, Primer3 standalone code computes primers on user request. The information generated at the client end, i.e. third tier of the architecture are list of multiple primers along with their respective melting temperature, GC content, start position and product size (amplicon size).

Figure 2

Open in new tab Download slide

Three-tier architecture of PIPEMicroDB.

PIPEMicroDB has eight tabs (Home, About, Database, Analysis, Tutorial, Links, Contact and Team). General information of the developed microsatellite database, information about pigeonpea, microsatellite markers and analysis of the pigeonpea genome has been discussed. The tutorial of this database contains the guidelines for users and terminologies used in the database contents. PIPEMicroDB is appended with other useful links, the team and contact persons.

Accessing database

PIPEMicroDB can be accessed to extract microsatellites based on motif type (mono, di, tri, tetra, penta and hexa), repeat motif and repeat kind (perfect and compound). These microsatellites can be mined based on the choice of chromosome where more than one chromosome may also be selected. Also, the flexibility has been given for limiting the search based on chromosomal location as well as the number of markers in that range. This is a novel approach and is not been implemented in any of the existing marker database, which may be useful for the breeders and biotechnologists. The flexibilities provided will enable researcher to select markers of choice at desired interval over the chromosome which may be coding or non-coding. However, if user wishes to know whether they are from coding or non-coding region, it can be traced/verified by our numeric counter showing start and end position of each amplicons over a particular chromosome. Furthermore, one can use each individual STRs of a targeted region over chromosome to narrow down location of gene of interest or linked QTL. PIPEMicroDB also fulfils this customized search according to the requirement of researcher, based on ranges of GC content (%), base pair and copy number.

PIPEMicroDB is further appended with Primer3. After selecting the STR based on customized search with the help of radiobutton, the marker of interest may be further carried for primer designing. The user may go for primer designing as a default setting or may go for modified degenerate bases in the sequence, if present. The selected markers may be designed with left and right flankings of size up to 500 bp. The output generated gives five primers along with the melting temperature, product size and GC content. The novel add-on for degenerate bases incorporated in this database search gives the users flexibility to replace degenerate bases with any of the alternative bases (A, T, G, C). This feature has been added to resolve the issue of some of the degenerate bases present in current pigeonpea genome assembly making the primer designing very difficult otherwise. The gap present in the region over chromosome has been deliberately avoided while designing the primer in flanking region of STR so that there is no inaccuracy in the predicted amplicon size for user’s ease in genotyping.

Discussion and conclusion

Utility of the database

PIPEMicroDB is of great use to pigeonpea breeders. The customization of this tool for search based on chromosome may be used for mapping of gene and QTL markers for crop improvement purpose. It is likely to be accessed by biologists engaged in research with diverse objectives in the crop primarily to develop molecular markers and also to understand the functional significance of microsatellites in regulating gene expression and genome evolution. The comprehensive options to search for simple and compound microsatellites repeats in the genic regions allow users to explore new avenues of investigations on these repeats. The primer designing for PCR amplification of desired motifs will facilitate studies on mutability, microsatellite abundance, etc. Association of microsatellites with a particular disease or phenotype may also be explored. Microsatellite data can also be used to investigate various anomalies using candidate gene approach. The database may be used for various research programmes, especially for genome mapping and gene tagging. This microsatellite database will serve as an important application for extracting information in order to design experiments in new directions elucidating novel roles and functions of microsatellites.

Analysis of pigeonpea genome

The whole genome was analysed for getting an overview of the pigeonpea genome. It was observed that 87% and 13% of the STR markers were of simple and compound (includes both simple and interrupted) type, respectively (Figure 3). Also, the ‘single’ repeat type was found to occur dominantly followed by ‘di’ type (Figure 4). Figure 5 shows the distribution of mono, di, tri, tetra, penta and hexa STRs in the whole genome of pigeonpea. The GC content between the range 0% and 25% occurs maximum as shown in Figure 6. Table 1 shows the distribution of simple and compound STRs along with repeat types (mono, di, tri, tetra, penta and hexa) on each chromosome.

Figure 3

Open in new tab Download slide

Distribution of STR marker types.

Figure 4

Open in new tab Download slide

Distribution of repeat types.

Figure 5

Open in new tab Download slide

Distribution of repeat kinds.

Figure 6

Open in new tab Download slide

Distribution based on GC content.

Table 1

Chromosome wise distribution of STRs

Chromosomes	Simple						Compound
	Mono	Di	Tri	Tetra	Penta	Hexa
Chromosome 1	4771	2255	628	91	21	21	1142
Chromosome 2	9827	4918	1339	173	39	55	2492
Chromosome 3	8183	3841	1019	124	29	37	1941
Chromosome 4	3581	1620	440	69	11	19	808
Chromosome 5	1429	668	173	30	4	9	337
Chromosome 6	6757	3231	834	115	29	32	1569
Chromosome 7	5145	2568	712	113	14	15	1261
Chromosome 8	5421	2644	720	103	19	25	1286
Chromosome 9	2788	1318	330	43	7	10	630
Chromosome 10	5469	2685	724	97	15	20	1340
Chromosome 11	12266	6018	1581	218	50	62	2959

Chromosomes	Simple						Compound
	Mono	Di	Tri	Tetra	Penta	Hexa
Chromosome 1	4771	2255	628	91	21	21	1142
Chromosome 2	9827	4918	1339	173	39	55	2492
Chromosome 3	8183	3841	1019	124	29	37	1941
Chromosome 4	3581	1620	440	69	11	19	808
Chromosome 5	1429	668	173	30	4	9	337
Chromosome 6	6757	3231	834	115	29	32	1569
Chromosome 7	5145	2568	712	113	14	15	1261
Chromosome 8	5421	2644	720	103	19	25	1286
Chromosome 9	2788	1318	330	43	7	10	630
Chromosome 10	5469	2685	724	97	15	20	1340
Chromosome 11	12266	6018	1581	218	50	62	2959

Open in new tab

Table 1

Chromosome wise distribution of STRs

Chromosomes	Simple						Compound
	Mono	Di	Tri	Tetra	Penta	Hexa
Chromosome 1	4771	2255	628	91	21	21	1142
Chromosome 2	9827	4918	1339	173	39	55	2492
Chromosome 3	8183	3841	1019	124	29	37	1941
Chromosome 4	3581	1620	440	69	11	19	808
Chromosome 5	1429	668	173	30	4	9	337
Chromosome 6	6757	3231	834	115	29	32	1569
Chromosome 7	5145	2568	712	113	14	15	1261
Chromosome 8	5421	2644	720	103	19	25	1286
Chromosome 9	2788	1318	330	43	7	10	630
Chromosome 10	5469	2685	724	97	15	20	1340
Chromosome 11	12266	6018	1581	218	50	62	2959

Chromosomes	Simple						Compound
	Mono	Di	Tri	Tetra	Penta	Hexa
Chromosome 1	4771	2255	628	91	21	21	1142
Chromosome 2	9827	4918	1339	173	39	55	2492
Chromosome 3	8183	3841	1019	124	29	37	1941
Chromosome 4	3581	1620	440	69	11	19	808
Chromosome 5	1429	668	173	30	4	9	337
Chromosome 6	6757	3231	834	115	29	32	1569
Chromosome 7	5145	2568	712	113	14	15	1261
Chromosome 8	5421	2644	720	103	19	25	1286
Chromosome 9	2788	1318	330	43	7	10	630
Chromosome 10	5469	2685	724	97	15	20	1340
Chromosome 11	12266	6018	1581	218	50	62	2959

Open in new tab

To make best use of this web-based tool, the wet lab validation of these STRs warranted in order to reap the benefit of genomics in endeavour of improvement in pigeonpea productivity.

STR validations

In silico validation of in vitro validated primers was done. The set of STR markers (11) were evaluated in the database using PERL script and the results are presented in Table 2.

Table 2

STR validation result of primers from pigeonpea

	Kumawat et al. (11)
Total no. of primers reported	28
No. of positive primers (forward)	11
No. of positive primers (reverse)	12
No. of positive primers (common to both forward and reverse)	11

	Kumawat et al. (11)
Total no. of primers reported	28
No. of positive primers (forward)	11
No. of positive primers (reverse)	12
No. of positive primers (common to both forward and reverse)	11

Open in new tab

Table 2

STR validation result of primers from pigeonpea

	Kumawat et al. (11)
Total no. of primers reported	28
No. of positive primers (forward)	11
No. of positive primers (reverse)	12
No. of positive primers (common to both forward and reverse)	11

	Kumawat et al. (11)
Total no. of primers reported	28
No. of positive primers (forward)	11
No. of positive primers (reverse)	12
No. of positive primers (common to both forward and reverse)	11

Open in new tab

Limitations

The total numbers of STR markers reported are 123 387, which needs wet lab validation. Furthermore, the primers designed here can only be used in simple singleplex PCR, not for multiplex PCR used for throughput analysis. However, the loci searched can be of immense use as it has location and chromosome, which is required input for multiplex designing. Relatively higher abundance of mono- over di-nucleotide repeat type might be due to inherent limitation of the 454 pyrosequencing technology which adds more mono-nucleotide causing sequencing error (12).

Availability

PIPEMicroDB can be accessed freely at http://cabindb.iasri.res.in/pigeonpea/.

Funding

National Agricultural Innovation Project, Indian Council of Agricultural Research, New Delhi, Ministry of Agriculture, Govt. of India.

Conflict of interest. None declared.

Acknowledgements

The technical assistance of Jai Bhagwan in maintaining the web server and A.R. Paul in designing the logo of PIPEMicroDB is thankfully acknowledged. The authors acknowledge the critical input of all the four anonymous reviewers and associate editor in improvement of the manuscript.

References

1

http://faostat.fao.org (15 June 2012, date last accessed)

2

Snapp

SS

,

Rohrbach

DD

,

Simtowe

F

, et al.

Sustainable soilmanagement options for Malawi: can smallholder farmers grow more legumes? Agri

,

Ecosys. Environ.

,

2002

, vol.

91

(pg.

159

-

174

)

Google Scholar

Crossref

WorldCat

3

Mapfumes

P

,

Mpepereki

S

,

Mafongoya

P

.

Waddington

SR

,

Murwira

HK

,

Kumwenda

JDT

, et al.

Pigeonpea in Zimbabwe: A new crop with potential

,

Proceedings of the Soil Fertility Network Results and Planning Workshop

,

1998

Soil Fert Net/CIMMYT, ISBN 970-648-006-4, pp. 93–98

Google Scholar

Google Preview

OpenURL Placeholder Text

WorldCat

4

http://www.crnindia.com/commodity/tur.html (15 June 2012, date last accessed)

5

Gonçalves

EC

,

Silva

A

,

Barbosa

MSR

, et al.

Isolation and characterization of microsatellite loci in Amazonian red-handed howlers Alouatta belzebul (Primates, Plathyrrini)

,

Mol. Ecol.

,

2004

, vol.

4

(pg.

406

-

408

)

Google Scholar

Crossref

WorldCat

6

Kashi

Y

,

King

DG

.

Simple sequence repeats as advantageous mutators in evolution

,

Trends Genet.

,

2006

, vol.

22

(pg.

253

-

259

)

7

Li

YC

,

Korol

AB

,

Fahima

T

, et al.

Microsatellites: genomic distribution, putative functions and mutational mechanisms: a review

,

Mol. Ecol.

,

2002

, vol.

11

(pg.

2453

-

2465

)

8

Li

YC

,

Korol

AB

,

Fahima

T

, et al.

Microsatellites within genes: structure, function and evolution

,

Mol. Biol. Evol.

,

2004

, vol.

21

(pg.

991

-

1007

)

9

Varshney

RK

,

Chen

W

,

Li

Y

, et al.

Draft genome sequence of pigeonpea (Cajanus cajan), an orphan legume crop of resource-poor farmers

,

Nat. Biotechnol.

,

2011

, vol.

30

(pg.

83

-

89

)

10

http://pgrc.ipk-gatersleben.de/misa/ (15 June 2012, date last accessed)

11

Kumawat

G

,

Raje

RS

,

Bhutani

S

, et al.

Molecular mapping of QTLs for plant type and earliness traits in pigeonpea (Cajanus cajan L. Millsp.)

,

BMC Genet.

,

2012

, vol.

13

(pg.

84

-

100

)

12

Haseneyer

G

,

Schmutzer

T

,

Seidel

M

, et al.

From RNA-seq to large-scale genotyping-genomics resources for rye (Secale cereal L.)

,

BMC Plant Biol.

,

2011

, vol.

11

(pg.

131

-

143

)

Author notes

Citation details: Sarika, Vasu Arora, M. A. Iquebal, Anil Rai, and Dinesh Kumar. PIPEMicroDB: microsatellite database and primer generation tool for pigeonpea genome. Database (2012) Vol. 2012: article ID bas054; doi:10.1093/database/bas054.

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download all slides

Month:	Total Views:
October 2016	1
November 2016	5
December 2016	6
January 2017	5
February 2017	18
March 2017	6
May 2017	8
June 2017	7
July 2017	6
August 2017	5
November 2017	3
December 2017	24
January 2018	17
February 2018	14
March 2018	25
April 2018	11
May 2018	16
June 2018	14
July 2018	24
August 2018	24
September 2018	15
October 2018	8
November 2018	12
December 2018	13
January 2019	14
February 2019	12
March 2019	9
April 2019	16
May 2019	16
June 2019	17
July 2019	17
August 2019	23
September 2019	13
October 2019	19
November 2019	16
December 2019	12
January 2020	17
February 2020	10
March 2020	7
April 2020	6
May 2020	14
June 2020	13
July 2020	13
August 2020	19
September 2020	17
October 2020	9
November 2020	10
December 2020	11
January 2021	15
February 2021	13
March 2021	16
April 2021	9
May 2021	40
June 2021	15
July 2021	11
August 2021	24
September 2021	12
October 2021	11
November 2021	21
December 2021	6
January 2022	9
February 2022	23
March 2022	13
April 2022	23
May 2022	13
June 2022	20
July 2022	7
August 2022	17
September 2022	107
October 2022	121
November 2022	4
December 2022	14
January 2023	4
February 2023	8
March 2023	16
April 2023	25
May 2023	9
June 2023	7
July 2023	6
August 2023	12
September 2023	1
October 2023	13
November 2023	10
December 2023	26
January 2024	17
February 2024	35
March 2024	13
April 2024	12

Article Contents

PIPEMicroDB: microsatellite database and primer generation tool for pigeonpea genome

Abstract

Introduction

Database development

Database flow

Database architecture

Accessing database

Discussion and conclusion

Utility of the database

Analysis of pigeonpea genome

STR validations

Limitations

Availability

Funding

Acknowledgements

References

Author notes

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

Article Contents

PIPEMicroDB: microsatellite database and primer generation tool for pigeonpea genome

Abstract

Introduction

Database development

Database flow

Database architecture

Accessing database

Discussion and conclusion

Utility of the database

Analysis of pigeonpea genome

STR validations

Limitations

Availability

Funding

Acknowledgements

References

Author notes

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

This Feature Is Available To Subscribers Only