Download the BIAS-PROFS GPCR dataset (often referred to as the "GDS" in publications) using the link below:
Download (1161kb, zip file)
Please cite this dataset using the following reference in any derivative publications:
On the hierarchical classification of G Protein-Coupled Receptors.
Matthew N. Davies, Andrew Secker, Alex A. Freitas, Miguel Mendao, Jon Timmis and Darren R. Flower.
(2007) pp. 3113-3118. Bioinformatics 23(23), December 2007: Oxford Journals.
Each class in the dataset is represented as a single text file. Within each file, each sequence has a header then the sequence on the following line. Each sequence spans one line only. This can make it slightly easier for automated processing of the dataset, however if viewing the files in a text-viewer, word wrap or line wrap must be turned on.
A brief summary of the class breakdown is shown in the table below. The dataset contains 8354 sequences in 108 classes. Please note, in some publications classes with fewer than 10 examples are removed. In this case, the dataset will contain 87 classes and 8222 sequences.
The structure of the dataset is shown in the table below, along with the number of example sequences in each class. Classes with fewer than 10 examples are shown in grey only.
ClassA | 5526 | Adrenergic | 95 | Adrenergic | 95 |
Amine | 1489 | Adrenoreceptor | 174 | ||
Dopamine | 228 | ||||
Histamine | 53 | ||||
MuscAcetyl | 159 | ||||
Muscarinicacetylchol | 186 | ||||
Octopamine | 37 | ||||
Serotonin | 503 | ||||
Traceamine | 149 | ||||
Anaphylatoxin | 30 | Anaphylatoxin | 30 | ||
Cannabinoid | 151 | Cannabinoid | 151 | ||
Fmetleuphe | 7 | Fmetleuphe | 7 | ||
GRHR | 105 | AKH | 8 | ||
Cora | 6 | ||||
GRHR | 91 | ||||
Hormone | 205 | FollicleStim | 66 | ||
Gonadotrophin | 93 | ||||
Lutropin | 9 | ||||
Thyrotropin | 37 | ||||
Interleukin8 | 67 | Interleukin8 | 67 | ||
Leuko | 16 | BLT1 | 5 | ||
BLT2 | 11 | ||||
Lyso | 65 | LysoEdg2 | 14 | ||
LysoEdg4 | 9 | ||||
LysoEdg7 | 7 | ||||
SphingoEdg1 | 7 | ||||
SphingoEdg3 | 5 | ||||
SphingoEdg5 | 8 | ||||
SphingoEdg6 | 6 | ||||
SphingoEdg8 | 9 | ||||
Melaton | 90 | Melaton | 90 | ||
Nucleotide | 266 | Adenosine | 93 | ||
Purinergic | 173 | ||||
Olfactory | 90 | Olfactory | 90 | ||
Peptide | 2713 | Adrenocorticotropic | 18 | ||
Adrenomedullin | 15 | ||||
Allatostatin | 15 | ||||
Angiotensin | 128 | ||||
Bombesin | 50 | ||||
Bradykinin | 77 | ||||
C5A | 33 | ||||
Chemokine | 575 | ||||
Cholecystokinin | 33 | ||||
Conopressin | 2 | ||||
DrostatinC | 6 | ||||
Duffy | 65 | ||||
Endothelin | 81 | ||||
fMetLeuPhe | 7 | ||||
Galanin | 66 | ||||
Interleukin8 | 67 | ||||
Kiss1 | 12 | ||||
MelaninConc | 32 | ||||
Melanocortin | 314 | ||||
Melanocyte | 90 | ||||
Mesotocin | 5 | ||||
Neuromedin | 60 | ||||
NeuromedinB-U | 47 | ||||
Neuropeptide | 118 | ||||
NeuropeptideFF | 17 | ||||
Neurotensin | 35 | ||||
Opoid | 170 | ||||
Orexigenic | 7 | ||||
Orexin | 40 | ||||
Oxytocin | 41 | ||||
Prokineticin | 22 | ||||
Prolactin | 19 | ||||
Proteinase | 34 | ||||
Somatostatin | 157 | ||||
SubstanceK | 12 | ||||
SubstanceP | 19 | ||||
Sulfakinin | 4 | ||||
Tachykinin | 69 | ||||
Thrombin | 34 | ||||
UrotensinII | 13 | ||||
Vasopressin | 93 | ||||
Vasotocin | 11 | ||||
Platelet | 21 | Platelet | 21 | ||
Prostanoid | 45 | Prostacyclin | 12 | ||
Prostaglandin | 27 | ||||
Thromboxane | 6 | ||||
Thyro | 71 | ETHR | 2 | ||
Growth | 34 | ||||
Thyro | 35 | ||||
ClassB | 625 | BrainSpec | 26 | BrainSpec | 26 |
Cadherin | 65 | Cadherin | 65 | ||
Calcitonin | 55 | Calcitonin | 55 | ||
Corticotropin | 53 | Corticotropin | 53 | ||
Diuretic | 7 | Diuretic | 7 | ||
EMR1 | 14 | EMR1 | 14 | ||
Gastric | 13 | Gastric | 13 | ||
Glucagon | 24 | Glucagon | 24 | ||
GrowthHorm | 50 | GrowthHorm | 50 | ||
Latrophilin | 111 | Latrophilin | 111 | ||
Methuselah | 25 | Methuselah | 25 | ||
PACAP | 63 | PACAP | 63 | ||
Parathyroid | 40 | Parathyroid | 40 | ||
Secretin | 18 | Secretin | 18 | ||
Vasocactive | 61 | Vasoactive | 61 | ||
ClassC | 2172 | BOSS | 55 | BOSS | 55 |
CalcSense | 588 | CalcLike | 30 | ||
ExtraCalc | 27 | ||||
Pheromone | 531 | ||||
GABA | 86 | GABA | 86 | ||
GlutaMeta | 178 | GlutaMeta | 178 | ||
PutPher | 311 | PutPher | 311 | ||
Taste | 954 | Taste | 954 | ||
ClassD | 13 | Pheromone | 13 | AlphaFac | 13 |
ClassE | 18 | cAMP | 18 | cAMP | 18 |
Total number of sequences - 8354