Version 3
|
|
||||||
This mirror
was last updated on
03/02/2011
What's in a GeneCard?
This page provides information about the various GeneCards sections and tables.
- The sections that follow are linked to by the GeneCards links labeled About, About this table, About this scheme, and About these images found in the corresponding GeneCards section.
- Superscripts in the data refer to the sources (shown on the left column of the card) from where the data was extracted.
- Tooltips offering explanatory information about images can be viewed by placing your mouse over the images (see expression).
- Text background color changes for easy identification of mouseovers.
- To keep the GeneCards page more compact, many of the tables/columns initially offer only partial, high-scoring results (e.g: the top 10 SNPs sorted by type, with coding, nonsynonomous SNPs shown first; the chemical compounds matching highest with the gene; etc). In these cases, a hyperlink is always provided for viewing all of the information available for that item.
- Categories are based on Entrez Gene type and status, as well as several other factors.
- The former categories 'predicted' and 'predicted with support' are manifested now in the attibute 'predicted', which appears in the upper left box, where the category and GCid appear. This attribute means that the Entrez Gene status is 'PREDICTED', 'INFERRED', or 'MODEL' or the symbol source is Ensembl.
- The former category "reserved symbol" no longer exists because it is no longer used by HUGO.
| Category | Description |
|---|---|
| protein-coding | Entrez Gene type is 'protein coding', or data source is Ensembl and an Ensembl protein exists |
| pseudogene | Entrez Gene type is 'pseudo' or symbol contains 'pseudo' |
| RNA gene | Entrez Gene type ends with 'RNA' |
| gene cluster | symbol ends with '@' |
| genetic locus | none of the above, but there is disease information , or 'QTL' in the symbol |
| uncategorized | none of the above |
This section provides the gene's symbol, category, GIFtS score (see below), and GCid
in the box on the left hand side.
Each gene category has its distinct color: protein-coding,
pseudogene,
RNA gene,
gene cluster,
genetic locus, and
uncategorized.
The gene's symbol and GCid are the color of the gene's category.
The background color of the box that contains the gene's symbol and GCid
is indicative of which database the symbol is from:HGNC Approved Genes,
EntrezGene Database or
Ensembl Gene Database.
The header also contains a short description of the gene, and whether or not the gene symbol is
HUGO Gene Nomenclature
Committee (HGNC) database
approved.
GeneCards Inferred Functionality Scores (GIFtS)
The GIFtS algorithm uses the wealth of GeneCards annotations to produce scores aimed at predicting the degree of a gene's functionality. Since the degree of known functionality is correlated with the amount of research done on a particular gene or its product, we use these annotations in a scoring system aimed at inferring functionality. Note that while the accumulation of data for a specific gene in certain databases is merely correlated with functionality, many GeneCards sources, like the Gene Ontology (GO) Consortium and Genatlas provide definitive information about functionality.
Our goal is to use these two types of annotations in order to measure the functionality of GeneCards genes. Our first step, was to produce for each gene, a binary vector of 67 elements , indicating presence or absence of data in each relevant source. The GIFtS score of a particular gene is a percentage which is derived from the sum of these binary values divided by the number of sources (the vector length).
Improved GIFtS includes experimenting with increased resolution by using sub-sectioning of data sources and adjusting scores based on the presence or absence of detailed annotations within a source (currently SwissProt). In addition we have introduced weights related to the quantitative aspects of annotations items, enabling better evaluation of the data relevant to annotation levels (currently orthologs and publications). In order to enrich GIFtS with respect to protein data, we selected the pivotal bioinformatics source for such data, namely SwissProt, and dissected it into 6 sub sources: protein subunit, sub cellular location, post-translational modification, function, catalytic activity, and other. Each of these subfields received a binary score as described above, thereby increasing the GIFtS vector size by 5. To weight proteins effectively in the new vectors, the sum of the binary data was still divided by the original number of sources (with SwissProt treated as 1 source for this denominator, in spite of its sub sources contributions to the numerator). To enrich GIFtS by orthologs or publications data, we define a new score for each of those components, which is then added to the default GIFtS. Specifically, the orthologs and publications scores for each gene are calculated as round (logxsum(i)), where x equals 3 for orthologs and 5 for publications, and sum(i) is the number of relevant orthologs or publications. Genes with no orthologs or publications receive score of zero for the relevant component(s); scores rounded down to 0 (for low counts) are normalized to 1.
GeneCards Sections
Aliases & DescriptionsThis section displays synonyms and aliases for the relevant GeneCards gene,
as extracted from
OMIM,
HGNC,
Entrez Gene,
UniProtKB
(Swiss-Prot/TrEMBL),
GeneLoc, and
Ensembl.
Also shown are accessions from HGNC, EntrezGene, UniProtKB, and/or Ensembl, and
previous GC identifiers where relevant
(for cases that GeneLoc deems it necessary to assign a new identifier to a gene based on updated
information about its chromosomal location).
Such GC ids will always remain with their original genes and will not be reused with other symbols.
This section displays descriptions of a gene's function, cellular localization and a gene's effect on phenotype for the relevant GeneCards gene,
as extracted from
Entrez Gene,
UniProtKB
(UniprotKB/Swiss-Prot/UniprotKB/TrEMBL),
Tocris Bioscience, and
Gene Wiki.
This section displays the chromosome, cytogenetic band and map location of the GeneCards gene
as extracted from
GeneLoc,
HGNC,
Entrez Gene, Nature (405, 311-319) and
miRBase,
as well as genomic views from UCSC
and Ensembl,
and links to transcription factor binding sites and Pyrosequencing assays at
Qiagen and/or
SABiosciences.
The GeneLoc integrated location is shown in red on the image. If this
differs from the location provided by Entrez Gene and/or Ensembl, their locations are shown on the
image in green and/or blue respectively.
Also provided are links to the
GeneLoc gene density information for this gene's chromosome, which shows the number of genes in
each 1 Mb interval along the chromosome, and to detailed exon information as provided by
GeneLoc.
This section provides annotated information of the proteins encoded by GeneCards genes
according to
UniProtKB,
neXtProt, and/or
Ensembl,
the capability to view phosphorylation sites using
Phosphosite,
reference sequences (RefSeq) according to
NCBI,
cellular component ontologies visualized by the
Gene Ontology Consortium
(more information),
links for ordering antibodies from
Millipore,
Cell Signaling Technology,
OriGene,
GenScript,
Novus Biologicals,
Sigma-Aldrich,
R&D Systems,
and/or Epitomics,
recombinant proteins from
Millipore,
Sigma-Aldrich,
R&D Systems,
Enzo Life Sciences,
Novus Biologicals,
OriGene,
GenScript,
Sino Biological, and/or
ProSpec,
and assays from
Millipore,
Cell Signaling Technology,
R&D Systems,
OriGene,
GenScript,
Enzo Life Sciences,
Sigma-Aldrich, and/or
Uscn.
Direct links to three-dimensional visualization of PDB structures provided by the
OCA browser
and Proteopedia. Visualizations are also provided via the (3D)
for OCA Browser or the Proteopedia symbol hyperlink shown next to each PDB identifier.
Genes with similar ontologies can be seen using GeneDecks Partner Hunter (more
information)
This section provides annotated information about protein domains and families according to
InterPro,
ProtoNet,
UniProtKB
and Blocks.
Genes with similar domains can be seen using GeneDecks Partner Hunter (more
information)
This section provides annotated information about gene function
according to MGI,
UniProtKB
IUBMB, and
Genatlas,
including: shRNA from
OriGene and
Sigma-Aldrich,
siRNAs from
Sigma-Aldrich
OriGene,
and Qiagen,
RNAi products from
Millipore,
microRNA from
Sigma-Aldrich,
Qiagen, and
SABiosciences,
Clones from
Sigma-Aldrich,
OriGene,
GenScript, and
Sino Biological,
Cell Lines from GenScript,
as well as molecular function ontologies visualized by the
Gene Ontology
Consortium (more information).
Genes with similar ontologies can be seen using GeneDecks Partner Hunter (more
information)
Information from MGI includes
phenotypes for mouse orthologs and a popup table with information on
phenotypic alleles of the orthologs. This table presents the following
columns:
- Allele Name - Official symbol for the allele with link to MGD record
- MGI id - MGI identifier of the allele (linked to in previous column)
- Category - Type of allele by mode of origin
- Observed Phenotypes in Mouse - Phenotypic details for all genotypes that include at least one of the alleles
Genes with similar phenotypes can be seen using GeneDecks Partner Hunter (more
information)
This section provides links to pathways, interactions, and PCR arrays according to
information extracted
from the Kyoto
Encyclopedia of Genes and Genomes (KEGG),
Cell Signaling Technology,
Millipore,
Sigma-Aldrich,
SABiosciences,
UniProtKB,
String
and MINT,
as well as biological process ontologies visualized by the
Gene Ontology Consortium
(more information).
Genes with similar ontologies and those in the same pathways can be seen using GeneDecks Partner Hunter (more
information)
Links to the
SABiosciences
Gene Network Central interacting genes and proteins network and the
Sigma-Aldrich
"Your Favorite Gene" Molecular Interaction Network for the relevant gene are also provided.
Interacting proteins
Each line in this table represents one interacting protein, according to EBI-IntAct, MINT and/or String.
The following columns are presented:
- Interactant - Links to the GeneCards page (first sub-column) and the UniProtKB page (second sub-column) for the interacting protein. Superscript links: 1 - the comments section in the UniProtKB page for the interactant; 2 - the page of all interactions between the two proteins, or all experiments supporting them, in the MINT database.
- Interaction Details - Links to the interaction page in the database from which in was retrieved. In the case of IntAct, this page may include several different experiments supporting the same interaction. In the MINT database each distinct interaction definition or experiment supporting it is assigned a different mint id, all are presented. In the String database each interaction is given an experimental score (based on experimental datasets from other protein-protein interaction databases) and a database score (based on information from curated databases). These scores indicate the confidence that the predicted interaction exists. Only interactions with at least one score over 0.7 (high confidence) are presented.
This section provides relationships between GeneCards genes and both chemical compounds and
drugs, as well as links to drugs and compounds for ordering at
Sigma-Aldrich,
Enzo Life Sciences and Tocris Biosciences.
Chemical compound relationships are from Novoseek.
Drug compound relationships are from PharmGKB. Pharmaceutical uses are provided by UniProtKB.
Tocris compounds and pharmacological data.
This table presents the following columns:
- Compound - The name of the Tocris product related to this GeneCards gene.
- Action - Action (i.e. agonist, ligand, etc.) of the Tocris product related to this GeneCards gene.
- CAS Number - Chemical Abstracts Registry number.
Novoseek chemical compound relationships.
This table presents the following columns:
- Compound - The name of the chemical compound related to this GeneCards gene.
- Score - The Novoseek score of the relevance of the chemical compound to this gene, based on their literature text-mining algorithms.
- Articles - The number of articles in which both the gene's symbol and the compound appear.
- PubMed IDs for Articles with Shared Sentences (# sentences) - PubMed IDs of articles in which both the gene symbol and the compound appear in the same sentence, sorted by the number of sentences (shown in parentheses in the column) in which they both appear.
PharmGKB drug compound relationships.
This table presents the following columns:
- Drug Compound - The name of the drug compound related to this GeneCards gene.
- PharmGKB Relations - description of the relationship between the gene and the drug:
- CO - Clinical Outcome
- PD - Pharmacodynamics and Drug Response
- PK - Pharmacokinetics
- FA - Molecular and Cellular Functional Assays
- GN - Genotype
- PubMed IDs for articles supporting these relationships - PubMed IDs of articles in which both the gene symbol and the drug are discussed.
Genes with similar drug and compound relationships can be seen using GeneDecks Partner Hunter (more
information)
This section contains associated
Unigene clusters and
repesentative sequences,
REFSEQ mRNAs,
non coding RNAs from
RNAdb,
siRNAs from
Sigma-Aldrich,
OriGene, and
Qiagen,
RNAi products from
Millipore,
shRNA from
OriGene and
Sigma-Aldrich,
microRNA from
Sigma-Aldrich,
Qiagen, and
SABiosciences,
Clones from
Sigma-Aldrich,
OriGene,
GenScript, and
Sino Biological,
Primers from
OriGene and/or
SABiosciences,
Assemblies
(sorted by a scoring scheme that gives preferences to mRNAs over EST associations),
GeneTide.
Highest scoring ESTs, transcript and alignment information from
AceView.
Additional gene/cDNA sequences from
GenBank,
exon structure information from
GeneLoc,
alternative splicing information, and transcript links to
Ensembl.
Alternative Splicing
This subsection contains alternative splicing information according to
ASD followed by
alternative splicing isoforms from ECgene. Exons with
alternative splice sites in different isoforms were broken into Exonic Units (ExUns). The letters
indicate the order of the ExUns in the exon. The symbol ' ^ ' between ExUns indicates an intron,
while ' ' indicates the junction of two ExUns. Mouseovers on the dark blue squares show the
Exun's genomic coordinates, while mouseovers on the light blue squares show its transcript coordinates.
When showing ASD's splice variants, GeneCards subtracts the 3000 bp flank that ASD adds to the transcript
coordinates.
This section contains links to
experimental results from GeneNote,
probeset-to-gene annotations from GeneAnnot and
GeneTide,
electronic Northern data images and clone count from
UniGene,
SAGE
expression data images and tag counts based on data extracted from
CGAP and the Genomics Institute of the
Novartis Research Foundation (GNF) BioGPS, followed by links to
SOURCE,
and/or EXPOLDB,
Primers from
OriGene and/or
SABiosciences,
and/or tissue specificity data from
UniProtKB.
Expression for PCR Arrays from SABiosciences.
Genes with similar binary patterns can be seen using GeneDecks Partner Hunter (more
information)
An association of GeneCards genes to Affymetrix probe-sets, through GeneAnnot and GeneTide is presented in a table.
Other columns include data from GeneAnnot and GeneTide, where an asterisk next to the probe set name indicates lower quality annotation, as follows:
- Array - The Affymetrix GeneChip® expression array. Note that U95-A refers to Affymetrix array AV2 (version 2).
- # genes - The number of genes related to this probe set.
- Sensitivity - The fraction of probes that hit transcripts related to that gene (range: 0-1).
- Specificity - The degree to which the individual probes of a given probe set match a certian gene and only that gene (range: 0-1, where 1 is most specific).
- Correlation - see description below in "GeneNote - individual probe-sets variation" (range: 0-1).
- Length - see description below in "GeneNote - individual probe-sets variation" (range: 0-4).
- Gb_Accession - The mRNA's GenBank accession number.
- Consensus - The fraction of annotating resources that agree that the cDNA belongs to the gene (range: 0-1).
- Uniqueness - A confidence score that says how convinced each resource is that this is the only possible gene associated with the sequence (range: 0-1).
- Score - The 'Consensus' and 'Uniqueness' parameters collapsed into one score (range: 0-1).
- Rank - The position of the specific gene among all other genes associated with this transcript.
After the table, 3 images, for GeneNote/GNF, electronic Northern, and SAGE tissue expression data respectively, are presented:
GeneNote / GNF Normal / GNF Cancer Expression array images
Experimental tissue vectors: Duplicate measurements were obtained for twelve normal human tissues (out of 28 tissues shown) hybridized against Affymetrix GeneChips HG-U95A-E (GeneNote data) and
for 22 normal human tissues hybridized against HG-U133A (GNF data **). The intensity values (shown on the y-axis) were first averaged between duplicates, then probeset values were averaged per gene,
global median-normalized and scaled to have the same median of about 70
(half-way between GeneNote and GNF medians).
Available at GNF BioGPS, HG-U133A expression data for 18 NCI60 cancer cell lines was processed
and added to the display (a single measurement taken; normalized
according to the GNF normal data). The correspondence between
cell lines and tissues is given in a table below :
| TISSUE | NC160 | COMMENT |
|---|---|---|
| Kidney | 786-0 | kidney |
| Heart | A204 | rhabdomyosarcoma |
| Lung | A549 | lung |
| SalivaryGland | ACC3 | salivary_gland |
| Prostate | ALVA31 | prostate |
| Thymus | CCRT_CEM | "blood, T cell leukemia" |
| WholeBlood | DAUDI | "blood, lymphoma" |
| Colon | HCT116 | colon |
| Cervix | HELA | cervix |
| Liver | HEPG2 | liver |
| Spleen | HL60 | "blood, B cell leukemia" |
| Breast | MDA_MB231 | breast |
| Ovary | OVCAR3 | ovary |
| Pancreas | PANC1 | pancreas |
| BoneMarrow | SAOS2 | osteosarcoma |
| Brain | SF268 | glioblastoma |
| Skin | SKMEL28 | melanoma |
| Bladder | T24 | bladder |
Note that the diamonds along the x-axis of each graph hint that the tissue (cell line) expression values are
available for a given gene, while empty "diamonds" tell: either that there is
no such tissue for a specific microarray platform (SAGE/e-Northern), or the current gene has no matching probesets on the
microarray (or tags/ESTs for SAGE/e-Northern). If there is a filled diamond along the x-axis but no data shown in the graph
it indicates that after thresholding and normalization there is no meaningful expression data for that tissue.
Normalized intensities are drawn on a root scale, which is an intermediate between log and linear scales.
The Affymetrix MAS5 algorithm was used for array processing.
** Reference: Su AI, Wiltshire T, Batalov S, Lapp H, Ching KA, Block D, Zhang J, Soden R, Hayakawa M,
Kreiman G, Cooke MP, Walker JR, Hogenesch JB (2004) A gene atlas of the mouse and human protein-encoding
transcriptomes. Proc Natl Acad Sci U S A. 2004 Apr 20;101(16):6062-7
UniGene - electronic Northern Normal / eNorthern Cancer
Electronic Northern: For the shown set of
non-fetal normal and cancer human tissues, NCBI's Unigene dataset (Hs.data) is mined for information about the
number of unique clones per gene per tissue. Clones are assigned to particular tissues by applying
data-mining heuristics to Unigene's library information file (Hs.lib.info). Electronic expression
results were calculated by dividing the number of clones per gene by the number of clones per tissue.
They were then normalized by multiplying by 1M, and the obtained normalized counts are presented on
the same root scale as the experimental tissue vectors.
CGAP:SAGE Normal / SAGE Cancer
Serial Analysis of Gene Expression: For the same set of normal and cancer human tissues, CGAP
datasets Hs.frequencies and Hs.libraries are mined for information about the number of
SAGE tags per tissue. Tags are reassigned to a Unigene cluster and after that to a particular gene by
mining Hs.best_gene, Hs.best_tag and Hs_GeneData. The expression level of a
particular gene in a particular tissue was calculated as the number of appearances of the
corresponding tag divided by the total number of tags in libraries derived from that tissue. These
fractions were then normalized by multiplying by 1.2M and the obtained normalized counts are presented
on the same root scale as that used for the electronic Northern pictures. Please note: Currently,
only associations with minimal ambiguity participate in the analysis.
This section contains Orthologs from
HomoloGene, euGenes, SGD, and MGD, with possible further links to
Flybase and
WormBase.
The table presents the following columns:
- Organism - The names of the homologous species, using both scientific and popular terminology.
- Gene - The symbol for the gene in the homologous species.
- Locus - The position of the gene in the homologous species.
- Description - Its description.
- Human Similarity - The percent similarity to the human gene, followed either by (n) where the comparison was based on nucleic acids or (a) for amino acid based comparisons.
- NCBI accessions - links to the sequences for the gene in NCBI databases including GenBank and Entrez Gene.
Upon clicking the "Species with no ortholog" link, a pop-up window appears. It lists the species that do NOT have an ortholog to the relevant gene.
Superscripts represent the source from which this data was extracted. Data from HomoloGene can have one of two superscripts. If the second one is cited, it means that data for this species exists only in the older version of HomoloGene, which used unfinished genomes and where the homologs found might not be true orthologs.
Following the table is a link to Ensembl gene trees.
ParalogsThis section contains Paralogs from
HomoloGene and Ensembl
, and Pseudogenes from Pseudogene.org.
Genes with similar paralogs can be seen using GeneDecks Partner Hunter (more
information)
This section contains SNPs/Variants from the
NCBI SNP Database,
Ensembl, and
PupaSUITE/
PupaSNP, with
descriptions from UniProtKB, Linkage Disequilibrium images from
HapMap,
Structural Variations (CNVs/InDels/Inversions) from the
Database of Genomic Variants,
and PCR Resequencing Primers from Qiagen.
NCBI SNPs
SNP information is currently extracted from
dbSNP XML files. Filtering is done to include only those that are not artifacts, not connected to
gene duplication, not withdrawn by NCBI, fully specified, without ambiguous locations or low map quality,
and having single Entrez Gene and contig ids.
The order of a gene's displayed SNPs can be determined by the user. By default, SNPs are
sorted first (shown in the select box as 1st) by validation status (validated before
non-validated), then, within these groups, by ordered location type
(first
coding non-synonymous, then coding synonymous, followed by coding,
splice site, mRNA-UTR, intron, locus, reference, and/or
exception),
as the secondary (2nd) nested criterion,
and finally, by the number of validations (up to 4).
The user can change this default sort order and define up to three hierarchical sorting priorities
from fields available as select
boxes above the relevant columns on the section's
button line as follows: rs-numbers (sorted in ascending order),
validation status, position on the chromosome (ascending order), location type, allele frequencies
(existing info before non-existing), population types (alphabetical order), and total sample size
(largest to smallest).
Each displayed line includes genomic, expression, and allele frequency data sections. Only the summary
is shown for the expression and allele frequency sections, with a link to the detailed information
(via the
magnifying glass icon).
This table presents the following columns:
- SNP ID - The NCBI rs number for this SNP
- Valid - The validation method(s) associated with this SNP:
- C - by-cluster
- has 2+ submissions, with 1+ submissions assayed with a non-computational method
- A - by-2hit-2allele
- all alleles have been observed in 2+ chromosomes
- F - by-frequency
- subsnp has frequency data submitted
- H - by-hapmap
- validated by HapMap project
- O - by-other-pop
- validated by submitter
- has 2+ submissions, with 1+ submissions assayed with a non-computational method
- C - by-cluster
- Chr pos - chromosomal position: position of variation(strand).
- Sequence - The sequence flanking the base pair variation (highlighted in blue/orange/green/pink). Lower case letters indicate repetitive or low-complexity sequence.
- Recs - number of records for expression/allele frequency data
- AAChg - The change in amino acid resulting from this SNP
- Type - The SNP type:
- nonsynon - coding, non-synonymous; includes missense (mis), frameshift (fra), nonsense (non)
- change in peptide with respect to contig sequence
- synon - coding, synonymous
- no change in peptide for allele with respect to contig seq
- cds - coding
- variation in coding region of gene, assigned if allele-specific class unknown
- spl - splice-site; includes splice 3 (sp3), splice 5 (sp5)
- variation in first 2 or last 2 bases of intron
- utr - mrna untranslated region; includes utr 3 (ut3), utr 5 (ut5)
- variation in transcript, but not in coding region interval
- int - intron
- variation in intron, but not in first 2 or last 2 bases of intron
- exc - exception
- variation in coding region with exception raised on alignment.This occurs when protein with gap
- in sequence is aligned back to contig sequence. variations 3' of the gap have undefined functional inference.
- ref - reference
- allele observed in reference contig sequence
- loc - locus-region; includes near gene 3 (ng3), near gene 5 (ng5)
- variation in region of gene, but not in transcript
- PupaSnp Designations:
- ese - exonic splicing enhancer
- spl - splice-site
- trp - triplex forming sequences
- tfbs - transcription factor binding sites
- change in peptide with respect to contig sequence
- nonsynon - coding, non-synonymous; includes missense (mis), frameshift (fra), nonsense (non)
- More - View individual records
- Allele freq - Average frequency of the allelles for all populations, displayed as a pie-chart (only if 2 alleles). Alleles are in the same orientation and color as the displayed SNP sequence. Numeric info about the frequencies is available using the mouseover.
- Pop - population type
- Total sample - total data sample size (number of chromosomes)
Additional columns in Expression data popup:
- mRNA Accession - The mRNA sequence at NCBI
- Protein Accession - The protein sequence at NCBI
- Phase - Codon position.(1, 2, or 3)
- Protein Position - Position number of the amino acid in the protein.
Additional columns in Allele Frequency data popup:
- Het - estimated heterozygosity of population
- Sample Size - population data sample size (number of chromosomes)
This section also provides Linkage Disequilibrium (LD) information from HapMap.
Disorders & MutationsThis section contains Disorders & Mutations in which GeneCards genes are involved,
according to OMIM,
UniProtKB,
Novoseek,
PharmGKB,
Genatlas,
BGMUT,
GeneTests,
the Human Genome Variation Society's
Locus
Specific Mutation Databases (LSDB),
HGMD,
GAD,
HuGENet,
BCGD,
and/or TGDB.
Novoseek disease relationships
This table presents the following columns:
- Disease - The name of the disease related to this GeneCards gene.
- Score - The Novoseek score of the relevance of the disease to this gene, based on their literature text-mining algorithms.
- Articles - The number of articles in which both the gene's symbol or description and the disease appear.
- PubMed IDs for Articles with Shared Sentences (# sentences) - PubMed IDs of articles in which both the gene symbol and the disease appear in the same sentence, sorted by the number of sentences (shown in parentheses in the column) in which they both appear.
PharmGKB disease relationships
This table presents the following columns:
- Disease - The name of the disease related to this GeneCards gene.
- PharmGKB Relations - description of the relationship between the gene and the disease:
- CO - Clinical Outcome
- PD - Pharmacodynamics and Drug Response
- PK - Pharmacokinetics
- FA - Molecular and Cellular Functional Assays
- GN - Genotype
- PubMed IDs for articles supporting these relationships - PubMed IDs of articles in which both the gene symbol and the disease are discussed.
Genes with similar disease relationships can be seen using GeneDecks Partner Hunter (more
information)
This section provides links to possibly related articles in
Doctor's Guide.
This section provides titles of and links to research articles in PubMed, as associated via Novoseek, HGNC, Entrez Gene, UniProtKB, PharmGKB, and/or GAD.
The articles are ranked,
first according to the number of GeneCards sources that associate the article with this gene, then by
date of publication,
and then according to the
Novoseek score for this
article/gene relationship. The year of publication appears in parentheses after the title of each
article.
Lower ranked articles may also appear in partial results if their titles or authors contain your
search term.
This section allows the user to search
PubMed,
OMIM,
or NCBI Bookshelf. The current gene's aliases and disorders are provided,
as well as the search string that led to the gene, to be used as search fodder. The user can also
add new search terms.
How To Search:
The search box allows the user to search for aliases and/or free text in either PubMed,
OMIM or NCBI Bookshelf. If you wish to simply search for a variety of aliases, select
each aliases while holding down the control key. This type of search will search for any
of the aliases, if you wish to search for all aliases selected you must go to the free text
box (next to the search button) and change all of the OR's to AND's, manually. You may
also enter free text and search for the aliases selected AND/OR (use radio buttons to the left
of the box to select this) the free text. Once again, if you would like to only find documents
that have all of the aliases selected you must change the OR's to AND's in the Query String
box.
These sections provide links to the GeneCards genes in other databases:
- Genome Databases - According to H-InvDB, Entrez Gene, HGNC, AceView, euGenes, Ensembl, ECgene, KEGG, and/or miRBase.
- Other Databases - According to HUGE.
- Specialized Databases - According to ATLAS, HORDE, IMGT, MTDB, LEIDEN, and/or UniProtKB.
This section features Patent information from
GeneIP and
technologies that are available for licensing. Institutions
currently featured include
the Weizmann Institute of Science,
the Salk Institute for Biological Studies, and
Tufts University. Also included in this
section is IP news from XenneX, Inc.
This section provides links to reagents available from
Millipore,
and/or R&D Systems,
proteins, lysates, and/or antibodies available from
Cell Signaling Technology,
Millipore,
R&D Systems,
Sigma-Aldrich,
OriGene,
GenScript,
Novus Biologicals,
Epitomics,
and/or ProSpec,
drugs and compounds available from
Tocris Biosciences,
Enzo Life Sciences, and/or
Sigma-Aldrich,
clones and/or primers available from
Sigma-Aldrich,
OriGene,
Qiagen,
GenScript,
SABiosciences, and/or
Sino Biological,
and GPCR/Kinase Profiling, Assay development, GPCR & ELISA assays available from
GenScript,
R&D Systems,
Sigma-Aldrich, and/or
Uscn.
Gene Ontology (GO) Tables
The Gene Ontology sections in Proteins, Gene Function, and Pathways & Interactions display a table with the following columns:
- GO ID
- The identifier used by GO and linked to the GO entry
- Qualified GO term
- The description of this entry, possibly qualified with "NOT", "colocalizes with", or "contributes to"
- Evidence
- A 2 or 3 letter code
-
- Curator-assigned Evidence Codes
- Experimental Evidence Codes:
- IDA: Inferred from Direct Assay
- IPI: Inferred from Physical Interaction
- IMP: Inferred from Mutant Phenotype
- IGI: Inferred from Genetic Interaction
- IEP: Inferred from Expression Pattern
- Computational Analysis Evidence Codes:
- ISS: Inferred from Sequence or Structural Similarity
- IGC: Inferred from Genetic Context
- RCA: Inferred from Reviewed Computational Analysis
- Author Statement Evidence Codes:
- TAS: Traceable Author Statement
- NAS: Non-traceable Author Statement
- Curator Statement Evidence Codes:
- IC: Inferred by Curator
- ND: No biological Data available
- Automatically-assigned Evidence Codes
- IEA: Inferred from Electronic Annotation
- Obsolete Evidence Codes
- NR: Not Recorded
- Curator-assigned Evidence Codes
- PubMed ids
- References in the literature, if relevant, obtained from EntrezGene
GeneDecks Partner Hunter
GeneDecks Partner Hunter is available for ontologies,
phenotypes, drugs and compounds,
sequence-based paralogs, disorders,
pathways, binary patterns, and
domains. By clicking on the GeneDecks Partner Hunter button for a
particular section, one arrives at the GeneDecks home page, where the gene name has been
entered and the appropriate fields selected from the attribute list. From this page,
changes can be made to the data requested. Submitting this form brings up a result page
containing a list of genes similar to the chosen gene and their descriptions.
Selected Algorithms
Novoseek Scoring Algorithm
The relevance scores of elements related to genes (chemical
substances and diseases) are based on the analysis of co-occurrences of two
elements in Medline documents. The observed number of documents where both
elements appear together and the number of documents where both appear
independently are compared to an expected value based on a hypergeometric
distribution. The more co-occurrences are observed in relation to the number
expected the more unlikely it is that this happened by chance and the higher
will be the value. Unfortunately the absolute numbers are not meaningful but
can only give an order of importance (i.e. in the list of chemicals related to
a gene the order is meaningful and the first chemicals in the list are,
statistically, stronger related to the gene than the following ones but the
absolute values of the scores may change from one release to another).