 |
Quick
reference - characters used in the description of sequence changes
|
Last modified July 9, 2007
|
Since references to WWW-sites are not yet acknowledged as citations, please
mention den
Dunnen JT and Antonarakis SE (2000). Hum.Mutat. 15:7-12 when referring to
these pages.
Symbols used
Below an overview of all different signs and characters used in the
description of sequence variants with their meaning.
- numbering
- -1 = first nucleotide 5' of the ATG translation initiation codon
- *1 = the first nucleotide 3' of the translation stop codon
- N+1 = nucleotide in an intron; the first nucleotide in the intron
3' of the splice donor site at position N of the coding DNA
reference sequence
- N-1 = nucleotide in an intron; the first nucleotide in the intron
5' of the splice acceptor site at position N of the coding DNA
reference sequence
- reference sequences
- c. = coding DNA reference sequence
- g. = genomic reference sequence
- m. = mitochondrial reference sequence
- r. = RNA reference sequence
- p. = protein reference sequence
- specific characters
- + (plus) = see nucleotide numbering
- - (minus) = see nucleotide numbering
- * (asterisk) = see nucleotide numbering
NOTE:
because of existing
IUB/IUPAC conventions this character can be
expected to be used in future to indicate a stop codon (instead of
the currently recommended X)
- _ (underscore) = see nucleotide numbering, used to indicate a range (e.g. in combination
with a deletion, insertion or variable sequence)
- > (greater then) = changes to
- c.5T>G substitution at DNA level
- p.Leu30_Cys42>SerfsX3 insertion/deletion on protein
level)
- : (colon) = used to separate indicator (e.g. reference sequence used)
and the actual description of a variant
e.g.
M13855.3:c.1A>G
- ; (semi-colon) = separator between different changes in one allele
- , (comma) = separator between different nucleotides (mosaic cases), transcripts or
proteins
generated from one allele (chromosome)
- () = indicates the range of uncertainty in the description of a
change
- [] = encloses several changes, transcripts or proteins from one
allele
- c.[76A>C]+[83G>C] changes at the two alleles
- c.[76A>C; 83G>C] two changes in one allele
- r.[76a>c, 73_88del] two transcripts from one allele
{}
= encloses a database accession.version number when two indicators
are used
e.g. DMD{NM_004006.1}:c.3G>T
- ? (question mark) = unknown
- = (equals) = indicates 'identical to reference sequence' (no change,
wild type sequence)
- nucleotides and amino acids
- DNA
- A = adenine
- C = cytosine
- G = guanine
- T = thymidine
- RNA
- a = adenine
- c = cytosine
- g = guanine
- u = uracil
- protein
- others
- chr = chromosome (e.g. chr19 or chrX)
- del = deletion
- dup = duplication
- ext = extension (e.g. N- or
C-terminus of protein)
- ins = insertion
- inv = inversion
- con = (gene) conversion
- fs = frame shift
- t = translocation; e.g. t(X;4)(p21.2;q34)
| Top of page | MutNomen
homepage | Check-list |
| Recommendations: general, DNA,
RNA, protein,
uncertain |
| Discussions | FAQ's | Codons / amino acids | History
|
| Examples to describe changes: DNA, RNA, protein |
Copyright © HGVS 2007 All Rights Reserved
Website Created by Rania Horaitis, Nomenclature by J.T. Den Dunnen - Disclaimer |