Recommendations for the description of DNA sequence variants with uncertainties


Last modified July 9, 2007

Since references to WWW-sites are not yet acknowledged as citations, please mention den Dunnen JT and Antonarakis SE (2000). Hum.Mutat. 15:7-12 when referring to these pages.


Contents


Introduction

Often clear changes can be detected in the genome, but a precise description in relation to the genome sequence is not possible. Examples include cytogenetically detectable changes and changes detected using Fluorescence In Situ Hybridisation (FISH). Other examples are cases were genomic deletions / duplications are detected indirectly using RNA analysis; although the change can be described exactly at RNA-analysis, the genomic sequence spanning the junction of the rearrangement is required to describe the change at DNA-level. Finally there are examples, mostly from older publications, where the changes detected are not described precisely ("an 11 nucleotide deletion in exon 3") or uniquely ("changing amino acid Gly-17 to Arg").

In many diseases pathogenic changes are found which delete or duplicate (sets of) whole exons. These changes are detected with technologies like PCR, Southern blotting, FISH, MAPH and MLPA but the analysis does not reveal the breakpoints at a molecular level. Incorporation of these mutations in sequence variation databases is essential, most importantly to be able to determine their frequency. In addition, in cases like Duchenne and Becker Muscular Dystrophy (DMD/BMD), the breakpoints determine the severity of the disease, i.e. a truncated reading frame causing DMD, an open reading frame causing BMD. 

There is currently a growing demand to be able to include these changes in catalogs of all DNA sequence changes found in specific regions or genes. For example Dr. Simon Forbes (The Sanger Institute, UK), working on the cancer mutation database (COSMIC), asked us how to include variants that were published without sufficient details to describe them exactly. Although it is clear that publications need to contain exact descriptions of the changes reported, it is also true that older publications not always meet this requirement.

With few exceptions, clear recommendations to describe such changes have not yet been made. The purpose of this page is to list all current recommendations and to start discussions on how to extend these by suggesting possibilities to describe changes not covered yet. Do you agree / disagree ?, did we miss cases ?, do you want to make suggestions ?; please contact us E-mail to: ddunnen @ HumGen.nl and Stylianos.Antonarakis @ medecine.unige.ch.

Main recommendation
When the exact position of a change is not known, the range of the uncertainty is listed between brackets ("(5' border_3' border)") and it is attempted to describe the change on DNA-level as precisely as possible.

The deletion of genomic sequences can be detected using several different methods, incl. FISH, Southern blot, quantitative PCR, MAPH and MLPA. One could argue that - to be as precise as possible - the description of the deletion should include information of the probe sequences used. Although our recommendation is indeed to do this, it should be noted that this has its limitations. Using some methods, the presence of a probe signal only indicates that a (large) part of the probe is present (e.g. BAC-derived FISH signals). On the other hand, the absence of a signal does not mean that the entire target sequence is missing (e.g. PCR results are negative when only one of the two primer annealing sites is deleted, see FAQ).


Subjects

Two changes in one individual

When two sequence changes are found in one individual but it is unknown whether they are located on the same or on different alleles, the change is described using the format c.[76A>C(+)483G>C] (see Recommendations).


Incomplete descriptions

When the exact position of a change is not known, the range of the uncertainty is listed between brackets ("()", see Recommendation). Examples;


Exonic deletions / duplications

Since no precise molecular data have been determined, a correct description of the changes at a molecular level is rather difficult. To facilitate database incorporation the description should be based on the reference sequence used in the database and include the position of the most extreme region(s) tested (e.g. segment PCR-ed, probes used for hybridisation, etc.). To describe deletions (duplications) it is assumed that when a probe for a specific exon scores deleted (duplicated) that the entire exon is deleted (duplicated). Detailed knowledge regarding the exact location of the probe sequence used is not incorporated in the description of the deletion (duplication), unless two different probes for one exon score different (see FAQ).

Deletions are designated by "del" after an indication of the first and last nucleotide(s) deleted (see Recommendations). Examples;

For duplications the same recommendations hold, except that duplications are designated by "dup" after an indication of the nucleotides flanking the duplicated region (see Recommendations).


FISH-detected rearrangements

Many chromosomal rearrangements, especially in the past, have been detected using techniques like Southern blotting and Fluorescence In Situ Hybridisation (FISH). The description of these changes was often in tabular or graphical format based on either position-ordered probe names or relative chromosomal positions. Especially when FISH was used, relatively little or often even no actual DNA sequence of the probe(s) used was known. In those days, even when a probe sequence was known, this information was of little help to determine the probe location more precisely.

With the availability of the reference human genome sequence, this situation has dramatically changed. Now, any piece of DNA probe sequence can be used to position that probe with great precision on the human genome map. In addition, the clones used to generate the human genome sequence are freely available and have become the preferred probes for new FISH experiments. The latter is especially true for genome-wide array-CGH experiments using ordered PAC/BAC-clones. As a consequence of these developments it is now possible to describe these changes based on DNA sequences. NOTE: see Discussion. Examples;


Array-detected rearrangements

Basically, chromosomal rearrangements and other DNA sequence variants detected using array technology can, based on the array-probe sequences used, be described as those for FISH-detected rearrangements (see above). An advantage here is that the array probes used are often exactly defined, being mostly relatively short 20-60-mer oligonucleotide sequences. Thus, these oligo-probe sequences can be exactly describe at nucleotide level in relation to the human genome reference sequence. For deletions the basic format is (last-present_first-deleted)_(last-deleted_first-again-present). Examples;


Cytogenetic rearrangements

A nomenclature system to describe cytogenetically detectable rearrangements has been suggested early on (see ISCN 1985). Current recommendations in this areas are made by the "Standing Committee on Human Cytogenetic Nomenclature ( 2001-2006)".


| Top of page | MutNomen homepage | Check-list |
| Recommendations: generalRNAprotein |
| Discussions | FAQ's | Codons / amino acids | History |
| Example descriptions:  QuickRef / symbolsDNARNAprotein |

Copyright © HGVS 2007 All Rights Reserved
Website Created by Rania Horaitis, Nomenclature by J.T. Den Dunnen - Disclaimer