Metadata

There are fields that are required for specific combinations of STRATEGY and TRACE_TYPE_CODE. You may check requirements in the Validation Table. Metadata can be searched at the NCBI Trace Archive. RED: Required
GREEN: May be required, depending upon the trace type and strategy employed
BLACK: May be mandatory, optional or not allowed for a given combination of trace type and strategy.

Trace Archive RFC Required*
May be required, depending upon the trace type and strategy employed*

Metadata Field List

ACCESSION
DDBJ/EMBL/Genbank accession number Type: varchar(30) Example: AC22227 The is assigned upon deposition to a public repository (DDBJ/EMBL/Genbank). This field will not be applicable to all trace types (primarily WGS). However, if this field contains a validaccession identifier correlation between the primary sequence data (in Trace) and the secondary sequence data (in the public repository) is facilitated.
AMPLIFICATION_FORWARD*
The forward amplification primer sequence Type: varchar(100) Example: GGATTCTGACTAACGAGC The field is to allow submitters to define the primers used to amplify templates for sequencing. This field is required when =PCR or RT-PCR.
AMPLIFICATION_REVERSE*
The reverse amplification primer sequence. Type: varchar(100) Example: GGATTCTGACTAACGAGC The field is to allow submitters to define the primers used to amplify templates for sequencing. This field is required when =PCR or RT-PCR.
AMPLIFICATION_SIZE
The expected amplification size for a pair of primers. Type: int Example: 500 The field allows submitters to define the expected amplification size for a pair of primers (defined in the and fields). This number should be given in base pairs. If =PCR, the amplification size is based on amplification of genomic DNA. If the =RT-PCR, then the amplification size is based on amplification of transcript.
ANONYMIZED_ID
Anonymous ID for an individual. Type: varchar(100) Example: 2222anonym Used in projects to maintain the anonymity of donors. In many cases, there may be a controlled access database that can map many anonymized_ids in the trace archive to a single individual id for which phenotypic information may be available.
ATTEMPT
Number of times the sequencing project has been attempted by the center and/or submitted to the Trace Archive. Type: tinyint(1-255) Example: 2
BASE_FILE
File name with base calls. Type: varchar(200) Example: ./mytraces/123clone.fasta Trace files which do not include the basecalls must provide this information in a separate file. The file designations are recorde din the field of the metadata file. If basecalls are provided in separate files the information in these files will overwrite any information in the trace (usually *.scf) file. If the base calls that would be provided in the are the same as the information in the trace file, DO NOT PROVIDE THE FILE. If the center provides the and, then the peak index information should also be provided in a file called.
CENTER_NAME*
Name of the sequencing center. Type: varchar(50) Example: WUGSC Sequencing centers wishing to submit data must contact the DDBJ Trace Archive administrators to determine a center abbreviation. This abbreviation issued in the field. This field has a controlled vocabulary. For the complete list of submitting centers see: http://www.ncbi.nlm.nih.gov/Traces/trace.cgi?view=submitting_centers

These center names are controlled separately from those of the Sequence Read Archive

CENTER_PROJECT*
Center defined project name. Type: varchar(100) Example: HBBB The reflects a sequencing center's internal designation for a specific sequencing project.This field can be useful for grouping related traces.
CHEMISTRY
Description of the chemistry used in the sequencing reaction. Type: varchar(50) Example: BIGDYEV3.0
CHEMISTRY_TYPE
Type of chemistry used in the sequencing reaction. Type: char(50) Example: P The uses a controlled list.
Accepted values are:
PrimerTerminatorp=primer; t=terminator
CHROMOSOME
Chromosome to which the trace is assigned. Type: varchar(8) Example: 11 The indicates to which chromosome a trace has been assigned. Gene names or cytogenetic positions are not appropriate substitutes for chromosome information.
CLIP_QUALITY_LEFT
Left clip of the read, in base pairs, based on quality analysis. Type: int Example: 56 The field indicates the base at the beginning of the sequence at which the read should be clipped due to poor quality sequence. The given value would be the first base of the high quality region of the trace.
CLIP_QUALITY_RIGHT
Right clip of the read, in base pairs, based on quality analysis. Type: int Example: 256 The field indicates the base at the end of the sequence at which the read should be clipped due to poor quality sequence. The given value would be the last base of the high quality region of the trace.
CLIP_VECTOR_LEFT*
Left clip of the read, in base pairs, based on vector sequence. Type: int Example: 75 The field indicates the base at the beginning of the sequence at which the read should be clipped due to vector sequence. The given value would be the first base of non-vector sequence. This field is required for almost all combinations of and . This information can be omitted if the field is populated or is PCR or RT-PCR.
CLIP_VECTOR_RIGHT*
Right clip of the read, in base pairs, based on vector sequence. Type: int Example: 275 The field indicates the base at the end of the sequence at which the read should be clipped due to vector sequence. The given value would be the last non-vector sequence. This field is required for almost all combinations of and . This information can be omitted if the field is populated or is PCR or RT-PCR.NOTE: Many centers combine vector and quality analysis, and thus have only one set of clip values. Inthis case, the set of values should be placed in the / fields.
CLONE_ID*
The name of the clone from which the trace was derived. Type: varchar(30) Example: RP23-1123F10 The field issued to store the identifier related to an individual clone, for example a BAC clone, PAC clone or cDNA clone. If the clone is registered with the clone registry(http://www.ncbi.nlm.nih.gov/clone/), standard clone registry nomenclature (http://www.ncbi.nlm.nih.gov/clone/content/overview/) should be used.
This field is required for the following combination of and :
=cDNA;=Any
=EST;=Any
=CLONEEND;=CLONEEND
=CLONE;=Any
=ENCODE;=SHOTGUN;
PrimerWalk; CLONEEND =FINISHING;=Any
CLONE_ID_LIST*
Semi-colon delimited list of clones if the Strategy is PoolClone. Type: varchar(30) Example: RP23-200A2;RP23-500P1 The field is used only if =PoolClone. In this case, the list of clones is provided as a semicolon delimited list. If the clones are registered with the Clone Registry (http://www.ncbi.nlm.nih.gov/clone/), standard clone registry nomenclature (http://www.ncbi.nlm.nih.gov/clone/content/overview/) should be used (see field).Note: The list of clones is not limited, but the size of the individual clone within the list is limited to 30 bytes.
This field is required for the following combination of and :
=PoolClone;=Any
COLLECTION_DATE*
The full date, in "Mar 2 2006 12:00AM" format, on which an environmental sample was collected. Type: datetime Example: Mar 2 2006 12:00AM The field is used to define the date and time on which an environmental sample was collected.
This field is required for the following combination of and :
=Env Sample-Geo; =Any
=Env Sample-Host; =Any
CVECTOR_ACCESSION
Repository (DDBJ/EMBL/Genbank) accession identifier for the cloning vector. Type: varchar(50) Example: AY451994 The field holds the accession number for the cloning vector used. This cloning vector relates to the clone named in the field.
CVECTOR_CODE
Center defined code for the cloning vector. Type: varchar(50) Example: PBACE3.6 The field holds the user defined identifier for the cloning vector. Submitters are encouraged to submit all vector sequence information to public repositories.
DEPTH
Depth (in meters) at which an environmental sample was collected. Type: float Example: 10M The field is applicable to water samples and earth samples. If the value of this field is NULL, it is anticipated the sample was taken from the surface of the environment. While this field is only applicable to environmental samples, it is not required.
ELEVATION
Elevation (in meters) at which an environmental sample was collected. Type: float Example: 500 If the value of this field is NULL it is assumed the data were obtained at sea level. The field is only applicable to some environmental sample data, but is not a required field.
ENVIRONMENT_TYPE*
Type of environment from which an environmental sample was collected. Type: varchar(250) Example: sea water The field is used to describe the specific environment from which an environmental sample was taken. While the and fields describe the location many types of environmental types could exist at this location (for example, soil, sludge, tree roots, etc).
This field would be required for the following combination of and :
=Env Sample -Geo; =Any
EXTENDED_DATA
Extra ancillary information wrapped around in a EXTENDED_DATA block, where actual values are provided with a special <field> tag. Type: varchar() Example:
<extended_data>
    <field name='SamplingSiteMonthChlorophyllLevel'>1.4 mg_mm</field>
    <field name='SamplingSiteYearlyChlorophyllLevel'>1.12 mg_mm</field>
    <field name='SamplingSiteYearlyChlorophyllLevelStdError'>0.19 mg_mm</field>
</extended_data>
The '=' sign and the field separator character '|' should be excluded from names and their values. No other validity checks will be performed on the data.
FEATURE_ID_FILE
File describing the features and their locations on a chip. Type: varchar(200) Example: ./mytraces/chip2.cdf The provides the location and sequence of the features for a given chip when ="CHIP".
FEATURE_ID_FILE_NAME*
Reference to a common FEATURE_ID_FILE which should be submitted first. Type: varchar(200) Example: This field is required when ="CHIP".
FEATURE_SIGNAL_FILE
File giving the signal and variance for features on a chip. Type: varchar(200) Example: ./mytraces/chip2.signal The provides the signal and variance of signal for the features on a given chip when ="CHIP".
FEATURE_SIGNAL_FILE_NAME*
Reference to a common FEATURE_SIGNAL_FILE which should be submitted first. Type: varchar(200) Example: This field is required when ="CHIP".
GENE_NAME
Gene name or some other common identifier. Type: varchar(100) Example: transporter 1 Free text. Mainly this field would be for ='Re-sequencing' or'ENCODE'. When a group is analyzing a particular gene, they may want to refer to that gene by it's name or some other common identifier.
HI_FILTER_SIZE
The largest filter used to stratify an environmental sample. Type: varchar(50) Example: 50 micron The field is applicable only to environmental sample data but is not a required field.
HOST_CONDITION
The condition of the host from which an environmental sample was obtained. Type: varchar(100) Example: HIV-positive The field is only applicable to environmental sample data and is used to describe the condition (healthy, sick, etc) of the host from which a sample was taken.
HOST_ID*
Unique identifier for the specific host from which an environmental sample was taken. Type: varchar(100) Example: yerkes pedigree #C0479 'Clint' The field is only applicable to environmental sample data and is used to capture the unique name for the specific host from which a sample was obtained.
This field would be required for the following combination of and :
=Env Sample-Host; =Any
HOST_LOCATION*
Specific location on the host from which an environmental sample was collected. Type: varchar(100) Example: rumen The field is only applicable to environmental sample data and is used to describe the specific part of the host from which the sample was obtained, for example: dental plaque, hindgut, root surfaces.
This field would be required for the following combination of and :
=Env Sample-Host; =Any
HOST_SPECIES*
The host from which an environmental sample was obtained. Type: varchar(100) Example: Pan troglodytes The field is only applicable to environmental sample data.
This field would be required for the following combination of and :
=Env Sample-Host; =Any
INDIVIDUAL_ID
Publicly available identifier to denote a specific individual or sample from which a trace was derived. Type: varchar(100) Example: NA12345 The field provides a center specific unique id that can associate as pecific trace to an individual. This will be used primarily for population based studies.
INSERT_FLANK_LEFT*
Flanking sequence at the cloning junction. Type: varchar(100) Example: AAGGTGCGATGCAGTGGCAGTAGCAGTGTCGACGTGACGATTCGTCCGGA The field should provide from 50 up to 100 bases of sequence (including linkers) to the left of the cloning junction. This information will allow users to perform their own vector trimming of reads. This field is required for almost all combinations of and . This field can be omitted if is populated.However, is the preferred choice. If there was no cloning step involved in the sequencing, please populate the field with 'NONE'.
INSERT_FLANK_RIGHT*
Flanking sequence at the cloning junction. Type: varchar(100) Example: AAGGCGCGATGCAGTGAGCGAGGCTGACGTCGGCTAGCGTCGCGTCGGGT The field should provide from 50 up to 100 bases of sequence (including linkers) to the right of the cloning junction. This information will allow users to perform their own vector trimming of reads. This field is required for almost all combinations of and . This field can be omitted if is populated.However, is the preferred choice. If there was no cloning step involved in the sequencing, please populate the field with 'NONE'. It is anticipated that if is populated that will also be populated. It is not anticipated that a mixture of clip values and junction sequence will be specified. (i.e. and populated for the same record.
INSERT_SIZE*
Expected size of the insert (referred to by the value in the TEMPLATE_ID field) in base pairs Type: int Example: 2000 The field indicates the expected insert size of the clone that is sequenced. It is understood that this is an estimate based upon the average insert sizes found in a given library. However, this information is critical for certain experiments, such as whole genome assembly.
This field would be required for the following combination of and :
=Any;=WGS=Any;
=WCS=cDNA;=CLONEEND=CLONEEND;
=CLONEEND
INSERT_STDEV*
Approximate standard deviation of value in INSERT_SIZE field. Type: int Example: 200 The field reflects the approximate standard deviation of the insert size. It is understood that this information is an approximation and may change as better data is obtained. This field would be required for the following combination of and :
=Any;=WGS=Any;
=WCS=cDNA;
=CLONEEND=CLONEEND;=CLONEEND
LATITUDE*
The latitude measurement (using standard GPS notation) from which a sample was collected. Type: float Example: 54.736 The field is required to describe the collection of some environmental sample data. The latitude range is [-90,90] with the equator as 0 latitude and positive values of latitude are north of the equator. This field would be required for the following combination of and:
=Env Sample- Geo;=Any
LIBRARY_ID*
The source of the clone identified in the CLONE_ID field Type: varchar(100) Example: RP23 The field documents the source library of the archival clone resource. Many genomic libraries have been registered with the Clone Registry (http://www.ncbi.nlm.nih.gov/clone) and the standard nomenclature (http://www.ncbi.nlm.nih.gov/clone/content/overview/) should be used for these libraries.
This field would be requiredfor the following combination of and :
=cDNA;=Any=EST;=Any
=CLONEEND;=CLONEEND=CLONE;
=Any=ENCODE;=SHOTGUN;PrimerWalk; CLONEEND
LONGITUDE*
The longitude measurement (using standard GPS notation) from which a sample was collected. Type: float Example: -86.403 The field is required to describe the collection of some environmental sample data. The longitude is ranging from 0° at the Prime Meridian to +180° eastward and -180° westward.
This field would be required for the following combination of and :
=Env Sample-Geo; =Any
LO_FILTER_SIZE
The smallest filter size used to stratify an environmental sample. Type: varchar(50) Example: 25 micron The field is only applicable to environmental sample data but is not a required field.
NCBI_PROJECT_ID
BioProject ID generated by the INSDC. Type: int Example: 7 field would allow to link traces to BioProject database and easily retrieve sets of traces from each Project. Genome sequencing centers may apply their project to the DDBJ BioProject prior the submission of genomic sequence data. Submitters need not submit sequencing data at the time they register their project.
ORGANISM_NAME*
Description of species for BARCODE project from which trace is derived. Type: varchar(100) Example: Acanthocybium solandri The field is used to classify the read by species for BARCODE data, using proper taxonomic name in accordance with Taxonomy Browser. ="BARCODESPECIES" for all traces from this project. This field would be required for the =BARCODE.
PEAK_FILE
Name of file that contains the list of peak values. Type: varchar(200) Example: ./mytraces/123clone.peak Consult the field description for more information.
PH
The pH at which an environmental sample was collected. Type: float Example: 7.2 The field is only applicable to environmental sample data but is not a required field.
PICK_GROUP_ID
Id to group traces picked at the same time. Type: int Example: 939065
PLACE_NAME
Country in which the biological sample was collected and/or common name for a given location. Type: varchar(250) Example: Octopus Springs The field is applicable to environmental sample data, but is not required.
PLATE_ID
Submitter defined plate id. Type: varchar(32) Example: 203 The and fields are intended to identify the storage location of the sequencing template (not the library well coordinate of an archival clone named in the field). This may enable flipped or contaminated trays to be easily identified. If a particular experiment did not require the use of a plate, please populate this field with '0'.
POPULATION_ID
Center provided id to designate a population from which a trace (or group of traces) was derived. Type: varchar(100) Example: CEPH The field is used to capture center specific designations of groups of individuals. This will likely only be useful in population studies(usually =SNP).
PREP_GROUP_ID
ID that defines groups of traces prepared at the same time. Type: varchar(30) Example: A2
PRIMER
The primer sequence (used in the sequencing reaction). Type: varchar(200) Example: GAATACCTACGATCGCC The value of the field is the actual base sequence of the sequencing primer used. If a center uses a primer extensively, the primer sequence can be entered into the list of primer codes and the field can be used.
PRIMER_CODE
Identifier for the sequencing primer used. Type: varchar(30) Example: Sp6
PRIMER_LIST*
A ';' delimited list of primers used in a mapping experiment (such as AFLP). Type: varchar(100) Example: AAGGTCTGCGCGTGTC;AGCTGCGTACGTAATCG; This field is required if ="AFLP" and ="PCR".
PROGRAM_ID*
The program used to create the trace file. Type: varchar(100) Example: phred-19990722h The field is used to indicate the base calling program. This field is free text. Program name, version numbers or dates are very useful.
More example values:
  • phred-19980904e
  • abi-3.1
  • ATQA
  • TraceTuner
  • Licor
  • Megabase
  • Beckman
PROJECT_NAME
Term by which to group traces from different centers based on a common project. Type: varchar(50) Example: New Project In this way sequencing centers that are working on the same large project can group all of the traces for this project using a common term. This field has a controlled vocabulary. Sequencing centers wishing to submit data must contact the DDBJ Trace Archive to determine a name that all members of the project agree on.
QUAL_FILE
Name of file containing the quality scores. Type: varchar(200) Example: ./mytraces/123clone.fasta.qs Trace files which do not include the quality scores must provide this information in a separate file. The file designations are recorded in the fields of the metadata file. The actual quality scores are stored in the file designated in the field. If quality scores are provided in separate files the information in these files will overwrite any information in the trace (usually *.scf) file. If the quality scores that would be provided in the are the same as the information in the trace file, DO NOT PROVIDE THE FILE. However, it is important to note that if some formats do not include the quality scores, then these values must be provided as ancillary information. If the center provides the and, then the peak index information should also be provided in a file called.
REFERENCE_ACCESSION*
Reference accession (use accession and version to specify a particular instance of a sequence) used as the basis for a re-sequencing project. In case of Comparative strategy show the basis for primers design. Type: varchar(50) Example: NT_029829.1 This field is required for the following combination of and :
=Re-sequencing;Comparative =Any
REFERENCE_ACC_MAX*
Finish position for a particular amplicon in re-sequencing or comparative projects. Type: int Example: 30929 This field points to the finishing coordinate of the described in the field. All coordinates should be in 1 base coordinates (i.e.sequences start at base 1, not base 0). This field is required for the following combination of and :
=Re-sequencing; =SHOTGUN; PCR;RT-PCR
REFERENCE_ACC_MIN*
Start position for a particular amplicon in re-sequencing or comparative projects. Type: int Example: 29829 This field points to the starting coordinate of the described in thefield. All coordinates should be in 1 base coordinates (i.e.sequences start at base 1, not base 0). This field is required forthe following combination of and :
=Re-sequencing; =SHOTGUN; PCR;RT-PCR
REFERENCE_OFFSET*
Sequence offset of accession specified in REFERENCE_ACCESSION field to define the coordinate start position used as the basis for a re-sequencing project. Type: int Example: 1520899 This field points to the starting coordinate of the described in thefield. All coordinates should be in 1 base coordinates (i.e.sequences start at base 1, not base 0). This field is required forthe following combination of and :
=Re-sequencing; =CHIP
REFERENCE_SET_MAX
Finish position for a entire re-sequencing region. This region may include several amplicons. Type: int Example: 29829 This field points to the starting coordinate of the described in the field for a entire re-sequencing region. All coordinates should be in 1 base coordinates (i.e. sequences start at base 1, not base 0).The REFERENCE_ACC_[MIN|MAX] and REFERENCE_SET_[MIN|MAX] should refer to the same REFERENCE_ACC.
REFERENCE_SET_MIN
Start position for a entire re-sequencing region. This region may include several amplicons. Type: int Example: 29829 This field points to the starting coordinate of the described in the field for a entire re-sequencing region. All coordinates should be in 1 base coordinates (i.e. sequences start at base 1, not base 0).The REFERENCE_ACC_[MIN|MAX] and REFERENCE_SET_[MIN|MAX] should refer to the same REFERENCE_ACC.
RUN_DATE
Date the sequencing reaction was run. Type: datetime Example: 2000-10-28
RUN_GROUP_ID
ID used to group traces run on the same machine. Type: varchar(30) Example: group2
RUN_LANE
Lane or capillary of the trace. Type: int Example: 1 The documents the specific lane or capillary on which a trace was obtained.
RUN_MACHINE_ID
ID of the specific sequencing machine on which a trace was obtained. Type: varchar(30) Example: machine2
RUN_MACHINE_TYPE
Type or model of machine on which a trace was obtained. Type: varchar(30) Example: ABI 310
SALINITY
The salinity at which an environmental sample was collected measured in parts per thousand units (promille). Type: float Example: 20 The field is only applicable to environmental sample data but is not a required field.
SEQ_LIB_ID*
Center specified M13/PUC library that is actually sequenced. Type: varchar(255) Example: 22194 The field is the center identifier for the M13/PUC based clone that is actually sequenced. This will allow grouping of traces by the actual ligation event and is applicable to most projects. Thi svalue will be unique within a given center.
This field would be required for the following combination of and :
=Any;=SHOTGUN
=Any;=WGS/WCS
SOURCE_TYPE*
Source of the DNA. Type: varchar(50) Example: GENOMIC DNA The field consists of a code. Possible values are:
  • G=Genomic DNA (includes PCR products from genomic DNA)
  • N=Non Genomic DNA (EST, cDNA, RT-PCR, screened libraries)
  • VIRAL RNA=Viral RNA
  • SYNTHETIC=Synthetic DNA
Accepted values are G, N, GENOMIC, NON GENOMIC, VIRAL RNA,SYNTHETIC
SPECIES_CODE*
Description of species from which trace is derived. Type: varchar(100) Example: Homo sapiens The field is used to classify the read by species, using proper taxonomic names where possible. This field currently is maintained as a controlled vocabulary. For a list of species currently contained within the Trace Archive, see: http://www.ncbi.nlm.nih.gov/Traces/trace.cgi?cmd=stat&f=xml_list_species&m=obtain&s=speciesTo submit a new species, please contact the DDBJ Trace Archive prior to submission. For cases in which it is unclear ofthe taxonomic origin of a specific trace the taxonomic classification 'ENVIRONMENTAL SEQUENCE' can be used in a case of environmental samples or 'ARTIFICIAL SEQUENCE' in a case of artificial material.
STRAIN*
Strain from which a trace is derived. Type: varchar(50) Example: C57BL/6J is required for ="SNP"
STRATEGY*
Experimental STRATEGY. Type: varchar(50) Example: MODEL VERIFY Experimental used when obtaining the trace. It is proposed that this would be a controlled vocabulary, but that submitters would contribute to this list as needed to define various experiments and projects.

  • AFLP: Amplified Fragment Length Polymorphism
  • BARCODE: DNA sequence analysis of a uniform target gene to enable species identification
  • CCS: Concatenated cDNA sequencing
  • cDNA: Sequences generated in the process of sequencing cDNA clones
  • CF-S: Cot-filtered single/low-copy genomic DNA
  • CF-M: Cot-filtered moderately repetitive genomic DNA
  • CF-H: Cot-filtered highly repetitive genomic DNA
  • CF-T: Cot-filtered theoretical single-copy DNA
  • CLONE: Genomic clone based (hierarchical) sequencing
  • CLONEEND: Sequences generated from the end of a clone(BAC/PAC/Fosmid or cDNA)
  • Comparative: Sequences obtained using primers design from related species
  • CTS: Concatenated Tag Sequencing
  • Env Sample-GEO: Geographically generated environmental sample
  • Env Sample-Host: Environmental samples collected from a specific host
  • EST: single pass sequencing of cDNA templates
  • FINISHING: a read specifically made for finishing, could be either BAC finishing or Whole Genome Assembly (WGA) finishing
  • MODEL VERIFY: Sequences obtained to verify proposed gene models
  • PoolClone: Pools of clones (BACs mostly)
  • SNP: Reads used for SNP identification
  • TARGETED LOCUS: Sequences obtained from templates generated by primers designed to amplify a specific genetic locus
  • Re-sequencing: Re-sequencing of targeted genomic regions
  • RT-PCR: Sequences obtained using templates generated by Reverse Transcriptase Polymerase Chain Reaction
  • WGA: Whole Genome Assembly
SUBMISSION_TYPE*
Type of submission. Type: varchar(50) Example: NEW The field allowed values:
  • NEW: use to submit new data
  • UPDATE: use to renew traces and their ancillary information. Previous data will be saved with their TI's; new traces with the same trace_name's will receive new TI's and they will become active
  • UPDATEINFO: use to update or add ancillary information for already existing traces without re-submitting the entire package of data
  • WITHDRAW: use to withdraw traces
SVECTOR_ACCESSION
DDBJ/EMBL/Genbank accession of the sequencing vector. Type: varchar(50) Example: X52325
SVECTOR_CODE
Center defined code for the sequencing vector. Type: varchar(50) Example: pBluescript SK(+)
TEMPERATURE
The temperature (in oC) at which an environmental sample was collected. Type: float Example: 30 The field is only applicable to environmental sample data but it is not a required field.
TEMPLATE_ID
Submitter defined identifier for the sequencing template. Type: varchar(50) Example: HBBBA2211 The field is used to uniquely identify the actual template that is sequenced. This field, in conjunction with the TRACE_END field, can be used to identify traces that should be marked as 'mate_pairs'because they come from opposite ends of the same clone.
TRACE_END
Defines the end of the template contained in the read. Type: varchar(50) Example: F The field can have the following values:
  • F: FORWARD
  • R: REVERSE
  • N: UNKNOWN
TRACE_FILE*
Filename with the trace, relative to the top of the volume. Type: varchar(200) Example: ./traces/TRACE001.scf
TRACE_FORMAT*
Format of the trace file. Type: varchar(20) Example: scf The field can have the following values:
  • SCF - A standard file format for data from DNA sequencing instruments.
  • ABI - A ABI-trace file is a binary file including the trace data and the sequence.
TRACE_NAME*
Center defined trace identifier. Type: varchar(250) Example: HBBBA1U2211 The field must be unique within a center, but is not required to be unique between centers. The combination of and act as a unique key within the Trace Archive.
TRACE_TYPE_CODE*
Sequencing strategy by which the trace was obtained. Type: varchar(50) Example: wgs The field reflects the sequencing used to obtain the trace.

  • CHIP: Sequences obtained using microarrays (also called DNAchips or gene chips)
  • CLONEEND: Sequences generated from the end of a large insert(BAC/PAC/Fosmid) or cDNA clone
  • EST: Single Pass Expressed Sequence Tag
  • HTP SELEX: High throughput SELEX
  • OTHER: Other than PCR, PrimerWalk, SHOTGUN or TRANSPOSON for FINISHING
  • PCR: Sequences obtained using templates generated by genomic Polymerase Chain Reaction
  • PrimerWalk: Sequences generated through a primer walkingstep
  • RT-PCR: Sequences obtained using templates generated by Reverse Transcriptase Polymerase Chain Reaction
  • SHOTGUN: Shotgun sequencing of clones (genomic or cDNA)
  • TRANSPOSON: Sequences obtained using templates generated bytransposons
  • WCS: Whole Chromosome Shotgun
  • WGS: Whole Genome Shotgun
TRANSPOSON_ACC*
DDBJ/EMBL/Genbank accession for transposon used in generating sequencing template. Type: varchar(50) Example: X00913 The would be required for the following combination of and :
=Any;=TRANSPOSON
TRANSPOSON_CODE*
Center defined code for transposon used in generating sequencing template. Type: varchar(50) Example: Mu transposon This field would be required for the following combination of and :
=Any;=TRANSPOSON
WELL_ID
Center defined well identifier for the sequencing reaction. Type: varchar(50) Example: A1 The field in combination with the field , is used to define the storage location of the sequencing reaction (see note with the field). Typically,sequencing reactions are performed in standard microtiter dishes having either 96 or 384 wells (see standard configurations below).
Standard 96 well microtiter configuration
Standard 96 well microtiter configuration
Standard 384 well microtiter configuration
Standard 384 well microtiter configuration

Internal Fields List

BASECALL_LENGTH
Length of the trace in base pairs. Type: int Example: 396
BASES_20
Number of base pairs for which quality score exceed 20. Type: smallint Example: 50 Warning: There are some depositions that do not have quality scores. This is likely due to the center submitting ABI files and not providing quality calls separately.
BASES_40
Number of base pairs for which quality score exceed 40. Type: smallint Example: 50 Warning: There are some deposition sthat do not have quality scores. This is likely due to the center submitting ABI files and not providing quality calls separately.
BASES_60
Number of base pairs for which quality score exceed 60. Type: smallint Example: 50 Warning: There are some depositions that do not have quality scores. This is likely due to the center submitting ABI files and not providing quality calls separately.
LOAD_DATE
Date on which the data was loaded. Type: smalldatetime Example: Jan 8 2001 11:59AM
MATE_PAIR
TI's of the reads obtained from the other end of the same template. Type: int Example: 203682255 MATE PAIR is the pair of reads obtained from two ends of the same template (FORWARD and REVERSE).
REPLACED_BY
TI that replaced the current TI as "active". Type: int Example: 304753779 This field points to the more recent data set. If trace was updated then the field stores the for the new trace. If only ancillary information has been updated, then replaced_by=0 and is not shown.
STATE
Indicates the status of the trace. Type: varchar Example: active
  • active
  • updated
  • withdrawn
TAXID
NCBI Taxonomy ID. Type: int Example: 10090 This field links DDBJ Trace Archive with NCBI Taxonomy Browser.
TI
Trace unique internal Identifier. Type: int Example: 304753779 It is given for a record at the loading stage, and any record,or number of records can be obtain by their identifiers.
UPDATE_DATE
Date on which the data was updated/replaced. Type: smalldatetime Example: Jul 19 2001 3:48PM This field is used to store the date of the last update.