Glossary terms: 402

ACCESSION
DDBJ/EMBL/Genbank accession number Type: varchar(30) Example: AC22227 The is assigned upon deposition to a public repository (DDBJ/EMBL/Genbank). This field will not be applicable to all trace types (primarily WGS). However, if this field contains a validaccession identifier correlation between the primary sequence data (in Trace) and the secondary sequence data (in the public repository) is facilitated.
Database: Trace Archive; Category: Metadata Field List
Adapter Spec
Some technologies will require knowledge of the sequencing adapter or the last base of the adapter in order to decode the spot.
Database: Sequence Read Archive; Category: Experiment
Agency
Name of funding agency. For example: Japan Society for the Promotion of Science.
Database: BioProject; Category: General info
Agency abbreviation
Abbreviation of funding agency. For example: JSPS.
Database: BioProject; Category: General info
Alias
Name of the experiment designated by the archive. This alias is used to reference metadata objects without accession numbers.
Database: Sequence Read Archive; Category: Experiment
Alias
Name of the run designated by the archive. This alias is used to reference metadata objects without accession numbers.
Database: Sequence Read Archive; Category: Run
Alias
Name of the analysis designated by the archive.This alias is used to reference metadata objects without accession numbers.
Database: Sequence Read Archive; Category: Analysis
AMPLIFICATION_FORWARD
The forward amplification primer sequence Type: varchar(100) Example: GGATTCTGACTAACGAGC The field is to allow submitters to define the primers used to amplify templates for sequencing. This field is required when =PCR or RT-PCR.
Database: Trace Archive; Category: Metadata Field List
AMPLIFICATION_REVERSE
The reverse amplification primer sequence. Type: varchar(100) Example: GGATTCTGACTAACGAGC The field is to allow submitters to define the primers used to amplify templates for sequencing. This field is required when =PCR or RT-PCR.
Database: Trace Archive; Category: Metadata Field List
AMPLIFICATION_SIZE
The expected amplification size for a pair of primers. Type: int Example: 500 The field allows submitters to define the expected amplification size for a pair of primers (defined in the and fields). This number should be given in base pairs. If =PCR, the amplification size is based on amplification of genomic DNA. If the =RT-PCR, then the amplification size is based on amplification of transcript.
Database: Trace Archive; Category: Metadata Field List
Analysis Center
If applicable, the center name produced this analysis.Center Name List.
Database: Sequence Read Archive; Category: Analysis
Analysis Date
The date when this analysis was produced.
Database: Sequence Read Archive; Category: Analysis
Analysis Type
Select an Analysis type. Submit alignment data to Run in bam format.
Analysis Type Description
De Novo Assembly A placement of sequences including trace, SRA, GI records into a multiple alignment from which a consensus is computed..
Sequence Annotation Per sequence annotation of named attributes and values.
Example: Processed sequencing data for submission to dbEST without assembly.
Reads have already been submitted to one of the sequence read archives in raw form.
The fasta data submitted under this analysis object result from the following treatments, which may serve to filter reads from the raw dataset:
    - sequencing adapter removal
    - low quality trimming
    - poly-A tail removal
    - strand orientation
    - contaminant removal.
Abundance Measurement Identify the tools and processing steps used to produce the abundance measurements (coverage tracks).
Database: Sequence Read Archive; Category: Analysis
ANONYMIZED_ID
Anonymous ID for an individual. Type: varchar(100) Example: 2222anonym Used in projects to maintain the anonymity of donors. In many cases, there may be a controlled access database that can map many anonymized_ids in the trace archive to a single individual id for which phenotypic information may be available.
Database: Trace Archive; Category: Metadata Field List
Array Data File
This column contains a list of raw data files, one for each row of the SDRF file, linking these data files to their respective hybridizations. The following columns can be used to annotate Array Data File columns
Database: Omics Archive; Category: SDRF
Array Data Matrix File
This column contains a list of raw data matrix files, where data from multiple hybridizations is stored in a single file, and the data mapped to each hybridization via the Data Matrix format itself. The following columns can be used to annotate Array Data Matrix File columns
Database: Omics Archive; Category: SDRF
Array Design REF
This column contains references to the array design used for each hybridization. For DOR submissions this should be an ArrayExpress/DOR accession number, e.g. "A-DORD-1". The following columns can be used to annotate Array Design REF columns The Term Source REF column here can be used to point to the source of the array design referenced; however for DOR submissions this should always be ArrayExpress, and so this column is in effect ignored.
Database: Omics Archive; Category: SDRF
ascii
ASCII character based encoding.
Database: Sequence Read Archive; Category: Run
Assay Name
This column contains user-defined names for each Assay. "Assay Name" may be used instead of "Hybridization Name" to identify generic biological assays, such as rtPCR and sequencing. Note that this column should not be used for submission of regular microarray experiments to DOR. All Assay Name columns must be followed by a Technology Type column. Used as an identifier within the MAGE-TAB document.
The following columns can be used to annotate Assay Name columns
Database: Omics Archive; Category: SDRF
@
ASCII value 64. Typically used for range 0..60.
Database: Sequence Read Archive; Category: Run
ATTEMPT
Number of times the sequencing project has been attempted by the center and/or submitted to the Trace Archive. Type: tinyint(1-255) Example: 2
Database: Trace Archive; Category: Metadata Field List
Attributes
A list of attributes and their definitions can be viewed here.Besides the mandatory fields, there are several optional attribute fields. To make the BioSample record most useful, you should include all available information in the submission. Commonly used and useful attributes have been defined, with standardized nomenclature. In preparing your submission, please refer to this attributes list and fill in the relevant fields. If you have information of a type that does not appear in the standard list, you can create it as a Custom Attribute.
Database: Biosample; Category: Attributes
Base Call
Element's body contains a basecall, attribute provide description of this read meaning as well as matching rules.
Database: Sequence Read Archive; Category: Experiment
BASE_FILE
File name with base calls. Type: varchar(200) Example: ./mytraces/123clone.fasta Trace files which do not include the basecalls must provide this information in a separate file. The file designations are recorde din the field of the metadata file. If basecalls are provided in separate files the information in these files will overwrite any information in the trace (usually *.scf) file. If the base calls that would be provided in the are the same as the information in the trace file, DO NOT PROVIDE THE FILE. If the center provides the and, then the peak index information should also be provided in a file called.
Database: Trace Archive; Category: Metadata Field List
BASECALL_LENGTH
Length of the trace in base pairs. Type: int Example: 396
Database: Trace Archive; Category: Internal Fields List
BASES_20
Number of base pairs for which quality score exceed 20. Type: smallint Example: 50 Warning: There are some depositions that do not have quality scores. This is likely due to the center submitting ABI files and not providing quality calls separately.
Database: Trace Archive; Category: Internal Fields List
BASES_40
Number of base pairs for which quality score exceed 40. Type: smallint Example: 50 Warning: There are some deposition sthat do not have quality scores. This is likely due to the center submitting ABI files and not providing quality calls separately.
Database: Trace Archive; Category: Internal Fields List
BASES_60
Number of base pairs for which quality score exceed 60. Type: smallint Example: 50 Warning: There are some depositions that do not have quality scores. This is likely due to the center submitting ABI files and not providing quality calls separately.
Database: Trace Archive; Category: Internal Fields List
bin
Length value bin.
Database: Sequence Read Archive; Category: Experiment
Biomaterial provider
Indicate the source of the study material (e.g., ATCC ID or a Principal Investigator or lab).
Database: BioProject; Category: General info
Umbrella BioProject accession
A BioProject accession number of an initiative which is already registered in the BioProject database.
Database: BioProject; Category: General info
BioProject ID
Select a project registered to BioProject or submit a new project. For submission to BioProject, please refer to the BioProject Handbook.
Database: Sequence Read Archive; Category: BioProject
BioSample ID
Select samples registered to BioSample or create and submit new samples. For submission to BioSample, please refer to BioSample Handbook.
Database: Sequence Read Archive; Category: BioSample
Biotic Relationship
Select a BioticRelationship.
BioticRelationship
FreeLiving
Commensal
Symbiont
Episymbiont
Intracellular
Parasite
Host
Endosymbiont
Database: BioProject; Category: Target
Capture
The scale, or type, of information that the study is designed to generate from the sample material.
CaptureDescription
WholeThe project makes use of the whole sample material (most common case).
Clone EndsCapturing clone end data.
ExomeCapturing exon-specific data.
Targeted Locus/LociCapturing specific loci (gene, genomic region, barcode standard).
Random SurveyNot using whole sample, an incomplete survey of the sample.
OtherSpecify the scale or type of the captured material in the "Target description".
Database: BioProject; Category: Project type
Cellularity
Select a cellularity.
Cellularity
Unicellular
Multicellular
Colonial
Database: BioProject; Category: Target
Center Name

A submitter's center name. Center Name List. A center name abbreviation is required to submit data to DRA.

In the metadata creation tool, the center name is automatically filled with the account information.

The Center Name is an abbreviation operationally used by SRA and is not for indicating ownership of submission. Submitters listed in Submitter hold ownership of submission.

Database: Sequence Read Archive; Category: Submission
Center Name
Controlled vocabulary identifying the sequencing center, core facility, consortium, or laboratory responsible for the experiment.Center Name List.
Database: Sequence Read Archive; Category: Experiment
Center Name
Controlled vocabulary identifying the sequencing center, core facility, consortium, or laboratory responsible for the run.Center Name List.
Database: Sequence Read Archive; Category: Run
Center Name
Controlled vocabulary identifying the sequencing center, core facility, consortium, or laboratory responsible for the analysis.Center Name List.
Database: Sequence Read Archive; Category: Analysis
CENTER_NAME
Name of the sequencing center. Type: varchar(50) Example: WUGSC Sequencing centers wishing to submit data must contact the DDBJ Trace Archive administrators to determine a center abbreviation. This abbreviation issued in the field. This field has a controlled vocabulary. For the complete list of submitting centers see: http://www.ncbi.nlm.nih.gov/Traces/trace.cgi?view=submitting_centers

These center names are controlled separately from those of the Sequence Read Archive

Database: Trace Archive; Category: Metadata Field List
CENTER_PROJECT
Center defined project name. Type: varchar(100) Example: HBBB The reflects a sequencing center's internal designation for a specific sequencing project.This field can be useful for grouping related traces.
Database: Trace Archive; Category: Metadata Field List
Characteristics[]
Controlled vocabulary term or measurement. Used as an attribute column following Source Name, Sample Name, Extract Name, or Labeled Extract Name. This column contains terms describing each material according to the characteristics category indicated in the column header. For example, a column headed "Characteristics[OrganismPart]" would contain individual OrganismPart terms. These terms may be user-defined (the default), from an external ontology source (indicated using a Term Source REF column), or a measurement (indicated using a Unit[] column). The following columns can be used to annotate Characteristics[<category term>]:
Database: Omics Archive; Category: SDRF
CHEMISTRY
Description of the chemistry used in the sequencing reaction. Type: varchar(50) Example: BIGDYEV3.0
Database: Trace Archive; Category: Metadata Field List
CHEMISTRY_TYPE
Type of chemistry used in the sequencing reaction. Type: char(50) Example: P The uses a controlled list.
Accepted values are:
PrimerTerminatorp=primer; t=terminator
Database: Trace Archive; Category: Metadata Field List
CHROMOSOME
Chromosome to which the trace is assigned. Type: varchar(8) Example: 11 The indicates to which chromosome a trace has been assigned. Gene names or cytogenetic positions are not appropriate substitutes for chromosome information.
Database: Trace Archive; Category: Metadata Field List
CLIP_QUALITY_LEFT
Left clip of the read, in base pairs, based on quality analysis. Type: int Example: 56 The field indicates the base at the beginning of the sequence at which the read should be clipped due to poor quality sequence. The given value would be the first base of the high quality region of the trace.
Database: Trace Archive; Category: Metadata Field List
CLIP_QUALITY_RIGHT
Right clip of the read, in base pairs, based on quality analysis. Type: int Example: 256 The field indicates the base at the end of the sequence at which the read should be clipped due to poor quality sequence. The given value would be the last base of the high quality region of the trace.
Database: Trace Archive; Category: Metadata Field List
CLIP_VECTOR_LEFT
Left clip of the read, in base pairs, based on vector sequence. Type: int Example: 75 The field indicates the base at the beginning of the sequence at which the read should be clipped due to vector sequence. The given value would be the first base of non-vector sequence. This field is required for almost all combinations of and . This information can be omitted if the field is populated or is PCR or RT-PCR.
Database: Trace Archive; Category: Metadata Field List
CLIP_VECTOR_RIGHT
Right clip of the read, in base pairs, based on vector sequence. Type: int Example: 275 The field indicates the base at the end of the sequence at which the read should be clipped due to vector sequence. The given value would be the last non-vector sequence. This field is required for almost all combinations of and . This information can be omitted if the field is populated or is PCR or RT-PCR.NOTE: Many centers combine vector and quality analysis, and thus have only one set of clip values. Inthis case, the set of values should be placed in the / fields.
Database: Trace Archive; Category: Metadata Field List
CLONE_ID
The name of the clone from which the trace was derived. Type: varchar(30) Example: RP23-1123F10 The field issued to store the identifier related to an individual clone, for example a BAC clone, PAC clone or cDNA clone. If the clone is registered with the clone registry(http://www.ncbi.nlm.nih.gov/clone/), standard clone registry nomenclature (http://www.ncbi.nlm.nih.gov/clone/content/overview/) should be used.
This field is required for the following combination of and :
=cDNA;=Any
=EST;=Any
=CLONEEND;=CLONEEND
=CLONE;=Any
=ENCODE;=SHOTGUN;
PrimerWalk; CLONEEND =FINISHING;=Any
Database: Trace Archive; Category: Metadata Field List
CLONE_ID_LIST
Semi-colon delimited list of clones if the Strategy is PoolClone. Type: varchar(30) Example: RP23-200A2;RP23-500P1 The field is used only if =PoolClone. In this case, the list of clones is provided as a semicolon delimited list. If the clones are registered with the Clone Registry (http://www.ncbi.nlm.nih.gov/clone/), standard clone registry nomenclature (http://www.ncbi.nlm.nih.gov/clone/content/overview/) should be used (see field).Note: The list of clones is not limited, but the size of the individual clone within the list is limited to 30 bytes.
This field is required for the following combination of and :
=PoolClone;=Any
Database: Trace Archive; Category: Metadata Field List
COLLECTION_DATE
The full date, in "Mar 2 2006 12:00AM" format, on which an environmental sample was collected. Type: datetime Example: Mar 2 2006 12:00AM The field is used to define the date and time on which an environmental sample was collected.
This field is required for the following combination of and :
=Env Sample-Geo; =Any
=Env Sample-Host; =Any
Database: Trace Archive; Category: Metadata Field List
Color Matrix
Matrix of code numbers (Value) and two base combinations (Dibase).
ValueDibaseValueDibase
0AA2AG
0CC2GA
0TT2CT
0GG2TC
1AC3AT
1CA3TA
1GT3GT
1TG3TG
Database: Sequence Read Archive; Category: Experiment
Color Matrix Code
Code numbers used to encode two base combinations.
Database: Sequence Read Archive; Category: Experiment
Comment[]
This column can be used to annotate the main graph node and edge columns listed above. It is included as an extensibility mechanism, and should not generally be used to encode meaningful biological annotation. The column header should contain a name for the type of values included in the column.
Database: Omics Archive; Category: SDRF
Comment[]
A user-defined value which is associated with the investigation. For example, DOR uses "Comment [BioProject ID]" to record the BioProject ID; alternatively, tags such as "Comment [Goal]" might be used to indicate the purpose behind an investigation.
Database: Omics Archive; Category: IDF
Comment[BioProject ID]
The BioProject ID of the associated project. This is used to group the related INSDC records. See the DDBJ BioProject website for details.
Database: Omics Archive; Category: IDF
Comment[Center Name]
The center name of the associated DRA submission.
Database: Omics Archive; Category: IDF
Comment[DRA accession]
The DRA accession number(s) of the associated raw sequencing reads. This field links the processed data in the DOR and raw data in the DRA. When the data set is submitted to the DOR, the DOR registers raw data to the DRA and fills in this field.
Database: Omics Archive; Category: IDF
Comment[Laboratory Name]
The Laboratory name of the associated DRA submission.
Database: Omics Archive; Category: IDF
Consortium name
If study is carried out as part of a consortium, provide the consortium name.
Database: BioProject; Category: General info
Consortium URL
If the consortium maintains a web site, provide the URL.
Database: BioProject; Category: General info
Contact Person
Contact information of submitter(s). Questions and notifications about a submission are contacted to the e-mail address(es) listed here. Personal contact information is considered confidential and is collected to be used by DDBJ staff should questions arise; the general information about the research center is used for public display.
Database: Biosample; Category: Submitter
CVECTOR_ACCESSION
Repository (DDBJ/EMBL/Genbank) accession identifier for the cloning vector. Type: varchar(50) Example: AY451994 The field holds the accession number for the cloning vector used. This cloning vector relates to the clone named in the field.
Database: Trace Archive; Category: Metadata Field List
CVECTOR_CODE
Center defined code for the cloning vector. Type: varchar(50) Example: PBACE3.6 The field holds the user defined identifier for the cloning vector. Submitters are encouraged to submit all vector sequence information to public repositories.
Database: Trace Archive; Category: Metadata Field List
Data provider
Indicate the data provider (data submitter) if it is someone other than the submitting organization or consortium. For example, a sequecning center.
Database: BioProject; Category: General info
Data provider
Indicate the data provider (data submitter) if it is someone other than the submitting organization or consortium. For example, a sequencing center or a DACC.
Database: Biosample; Category: General info
Data provider URL
If you would like to use to present a link to the data provider then please provide the URL.
Database: BioProject; Category: General info
Data provider URL
If you would like us to present a link to the data provider then please provide the URL.
Database: Biosample; Category: General info
Data Release
Specify when this submission should be released to the public.
Data releaseDescription
ReleaseSubmitted BioSample record will be released immediately after the curation process finishes.
HoldSubmitted BioSample record is released when the DDBJ, DRA, DTA and DOR record(s) referencing this BioSample ID is released. Private DDBJ record(s) referencing this BioSample ID is not released.
Database: Biosample; Category: General info
Date
Used as an attribute column following Protocol REF. The date (and time, where available) upon which the protocol was performed, in the following format: YYYY-MM-DDThh:mm:ssZ (for example, 2008-09-12T16:27:27Z)
Database: Omics Archive; Category: SDRF
Date of Experiment
The date on which the experiment was performed. The date should be entered in the "YYYY-MM-DD" format (ex. 2011-01-01). This tag can only have one value.
Database: Omics Archive; Category: IDF
decimal
Single decimal value per quality score.
Database: Sequence Read Archive; Category: Run
DEPTH
Depth (in meters) at which an environmental sample was collected. Type: float Example: 10M The field is applicable to water samples and earth samples. If the value of this field is NULL, it is anticipated the sample was taken from the surface of the environment. While this field is only applicable to environmental samples, it is not required.
Database: Trace Archive; Category: Metadata Field List
Derived Array Data File
This column contains a list of processed data files, one for each row of the SDRF file, linking these data files to their respective hybridizations. The following columns can be used to annotate Derived Array Data File columns
Database: Omics Archive; Category: SDRF
Derived Array Data Matrix File
This column contains a list of processed data matrix files, where data from multiple hybridizations is stored in a single file, and the data mapped to each hybridization (or scan, or normalization) via the Data Matrix format itself. The following columns can be used to annotate Derived Array Data Matrix File columns
Database: Omics Archive; Category: SDRF
Description
Describes the contents of the analysis.
Database: Sequence Read Archive; Category: Analysis
Description
Used as an attribute column following Source Name, Sample Name, Extract Name, or Labeled Extract Name. A free-text description to be attached to the corresponding material. To be used sparingly, if at all - most annotations should be provided using controlled vocabulary terms, using Characteristics[] columns.
Database: Omics Archive; Category: SDRF
Description
A brief description, to elaborate upon the brief label.
Database: BioProject; Category: Target
Description
A description of any unusual features of the replicon.
Database: BioProject; Category: Target
Description of novel organism
Enter necessary information to register an organism to the taxonomy database.
Database: BioProject; Category: Target
Disease
Enter a disease name.
Database: BioProject; Category: Target
DOI
Provide a DOI if a PubMed ID is not available. Provide the additional reference information.
<Publication id="10.1093/nar/gku1120">
	<DbType>eDOI</DbType>
</Publication>
<ProjectReleaseDate> ...
Database: BioProject; Category: Publication
DOI
Provide a DOI if a PubMed ID is not available. Provide the additional reference information.
Database: Biosample; Category: Publications
E-mail
E-mail of submitter.
Database: Sequence Read Archive; Category: Submission
E-mail
E-mail address. Enter an address from the organizations domain.
Database: BioProject; Category: Submitter
E-mail
E-mail address. Enter an address from the organizations domain.
Database: Biosample; Category: Submitter
ELEVATION
Elevation (in meters) at which an environmental sample was collected. Type: float Example: 500 If the value of this field is NULL it is assumed the data were obtained at sea level. The field is only applicable to some environmental sample data, but is not a required field.
Database: Trace Archive; Category: Metadata Field List
end
Both matches and mismatches are counted. When Max Mismatch is exceeded - it is not a match. When Min Match is reached - match is declared.
Database: Sequence Read Archive; Category: Experiment
Endospores
Choose target bacteria forms endospores or not.
Endospores
Yes
No
Database: BioProject; Category: Target
Enveloped
Choose enveloped or not.
Enveloped
Yes
No
Database: BioProject; Category: Target
ENVIRONMENT_TYPE
Type of environment from which an environmental sample was collected. Type: varchar(250) Example: sea water The field is used to describe the specific environment from which an environmental sample was taken. While the and fields describe the location many types of environmental types could exist at this location (for example, soil, sludge, tree roots, etc).
This field would be required for the following combination of and :
=Env Sample -Geo; =Any
Database: Trace Archive; Category: Metadata Field List
Environmental package (MIxS Sample)
No package
air
host-associated
human-associated
human-gut
human-oral
human-skin
human-vaginal
microbial mat/biofilm
miscellaneous or artificial
plant-associated
sediment
soil
wastewater/sludge
water
Database: Biosample; Category: Sample type
Environmental sample description
Describe details of sample information.
Database: BioProject; Category: Target
Environmental sample name
Unclassified sequences including metagenome and environmental samples may be found at here. If an appropriate name was not found, describe a novel name you propose and details of sample information in the Environmental sample description.
Database: BioProject; Category: Target
!
ASCII value 33. Typically used for range 0..63.
Database: Sequence Read Archive; Category: Run
Experiment Description
A short paragraph describing the experiment as free-text. This tag can only have one value.
Database: Omics Archive; Category: IDF
Experiment Referenced
Select the experiment this run belongs to.
Database: Sequence Read Archive; Category: Run
Experimental Design
The experiment design types which are applicable to this study. Typically these terms should come from the MGED Ontology. The ExperimentDesignType subclasses are particularly useful here. See for example the list of BiologicalProperty terms available. Controlled vocabulary term.
Database: Omics Archive; Category: IDF
Experimental Design Term Accession Number
The accession number for this term, taken from the indicated Term Source.
Database: Omics Archive; Category: IDF
Experimental Design Term Source REF
The source of the Experimental Design terms; this must reference one of the Term Source Names defined elsewhere in the IDF file (see below).
Database: Omics Archive; Category: IDF
Experimental Factor Name
A user-defined name for each experimental factor studied by the experiment. These experimental factors represent the variables within the investigation (e.g. growth condition, genotype, organism part, disease state). The actual values of these variables will be listed in the SDRF file, in "Factor Value [<factor name>]" colummns. Used as an identifier within the MAGE-TAB document.
Database: Omics Archive; Category: IDF
Experimental Factor Term Accession Number
The accession number for this term, taken from the indicated Term Source.
Database: Omics Archive; Category: IDF
Experimental Factor Term Source REF
The source of the Experimental Factor Type terms; this must reference one of the Term Source Names defined elsewhere in the IDF file (see below).
Database: Omics Archive; Category: IDF
Experimental Factor Type
A term describing the type of each experimental factor. These terms will usually come from the MGED Ontology. The ExperimentalFactorCategory subclasses are particularly useful here. See for example the list of BioMaterialCharacteristicCategory terms available. Controlled vocabulary term.
Database: Omics Archive; Category: IDF
EXTENDED_DATA
Extra ancillary information wrapped around in a EXTENDED_DATA block, where actual values are provided with a special <field> tag. Type: varchar() Example:
<extended_data>
    <field name='SamplingSiteMonthChlorophyllLevel'>1.4 mg_mm</field>
    <field name='SamplingSiteYearlyChlorophyllLevel'>1.12 mg_mm</field>
    <field name='SamplingSiteYearlyChlorophyllLevelStdError'>0.19 mg_mm</field>
</extended_data>
The '=' sign and the field separator character '|' should be excluded from names and their values. No other validity checks will be performed on the data.
Database: Trace Archive; Category: Metadata Field List
External Links
An URL may be provided, with a label for the resource, to reference a resource that is directly relevant to the submitted sample.
Database: Biosample; Category: General info
Extract Name
This column contains user-defined names for each Extract material. Used as an identifier within the MAGE-TAB document.
The following columns can be used to annotate Extract Name columns:
Database: Omics Archive; Category: SDRF
Factor Value[]
Controlled vocabulary term or measurement. This column contains terms describing the experimental factor values (i.e., variables) for each row of the SDRF. The Experimental Factor Name to which it pertains (from the accompanying IDF) should be indicated in the column header. For example, if you have this in your IDF You could then use this factor in your SDRF (assuming you had also defined the "Mouse Anatomy" term source in your IDF)
Factor Value[Tissue] Term Source REF
brainMouse Anatomy
kidneyMouse Anatomy
liverMouse Anatomy
intestineMouse Anatomy
pancreasMouse Anatomy
The terms in the column may be user-defined (the default), from an external ontology source (indicated using a Term Source REF column), or a measurement (indicated using a Unit[] column). In the example above, the column terms would be treated as describing organism parts. For more precise control over the treatment of these terms, the optional form "Factor Value [] ()" is available, e.g. "Factor Value [growth condition EF] (Nutrients)".
Database: Omics Archive; Category: SDRF
FEATURE_ID_FILE
File describing the features and their locations on a chip. Type: varchar(200) Example: ./mytraces/chip2.cdf The provides the location and sequence of the features for a given chip when ="CHIP".
Database: Trace Archive; Category: Metadata Field List
FEATURE_ID_FILE_NAME
Reference to a common FEATURE_ID_FILE which should be submitted first. Type: varchar(200) Example: This field is required when ="CHIP".
Database: Trace Archive; Category: Metadata Field List
FEATURE_SIGNAL_FILE
File giving the signal and variance for features on a chip. Type: varchar(200) Example: ./mytraces/chip2.signal The provides the signal and variance of signal for the features on a given chip when ="CHIP".
Database: Trace Archive; Category: Metadata Field List
FEATURE_SIGNAL_FILE_NAME
Reference to a common FEATURE_SIGNAL_FILE which should be submitted first. Type: varchar(200) Example: This field is required when ="CHIP".
Database: Trace Archive; Category: Metadata Field List
File Name
The name of a sequence data file. Uploaded filenames are automatically filled in.
Database: Sequence Read Archive; Category: Run
File Name
The name of an analysis file.
Database: Sequence Read Archive; Category: Analysis
File Type
The sequence data file format. For the fastq files with variable read length, select 'generic_fastq'. For the fastq files with constant read length, select 'fastq'.

File Type Description
generic_fastq fastq files with variable read length
fastq fastq files with constant read length
sff 454 Standard Flowgram Format file
hdf5 PacBio hdf5 Format file
SOLiD_native SOLiD csfasta and qual files. # Support for this format is planned to be depracated in May, 2017.
bam Binary SAM format for use by loaders that combine alignment and sequencing data
tab A tab-delimited table maps "SN in SQ line of BAM header" and "reference fasta file"
reference_fasta Reference sequence file in single fasta format used to construct SRA archive file format. Filename must end with ".fa"
Database: Sequence Read Archive; Category: Run
File Type
The analysis data file format.
File Type Description
bam Binary form of the Sequence alignment/map format for read placements, from the SAM tools project.
See http://sourceforge.net/projects/samtools/.
tab A tab delimited text file that can be viewed as a spreadsheet. The first line should contain column headers..
ace Multiple alignment file output from the phred assembler and similar programs.
See http://www.phrap.org/consed/distributions/README.16.0.txt for a description of the ACE file format..
fasta Sequence data format indicating sequence base calls.The format is simple: a header line initiated with the > character, data lines following with base calls..
wig The wiggle (WIG) format allows display of continuous-valued data in track format.This display type is useful for GC percent, probability scores, and transcriptome data.
See http://genome.ucsc.edu/goldenPath/help/wiggle.html for a description of the Wiggle Track format..
BED BED format provides a flexible way to define the data lines that are displayed in an annotation track.
See http://genome.ucsc.edu/FAQ/FAQformat#format1 for a description of the BED format..
VCF Variant Call Format.
See http://www.1000genomes.org/wiki/analysis/variant%20call%20format/vcf-variant-call-format-version-41 for a description of the VCF format.
MAF Mutation Annotation Format
GFF General Feature Format
csv
tsv
Database: Sequence Read Archive; Category: Analysis
First name
Submitter's first name.
Database: BioProject; Category: Submitter
First name
First name of author.
Database: BioProject; Category: Publication
First name
Submitter's first name.
Database: Biosample; Category: Submitter
First name
Database: Biosample; Category: Publications
Flow Sequence
The fixed sequence of challenge bases that flow across the picotiter plate.
Database: Sequence Read Archive; Category: Experiment
Flow Sequence
The fixed sequence of challenge bases that flow across the picotiter plate.This is optional in the schema now but will be required by business rules and future schema versions.
Database: Sequence Read Archive; Category: Experiment
Follows Read Index
Specify the read index that precedes this read.
Database: Sequence Read Archive; Category: Experiment
full
Only Max Mismatch influences matching process.
Database: Sequence Read Archive; Category: Experiment
GENE_NAME
Gene name or some other common identifier. Type: varchar(100) Example: transporter 1 Free text. Mainly this field would be for ='Re-sequencing' or'ENCODE'. When a group is analyzing a particular gene, they may want to refer to that gene by it's name or some other common identifier.
Database: Trace Archive; Category: Metadata Field List
Other samples (e.g. transcriptome, epigenetics etc)
Use for any sample type (e.g. transcriptome, epigenetics etc). These samples are described using common core attributes and submitter-supplied custom attributes.
Database: Biosample; Category: Sample type
Genomic Sequences Sample (MIGS)
Cultured Bacterial/Archaeal Genomic Sequences
Eukaryotic Genomic Sequences
Viral Genomic Sequences

Environmental samples do not include endosymbionts that can be reliably recovered from a particular host, organisms from a readily identifiable but uncultured field sample (e.g., many cyanobacteria), or phytoplasmas that can be reliably recovered from diseased plants (even though these cannot be grown in axenic culture). Select "Cultured Bacterial/Archaeal" or "Eukaryotic" or "Viral".

Database: Biosample; Category: Sample type
Gram
Choose gram positive or negative.
Gram
Positive
Negative
Database: BioProject; Category: Target
Grant ID
Grant number is collected to support searches (e.g., publications often cite Grant numbers). For example: JSPS KAKENHI Grant Number 12345678.
Database: BioProject; Category: General info
Grant title
Grant title may also support searches.
Database: BioProject; Category: General info
Habitat
Choose a Habitat.
Habitat
HostAssociated
Aquatic
Terrestrial
Specialized
Multiple
Unknown
Database: BioProject; Category: Target
Haploid genome size
Haploid genome size in Kb, Mb or cM.
Database: BioProject; Category: Target
hexadecimal
Single hexadecimal value per quality score.
Database: Sequence Read Archive; Category: Run
HI_FILTER_SIZE
The largest filter used to stratify an environmental sample. Type: varchar(50) Example: 50 micron The field is applicable only to environmental sample data but is not a required field.
Database: Trace Archive; Category: Metadata Field List
Hold
Released concurrently when the DDBJ, DRA, DTA and DOR record(s) citing this ID is released.
Database: BioProject; Category: Submitter
Hold
Submitted BioSample record is released when the DDBJ, DRA and DTA record(s) referencing this BioSample ID is released. Private DDBJ record(s) referencing this BioSample ID is not released.
Database: Biosample; Category: Submitter
Hold Until
Direct the DRA to release the record on or after the specified date.Submitter can set the hold date for a maximum of 2 years and can change the date before the record is released.
Database: Sequence Read Archive; Category: Submission
HOST_CONDITION
The condition of the host from which an environmental sample was obtained. Type: varchar(100) Example: HIV-positive The field is only applicable to environmental sample data and is used to describe the condition (healthy, sick, etc) of the host from which a sample was taken.
Database: Trace Archive; Category: Metadata Field List
HOST_ID
Unique identifier for the specific host from which an environmental sample was taken. Type: varchar(100) Example: yerkes pedigree #C0479 'Clint' The field is only applicable to environmental sample data and is used to capture the unique name for the specific host from which a sample was obtained.
This field would be required for the following combination of and :
=Env Sample-Host; =Any
Database: Trace Archive; Category: Metadata Field List
HOST_LOCATION
Specific location on the host from which an environmental sample was collected. Type: varchar(100) Example: rumen The field is only applicable to environmental sample data and is used to describe the specific part of the host from which the sample was obtained, for example: dental plaque, hindgut, root surfaces.
This field would be required for the following combination of and :
=Env Sample-Host; =Any
Database: Trace Archive; Category: Metadata Field List
HOST_SPECIES
The host from which an environmental sample was obtained. Type: varchar(100) Example: Pan troglodytes The field is only applicable to environmental sample data.
This field would be required for the following combination of and :
=Env Sample-Host; =Any
Database: Trace Archive; Category: Metadata Field List
Hybridization Name
This column contains user-defined names for each Hybridization. Used as an identifier within the MAGE-TAB document.
The following columns can be used to annotate Hybridization Name columns
Database: Omics Archive; Category: SDRF
Image File
This optional column contains a list of image files, one for each row of the SDRF file, linking these image files to their respective hybridizations. Note that DOR does not store image data due to size constraints on the database. If desired, you may use this column to include links to image files stored on your local webserver. The following columns can be used to annotate Derived Array Data File columns
Database: Omics Archive; Category: SDRF
Immediate Release
Direct the DRA to release the record immediately after submission is processed.
Database: Sequence Read Archive; Category: Submission
INDIVIDUAL_ID
Publicly available identifier to denote a specific individual or sample from which a trace was derived. Type: varchar(100) Example: NA12345 The field provides a center specific unique id that can associate as pecific trace to an individual. This will be used primarily for population based studies.
Database: Trace Archive; Category: Metadata Field List
Initiative description
Description of an initiative.
Database: BioProject; Category: General info
INSERT_FLANK_LEFT
Flanking sequence at the cloning junction. Type: varchar(100) Example: AAGGTGCGATGCAGTGGCAGTAGCAGTGTCGACGTGACGATTCGTCCGGA The field should provide from 50 up to 100 bases of sequence (including linkers) to the left of the cloning junction. This information will allow users to perform their own vector trimming of reads. This field is required for almost all combinations of and . This field can be omitted if is populated.However, is the preferred choice. If there was no cloning step involved in the sequencing, please populate the field with 'NONE'.
Database: Trace Archive; Category: Metadata Field List
INSERT_FLANK_RIGHT
Flanking sequence at the cloning junction. Type: varchar(100) Example: AAGGCGCGATGCAGTGAGCGAGGCTGACGTCGGCTAGCGTCGCGTCGGGT The field should provide from 50 up to 100 bases of sequence (including linkers) to the right of the cloning junction. This information will allow users to perform their own vector trimming of reads. This field is required for almost all combinations of and . This field can be omitted if is populated.However, is the preferred choice. If there was no cloning step involved in the sequencing, please populate the field with 'NONE'. It is anticipated that if is populated that will also be populated. It is not anticipated that a mixture of clip values and junction sequence will be specified. (i.e. and populated for the same record.
Database: Trace Archive; Category: Metadata Field List
INSERT_SIZE
Expected size of the insert (referred to by the value in the TEMPLATE_ID field) in base pairs Type: int Example: 2000 The field indicates the expected insert size of the clone that is sequenced. It is understood that this is an estimate based upon the average insert sizes found in a given library. However, this information is critical for certain experiments, such as whole genome assembly.
This field would be required for the following combination of and :
=Any;=WGS=Any;
=WCS=cDNA;=CLONEEND=CLONEEND;
=CLONEEND
Database: Trace Archive; Category: Metadata Field List
INSERT_STDEV
Approximate standard deviation of value in INSERT_SIZE field. Type: int Example: 200 The field reflects the approximate standard deviation of the insert size. It is understood that this information is an approximation and may change as better data is obtained. This field would be required for the following combination of and :
=Any;=WGS=Any;
=WCS=cDNA;
=CLONEEND=CLONEEND;=CLONEEND
Database: Trace Archive; Category: Metadata Field List
Instrument
Select a sequencing instrument model.
Instrument Model
454 GS
454 GS 20
454 GS FLX
454 GS FLX+
454 GS FLX Titanium
454 GS Junior
Illumina Genome Analyzer
Illumina Genome Analyzer II
Illumina Genome Analyzer IIx
Illumina HiSeq 1000
Illumina HiSeq 1500
Illumina HiSeq 2000
Illumina HiSeq 2500
Illumina HiSeq 3000
Illumina HiSeq 4000
Illumina MiSeq
Illumina HiScanSQ
HiSeq X Five
HiSeq X Ten
NextSeq 500
NextSeq 550
Helicos HeliScope
AB SOLiD System
AB SOLiD System 2.0
AB SOLiD System 3.0
AB SOLiD 3 Plus System
AB SOLiD 4 System
AB SOLiD 4hq System
AB SOLiD PI System
AB 5500 Genetic Analyzer
AB 5500xl Genetic Analyzer
AB 5500xl-W Genetic Analysis System
Complete Genomics
MinION
GridION
PromethION
PacBio RS
PacBio RS II
Sequel
Ion Torrent PGM
Ion Torrent Proton
AB 310 Genetic Analyzer
AB 3130 Genetic Analyzer
AB 3130xL Genetic Analyzer
AB 3500 Genetic Analyzer
AB 3500xL Genetic Analyzer
AB 3730 Genetic Analyzer
AB 3730xL Genetic Analyzer
Database: Sequence Read Archive; Category: Experiment
Instrument Name
Center-assigned name or id of the instrument used in the run.
Database: Sequence Read Archive; Category: Run
Investigation Title
The overall title of the investigation. This tag can only have one value.
Database: Omics Archive; Category: IDF
Isolate name or label
A label for an isolated sample, or name of an individual animal (e.g., Clint). Please provide this or "Strain, breed, cultivar".
Database: BioProject; Category: Target
Issue
Journal issue.
Database: BioProject; Category: Publication
Issue
Database: Biosample; Category: Publications
Journal title
A title of journal.
Database: BioProject; Category: Publication
Journal title
Database: Biosample; Category: Publications
Key Sequence
The Key Sequence is a known sequence of four nucleotides located immediately downstream from the sequencing primer.  It is therefore the first to be sequenced in each well.
Database: Sequence Read Archive; Category: Experiment
ktile
k-tile where k is the k-th bin.
Database: Sequence Read Archive; Category: Experiment
Lab Name
Laboratory name within submitting institution. The Lab name is pre-entered with "Lab/Group", "Department (2)", "Department (1)", "Organization" of D-way account. Text can be editted.
Database: Sequence Read Archive; Category: Submission
Label
Controlled vocabulary term. Used as an attribute column following Labeled Extract Name. The label compound which is conjugated to an Extract to create the Labeled Extract. For DOR submissions this term should be an instance of LabelCompound from the MGED Ontology. Examples: Cy3, Cy5, biotin, alexa_546. The following columns can be used to annotate Label columns The Term Source REF column in this case would point to the ontology (defined in the IDF) from which the Label terms are taken (the MGED Ontology in the example above).
Database: Omics Archive; Category: SDRF
Labeled Extract Name
This column contains user-defined names for each Labeled Extract material. Used as an identifier within the MAGE-TAB document.
The following columns can be used to annotate Labeled Extract Name columns
Database: Omics Archive; Category: SDRF
Last name
Submitter's last name.
Database: BioProject; Category: Submitter
Last name
Last name of author.
Database: BioProject; Category: Publication
Last name
Submitter's last name.
Database: Biosample; Category: Submitter
Last name
Database: Biosample; Category: Publications
LATITUDE
The latitude measurement (using standard GPS notation) from which a sample was collected. Type: float Example: 54.736 The field is required to describe the collection of some environmental sample data. The latitude range is [-90,90] with the equator as 0 latitude and positive values of latitude are north of the equator. This field would be required for the following combination of and:
=Env Sample- Geo;=Any
Database: Trace Archive; Category: Metadata Field List
Library Construction Protocol

Free form text describing the protocol by which the sequencing library was constructed. Please include protocols of DNA fragmentation, ligation and enrichment. If a library preparation kit is used, include the name and version (if any) of the kit (for example, Illumina Nextera DNA Library Preparation Kit).

Reference: Alnasir J, Shanahan HP. Investigation into the annotation of protocol sequencing steps in the sequence read archive. Gigascience. 2015 May 9;4:23. doi: 10.1186/s13742-015-0064-7. eCollection 2015. PMID: 25960871 (Open Access)

Database: Sequence Read Archive; Category: Experiment
Library Name
The submitter's name for this library.
Database: Sequence Read Archive; Category: Experiment
Library Selection
Whether any method was used to select and/or enrich the material being sequenced.
Library Selection Description
RANDOM Random shearing only.
PCR Source material was selected by designed primers.
RANDOM PCR Source material was selected by randomly generated primers.
RT-PCR Source material was selected by reverse transcription PCR.
HMPR Hypo-methylated partial restriction digest.
MF Methyl Filtrated.
repeat fractionation Selection for less repetitive (and more gene rich) sequence through Cot filtration (CF) or other fractionation techniques based on DNA kinetics.
size fractionation Physical selection of size appropriate targets.
MSLL Methylation Spanning Linking Library.
cDNA complementary DNA.
cDNA_randomPriming
cDNA_oligo_dT
PolyA PolyA selection or enrichment for messenger RNA (mRNA); should replace cDNA enumeration.
Oligo-dT enrichment of messenger RNA (mRNA) by hybridization to Oligo-dT.
Inverse rRNA depletion of ribosomal RNA by oligo hybridization.
ChIP Chromatin immunoprecipitation.
MNase Micrococcal Nuclease (MNase) digestion.
DNAse Deoxyribonuclease (DNase) digestion.
Hybrid Selection Selection by hybridization in array or solution.
Reduced Representation Reproducible genomic subsets, often generated by restriction fragment size selection, containing a manageable number of loci to facilitate re-sampling.
Restriction Digest DNA fractionation using restriction enzymes.
5-methylcytidine antibody Selection of methylated DNA fragments using an antibody raised against 5-methylcytosine or 5-methylcytidine (m5C)MBD2 protein methyl-CpG binding domain : Enrichment by methyl-CpG binding domain.
MBD2 protein methyl-CpG binding domain MBD2 protein methyl-CpG binding domain.
CAGE Cap-analysis gene expression.
RACE Rapid Amplification of cDNA Ends.
MDA multiple displacement amplification.
padlock probes capture method Padlock Probes capture strategy to be used in conjuction with Bisulfite-Seq.
other Other library enrichment, screening, or selection process.
unspecified Library enrichment, screening, or selection is not specified.
Database: Sequence Read Archive; Category: Experiment
Library Source
The Library Source specifies the type of source material that is being sequenced.
Library Source Description
GENOMIC Genomic DNA (includes PCR products from genomic DNA).
TRANSCRIPTOMIC Transcription products or non genomic DNA (EST, cDNA, RT-PCR, screened libraries).
METAGENOMIC Mixed material from metagenome.
METATRANSCRIPTOMIC Transcription products from community targets.
SYNTHETIC Synthetic DNA.
VIRAL RNA Viral RNA.
OTHER Other, unspecified, or unknown library source material.
Database: Sequence Read Archive; Category: Experiment
Library Strategy
Sequencing technique intended for this library.
Library Strategy Description
WGS Whole genome shotgun.
WGA Whole genome amplification.
WXS Random sequencing of exonic regions selected from the genome.
RNA-Seq Random sequencing of whole transcriptome.
miRNA-Seq Micro RNA and other small non-coding RNA sequencing.
ncRNA-Seq Capture of other non-coding RNA types, including post-translation modification types such as snRNA (small nuclear RNA) or snoRNA (small nucleolar RNA), or expression regulation types such as siRNA (small interfering RNA) or piRNA/piwi/RNA (piwi-interacting RNA).
ssRNA-seq strand-specific RNA sequencing
WCS Whole chromosome (or other replicon) shotgun.
CLONE Genomic clone based (hierarchical) sequencing.
POOLCLONE Shotgun of pooled clones (usually BACs and Fosmids).
AMPLICON Sequencing of overlapping or distinct PCR or RT-PCR products.
CLONEEND Clone end (5', 3', or both) sequencing.
FINISHING Sequencing intended to finish (close) gaps in existing coverage.
RAD-Seq Restriction Site Associated DNA Sequence
ChIP-Seq Direct sequencing of chromatin immunoprecipitates.
MNase-Seq Direct sequencing following MNase digestion.
DNase-Hypersensitivity Sequencing of hypersensitive sites, or segments of open chromatin that are more readily cleaved by DNaseI.
Bisulfite-Seq Sequencing following treatment of DNA with bisulfite to convert cytosine residues to uracil depending on methylation status.
EST Single pass sequencing of cDNA templates.
FL-cDNA Full-length sequencing of cDNA templates.
CTS Concatenated Tag Sequencing.
MRE-Seq Methylation-Sensitive Restriction Enzyme Sequencing strategy.
MeDIP-Seq Methylated DNA Immunoprecipitation Sequencing strategy.
MBD-Seq Direct sequencing of methylated fractions sequencing strategy.
Tn-Seq Gene fitness determination through transposon seeding.
FAIRE-seq Formaldehyde Assisted Isolation of Regulatory Elements
SELEX Systematic Evolution of Ligands by EXponential enrichment
RIP-Seq Direct sequencing of RNA immunoprecipitates (includes CLIP-Seq, HITS-CLIP and PAR-CLIP).
ChIA-PET Direct sequencing of proximity-ligated chromatin immunoprecipitates.
Hi-C Chromosome Conformation Capture technique where a biotin-labeled nucleotide is incorporated at the ligation junction, enabling selective purification of chimeric DNA ligation junctions followed by deep sequencing
ATAC-seq Assay for Transposase-Accessible Chromatin (ATAC) strategy is used to study genome-wide chromatin accessibility. alternative method to DNase-seq that uses an engineered Tn5 transposase to cleave DNA and to integrate primer DNA sequences into the cleaved genomic DNA
Targeted-Capture
Tethered Chromatin Conformation Capture
Synthetic-Long-Read binning and barcoding of large DNA fragments to facilitate assembly of the fragment
Other Library strategy not listed.
Database: Sequence Read Archive; Category: Experiment
LIBRARY_ID
The source of the clone identified in the CLONE_ID field Type: varchar(100) Example: RP23 The field documents the source library of the archival clone resource. Many genomic libraries have been registered with the Clone Registry (http://www.ncbi.nlm.nih.gov/clone) and the standard nomenclature (http://www.ncbi.nlm.nih.gov/clone/content/overview/) should be used for these libraries.
This field would be requiredfor the following combination of and :
=cDNA;=Any=EST;=Any
=CLONEEND;=CLONEEND=CLONE;
=Any=ENCODE;=SHOTGUN;PrimerWalk; CLONEEND
Database: Trace Archive; Category: Metadata Field List
Link 3'
Specify the read label at the 3' end of the gap, or NULL if it's the last tag.
Database: Sequence Read Archive; Category: Experiment
Link 5'
Specify the read label at the 5' end of the gap, or NULL if it's the first tag.
Database: Sequence Read Archive; Category: Experiment
Link description
Display name of web site that is related to this study.
Database: BioProject; Category: General info
Link description
Display name of web site that is related to this sample.
Database: Biosample; Category: General info
LO_FILTER_SIZE
The smallest filter size used to stratify an environmental sample. Type: varchar(50) Example: 25 micron The field is only applicable to environmental sample data but is not a required field.
Database: Trace Archive; Category: Metadata Field List
LOAD_DATE
Date on which the data was loaded. Type: smalldatetime Example: Jan 8 2001 11:59AM
Database: Trace Archive; Category: Internal Fields List
Location
The replicon subcellular location. For instance, the nucleus, or a differentiated organella. Please select "Nuclear or Prokaryote" for the chromosomes of eularyotes, bacteria or archaea.
Location
Nuclear or Prokaryote
Macronuclear
Nucleomorph
Mitochondrion
Kinetoplast
Chloroplast
Chromoplast
Plastid
Virion or Phage
Proviral or Prophage
Viroid
Extrachrom
Cyanelle
Apicoplast
Leucoplast
Proplastid
Hydrogenosome
Chromatophore
Other
Database: BioProject; Category: Target
Locus Name
Locus NameDescription
16S rRNABacterial ribosomal RNA hypervariable region(s)
18S rRNAEukaryotic small subunit ribosomal RNA
RBCLRuBisCO large subunit: ribulose-1,5-bisphosphate carboxylase/oxygenase large subunit
matKMaturase K gene
COX1Mitochondrial cytochrome c oxidase 1 gene
ITS1-5.8S-ITS2Internal transcribed spacers 1 and 2 plus 5.8S rRNA region
exomeAll exonic regions of the genome
otherOther locus, please describe
Database: Sequence Read Archive; Category: Experiment
Locus tag prefix
Locus tag prefix generation box will appear when [Project data type="Genome Sequencing" or "Metagenome"] AND [Capture="Whole"] AND [Objective="Sequence" or "Annotation" or "Assembly"].

Registration of a unique locus tag prefix is required for studies that result in genome assemblies. Please leave the prefix box empty, when a prefix is not necessary for WGS only submission.

Locus tag prefix guideline.

Locus tag prefix format
The locus_tag prefix can contain only alpha-numeric characters and it must be at least 3 characters long. It should start with a letter, but numerals can be in the 2nd position or later in the string. (ex. A1C). There should be no symbols, such as -_* in the prefix. The locus_tag prefix is to be separated from the tag value by an underscore ‘_’, eg A1C_00001.

DDBJ BioProject limits the maximum tag length to 12 characters. In the BioProject submission system, the locus tag is displayed in capital letters. However, the tag is reserved in case-insensitive manner.

Database: BioProject; Category: Project type
log-odds
The quality score is expressed as the ratio of error to non-error in log form: -10 log(p/(1-p)) where p is the probability of error, with value range -40..40.
Database: Sequence Read Archive; Category: Run
LONGITUDE
The longitude measurement (using standard GPS notation) from which a sample was collected. Type: float Example: -86.403 The field is required to describe the collection of some environmental sample data. The longitude is ranging from 0° at the Prime Meridian to +180° eastward and -180° westward.
This field would be required for the following combination of and :
=Env Sample-Geo; =Any
Database: Trace Archive; Category: Metadata Field List
Marker Sequences Sample (MIMARKS)
Specimen Marker Sequences
Survey related Marker Sequences

MIMARKS specimen: for marker gene (e.g., COI) sequences obtained from any material identifiable by means of specimens

MIMARKS-specimen applies to the contextual data for marker gene sequences from cultured or voucher-identifiable specimens.

MIMARKS survey: for uncultured diversity marker gene (e.g., 16S rRNA, 18S rRNA, nif, amoA, rpo) surveys

MIMARKS-survey is applicable to contextual data for marker gene sequences, obtained directly from the environment, without culturing or identification of the organisms.

Database: Biosample; Category: Sample type
Mate Pair Orientation
Orientation of mate pair.
Mate Pair OrientationDescription
innieTags are facing towards each other.
outieTags are facing away from each other.
normalTags are facing unidiretionally.
anti-normalTags are facing unidirectionally on opposite strand.
Database: Sequence Read Archive; Category: Experiment
MATE_PAIR
TI's of the reads obtained from the other end of the same template. Type: int Example: 203682255 MATE PAIR is the pair of reads obtained from two ends of the same template (FORWARD and REVERSE).
Database: Trace Archive; Category: Internal Fields List
Material
The type of material that is isolated from the sample for use in the experimental study.
MaterialDescription
GenomeA whole genome initiative. May be only the nuclear genome. Use for DNA of a metagenome sample.
Partial GenomeOne or more chromosomes or replicons were experimentally purified.
TranscriptomeTranscript data.
ReagentMaterial studied was obtained by chemical reaction, precipitation.
ProteomeProtein or peptide data.
PhenotypePhenotypic descriptive data.
OtherSpecify the material that was used in the "Target description".
Database: BioProject; Category: Project type
Material Type
Controlled vocabulary term. Used as an attribute column following Source Name, Sample Name, Extract Name, or Labeled Extract Name. This column contains terms describing the type of each material. For DOR submissions this term should be an instance of MaterialType from the MGED Ontology. Examples: whole_organism, organism_part, cell, total_RNA. The following columns can be used to annotate Material Type columns The Term Source REF column in this case would point to the ontology (defined in the IDF) from which the Material Type terms are taken (the MGED Ontology in the example above).
Database: Omics Archive; Category: SDRF
max Length
Maximum length in base pairs of the interval.
Database: Sequence Read Archive; Category: Experiment
Max Mismatch
Maximum number of mismatches.
Database: Sequence Read Archive; Category: Experiment
MD5 Checksum
MD5 checksum of a sequence data file. How to obtain the MD5 checksum values.
Database: Sequence Read Archive; Category: Run
MD5 Checksum
MD5 checksum of a run data file. How to obtain the MD5 checksum values.
Database: Sequence Read Archive; Category: Analysis
mean
Mean length in base pairs of the interval.
Database: Sequence Read Archive; Category: Experiment
Member
Member name of the fraction of the analysis file that should be loaded for this data block.Used for sample multiplexed studies where the analysis data has been demultiplexed by the submitter.
Database: Sequence Read Archive; Category: Analysis
Member Name
Allow for an individual Data Block to be associated with a member of a sample pool.
Database: Sequence Read Archive; Category: Run
(Meta)Genomic Sequences Sample (MIMS)
Environmental/Metagenome Genomic Sequences
Please refer to environmental samples.
Database: Biosample; Category: Sample type
Methodology
The core experimental approach used to obtain the data that is submitted to archival databases.
MethodologyDescription
SequencingSequencing using Sanger, 454, Illumina, etc wit
ArrayData/Sequence are generated by hybridization arrays.
Mass SpectroscopyData are generated by mass spectroscopy.
OtherPlease provide data description in the "Methodogy description".
Database: BioProject; Category: Project type
Methodology description
Describe the methodology type when the Other is selected.
Database: BioProject; Category: Project type
MI
Middle initial.
Database: BioProject; Category: Publication
MI
Middle initial.
Database: Biosample; Category: Publications
min Length
Minimum length in base pairs of the interval.
Database: Sequence Read Archive; Category: Experiment
Min Match
Minimum number of matches to trigger identification.
Database: Sequence Read Archive; Category: Experiment
Genome, metagenome or marker sequences (MIxS compliant)
Use for genomes, metagenomes, and marker sequences. These samples include specific attributes that have been defined by the Genome Standards Consortium (GSC) to formally describe and standardize sample metadata for genomes, metagenomes, and marker sequences. The samples are validated for compliance based on the presence of the required core attributes as described in MIxS. For details, please see the GSC websites.
Database: Biosample; Category: Sample type
Motility
Choose a Motility.
Motility
Yes
No
Database: BioProject; Category: Target
Name
Name of submitter.
Database: Sequence Read Archive; Category: Submission
Name
The plate/slide/flowcell name for this data block.
PlatformName
454plate name
Illuminaflowcell name
SOLiDslide name
Helicosflowcell
Database: Sequence Read Archive; Category: Run
Name
Data block name, for use in mapping multiple analysis files to a single analysis object.This attribute is not needed if there is only one analysis file loaded for the analysis object.
Database: Sequence Read Archive; Category: Analysis
Name
The preferred standard for the replicon name.
Database: BioProject; Category: Target
NCBI_PROJECT_ID
BioProject ID generated by the INSDC. Type: int Example: 7 field would allow to link traces to BioProject database and easily retrieve sets of traces from each Project. Genome sequencing centers may apply their project to the DDBJ BioProject prior the submission of genomic sequence data. Submitters need not submit sequencing data at the time they register their project.
Database: Trace Archive; Category: Metadata Field List
Nominal Length
Size of the insert for Paired reads.
Database: Sequence Read Archive; Category: Experiment
Nominal Sdev
Standard deviation of insert size.
Database: Sequence Read Archive; Category: Experiment
Normalization Name
This optional column contains user-defined names for each Normalization event. Used as an identifier within the MAGE-TAB document.
The following columns can be used to annotate Normalization Name columns
Database: Omics Archive; Category: SDRF
Normalization Term Accession Number
The accession number for this term, taken from the indicated Term Source.
Database: Omics Archive; Category: IDF
Normalization Term Source REF
The source of the Normalization Type terms; this must reference one of the Term Source Names defined elsewhere in the IDF file (see below).
Database: Omics Archive; Category: IDF
Normalization Type
The normalization strategies used. Typically these terms should come from the MGED Ontology. See for example the list of NormalizationDescriptionType terms. Controlled vocabulary term.
Database: Omics Archive; Category: IDF
Notes
Notes about the program for primary analysis.
Database: Sequence Read Archive; Category: Experiment
Objective
Project goals with respect to the type of data that will be generated and submitted to an INSDC-associated database. Select all relevant menu options.
ObjectiveDescription
Raw Sequence ReadsSubmission of raw sequencing information as it comes out of machine.
SequenceSequence which is not raw - meaning processed (clipped, matepaired, oriented).
AnalysisHigher level interpretation of the data.
AssemblyExperiment will result in assemblies (genome or transcriptome).
AnnotationExperiment wil result in Annotation.
VariationSubmission of variations.
Epigenetic MarkersExperiment will result in Epigenetic markers.
ExpressionSubmission of gene expression.
MapsExperiment will result in cytogenetic, physical, Rh, etc...maps.
PhenotypeExperiment will deliver phenotypes.
Other
Database: BioProject; Category: Project type
Optimum Temperature
Optimum temperature in Celsius.
Database: BioProject; Category: Target
Organism name

Organism name in the Taxonomy database. Unclassified sequences including metagenome and environmental samples may be found at here.

In the project spanning multiple species, enter a taxonomic classification common to the species (e.g., genus name).

If you intend to submit un-registered novel organism, please provide us the detailed organism information in the Description of novel organism and proposed organism name in the Organism Name.

Database: BioProject; Category: Target
ORGANISM_NAME
Description of species for BARCODE project from which trace is derived. Type: varchar(100) Example: Acanthocybium solandri The field is used to classify the read by species for BARCODE data, using proper taxonomic name in accordance with Taxonomy Browser. ="BARCODESPECIES" for all traces from this project. This field would be required for the =BARCODE.
Database: Trace Archive; Category: Metadata Field List
Organization
Organization to which a contact person belongs.
Database: Biosample; Category: Submitter
Orientation
Relative orientation of the paired reads.When the relative orientation is in the same direction to a reference sequence 5'-3'-5'-3' (or forward-forward), when is in the opposite direction 5'-3'-3'-5' (or forward-reverse).
Database: Sequence Read Archive; Category: Experiment
Oxygen requirement
Choose an Oxygen requirement.
OxygenReq
Aerobic
Microaerophilic
Facultative
Anaerobic
Unknown
Database: BioProject; Category: Target
Pages from
Reference start page.
Database: BioProject; Category: Publication
Pages from
Database: Biosample; Category: Publications
Pages to
Reference end page.
Database: BioProject; Category: Publication
Pages to
Database: Biosample; Category: Publications
PairedEnd
Mated tags sequenced from two ends of a physical extent of genomic material.
Database: Sequence Read Archive; Category: Experiment
Parameter Value[]
Used as an attribute column following Protocol REF columns. This column contains values for the protocol parameters referenced in the column header. The following columns can be used to annotate Parameter Value[] columns For example, if a Protocol Name "Array Hybridization" is defined in the accompanying IDF, with Protocol Parameters "hyb temp;hyb volume", the following would be valid.
Database: Omics Archive; Category: SDRF
PEAK_FILE
Name of file that contains the list of peak values. Type: varchar(200) Example: ./mytraces/123clone.peak Consult the field description for more information.
Database: Trace Archive; Category: Metadata Field List
Performer
Used as an attribute column following Protocol REF. The name of the researcher or center name who carried out the protocol. For sequencing protocol, this is used as a run center name in the DRA submission.
Database: Omics Archive; Category: SDRF
Person Address
The street address of each person associated with the experiment. The contact information is not made public.
Database: Omics Archive; Category: IDF
Person Affiliation
The organization affiliation for each person associated with the experiment. This is used for public display.
Database: Omics Archive; Category: IDF
Person Email
The email address of each person associated with the experiment. The contact information is not made public.
Database: Omics Archive; Category: IDF
Person Fax
The Fax number of each person associated with the experiment. The contact information is not made public.
Database: Omics Archive; Category: IDF
Person First Name
The first name of each person associated with the experiment.
Database: Omics Archive; Category: IDF
Person Last Name
The last name of each person associated with the experiment.
Database: Omics Archive; Category: IDF
Person Mid Initials
The middle initials of each person associated with the experiment.
Database: Omics Archive; Category: IDF
Person Phone
The telephone number of each person associated with the experiment. The contact information is not made public.
Database: Omics Archive; Category: IDF
Person Roles
The role(s) performed by each person. Typically these terms should come from the MGED Ontology. See for example the list of Roles terms. If more than one role is needed per person, the roles should be given as a semicolon (";") delimited list, for example: "submitter;data_coder;investigator". Controlled vocabulary term.
Database: Omics Archive; Category: IDF
Person Roles Term Accession Number
The accession number for this term, taken from the indicated Term Source.
Database: Omics Archive; Category: IDF
Person Roles Term Source REF
The source of the Person Roles terms; this must reference one of the Term Source Names defined elsewhere in the IDF file (see below).
Database: Omics Archive; Category: IDF
PH
The pH at which an environmental sample was collected. Type: float Example: 7.2 The field is only applicable to environmental sample data but is not a required field.
Database: Trace Archive; Category: Metadata Field List
phred
The quality score is expressed as a probability of error in log form: -10 log(1/p) where p is the probability of error, with value range 0..63, 0 meaning no base call.
Database: Sequence Read Archive; Category: Run
PICK_GROUP_ID
Id to group traces picked at the same time. Type: int Example: 939065
Database: Trace Archive; Category: Metadata Field List
Pipeline Program
Name of the program or process for primary analysis.
Database: Sequence Read Archive; Category: Analysis
Pipeline Version
Version of the program or process for primary analysis.
Database: Sequence Read Archive; Category: Analysis
PLACE_NAME
Country in which the biological sample was collected and/or common name for a given location. Type: varchar(250) Example: Octopus Springs The field is applicable to environmental sample data, but is not required.
Database: Trace Archive; Category: Metadata Field List
PLATE_ID
Submitter defined plate id. Type: varchar(32) Example: 203 The and fields are intended to identify the storage location of the sequencing template (not the library well coordinate of an archival clone named in the field). This may enable flipped or contaminated trays to be easily identified. If a particular experiment did not require the use of a plate, please populate this field with '0'.
Database: Trace Archive; Category: Metadata Field List
Ploidy
Select a Ploidy.
Ploidy
Haploid
Diploid
Polyploid
Allopolyploid
Database: BioProject; Category: Target
Pooling Strategy
The optional pooling strategy indicates how the library or libraries are organized if multiple samples are involved.
Pooling StrategyDescription.
NoneThere is a one-to-one correspondence with sample and library (normal case)
Simple poolThe sequencing is done on a pool of identified samples which cannot be distinguished in the sequencing result
Multiplexed samplesMultiple libraries were prepared each of which can be distinguished in the sequencing result through a molecular barcode or other indicator. Each library may be made from the same or different samples
Multiplexed librariesMultiple libraries were prepared each of which can be distinguished in the sequencing result through a molecular barcode or other indicator. Each library may be made from the same or different samples. This option is expected when the libraries are part of the same study
Spiked libraryOne library is prepared with an oligonucleotide sequence included that when sequenced can help provide quality control for the library
OtherOther, unspecified, or unknown pooling strategy
Database: Sequence Read Archive; Category: Experiment
POPULATION_ID
Center provided id to designate a population from which a trace (or group of traces) was derived. Type: varchar(100) Example: CEPH The field is used to capture center specific designations of groups of individuals. This will likely only be useful in population studies(usually =SNP).
Database: Trace Archive; Category: Metadata Field List
Precedes Read Index
Specify the read index that follows this read.
Database: Sequence Read Archive; Category: Experiment
PREP_GROUP_ID
ID that defines groups of traces prepared at the same time. Type: varchar(30) Example: A2
Database: Trace Archive; Category: Metadata Field List
PRIMER
The primer sequence (used in the sequencing reaction). Type: varchar(200) Example: GAATACCTACGATCGCC The value of the field is the actual base sequence of the sequencing primer used. If a center uses a primer extensively, the primer sequence can be entered into the list of primer codes and the field can be used.
Database: Trace Archive; Category: Metadata Field List
PRIMER_CODE
Identifier for the sequencing primer used. Type: varchar(30) Example: Sp6
Database: Trace Archive; Category: Metadata Field List
PRIMER_LIST
A ';' delimited list of primers used in a mapping experiment (such as AFLP). Type: varchar(100) Example: AAGGTCTGCGCGTGTC;AGCTGCGTACGTAATCG; This field is required if ="AFLP" and ="PCR".
Database: Trace Archive; Category: Metadata Field List
Private comments to DDBJ staff
Use this field if you have questions for database support staff. The content is not made public. If you intend to submit an umbrella project, please inform us that "this is umbrella project".
Database: BioProject; Category: General info
Private comments to DDBJ staff
Use this field if you have questions for database support staff. The content is not made public.
Database: Biosample; Category: Comments
Processing step name
Name of the processing step. Example) base call.
Database: Sequence Read Archive; Category: Experiment
Program
Name of the program for primary analysis. Example) Illumina GA pipeline.
Database: Sequence Read Archive; Category: Experiment
PROGRAM_ID
The program used to create the trace file. Type: varchar(100) Example: phred-19990722h The field is used to indicate the base calling program. This field is free text. Program name, version numbers or dates are very useful.
More example values:
  • phred-19980904e
  • abi-3.1
  • ATQA
  • TraceTuner
  • Licor
  • Megabase
  • Beckman
Database: Trace Archive; Category: Metadata Field List
Project data type

A general label indicating the primary study goal. Select appropriate types. News: A BioProject record can have multiple project data types

NCBI individually assigns the Project data type based on the experimental data linked to the project. This type is not used by EBI.

Project Data typeDescription
Genome Sequencingwhole, or partial, genome sequencing project (with or without a genome assembly)
Clone Endsclone-end sequencing project
EpigenomicsDNA methylation, histone modification, chromatin accessibility datasets
Exomeexome resequencing project
Mapproject that results in non-sequence map data such as genetic map, radiation hybrid map, cytogenetic map, optical map, and etc.
Metagenomesequence analysis of environmental samples
Phenotype and Genotypeproject correlating phenotype and genotype
Proteomelarge scale proteomics experiment including mass spec. analysis
Random Surveysequence generated from a random sampling of the collected sample; not intended to be comprehensive sampling of the material.
Targeted Locus (Loci)project to sequence specific loci, such as a 16S rRNA sequencing
Transcriptome or Gene Expressionlarge scale RNA sequencing or expression analysis. Includes cDNA, EST, RNA_seq, and microarray.
Variationproject with a primary goal of identifying large or small sequence variation across populations.
Othera free text description is provided to indicate Other data type
Database: BioProject; Category: Project type
Project data type description
Describe the project data type when the Other is selected.
Database: BioProject; Category: Project type
Project title
Very short descriptive name of the project for caption, labels, etc for public display. For example: Chromosome Y sequencing, Global studies of microbial diversity on human skin.
Database: BioProject; Category: General info
PROJECT_NAME
Term by which to group traces from different centers based on a common project. Type: varchar(50) Example: New Project In this way sequencing centers that are working on the same large project can group all of the traces for this project using a common term. This field has a controlled vocabulary. Sequencing centers wishing to submit data must contact the DDBJ Trace Archive to determine a name that all members of the project agree on.
Database: Trace Archive; Category: Metadata Field List
Protocol Contact
The name and contact details to be used for enquiries concerning the protocol.
Database: Omics Archive; Category: IDF
Protocol Description
A free-text description of the protocol. This text is included in a single tab-delimited field. If you wish to include tab or newline characters as part of this text, you must enclose the whole text within double quotes (").
Database: Omics Archive; Category: IDF
Protocol Hardware
The hardware used by the protocol.
Database: Omics Archive; Category: IDF
Protocol Name
The names of the protocols used within the MAGE-TAB document. These will be referenced in the SDRF in the "Protocol REF" columns. Used as an identifier within the MAGE-TAB document.
Database: Omics Archive; Category: IDF
Protocol Parameters
A semicolon-delimited list of parameter names; these names are used in the SDRF file (as "Parameter Value [<parameter name>]" headers) to list the values used for each protocol parameter. If more than one parameter was used for a given protocol, they should be separated with semicolons (";"). Used as an identifier within the MAGE-TAB document.
Database: Omics Archive; Category: IDF
Protocol REF
This column contains references to Protocol Names defined in the IDF, or accession numbers of protocols already deposited with ArrayExpress/DOR. The following columns can be used to annotate Protocol REF columns The Term Source REF column here can be used to point to the source of the protocol referenced, if it is not contained within the IDF; for DOR submissions this should always be ArrayExpress, and a suitable ArrayExpress Term Source should be defined in the IDF.
Database: Omics Archive; Category: SDRF
Protocol Software
The software used by the protocol.
Database: Omics Archive; Category: IDF
Protocol Term Accession Number
The accession number for this term, taken from the indicated Term Source.
Database: Omics Archive; Category: IDF
Protocol Term Source REF
The source of the Protocol Type terms; this must reference one of the Term Source Names defined elsewhere in the IDF file (see below). Examples: MGED ontology, OBI.
Database: Omics Archive; Category: IDF
Protocol Type
The type of the protocol, taken from a controlled vocabulary. Typically this term should come from the MGED Ontology. See for example the list of ExperimentalProtocolType terms. Controlled vocabulary term.
Database: Omics Archive; Category: IDF
Provider
Used as an attribute column following Source Name. A free-text string identifying the organization or person from which the Source was obtained.
Database: Omics Archive; Category: SDRF
Description
Description (a paragraph) of the project goals and purposes. Provide enough information (more than 100 characters) in the description for other users to interpret the data.
Database: BioProject; Category: General info
Public Release Date
The date on which the experimental data will be/was released. The date should be entered in the "YYYY-MM-DD" format (ex. 2011-01-01). This tag can only have one value.
Database: Omics Archive; Category: IDF
Publication Author List
The list of authors associated with each publication.
Database: Omics Archive; Category: IDF
Publication DOI
A Digital Object Identifier (DOI) for each publication (where available).
Database: Omics Archive; Category: IDF
Publication Status
A term describing the status of each publication (e.g. "submitted", "in preparation", "published"). Controlled vocabulary term.
Database: Omics Archive; Category: IDF
Publication Status Term Accession Number
The accession number for this term, taken from the indicated Term Source.
Database: Omics Archive; Category: IDF
Publication Status Term Source REF
The source of the Publication Status terms; this must reference one of the Term Source Names defined elsewhere in the IDF file (see below).
Database: Omics Archive; Category: IDF
Publication Title
The title of each publication.
Database: Omics Archive; Category: IDF
PubMed ID
The PubMed IDs of the publication(s) associated with this investigation (where available).
Database: Omics Archive; Category: IDF
PubMed ID
Provide a PubMed ID for any publications directly related to all samples in the submission. How do I add reference information?
Database: Biosample; Category: Publications
PubMed ID
The PubMed ID(s) will be used to populate the publication information.
<Publication id="15557739">
	<DbType>ePubmed</DbType>
</Publication>
<ProjectReleaseDate> ...
Database: BioProject; Category: Publication
QUAL_FILE
Name of file containing the quality scores. Type: varchar(200) Example: ./mytraces/123clone.fasta.qs Trace files which do not include the quality scores must provide this information in a separate file. The file designations are recorded in the fields of the metadata file. The actual quality scores are stored in the file designated in the field. If quality scores are provided in separate files the information in these files will overwrite any information in the trace (usually *.scf) file. If the quality scores that would be provided in the are the same as the information in the trace file, DO NOT PROVIDE THE FILE. However, it is important to note that if some formats do not include the quality scores, then these values must be provided as ancillary information. If the center provides the and, then the peak index information should also be provided in a file called.
Database: Trace Archive; Category: Metadata Field List
Quality Control Term Accession Number
The accession number for this term, taken from the indicated Term Source.
Database: Omics Archive; Category: IDF
Quality Control Term Source REF
The source of the Quality Control Type terms; this must reference one of the Term Source Names defined elsewhere in the IDF file (see below).
Database: Omics Archive; Category: IDF
Quality Control Type
The quality control procedures used. Typically these terms should come from the MGED Ontology. See for example the list of QualityControlDescriptionType terms. Controlled vocabulary term.
Database: Omics Archive; Category: IDF
Read Group Tag
When match occurs, the read will be tagged with this group membership. Used to relate a tag read to a sample member.
Database: Sequence Read Archive; Category: Experiment
Reference title
A title of reference.
Database: BioProject; Category: Publication
Reference title
Database: Biosample; Category: Publications
REFERENCE_ACC_MAX
Finish position for a particular amplicon in re-sequencing or comparative projects. Type: int Example: 30929 This field points to the finishing coordinate of the described in the field. All coordinates should be in 1 base coordinates (i.e.sequences start at base 1, not base 0). This field is required for the following combination of and :
=Re-sequencing; =SHOTGUN; PCR;RT-PCR
Database: Trace Archive; Category: Metadata Field List
REFERENCE_ACC_MIN
Start position for a particular amplicon in re-sequencing or comparative projects. Type: int Example: 29829 This field points to the starting coordinate of the described in thefield. All coordinates should be in 1 base coordinates (i.e.sequences start at base 1, not base 0). This field is required forthe following combination of and :
=Re-sequencing; =SHOTGUN; PCR;RT-PCR
Database: Trace Archive; Category: Metadata Field List
REFERENCE_ACCESSION
Reference accession (use accession and version to specify a particular instance of a sequence) used as the basis for a re-sequencing project. In case of Comparative strategy show the basis for primers design. Type: varchar(50) Example: NT_029829.1 This field is required for the following combination of and :
=Re-sequencing;Comparative =Any
Database: Trace Archive; Category: Metadata Field List
REFERENCE_OFFSET
Sequence offset of accession specified in REFERENCE_ACCESSION field to define the coordinate start position used as the basis for a re-sequencing project. Type: int Example: 1520899 This field points to the starting coordinate of the described in thefield. All coordinates should be in 1 base coordinates (i.e.sequences start at base 1, not base 0). This field is required forthe following combination of and :
=Re-sequencing; =CHIP
Database: Trace Archive; Category: Metadata Field List
REFERENCE_SET_MAX
Finish position for a entire re-sequencing region. This region may include several amplicons. Type: int Example: 29829 This field points to the starting coordinate of the described in the field for a entire re-sequencing region. All coordinates should be in 1 base coordinates (i.e. sequences start at base 1, not base 0).The REFERENCE_ACC_[MIN|MAX] and REFERENCE_SET_[MIN|MAX] should refer to the same REFERENCE_ACC.
Database: Trace Archive; Category: Metadata Field List
REFERENCE_SET_MIN
Start position for a entire re-sequencing region. This region may include several amplicons. Type: int Example: 29829 This field points to the starting coordinate of the described in the field for a entire re-sequencing region. All coordinates should be in 1 base coordinates (i.e. sequences start at base 1, not base 0).The REFERENCE_ACC_[MIN|MAX] and REFERENCE_SET_[MIN|MAX] should refer to the same REFERENCE_ACC.
Database: Trace Archive; Category: Metadata Field List
Region
Lower level partition of run data to which this data block pertains, typically the field of view for the imaging camera.
PlatformRegion
4540 if whole plate is used, 1..16 for gasket partition.
IlluminaTile number (1..200+), or use 0 if file contains all tiles.
SOLiDPanel number (1..4096), or use 0 if file contains all panels.
HelicosField
Database: Sequence Read Archive; Category: Run
Release
Release project data immediately. Private DDBJ record(s) citing this ID is not released.
Database: BioProject; Category: Submitter
Release
Submitted BioSample record will be released immediately after the curation process finishes.
Database: Biosample; Category: Submitter
Relevance
Select the primary general relevance of the study.
RelevanceDescription
Agricultural
Medical
IndustrialCould include bio-remediation, bio-fuels and other areas of research where there are areas of mass production.
Environmental
Evolution
ModelOrganism
OtherUnspecified major impact categories to be defined in the "Relevance description".
Database: BioProject; Category: General info
Relevance description
Describe the relevance when the Other is selected.
Database: BioProject; Category: General info
REPLACED_BY
TI that replaced the current TI as "active". Type: int Example: 304753779 This field points to the more recent data set. If trace was updated then the field stores the for the new trace. If only ancillary information has been updated, then replaced_by=0 and is not shown.
Database: Trace Archive; Category: Internal Fields List
Replicate Term Accession Number
The accession number for this term, taken from the indicated Term Source.
Database: Omics Archive; Category: IDF
Replicate Term Source REF
The source of the Replicate Type terms; this must reference one of the Term Source Names defined elsewhere in the IDF file (see below).
Database: Omics Archive; Category: IDF
Replicate Type
The replicate strategies used. Typically these terms should come from the MGED Ontology. See for example the list of ReplicateDescriptionType terms. Controlled vocabulary term.
Database: Omics Archive; Category: IDF
Reproduction
Select a Reproduction.
Reproduction
Sexual
Asexual
Database: BioProject; Category: Target
Run Center
The name of the contract sequencing center or company that executed the run.Center Name(s) listed inCenter Name List andRun Center List can be entered.
Database: Sequence Read Archive; Category: Run
Run Date
Date when the run took place.
Database: Sequence Read Archive; Category: Run
RUN_DATE
Date the sequencing reaction was run. Type: datetime Example: 2000-10-28
Database: Trace Archive; Category: Metadata Field List
RUN_GROUP_ID
ID used to group traces run on the same machine. Type: varchar(30) Example: group2
Database: Trace Archive; Category: Metadata Field List
RUN_LANE
Lane or capillary of the trace. Type: int Example: 1 The documents the specific lane or capillary on which a trace was obtained.
Database: Trace Archive; Category: Metadata Field List
RUN_MACHINE_ID
ID of the specific sequencing machine on which a trace was obtained. Type: varchar(30) Example: machine2
Database: Trace Archive; Category: Metadata Field List
RUN_MACHINE_TYPE
Type or model of machine on which a trace was obtained. Type: varchar(30) Example: ABI 310
Database: Trace Archive; Category: Metadata Field List
Run/Analysis
Specify whether a data file belongs to the Run or Analysis. In the web submission form, this field is un-editable and is automatically filled according to the selected Run or Analysis. To upload metadata in tsv file, this field needs to be specified manually.
Database: Sequence Read Archive; Category: Analysis
Run/Analysis
Specify whether a data file belongs to the Run or Analysis. In the web submission form, this field is un-editable and is automatically filled according to the selected Run or Analysis. To upload metadata in tsv file, this field needs to be specified manually.
Database: Sequence Read Archive; Category: Run
Run/Analysis contains files
Select a Run to which the data file belongs.
Database: Sequence Read Archive; Category: Run
Run/Analysis contains files
Select an Analysis to which the data file belongs.
Database: Sequence Read Archive; Category: Analysis
SALINITY
The salinity at which an environmental sample was collected measured in parts per thousand units (promille). Type: float Example: 20 The field is only applicable to environmental sample data but is not a required field.
Database: Trace Archive; Category: Metadata Field List
Salinity
Choose a Salinity.
Salinity
NonHalophilic
Mesophilic
ModerateHalophilic
ExtremeHalophilic
Unknown
Database: BioProject; Category: Target
Sample attributes
List of attributes.
Download BioSample worksheet which has been customised to fit models. This is a tab-delimited text file that may be opened with a spreadsheet program or a text editor.
Database: Biosample; Category: Attributes
Sample Name
This column contains user-defined names for each Sample material. Used as an identifier within the MAGE-TAB document.
The following columns can be used to annotate Sample Name columns:
Database: Omics Archive; Category: SDRF
Sample scope
The scope and purity of the biological sample used for the study.
Sample scopeDescription
MonoisolateA single animal, cultured cell-line, inbred population (or possibly a heterogeneous population when a single genome assembly is generated from the pooled sample; not preferred).
MultiisolateMultiple individuals, a population (representation of a species).
Multi-speciesSample represents multiple species.
EnvironmentSpecies content of the sample is not known.
SyntheticSample is synthetically created by a machine.
OtherSpecify the sample scope that was used in the "Target description".
Database: BioProject; Category: Project type
BioSample Used
Select the BioSample this experiment uses.
Database: Sequence Read Archive; Category: Experiment
Scan Name
This optional column contains user-defined names for each Scan event. Used as an identifier within the MAGE-TAB document.
The following columns can be used to annotate Scan Name columns
Database: Omics Archive; Category: SDRF
SDRF File
The name(s) of the SDRF file(s) accompanying this IDF file.
Database: Omics Archive; Category: IDF
Sector
Higher level partition of run data to which this data block pertains.
PlatformSector
454not used
IlluminaLane number
SOLiDslide
Helicoschannel
Database: Sequence Read Archive; Category: Run
SEQ_LIB_ID
Center specified M13/PUC library that is actually sequenced. Type: varchar(255) Example: 22194 The field is the center identifier for the M13/PUC based clone that is actually sequenced. This will allow grouping of traces by the actual ligation event and is applicable to most projects. Thi svalue will be unique within a given center.
This field would be required for the following combination of and :
=Any;=SHOTGUN
=Any;=WGS/WCS
Database: Trace Archive; Category: Metadata Field List
Sequence Length
The fixed number of bases expected in each raw sequence, including both mate pairs and any technical reads.
Database: Sequence Read Archive; Category: Experiment
Sequence Length
The fixed number of bases expected in each raw sequence, including both mate pairs and any technical reads.
Database: Sequence Read Archive; Category: Experiment
Shape
Select all relevant menu options.
ShapeDescription
Bacillirod-shaped
Coccispherical-shaped
Spirillaspiral-shaped
Coccobacillielongated coccal form
Filamentousfilament-shaped (bacilli thar occur in long threads)
Vibriosvibrio-shaped (short, slightly curved rods)
Fusobacteriafusiform or spindle-shaped (rods with tapered ends)
SquareShaped
CurvedShaped
Tailed
Database: BioProject; Category: Target
Short Name
Short name for the standard reference assembly used in the alignment.This should resolve into community accepted collection of reference sequences.
Short NameDescription
GRCh37GRCh37 is the Genome Reference Consortium Human Reference 37 released 24-FEB-2009, and includes haploid and alternative loci sequences.
GRCh37-liteGRCh37-lite is a subset of the full GRCh37 human genome assembly plus the human mitochondrial genome reference sequence (the "rCRS") from Mitomap.org. This set of sequences excludes all the alternate loci scaffolds of the full GRCh37 assembly, and has the pseudo-autosomal regions (PARs) on chromosome Y masked with Ns. This haploid representation of the genome is provided as a convenience for use in alignment pipelines that cannot handle the multiple placements expected in the PARs and in regions of the genome that are represented by the alternate loci.
HG18The March 2006 human reference sequence (NCBI Build 36.1) was produced by the International Human Genome Sequencing Consortium and is distributed by UCSC.
NCBI36NCBI Build 36.3 released 24 March 2008. This build consists of a reference assembly for the whole genome, alternate assemblies for the whole genome produced by Celera and by JCVI, plus alternate assemblies for some parts of the genome.
NCBI36-HG18_Broad_variantBroad Institute variant of Build 36/HG 18.
NCBI36_BCCAGSC_variantBritish Columbia Cancer Agency Genome Sequencing Center variant of Build 36/HG 18.
NCBI36_BCM_variantBaylor College of Medicine variant of Build 36/HG 18.
NCBI36_WUGSC_variantWashington University variant of Build 36/HG 18.
Database: Sequence Read Archive; Category: Analysis
Size
The size and unit of measurement for the estimated genome size.
Database: BioProject; Category: Target
Source Name
This column contains user-defined names for the Source materials. Used as an identifier within the MAGE-TAB document.
The following columns can be used to annotate Source Name columns:
Database: Omics Archive; Category: SDRF
SOURCE_TYPE
Source of the DNA. Type: varchar(50) Example: GENOMIC DNA The field consists of a code. Possible values are:
  • G=Genomic DNA (includes PCR products from genomic DNA)
  • N=Non Genomic DNA (EST, cDNA, RT-PCR, screened libraries)
  • VIRAL RNA=Viral RNA
  • SYNTHETIC=Synthetic DNA
Accepted values are G, N, GENOMIC, NON GENOMIC, VIRAL RNA,SYNTHETIC
Database: Trace Archive; Category: Metadata Field List
SPECIES_CODE
Description of species from which trace is derived. Type: varchar(100) Example: Homo sapiens The field is used to classify the read by species, using proper taxonomic names where possible. This field currently is maintained as a controlled vocabulary. For a list of species currently contained within the Trace Archive, see: http://www.ncbi.nlm.nih.gov/Traces/trace.cgi?cmd=stat&f=xml_list_species&m=obtain&s=speciesTo submit a new species, please contact the DDBJ Trace Archive prior to submission. For cases in which it is unclear ofthe taxonomic origin of a specific trace the taxonomic classification 'ENVIRONMENTAL SEQUENCE' can be used in a case of environmental samples or 'ARTIFICIAL SEQUENCE' in a case of artificial material.
Database: Trace Archive; Category: Metadata Field List
Spot Length

The read length in submitted sequencing files. For mate pairs, this number includes mate pairs, but does not include gap lengths.

  • When the spot length is constant, enter a constant value.
  • For 454 platforms producing reads with variable length, enter a constant flow count.
  • For fastq files with variable length, enter an average length.
Database: Sequence Read Archive; Category: Experiment
Spot Type
Select a layout of reads in sequencing data files.
Spot TypeDescription
singleSingle read
paired (FF)Paired reads with same direction.
paired (FR)Paired reads with opposite direction.
Database: Sequence Read Archive; Category: Experiment
start
Both matches and mismatches are counted. When Max Mismatch is exceeded - it is not a match. When Min Match is reached - match is declared.
Database: Sequence Read Archive; Category: Experiment
STATE
Indicates the status of the trace. Type: varchar Example: active
  • active
  • updated
  • withdrawn
Database: Trace Archive; Category: Internal Fields List
stdev
Standard deviation of length in base pairs of the interval.
Database: Sequence Read Archive; Category: Experiment
STRAIN
Strain from which a trace is derived. Type: varchar(50) Example: C57BL/6J is required for ="SNP"
Database: Trace Archive; Category: Metadata Field List
Strain, breed, cultivar
Microbial strain name, or eukaryotic breed or cultivar name. Please provide this or "Isolate name or label"
Database: BioProject; Category: Target
STRATEGY
Experimental STRATEGY. Type: varchar(50) Example: MODEL VERIFY Experimental used when obtaining the trace. It is proposed that this would be a controlled vocabulary, but that submitters would contribute to this list as needed to define various experiments and projects.

  • AFLP: Amplified Fragment Length Polymorphism
  • BARCODE: DNA sequence analysis of a uniform target gene to enable species identification
  • CCS: Concatenated cDNA sequencing
  • cDNA: Sequences generated in the process of sequencing cDNA clones
  • CF-S: Cot-filtered single/low-copy genomic DNA
  • CF-M: Cot-filtered moderately repetitive genomic DNA
  • CF-H: Cot-filtered highly repetitive genomic DNA
  • CF-T: Cot-filtered theoretical single-copy DNA
  • CLONE: Genomic clone based (hierarchical) sequencing
  • CLONEEND: Sequences generated from the end of a clone(BAC/PAC/Fosmid or cDNA)
  • Comparative: Sequences obtained using primers design from related species
  • CTS: Concatenated Tag Sequencing
  • Env Sample-GEO: Geographically generated environmental sample
  • Env Sample-Host: Environmental samples collected from a specific host
  • EST: single pass sequencing of cDNA templates
  • FINISHING: a read specifically made for finishing, could be either BAC finishing or Whole Genome Assembly (WGA) finishing
  • MODEL VERIFY: Sequences obtained to verify proposed gene models
  • PoolClone: Pools of clones (BACs mostly)
  • SNP: Reads used for SNP identification
  • TARGETED LOCUS: Sequences obtained from templates generated by primers designed to amplify a specific genetic locus
  • Re-sequencing: Re-sequencing of targeted genomic regions
  • RT-PCR: Sequences obtained using templates generated by Reverse Transcriptase Polymerase Chain Reaction
  • WGA: Whole Genome Assembly
Database: Trace Archive; Category: Metadata Field List
Study Ref
The study this experiment belongs to.The study alias will be displayed after relating the metadata objects in the Relation of MetaDefine.
Database: Sequence Read Archive; Category: Experiment
Study Ref
The study this analysis belongs to.The study alias will be displayed after relating the metadata objects in the Relation of MetaDefine.
Database: Sequence Read Archive; Category: Analysis
SUBMISSION_TYPE
Type of submission. Type: varchar(50) Example: NEW The field allowed values:
  • NEW: use to submit new data
  • UPDATE: use to renew traces and their ancillary information. Previous data will be saved with their TI's; new traces with the same trace_name's will receive new TI's and they will become active
  • UPDATEINFO: use to update or add ancillary information for already existing traces without re-submitting the entire package of data
  • WITHDRAW: use to withdraw traces
Database: Trace Archive; Category: Metadata Field List
Submitting organization
full name of organization.
Database: BioProject; Category: Submitter
Submitting organization
Full name of organization.
Database: Biosample; Category: Submitter
Submitting organization URL
The URL of submitter's organization.
Database: BioProject; Category: Submitter
Submitting organization URL
The URL of submitter's organization.
Database: Biosample; Category: Submitter
Suffix
Suffix for author.
Database: BioProject; Category: Publication
Suffix
Database: Biosample; Category: Publications
SVECTOR_ACCESSION
DDBJ/EMBL/Genbank accession of the sequencing vector. Type: varchar(50) Example: X52325
Database: Trace Archive; Category: Metadata Field List
SVECTOR_CODE
Center defined code for the sequencing vector. Type: varchar(50) Example: pBluescript SK(+)
Database: Trace Archive; Category: Metadata Field List
Tandem
Tandem gaps between ligands.
Database: Sequence Read Archive; Category: Experiment
Target description
Describe the Sample scope/Material/Capture when the Other(s) is selected.
Database: BioProject; Category: Project type
TAXID
NCBI Taxonomy ID. Type: int Example: 10090 This field links DDBJ Trace Archive with NCBI Taxonomy Browser.
Database: Trace Archive; Category: Internal Fields List
Taxonomy ID
NCBI Taxonomy ID
Database: BioProject; Category: Target
Technology Type
Controlled vocabulary term. Used as an attribute column following Assay Name. This column contains terms describing the type of each generic (non-hybridization) assay. Example: high_throughput_sequencing. The following columns can be used to annotate Technology Type columns The Term Source REF column in this case would point to the ontology (defined in the IDF) from which the Technology Type terms are taken.
Database: Omics Archive; Category: SDRF
TEMPERATURE
The temperature (in oC) at which an environmental sample was collected. Type: float Example: 30 The field is only applicable to environmental sample data but it is not a required field.
Database: Trace Archive; Category: Metadata Field List
Temperature range
Choose a temperature range.
TemperatureRange
Cryophilic
Psychrophilic
Mesophilic
Thermophilic
Hyperthermophilic
Unknown
Database: BioProject; Category: Target
TEMPLATE_ID
Submitter defined identifier for the sequencing template. Type: varchar(50) Example: HBBBA2211 The field is used to uniquely identify the actual template that is sequenced. This field, in conjunction with the TRACE_END field, can be used to identify traces that should be marked as 'mate_pairs'because they come from opposite ends of the same clone.
Database: Trace Archive; Category: Metadata Field List
Term Accession Number
Used as an attribute column following Term Source REF columns. This column contains the accession numbers from the term source used to identify the ontology or database terms in question. For example
Source Name Characteristics [DiseaseState] Term Source REF Term Accession Number
Sample 1acute lymphocytic leukemiaNCI MetathesaurusC0023449
(This example relies on the "NCI Metathesaurus" Term Source having been pre-defined in the IDF accompanying the SDRF.)
Database: Omics Archive; Category: SDRF
Term Source File
A filename or valid URI at which the Term Source may be accessed.
Database: Omics Archive; Category: IDF
Term Source Name
The names of the Term Sources (ontologies or databases) used within the MAGE-TAB document. This name will be used in all corresponding "Term Source REF" fields. Examples: MGED Ontology, NCI MetaThesaurus, ArrayExpress. Used as an identifier within the MAGE-TAB document.
Database: Omics Archive; Category: IDF
Term Source REF
Used as an attribute column following any controlled vocabulary column (e.g., Characteristics[]), or column allowing reference of external entities (e.g., Protocol REF). This column contains references to ontology or database Term Sources defined in the IDF, and from which the values in the previous column were taken. The following columns can be used to annotate Term Source REF columns
Database: Omics Archive; Category: SDRF
Term Source Version
The version of the Term Source used throughout the MAGE-TAB document.
Database: Omics Archive; Category: IDF
This publication has multiple authors
If this is checked, then "et al" is added to the author name provided above.
Database: BioProject; Category: Publication
This publication has multiple authors
If this is checked, then "et al" is added to the author name provided above.
Database: Biosample; Category: Publications
TI
Trace unique internal Identifier. Type: int Example: 304753779 It is given for a record at the loading stage, and any record,or number of records can be obtain by their identifiers.
Database: Trace Archive; Category: Internal Fields List
Title
Short text that can be used to call out experiment records in searches or in displays. A title like "[Sequencing Instrument Model] [paired end] sequencing of [BioSample ID]" (for example, "Illumina HiSeq 2000 paired end sequencing of SAMD00025741") is automatically constructed. To enter user-defined titles, download Experiment metadata into a tab-delimited text file, edit title values and upload it.
Database: Sequence Read Archive; Category: Experiment
Title
Short text that can be used to call out run records in searches or in displays. A title like "[Sequencing Instrument Model] [paired end] sequencing of [BioSample ID]" (for example, "Illumina HiSeq 2000 paired end sequencing of SAMD00025741") is automatically constructed. To enter user-defined titles, download Run metadata into a tab-delimited text file, edit title values and upload it.
Database: Sequence Read Archive; Category: Run
Title
Title of the analyis object.
Database: Sequence Read Archive; Category: Analysis
TRACE_END
Defines the end of the template contained in the read. Type: varchar(50) Example: F The field can have the following values:
  • F: FORWARD
  • R: REVERSE
  • N: UNKNOWN
Database: Trace Archive; Category: Metadata Field List
TRACE_FILE
Filename with the trace, relative to the top of the volume. Type: varchar(200) Example: ./traces/TRACE001.scf
Database: Trace Archive; Category: Metadata Field List
TRACE_FORMAT
Format of the trace file. Type: varchar(20) Example: scf The field can have the following values:
  • SCF - A standard file format for data from DNA sequencing instruments.
  • ABI - A ABI-trace file is a binary file including the trace data and the sequence.
Database: Trace Archive; Category: Metadata Field List
TRACE_NAME
Center defined trace identifier. Type: varchar(250) Example: HBBBA1U2211 The field must be unique within a center, but is not required to be unique between centers. The combination of and act as a unique key within the Trace Archive.
Database: Trace Archive; Category: Metadata Field List
TRACE_TYPE_CODE
Sequencing strategy by which the trace was obtained. Type: varchar(50) Example: wgs The field reflects the sequencing used to obtain the trace.

  • CHIP: Sequences obtained using microarrays (also called DNAchips or gene chips)
  • CLONEEND: Sequences generated from the end of a large insert(BAC/PAC/Fosmid) or cDNA clone
  • EST: Single Pass Expressed Sequence Tag
  • HTP SELEX: High throughput SELEX
  • OTHER: Other than PCR, PrimerWalk, SHOTGUN or TRANSPOSON for FINISHING
  • PCR: Sequences obtained using templates generated by genomic Polymerase Chain Reaction
  • PrimerWalk: Sequences generated through a primer walkingstep
  • RT-PCR: Sequences obtained using templates generated by Reverse Transcriptase Polymerase Chain Reaction
  • SHOTGUN: Shotgun sequencing of clones (genomic or cDNA)
  • TRANSPOSON: Sequences obtained using templates generated bytransposons
  • WCS: Whole Chromosome Shotgun
  • WGS: Whole Genome Shotgun
Database: Trace Archive; Category: Metadata Field List
TRANSPOSON_ACC
DDBJ/EMBL/Genbank accession for transposon used in generating sequencing template. Type: varchar(50) Example: X00913 The would be required for the following combination of and :
=Any;=TRANSPOSON
Database: Trace Archive; Category: Metadata Field List
TRANSPOSON_CODE
Center defined code for transposon used in generating sequencing template. Type: varchar(50) Example: Mu transposon This field would be required for the following combination of and :
=Any;=TRANSPOSON
Database: Trace Archive; Category: Metadata Field List
Trophic Level
Select a TrophicLevel.
TrophicLevel
Autotroph
Heterotroph
Mixotroph
Database: BioProject; Category: Target
Type
Select a replicon type.
Replicon type
Chromosome
Plasmid
Linkage Group
Segment
Other
Database: BioProject; Category: Target
Unit[]
Controlled vocabulary term. Used as an attribute column following Characteristics[], Factor Value[] or Parameter Value[]. This column contains terms describing the unit(s) to be applied to the values in the preceding column. The type of unit is included in the column header, e.g. "Unit[TimeUnit]". These unit types should correspond to Unit subclasses from the MGED Ontology. The following columns can be used to annotate Unit[] columns The Term Source REF column in this case would point to the ontology (defined in the IDF) from which the Unit terms are taken.
Database: Omics Archive; Category: SDRF
UPDATE_DATE
Date on which the data was updated/replaced. Type: smalldatetime Example: Jul 19 2001 3:48PM This field is used to store the date of the last update.
Database: Trace Archive; Category: Internal Fields List
URL
URL of web site that is related to this study.
Database: BioProject; Category: General info
URL
URL of the web site.
Database: Biosample; Category: General info
value
Frequency count or 0.
Database: Sequence Read Archive; Category: Experiment
Version
Version of the program for primary analysis. Example) 1.7
Database: Sequence Read Archive; Category: Experiment
Volume
Journal volume.
Database: BioProject; Category: Publication
Volume
Database: Biosample; Category: Publications
WELL_ID
Center defined well identifier for the sequencing reaction. Type: varchar(50) Example: A1 The field in combination with the field , is used to define the storage location of the sequencing reaction (see note with the field). Typically,sequencing reactions are performed in standard microtiter dishes having either 96 or 384 wells (see standard configurations below).
Standard 96 well microtiter configuration
Standard 96 well microtiter configuration
Standard 384 well microtiter configuration
Standard 384 well microtiter configuration
Database: Trace Archive; Category: Metadata Field List
Year
Publication year.
Database: BioProject; Category: Publication
Year
Database: Biosample; Category: Publications