Frequently asked questions

FAQ: 10

Metadata*

Do I have to register a separate BioProject/BioSample for each genome I am sequencing?

If multiple cultured genomes are part of the same research effort, then they can belong to the same BioProject. However, each culture must be registered as a separate BioSample.

Metagenomic assemblies, where multiple genomes are assembled with high confidence from a single metagenomic sample, register a BioProject for metagenomic assembly project, and BioSamples for each sample of metagenomic assembly.

Created: February 12, 2015; Last updated: December 13, 2016

What should be provided when information is unavailable?

Please see the Missing value reporting.
Created: September 2, 2014; Last updated: October 28, 2015

What is the relationship between BioSamples, SRA Experiments, SRA Runs, and my data files?

BioSample is descriptive information about the biological source materials, or samples, used to generate experimental data in any of primary data archives. Biological and technical replicates need to be registered as separate BioSamples distinguished by the "replicate" attribute having values such as "biological replicate 1" and "biological replicate 2".

Each SRA Experiment is a unique sequencing library for a specific sample. Importantly, much of the descriptive information that is displayed in the public record of your data is captured at the level of the DRA Experiment.

SRA Runs are simply a manifest of data file(s) that should be linked to a given sequencing library – no information present in the Run is displayed on the public record of your project. Note that all data files listed in a Run will be merged into a single SRA archive file (and fastq file for distribution), so files from different samples should not be grouped in the same Run. Paired-end data files (forward/reverse), conversely, MUST be listed in a single run in order for the two files to be correctly processed as paired-end. Do not divide a sample for a paired-end library (for example, forward and reverse).

Created: June 4, 2014; Last updated: January 4, 2017

How do I import a BioProject or BioSample accession into the DRA?

BioProject and BioSample submissions must be made through the Submission Portal D-way. Once you begin a BioProject or BioSample submission, it will be assigned a temporary tracking ID (PSUB/SSUB[number], respectively) – this is not the final accession! Once a BioProject is complete, it is assigned an accession like PRJDB[number]. Once a BioSample submission is complete, each sample will receive an accession like SAMD[number]. When creating DRA experiments, please specify the PSUB ID or PRJDB[number] accession as your BioProject, and SSUB ID or SAMD[number] as your BioSample. Note that a given data file can be linked to a single BioSample only.

When sample preparation and sequencing are carried out by different research groups, submitting DRA Experiment can refer BioProject and BioSample IDs obtained in the other submission account. If you need to refer external BioProject and BioSample IDs, contact to the DRA team. When referencing external objects, please be aware of triggering of data release among BioProject, BioSample and DRA submissions.

Created: June 4, 2014; Last updated: December 13, 2016

How many samples do I need for my DRA submission?

BioSample is descriptive information about the biological source materials, or samples, used to generate experimental data in any of primary data archives. Biological and technical replicates are represented by separate BioSamples with distinct 'replicate' attribute, e.g., 'replicate = biological replicate 1'.

For environmental samples, each physical isolate should be considered a BioSample, whereas uniquely attributable reads within an isolate are not. Note that a given DRA data file can be linked to a single BioSample only.

Basic guidance for BioSample registration are:
  • Register a separate BioSample for each unique source, e.g., RNA from the wings is a separate BioSample than RNA from legs if those two sources were sequenced independently.
  • A genome assembly can have only one BioSample. For a genome assembled from reads of multiple BioSamples, register a new BioSample and indicate which other BioSamples were used to generate the assembly. For example, if the reads from a male and from a female were submitted to DRA separately but the reads were combined to assemble the genome, register a new BioSample for the male plus the female, providing the accessions of the male and the female BioSamples in the new BioSample registration. Example genome entry.
  • Endosymbionts: Because sequences are annotated by genome, one would need separate BioSamples for an insect and its endosymbiont. In the insect genome assembly submission, we recommend indicating that the endosymbiont’s BioSample is separate and references the insect BioSample.
Examples:
  • 23,000 unique 16S amplicons from a single seawater collection point - 1 BioSample (1 sample was collected and then analyzed to deduce 16S diversity)
  • 3 "identical" transgenic mice treated with the same drug as part of an experiment - 3 BioSamples (biological and technical replicates are represented by separate BioSamples)
  • To examine gene expression profiles, CHO cells infected with a virus and sampled at 0, 2, 4, and 8 hours post infection - 4 BioSamples (4 time points)
  • To analyze differences in gene expression levels, RNA-seq data from a single male anteater taken from the brain, heart, lungs, testes, and liver - 5 BioSamples (5 different tissues isolated)
Created: June 4, 2014; Last updated: December 13, 2016

Update*

How do I update my BioSample?

At this time, it is necessary for submitters to contact the BioSample team to request updates and withdrawals as necessary. Please note that when BioSamples are updated, the submission overview page in the D-way submission portal will not reflect this change. That page is only a record of the initial submission, and does not display changes made in the BioSample database.

Created: October 13, 2015

How do I add reference information?

DDBJ Sequence Database

See the relevant item in Data Updates/Corrections and contact us from this form with "Our paper was published" in [Subject].

DRA

Add publication information to the BioProject referenced by relevant DRA submission. Contact BioProject team to add publication.

BioProject

Contact BioProject team to add publication information. Basically, citation of the BioProject accession is not recommended.

BioSample

When sequencing data derived from relevant samples are deposited in DDBJ Sequence Database and DRA, please add publication information as described above.

For a publication about isolation and growth condition specifications of the organism/material, add pubmed id etc to isol_growth_condt. For a primary genome report, please add the relevant pubmed id etc to ref_biomaterial.

If you want to add publication of the other types, please contact BioSample team.

Created: January 23, 2014; Last updated: September 5, 2016

Accession number*

Should I cite BioSample accession numbers in my manuscript?

Typically, it is appropriate to cite the accession numbers that are assigned to your data submissions, e.g. the DDBJ, WGS or DRA accession numbers. If individual BioSamples do need to be referenced, state that "BioSample metadata are available in the DDBJ BioSample database (http://trace.ddbj.nig.ac.jp/biosample/index_e.html) under accession number SAMDxxxxxxxx".

Created: October 13, 2015

Sample attributes*

What is the difference between env_biome, env_feature and env_material?

These three sample attributes describe environmental systems have influences on living organisms.

env_biome

In the Environment Ontology (ENVO), the biome [ENVO_00000428] classes are subclasses of environmental system. The env_biome represents environmental systems to which resident ecological communities have evolved adaptations. Thus, a env_biome may be thought of as a community-centric ecosystem, whose extent is defined by the presence of the communities adapted to it. This requires that a env_biome possesses a degree of spatial and temporal stability that has allowed at least some of its constituent communities to adapt. Classes such as tundra biome [ENVO_01000180] and coniferous forest biome [ENVO_01000196] are included in ENVO. Currently, the biome branch of the ontology makes no commitment to a specific spatial or temporal scale.

env_feature

The biome described above are useful in ecological settings; however, environments are often described by referencing a single entity that has a strong causal influence on its surrounding space. For example, a coral reef environment is determined by the presence and influence of a coral reef [ENVO_00000150]. Similarly, the human gut environment is determined by the human gut. Removal of either the coral reef or the human gut would cause the associated environmental system to collapse. Environmental systems of this kind make no specific reference to ecological communities or populations (as do biomes), but to some central, supporting ‘feature’. Entities that act in this way as the causal ‘hubs’ or supports of a given environmental system are referenced by classes in ENVO’s top-level environmental feature [ENVO_00002297] hierarchy. For example, the environmental feature seamount [ENVO_00000264] would support a seamount environment, i.e. an environmental system which is supported by, and whose properties are determined by, the presence of a seamount.

env_material

In contrast to the classes above, which identify countable entities, the subclasses of the top-level environmental material [ENVO_00010483] class refer to masses, volumes, or other portions of some medium included in an environmental system. A portion of environmental material is understood to be more complex and variable in composition than a simple collection of material entities (e.g. a collection of silicate particles). For example, the environmental material soil [ENVO_00001998] typically contains aggregates of fine rock particles, sand grains, clay particles, silt particles, communities of animals, plants, fungi and microbes, small parts of organisms, organic matter, water inclusions, and airspaces.

Created: July 24, 2014; Last updated: November 19, 2014

Data release*

How are linked BioProject/BioSample/sequence data released?

Linked BioProject, BioSample, DDBJ and DRA data are released as follows.

  • Release of the BioProject records DO NOT trigger release of the other linked data.
  • Release of the BioSample records DO NOT trigger release of the other linked data, however, DO trigger release of the referencing BioProject.
  • Release of the DDBJ and DRA nucleotide sequence data DO trigger release of the linked BioProject and BioSample records.
Release of linked BioProject/BioSample/sequence records
Release of linked BioProject/BioSample/sequence records

DRA Handbook: Release of DRA
BioProject Handbook: Release of BioProject
BioSample Handbook: Release of BioSample

Created: December 15, 2014; Last updated: December 13, 2016