DDBJ BioSample Handbook




    The BioSample database was developed to serve as a central location in which to store descriptive information about biological samples used to generate experimental data in any of primary data archives.

    Following figure depicts how BioSample records are organized and linked with other objects. This example is composed of one umbrella project that encompasses three subprojects, each of which generated data derived from two BioSample records. Users can query either the BioProject or the BioSample database to retrieve the relevant records, and then navigate through links to the corresponding experimental data which continue to be stored in DDBJ's primary data archives, DDBJ, DRA and DOR.

    Overview of BioSample and BioProject integration with other DDBJ databases
    Overview of BioSample and BioProject integration with other DDBJ databases


    Given the huge diversity of sample types handled by archival databases, and the fact that appropriate sample descriptions are often dependent on the context of the study, the definition of what a BioSample represents is deliberately flexible. Typical examples of a BioSample include a cell line, a primary tissue biopsy, an individual organism or an environmental isolate.

    Biological and technical replicates are represented by separate BioSamples with distinct 'replicate' attribute, e.g., 'biological replicate 1' and 'biological replicate 2'. FAQ: How many samples do I need for my DRA submission?

    Information about the sample will include:
    • Species
    • The material sampled, e.g., organs, tissues, cell type
    • Phenotypic information including disease states and clinical information about the individual

    The information about human subjects and access to it will be compliant with all relevant ethical requirements. The DDBJ BioSample database does not support controlled access mechanisms and thus cannot host human clinical samples that may have associated privacy concerns.

    Reference biosamples

    A particular set of biosamples submitted to BioSample databases directly may be referenced subsequently from many experiments. We will refer to this set of samples as reference biosamples. Example of these may be some commonly used cell lines or mouse strains. The BioSample pre-registered commonly-used samples and make it easy to reference these from other databases at INSDC. Reference biosamples include ATCC and Coriell.

    Sample attributes

    A major component of a BioSample record is the sample attributes section. Attributes define the material under investigation and can include sample characteristics such as cell type, collection site and phenotypic information like disease state.

    BioSample attributes are captured as structured name: value pairs, for example, tissue:liver

    The database supports and encourages use of dictionaries of attribute names.

    The first targeted dictionaries implemented in the DDBJ BioSample database are the MIxS minimum information checklists for standardizing descriptions of genomes, metagenomes and targeted locus sequences as recently developed by the Genomics Standards Consortium.

    For the MIxS check lists, please see Nature Biotechnology 29, 415–420 (2011) | doi: 10.1038/nbt.1823 (PMID:21552244 ).

    MIxS check list
    MIxS check list


    For an organism name of the BioSample organism attribute, see the "Organism name" page. Previously, a strain name or some other lower taxon was required for the organism name of whole genomic sequence, mainly microorganisms. However, currently, the value of organism qualifier should be just a scientific name, in principle, even though for microbial genomes. Please describe a strain name in the strain attribute of BioSample.

    Related news: Changes in organism strain information management

    XML schema

    BioSample XML schema


    Conditionally required*



    Contact information of submitter(s). Questions and notifications about a submission are contacted to the e-mail address(es) listed here. Personal contact information is considered confidential and is collected to be used by DDBJ staff should questions arise; the general information about the research center is used for public display.

    First name*
    Submitter's first name.
    Last name*
    Submitter's last name.
    E-mail address. Enter an address from the organizations domain.


    Organization to which a contact person belongs.
    Submitting organization*
    Full name of organization.
    Submitting organization URL
    The URL of submitter's organization.

    Data Release

    Select "Hold" or "Release". You cannot specify hold date. Please see Data Release for detailed release mechanism.

    Submitted BioSample record will be released immediately after the curation process finishes.
    Submitted BioSample record is released when the DDBJ, DRA and DTA record(s) referencing this BioSample ID is released. Private DDBJ record(s) referencing this BioSample ID is not released.

    General info

    External Links
    An URL may be provided, with a label for the resource, to reference a resource that is directly relevant to the submitted sample.
    Link description
    Display name of web site that is related to this sample.
    URL of the web site.

    Sample type

    Core Package

    Genome, metagenome or marker sequences (MIxS compliant)
    Use for genomes, metagenomes, and marker sequences. These samples include specific attributes that have been defined by the Genome Standards Consortium (GSC) to formally describe and standardize sample metadata for genomes, metagenomes, and marker sequences. The samples are validated for compliance based on the presence of the required core attributes as described in MIxS. For details, please see the GSC websites.
    Other samples (e.g. transcriptome, epigenetics etc)
    Use for any sample type (e.g. transcriptome, epigenetics etc). These samples are described using common core attributes and submitter-supplied custom attributes.


    (Meta)Genomic Sequences Sample (MIMS)
    Environmental/Metagenome Genomic Sequences
    Please refer to environmental samples.
    Genomic Sequences Sample (MIGS)
    Cultured Bacterial/Archaeal Genomic Sequences
    Eukaryotic Genomic Sequences
    Viral Genomic Sequences

    Environmental samples do not include endosymbionts that can be reliably recovered from a particular host, organisms from a readily identifiable but uncultured field sample (e.g., many cyanobacteria), or phytoplasmas that can be reliably recovered from diseased plants (even though these cannot be grown in axenic culture). Select "Cultured Bacterial/Archaeal" or "Eukaryotic" or "Viral".

    Marker Sequences Sample (MIMARKS)
    Specimen Marker Sequences
    Survey related Marker Sequences

    MIMARKS specimen: for marker gene (e.g., COI) sequences obtained from any material identifiable by means of specimens

    MIMARKS-specimen applies to the contextual data for marker gene sequences from cultured or voucher-identifiable specimens.

    MIMARKS survey: for uncultured diversity marker gene (e.g., 16S rRNA, 18S rRNA, nif, amoA, rpo) surveys

    MIMARKS-survey is applicable to contextual data for marker gene sequences, obtained directly from the environment, without culturing or identification of the organisms.

    Environmental package

    Environmental package (MIxS Sample)
    No package
    microbial mat/biofilm
    miscellaneous or artificial


    Sample attributes
    List of attributes.
    Download BioSample worksheet which has been customised to fit models. This is a tab-delimited text file that may be opened with a spreadsheet program or a text editor.
    A list of attributes and their definitions can be viewed here.Besides the mandatory fields, there are several optional attribute fields. To make the BioSample record most useful, you should include all available information in the submission. Commonly used and useful attributes have been defined, with standardized nomenclature. In preparing your submission, please refer to this attributes list and fill in the relevant fields. If you have information of a type that does not appear in the standard list, you can create it as a Custom Attribute.


    PubMed ID
    Provide a PubMed ID for any publications directly related to all samples in the submission. How do I add reference information?


    Provide a DOI if a PubMed ID is not available. Provide the additional reference information.
    Reference title*
    Journal title*
    Pages from*
    Pages to*
    First name*
    Middle initial.
    Last name*
    This publication has multiple authors
    If this is checked, then "et al" is added to the author name provided above.


    Private comments to DDBJ staff
    Use this field if you have questions for database support staff. The content is not made public.

    Submission to BioSample

    Submission of research data from human subjects
    For submitting data from human subjects (human data) to the databases of DDBJ center, it is submitter's responsibility to ensure that the dignity and right of human subject are protected in accordance with all applicable laws, ordinances, guidelines and policies of submitter's institution. In principle, make sure to remove any direct personal identifiers of human subjects from your data to be submitted. Before submitting human data, read the "Submission of research data from human subjects".

    Submission to BioSample

    Create a new sample submission

    Obtain a submission account according to the Account Handbook.

    Move to the Biosample submission page from the “Biosample” menu at the top. Create a new sample submission by clicking the [New submission] button.

    Create a new sample submission
    Create a new sample submission

    To submit a BioSample, enter content from left to right tabs.

    For BioSample metadata, please see the BioSample metadata.

    Submit new samples

    Select a sample type in the "SAMPLE TYPE". For genome samples, minimum sample attributes are defined by MIxS.

    For the Sample type, please see the BioSample Handbook.

    Select a sample type

    Enter sample attributes

    Download a template text file according to the selected sample type to enter sample attributes.

    A main sample submission step is to describe samples by required, optional and user-defined attributes.

    Download a text file for entering sample attributes

    A text file is separated by tab and can be opened and editted in spreadsheet editor (e.g. Excel®). Attribute names are in a header line. Attributes with "*" are required.

    From second lines, enter one sample per line. Enter PSUB submission id in bioproject_id for project without PRJD accession numbers.

    Missing value reporting

    The International Nucleotide Database Collaboration (INSDC) have developed a standardised missing/null value reporting language to be used where a value of an expected format for sample metadata reporting can not be provided. Submitters are strongly encouraged to always provide true values of expected formats. However, if missing/null value reporting is required submitters are asked to use a term with the finest granularity for their reported situation. If appropriate, use a term in the "lower level", if not, use a term in the "top level".

    To facilitate an understanding of the supported terms we enclose a table with the missing/null value terms and their definitions.

    Please use the following standardised missing value vocabulary only if a true value of an expected format for a mandatory field is missing. If a true value is missing for a recommended or an optional field then these fields should not be used for reporting at all.

    INSDC missing value reporting terms

    INSDC term
    (top level)
    INSDC term
    (lower level)
    not applicable information is inappropriate to report, can indicate that the standard itself fails to model or represent the information appropriately
    missing not collected information of an expected format was not given because it has not been collected
    not provided information of an expected format was not given, a value may be given at the later stage
    restricted access information exists but can not be released openly because of privacy concerns

    BioSample attribute list. User-defined attributes can be added at rightmost column.

    Enter attributes in one line for one sample

    *sample_name *sample_title description *organism *taxonomy_id
    NBRC 100918 S.albus …… Streptacidiphilus albus 105425
    NBRC 100919 S.carbonis …… Streptacidiphilus carbonis 105422

    Real-world examples of sample attributes

    BioSample accession Sample type Sample attributes
    SAMD00018424 MIGS.ba text
    Enter sample attributes by using spreadsheet software

    In one submission, samples can be submitted as 1 sample - 1 line in sample attributes tab-delimited text file.

    Check content in the last "OVERVIEW" and submit samples. In the "ATTRIBUTES" area, the submitted sample attribute file can be downloaded.

    Submit BioSample

    In the "ATTRIBUTES", updated attribute file is available.

    Accession numbers

    A temporary ID starts with SSUB is automatically assigned to a submitted BioSample. Until an official accession number will be issued, the submitted sample is referenced by this ID. After reviewing process, the DDBJ BioSample issues accession numbers with prefix SAMD to the completed data. You can view status and accession numbers of submitted samples in your submission account.

    • Do NOT cite a temporary ID starts with SSUB in references.
    • Do not double submit the samples which have been registered to EBI and NCBI.

    Release of BioSample

    You can select the following options:

    • Release immediately following curation
    • Release when referenced data is published

    Hold date cannot be set for BioSample.

    The submitted sample data can be kept private. Sample data are automatically released when the linked DDBJ record(s) is published. The release of the BioSample record does not trigger the release of private DDBJ sequence record(s) referencing this BioSample accession.However, the release of the BioSample record does trigger the release of referencing BioProject.

    FAQ: How are linked BioProject/BioSample/sequence data released?

    Update BioSample

    It is possible to update data after registration. Please contact us from Message form.