BioSample Handbook 20150408

Created: March 6, 2015; Last updated: May 31, 2016

    BioSample

    Overview

    Purpose

    The BioSample database was developed to serve as a central location in which to store descriptive information about biological samples used to generate experimental data in any of primary data archives.

    Following figure depicts how BioSample records are organized and linked with other objects. This example is composed of one umbrella project that encompasses three subprojects, each of which generated data derived from two BioSample records. Users can query either the BioProject or the BioSample database to retrieve the relevant records, and then navigate through links to the corresponding experimental data which continue to be stored in DDBJ's primary data archives, DDBJ, DRA and DOR.

    Overview of BioSample and BioProject integration with other DDBJ databases
    Overview of BioSample and BioProject integration with other DDBJ databases

    Sample

    Given the huge diversity of sample types handled by archival databases, and the fact that appropriate sample descriptions are often dependent on the context of the study, the definition of what a BioSample represents is deliberately flexible. Typical examples of a BioSample include a cell line, a primary tissue biopsy, an individual organism or an environmental isolate.

    Information about the sample will include:
    • Species
    • The material sampled, e.g., organs, tissues, cell type
    • Phenotypic information including disease states and clinical information about the individual

    The information about human subjects and access to it will be compliant with all relevant ethical requirements. The DDBJ BioSample database does not support controlled access mechanisms and thus cannot host human clinical samples that may have associated privacy concerns.

    Reference biosamples

    A particular set of biosamples submitted to BioSample databases directly may be referenced subsequently from many experiments. We will refer to this set of samples as reference biosamples. Example of these may be some commonly used cell lines or mouse strains. The BioSample pre-registered commonly-used samples and make it easy to reference these from other databases at INSDC. Reference biosamples include ATCC and Coriell.

    Sample attributes

    A major component of a BioSample record is the sample attributes section. Attributes define the material under investigation and can include sample characteristics such as cell type, collection site and phenotypic information like disease state.

    BioSample attributes are captured as structured name: value pairs, for example, tissue:liver

    The database supports and encourages use of dictionaries of attribute names.

    The first targeted dictionaries implemented in the DDBJ BioSample database are the MIxS minimum information checklists for standardizing descriptions of genomes, metagenomes and targeted locus sequences as recently developed by the Genomics Standards Consortium.

    For the MIxS check lists, please see Nature Biotechnology 29, 415–420 (2011) | doi: 10.1038/nbt.1823 (PMID:21552244 ).

    MIxS check list
    MIxS check list

    Organism

    For an organism name of the BioSample organism attribute, see the "Organism name" page. Previously, a strain name or some other lower taxon was required for the organism name of whole genomic sequence, mainly microorganisms. However, currently, the value of organism qualifier should be just a scientific name, in principle, even though for microbial genomes. Please describe a strain name in the strain attribute of BioSample.

    Related news: Changes in organism strain information management

    XML schema

    BioSample XML schema, may be modified in future.

    biosample.xsd

    Metadata

    Required*
    Conditionally required*

    Submitter

    Submitter

    Contact information of submitter(s). Questions and notifications about a submission are contacted to the e-mail address(es) listed here. Personal contact information is considered confidential and is collected to be used by DDBJ staff should questions arise; the general information about the research center is used for public display.

    First name*
    Submitter's first name.
    Last name*
    Submitter's last name.
    E-mail*
    E-mail address. Enter an address from the organizations domain.

    Organization

    Organization
    Organization to which a contact person belongs.
    Submitting organization*
    Full name of organization.
    Submitting organization URL
    The URL of submitter's organization.

    Data Release

    Select "Hold" or "Release". You cannot specify hold date. Please see Data Release for detailed release mechanism.

    Release
    Submitted BioSample record will be released immediately after the curation process finishes.
    Hold
    Submitted BioSample record is released when the DDBJ, DRA and DTA record(s) referencing this BioSample ID is released. Private DDBJ record(s) referencing this BioSample ID is not released.

    General info

    External Links
    An URL may be provided, with a label for the resource, to reference a resource that is directly relevant to the submitted sample.
    Link description
    Display name of web site that is related to this sample.
    URL
    URL of the web site.

    Sample type

    Core Package

    Genome, metagenome or marker sequences (MIxS compliant)
    Use for genomes, metagenomes, and marker sequences. These samples include specific attributes that have been defined by the Genome Standards Consortium (GSC) to formally describe and standardize sample metadata for genomes, metagenomes, and marker sequences. The samples are validated for compliance based on the presence of the required core attributes as described in MIxS. For details, please see the GSC websites.
    Other samples (e.g. transcriptome, epigenetics etc)
    Use for any sample type (e.g. transcriptome, epigenetics etc). These samples are described using common core attributes and submitter-supplied custom attributes.

    MIxS

    (Meta)Genomic Sequences Sample (MIMS)
    Environmental/Metagenome Genomic Sequences
    Please refer to environmental samples.
    Genomic Sequences Sample (MIGS)
    Cultured Bacterial/Archaeal Genomic Sequences
    Eukaryotic Genomic Sequences
    Viral Genomic Sequences

    Environmental samples do not include endosymbionts that can be reliably recovered from a particular host, organisms from a readily identifiable but uncultured field sample (e.g., many cyanobacteria), or phytoplasmas that can be reliably recovered from diseased plants (even though these cannot be grown in axenic culture). Select "Cultured Bacterial/Archaeal" or "Eukaryotic" or "Viral".

    Marker Sequences Sample (MIMARKS)
    Specimen Marker Sequences
    Survey related Marker Sequences

    MIMARKS specimen: for marker gene (e.g., COI) sequences obtained from any material identifiable by means of specimens

    MIMARKS-specimen applies to the contextual data for marker gene sequences from cultured or voucher-identifiable specimens.

    MIMARKS survey: for uncultured diversity marker gene (e.g., 16S rRNA, 18S rRNA, nif, amoA, rpo) surveys

    MIMARKS-survey is applicable to contextual data for marker gene sequences, obtained directly from the environment, without culturing or identification of the organisms.

    Environmental package

    Environmental package (MIxS Sample)
    No package
    air
    host-associated
    human-associated
    human-gut
    human-oral
    human-skin
    human-vaginal
    microbial mat/biofilm
    miscellaneous or artificial
    plant-associated
    sediment
    soil
    wastewater/sludge
    water

    Attributes

    Sample attributes
    List of attributes.
    Download BioSample worksheet which has been customised to fit models. This is a tab-delimited text file that may be opened with a spreadsheet program or a text editor.
    Attributes
    A list of attributes and their definitions can be viewed here.Besides the mandatory fields, there are several optional attribute fields. To make the BioSample record most useful, you should include all available information in the submission. Commonly used and useful attributes have been defined, with standardized nomenclature. In preparing your submission, please refer to this attributes list and fill in the relevant fields. If you have information of a type that does not appear in the standard list, you can create it as a Custom Attribute.

    Publications

    PubMed ID
    Provide a PubMed ID for any publications directly related to all samples in the submission. How do I add reference information?

    DOI

    DOI
    Provide a DOI if a PubMed ID is not available. Provide the additional reference information.
    Reference title*
    Journal title*
    Year*
    Volume*
    Issue*
    Pages from*
    Pages to*
    First name*
    MI
    Middle initial.
    Last name*
    Suffix
    This publication has multiple authors
    If this is checked, then "et al" is added to the author name provided above.

    Comments

    Private comments to DDBJ staff
    Use this field if you have questions for database support staff. The content is not made public.

    Submission to BioSample

    Data submission of human subjects research
    For all data from human subjects researches submitted to DDBJ, it is submitter's responsibility to ensure that the privacy of participant (human subject) is protected in accordance with all applicable laws, regulations and policies of submitter's institute.
    In principle, make sure to remove any direct personal identifiers of human subjects from your submissions.
    Before submitting data from human subjects researches, read the "Data submission of human subjects research".

    Submission to BioSample

    Create a new sample submission

    Obtain a submission account according to the Account Handbook.

    Move to the Biosample submission page from the “Biosample” menu at the top. Create a new sample submission by clicking the [Submit new sample] button.

    Create a new sample submission
    Create a new sample submission

    To submit a BioSample, enter content from left to right tabs.

    For BioSample metadata, please see the BioSample metadata.

    Submit new samples

    Select a sample type in the "SAMPLE TYPE". For genome samples, minimum sample attributes are defined by MIxS.

    For the Sample type, please see the BioSample Handbook.

    Select a sample type

    Download a template text file according to the selected sample type to enter sample attributes.

    A main sample submission step is to describe samples by required, optional and user-defined attributes.

    Download a text file for entering sample attributes

    A text file is separated by tab and can be opened and editted in spreadsheet editor (e.g. Excel®). Attribute names are in a header line. Attributes with "*" are required.

    From second lines, enter one sample per line. Enter PSUB submission id in bioproject_id for project without PRJD accession numbers. For attributes without measured values, enter "missing" or "not applicable".

    BioSample attribute list. User-defined attributes can be added at rightmost column.

    Enter attributes in one line for one sample

    *sample_name *sample_title description *organism *taxonomy_id
    NBRC 100918 S.albus …… Streptacidiphilus albus 105425
    NBRC 100919 S.carbonis …… Streptacidiphilus carbonis 105422

    Real-world examples of sample attributes

    BioSample accession Sample type Sample attributes
    SAMD00018424 MIGS.ba text
    Enter sample attributes by using spreadsheet software

    In one submission, samples can be submitted as 1 sample - 1 line in sample attributes tab-delimited text file.

    Check content in the last "OVERVIEW" and submit samples. In the "ATTRIBUTES" area, the submitted sample attribute file can be downloaded.

    Submit BioSample

    In the "ATTRIBUTES", updated attribute file is available.

    Accession numbers

    A temporary ID starts with SSUB is automatically assigned to a submitted BioSample. Until an official accession number will be issued, the submitted sample is referenced by this ID. After reviewing process, the DDBJ BioSample issues accession numbers with prefix SAMD to the completed data. You can view status and accession numbers of submitted samples in your submission account.

    • Do NOT cite a temporary ID starts with SSUB in references.
    • Do not double submit the samples which have been registered to EBI and NCBI.

    Release of BioSample

    You can select the following options:

    • Release immediately following curation
    • Release when referenced data is published

    Hold date cannot be set for BioSample.

    The submitted sample data can be kept private. Sample data are automatically released when the linked DDBJ record(s) is published. Private DDBJ record(s) referencing this BioSample ID is not released.

    FAQ: How are linked BioProject/BioSample/sequence data released?

    Update BioSample

    It is possible to update data after registration except for sample name. Please contact us from Message form.