JGA Submission Handbook

Created: October 28, 2014; Last updated: November 15, 2017

    JGA overview

    The DNA Data Bank of Japan (DDBJ) Center operates the Japanese Genotype-phenotype Archive (JGA) for human genotype and phenotype data in collaboration with National Bioscience Database Center (NBDC).

    • JGA is a controlled-access database and is different from the other INSDC unrestricted-access databases.
    • Submission account system of the JGA is separated from the D-way account system for the other DDBJ Center's unrestricted-access databases.
    • To submit data to the JGA, an approval from NBDC is necessary.
    • Login details for the JGA submission account are notified from NBDC.

    This page explains how to submit data to the JGA.
    For JGA overview, see the slide below (Japanese only).

    Create metadata by using excel

    Enter metadata into the excel

    New options were added according to the update to the JGA xsd 1.0.8.

    The fields for phenotype and disease were added to the metadata excel.

    The attributes fields were added to the Study and "Design Description" was added to the Experiment.

    The Subject ID (anonymized individual ID) and Gender (sex) were made mandatory.

    Download and fill in the metadata in English. For the JGA metadata, see this page.

    JGA metadata excel

    last updated: 2017-01-25

    The excel file name must end with "_metadata.xlsx". Before the "_metadata", Submission ID and NBDC hum number can be freely added.
    Do NOT include any white spaces in the names of files to be uploaded to JGA.
    When possible, concatenate the multiple data files in the Data/Analysis object to reduce the number of files per object for smoother download.

    JGA submission tool

    Download the latest JGA submission tool (last updated: 2017-11-16,v3.4.0) and run the tool by executing JGATool.bat.

    Execute in the Java 8. In the Java 7, the tool does not work. How to use in the proxy envinronment.


    JGA submission tool (Windows)

    Execute the tool by double clicking the bat file in the expanded files.

    Java Runtime Environment Version 8 Update 45 and newer are required.


    JGA submission tool (Unix)

    Excecute the sh file by shell in the expanded files.

    Java SE Development Kit 8u45 and newer are required. Does not work in the OpenJDK.

    Upload the excel

    Execute the JGA tool and login it by using the login ID and password sent from NBDC.

    Login the JGA tool
    Login the JGA tool

    The left window is your local computer and the right one is secure JGA server.

    In the Submission ID of the right window, select the JGA submission id (for example, example-0003) in the pulldown menu. In the left window, select the metadata excel (for example, JGA_example-0003_metadata.xlsx) and click the "Encrypt & Upload".

    Select the submission and metadata excel.
    Select the submission and metadata excel.

    The excel file is uploaded to the JGA server securely. Ignore the error messages in the bottom window.

    After uploading the excel, contact the JGA team.

    Uploaded excel file
    Uploaded excel file
    Do NOT send the metadata excel by e-mail.

    Download Excel/XML files

    Users can download the excel with filename ended with "_metadata.xlsx" and XML files by using the tool.

    Right-click the excel (for example, JGA_example-0003_r1_metadata.xlsx) and select Download from the menu. Then the selected excel file is downloaded to your computer.

    Download excel
    Download excel

    Right-click the XML file (for example, example-0003_Data.xml) and select the "Download" in the menu. Users can download the XML files one-by-one to your computer.

    Download XML
    Download XML

    Upload data files

    Data file format

    In the JGA submission system, file and archive, compression formats are judged by the file extensions.

    • The extensions, zip, tar, tar.gz, tgz, tar.bz2, tbz2, gz, bz2 are treated as the standard archive and compression formats. The other files with other archive and compression formats cause errors.
    • Do not compress the bam file.
    • Do not tar archive the compressed files by gz or bzip, instead, tar.gz the un-compressed files.

    File formats for submission

    Register the individual-level next-generation sequencing (NGS) fastq and bam files to the Data object and the microarray, variation, questionnaire files (non-NGS data) to the Analysis object.

    It is important that the processed data on which the conclusions in the related manuscript are based, are registered in the Analysis object.

    metadata XML file

    Select the downloaded XML and data files and upload them to the target submission by using the tool.

    Uplaod the metadata XML and data files
    Uplaod the metadata XML and data files

    Validation of submitted files

    Submitted metadata and data files are validated and the data files are uploaded in the encrypted form.

    If the all validation steps succeed, "[INFO] upload succeeded" is displayed in the bottom window. The JGA accessions will be issued after reviewing.

    When an error message is shown, contact the JGA team.

    Validated metadata XML and data files
    Validated metadata XML and data files

    How to select files

    Users can select multile files in the left window.

    Range selection

    Select the stating filename and then select the ending filename with pressing Shift, the files in the range are selected. Right click the "check (selected item)" and check the selected files.

    Range select the files
    Range select the files

    How to select distinct files

    Select the distinct files by pressing the Control, then select the "check (selected item)" and check the selected files.

    Check the selected files
    Check the selected files

    Select sub-directory

    All files in the directory are selected by clicking the folder checkbox.

    Check the sub-directory
    Check the sub-directory

    How to use in the proxy environment

    To use the tool in the proxy environment, users need to configure the file.

    Edit the "proxy.properties" in the tool folder and enter the proxy name (server=) and port number (port=).

    # Enter the server name and port number of the proxy server 
      to connect the JGA server via the proxy.
    # For example:
    # server=proxy.example.ac.jp
    # port=8080

    When the proxy server require the authentication, enter the credentials in the window after logging in the tool.

    The v3.2.0 can handle the BASIC authentification but can not handle the Digest one.

    Send the data files in hard disk

    When uploading files by the JGA tool takes too long time, for uploading large number and size of files, we accept the files in the hard disk drives.

    Disk format should be NTFS, ext3 or ext4.
    Check the entire disk by anti-virus checker.

    Encrypt data files

    Encrypt the data files by using the JGA data encryption tool and copy them into the hard disk. Upload the XML files by using the JGA submission tool and do not include them in the disk.

    JGA data encryption tool

    last updated: 2015-12-09

    Encrypt the file one-by-one. Do NOT encrypt tar archived files or directory.

    Operating envinronment

    • The disk space as large as the total size of the target files is necessary.
    • Confirmed in CentOS 6.4
    • Java Runtime Environment Version 8 Update 45 newer is neccesary.

    Expand the "jga-data-encrypt.tar.gz" by tar command. Following directories are expanded. Do NOT change the directory structure.

    jga-data-encrypt.sh (shell script) jar/ -> directory for execution files (do NOT change this directory)

    Move to the tool directory and execute the command.
    sh jga-data-encrypt.sh[space]-t[target files][space]-o[output directory path]

    $ sh jga-data-encrypt.sh -t target.fastq -o output

    Command options

    -t --target
    Sepecify the path to target files.
    The tool can encrypt one file. Multiple files by wild card cannot be specified.
    Use shell script to encrypt multiple files.

    -o --output
    Path to the directory to output encrypted files, encryption key and the MD5 file.
    When the directory is not found, the directory is automatically created.

    Output files

    In the output directory, the following 3 files are generated per 1 target file.

    1. encrypted file (.encrypt)
    The encrypted filename is [filename before encryption].encrypt (for example, file1.fastq will be file1.fastq.encrypt)

    2. key file (.encrypt.dat)
    The key file used for encryption. One key file is generated per one target file. The key file is encrypted by the public key. The filename is [encrypted target file name].dat. (for example, file1.fastq will be file1.fastq.encrypt.dat)

    MD5 file before and after encryption(.md5)
    The file recording the MD5 checksum values before and after encryption. One MD5 file is generated per one target file. The filename is [unencrypted filename].md5. (for example, file1.fastq will be file1.fastq.md5)

    Output messages

    The tool's messages are recorded in the log file ([server hostname].jga-data-encrypt.log in the tool directory) and shown on the standard screen. Normal messages are as follows.

    $ sh jgacmd.sh -t /home/hoge/file.txt -o /tmp/output
    START encrypt file ←start processing
    start encryption : /home/hoge/file.txt ←target filename
    encryption complete : /tmp/output /file.txt.encrypt ←output filename
    FINISH encrypt file ←processing finished

    Error messages

    Message Meaning
    [code 11] encryption error : <target> An error occurred during encryption process.
    [code 12] make md5 file error : <target> An error occurred during obtaining the md5 or writing md5 to the file.
    [code 13] output dir is not a directory : <target> -o soecified path id not a directory.
    [code 14] target is not a file : <target> -t path id not a file.

    Sending files

    For the JGA submission, three files "encrypted data files", "key file" and "md5 file" are neccesary. Copy them to the hard disk.

    Do NOT copy the metadata in the disk, instead, upload the metadata by using the JGA Submission Tool.

    Copy the data to the USB hard disk and send the disk to the address below. Include the cash on delibery slip for rerurn filled with the return address. We recommend to label the disk.

    Kodama Yuichi
    1111 Yata, Mishima, Shizuoka 411-8540, Japan