MAGE-TAB example

IDF Example*

IDF Example

MAGE-TAB Version1.1  
Investigation TitleHomo sapiens TK6
Experimental Designgenetic_modification_designtime_series_design
Experimental Factor NameGenetic ModificationIncubation Time
Experimental Factor Typegenetic_modificationtimepoint
Experimental Factor Term Source REFMGED OntologyMGED Ontology
Experimental Factor Term Accession NumberMO_927MO_738
   
Person Last NameMishimaFuji
Person First NameHanakoTarou
Person Emailhanako@ddbj.nig.ac.jp
Person Phone+81-55-981-6853
Person AddressYata 1111, Mishima, Shizuoka, Japan
Person AffiliationNational Institute of Genetics, DNA Data Bank of Japan
Person Rolessubmitter;investigatorinvestigator
Person Roles Term Source REFMGED OntologyMGED Ontology
   
Quality Control Typebiological_replicate 
Quality Control Term Source REFMGED Ontology 
Replicate Typebiological_replicate 
Replicate Term Source REFMGED Ontology 
Date of Experiment2005-02-28 
Public Release Date2006-01-03 
   
PubMed ID21062814
Publication Author ListHanako Mishima and Tarou Fuji.
Publication Statussubmitted
Experiment DescriptionGene expression of TK6 cells transduced with an ... the Neomycin resistance gene (TK6neo). 
   
Protocol NameGROWTHPRTCL 10653GROWTHPRTCL 10654
Protocol Typegrownucleic_acid_extraction
Protocol DescriptionTK6 cells were grown in suspension cultures in RPMI 1640 medium supplemented with...Approximately 10^6 cells were lysed in RLT buffer (Qiagen). Total RNA was extracted...
Protocol Parametersmedia;timeExtracted Product;Amplification
Protocol Term Source REFMGED OntologyMGED Ontology
   
SDRF Filee-dord-69.sdrf.txt 
   
Term Source NameMGED OntologyNCI Thesaurus
Term Source Filehttp://mged.sourceforge.net/ontologies/MGEDontology.phphttp://bioportal.bioontology.org/ontologies/NCIT
Term Source Version1.3.0.1

A full listing of all supported IDF tags can be found in the IDF section.

SDRF Conceptual Examples*

In the examples that follow, each complete path through an investigation design graph has been represented by a row in the corresponding table. Column headings in blue denote graph node identifier columns (names).

Example 1: Simple iterated design*

For extremely simple applications such as the example in Figure 1(a), a table may be as simple as shown in Figure 1(b), in which the protocol referenced by the identifier P-DORD-10 should include all the processing needed to get from the source sample to the final hybridization.

Figure 1(b) is an important use case, for example in the description of simple Affymetrix chip-based investigations. For a higher degree of granularity, where multiple protocols have been used in the processing of Sources through to Hybridizations, it is proposed that these protocols be given in order using repeated Protocol REF columns as shown in Figure 1(c).

(a) Investigation design graph
(b) Simple, unstructured representation of sample-hybridization relationships
Source Name Protocol REF Hybridization Name Array Data File Derived Array Data Matrix File
Source 1 P-DORD-10 Hybridization 1 Data1.CEL Matrix.txt
Source 2 P-DORD-10 Hybridization 2 Data2.CEL Matrix.txt
Source 3 P-DORD-10 Hybridization 3 Data3.CEL Matrix.txt
Source 4 P-DORD-10 Hybridization 4 Data4.CEL Matrix.txt
(c) Use of repeated protocol columns
Source Name Protocol REF Protocol REF Protocol REF Protocol REF Hybridization Name
Source 1 P-DORD-2 P-DORD-3 P-DORD-4 P-DORD-5 Hybridization 1
Source 2 P-DORD-2 P-DORD-3 P-DORD-4 P-DORD-5 Hybridization 2
Source 3 P-DORD-2 P-DORD-3 P-DORD-4 P-DORD-5 Hybridization 3
Source 4 P-DORD-2 P-DORD-3 P-DORD-4 P-DORD-5 Hybridization 4

Example 2: Simple iterated design, all nodes given*

A more complex alternative for such simple investigations is to explicitly indicate the materials used and created during the investigation as nodes in the investigation design graph, as shown in Figure 2. Figure 2(b) describes a similar investigation to that given in Figure 1(b). The Protocol REF columns have been omitted for clarity.

(a) Investigation design graph
(b) All nodes encoded in table
Source Name Protocol REF Sample Name Extract Name Labeled Extract Name Hybridization Name Array Data File Derived Array Data Matrix File
Source 1 P-DORD-10 Sample 1 Extract 1 LabeledExtract 1 Hybridization 1 Data1.CEL Matrix.txt
Source 2 P-DORD-10 Sample 2 Extract 2 LabeledExtract 2 Hybridization 2 Data2.CEL Matrix.txt
Source 3 P-DORD-10 Sample 3 Extract 3 LabeledExtract 3 Hybridization 3 Data3.CEL Matrix.txt
Source 4 P-DORD-10 Sample 4 Extract 4 LabeledExtract 4 Hybridization 4 Data4.CEL Matrix.txt

Example 3: Iterated design incorporating technical replicates*

Use of technical replicates may be indicated by branching within the graph, as shown in Figure 3. To indicate that biological replicates were used in an investigation, the investigation design graph will typically be constructed as shown in Figure 1(a) or Figure 2(a), with additional sample annotation indicating the relationships between replicate samples.

(a) Investigation design graph
(b) SDRF representation
Source Name Sample Name Extract Name Labeled Extract Name Hybridization Name Array Data File Derived Array Data Matrix File
Source 1 Sample 1 Extract 1 LabeledExtract 1 Hybridization 1 Data1.CEL Matrix.txt
Source 1 Sample 1 Extract 2 LabeledExtract 2 Hybridization 2 Data2.CEL Matrix.txt
Source 2 Sample 2 Extract 3 LabeledExtract 3 Hybridization 3 Data3.CEL Matrix.txt
Source 2 Sample 2 Extract 4 LabeledExtract 4 Hybridization 4 Data4.CEL Matrix.txt

Example 4: Iterated design single channel, sample pooling*

Figure 4 illustrates a example case of sample pooling. Graph nodes referring to the data files (Array Data File and Derived Array Data File) have been omitted in all subsequent examples for the sake of clarity.

(a) Investigation design graph
(b) SDRF representation
Source Name Sample Name Extract Name Labeled Extract Name Hybridization Name
Source 1a Sample 1 Extract 1 LabeledExtract 1 Hybridization 1
Source 1b Sample 1 Extract 1 LabeledExtract 1 Hybridization 1
Source 2a Sample 2 Extract 2 LabeledExtract 2 Hybridization 2
Source 2b Sample 2 Extract 2 LabeledExtract 2 Hybridization 2
Source 3a Sample 3 Extract 3 LabeledExtract 3 Hybridization 3
Source 3b Sample 3 Extract 3 LabeledExtract 3 Hybridization 3

Example 5: Iterated design, dual channel*

(a) Investigation design graph
(b) SDRF representation
Source Name Sample Name Extract Name Labeled Extract Name Label Hybridization Name
Source 1a Sample 1a Extract 1a LabeledExtract 1a Cy3 Cy3 Hybridization 1
Source 1b Sample 1b Extract 1b LabeledExtract 1b Cy5 Cy5 Hybridization 1
Source 2a Sample 2a Extract 2a LabeledExtract 2a Cy3 Cy3 Hybridization 2
Source 2b Sample 2b Extract 2b LabeledExtract 2b Cy5 Cy5 Hybridization 2
Source 3a Sample 3a Extract 3a LabeledExtract 3a Cy3 Cy3 Hybridization 3
Source 3b Sample 3b Extract 3b LabeledExtract 3b Cy5 Cy5 Hybridization 3

Example 6: Iterated design with common reference.*

(a) Investigation design graph
(b) SDRF representation
Source Name Sample Name Extract Name Labeled Extract Name Label Hybridization Name
Source 1 Sample 1 Extract 1 LabeledExtract 1 Cy3 Cy3 Hybridization 1
Source 2 Sample 2 Extract 2 LabeledExtract 2 Cy3 Cy3 Hybridization 2
Source 3 Sample 3 Extract 3 LabeledExtract 3 Cy3 Cy3 Hybridization 3
Source 4 Sample 4 Extract 4 LabeledExtract 4 Cy3 Cy3 Hybridization 4
Reference Reference Ref. Extract Reference LE Cy5 Cy5 Hybridization 1
Reference Reference Ref. Extract Reference LE Cy5 Cy5 Hybridization 2
Reference Reference Ref. Extract Reference LE Cy5 Cy5 Hybridization 3
Reference Reference Ref. Extract Reference LE Cy5 Cy5 Hybridization 4

Example 7: Complex time series.*

For simplicity, we can collapse the cascading graph into a flatter structure, using Time as a Factor Value (Figure 7(b)). This would translate into the simplified investigation design graph in Figure 7(c).

(a) Investigation design graph
(b) Flattened cascading time series structure
Source Name Protocol REF Sample Name Extract Name Labeled Extract Name Hybridization Name Factor Value [Incubation Time] Unit [Time Unit] Term Source REF
Source 1 P-DORD-12 Sample 1 Extract 1 LabeledExtract 1 Hybridization 1 0 hours MGED Ontology
Source 1 P-DORD-12 Sample 2 Extract 2 LabeledExtract 2 Hybridization 2 1 hours MGED Ontology
Source 1 P-DORD-12 Sample 3 Extract 3 LabeledExtract 3 Hybridization 3 2 hours MGED Ontology
Source 1 P-DORD-12 Sample 4 Extract 4 LabeledExtract 4 Hybridization 4 3 hours MGED Ontology
(c) Simple investigation design graph representation

SDRF Columns Use Cases*

Use Case 1: Treatment variation*

Variations in the treatments used can also be indicated in the SDRF. These can be represented as distinct protocols, as shown in Figure 8(a).

Alternatively, a single protocol could be used with different parameter values. In this case the parameter would have to be linked to its protocol via the IDF header file (Figure 8(b)). Parameter values may be specified with units (Figure 8(c)), and included in the Factor Values for an investigation by creating a separate Factor Value column containing duplicated values.

(a) Simple compound treatment using different protocols for different compounds. Two samples used in a dye swap.
Source Name Protocol REF Extract Name Labeled Extract Name Label Hybridization Name
ethanol-treated P-DORD-1 Eth. Ext Eth. LE Cy3 Cy3 Hyb 1
butanol-treated P-DORD-2 But. Ext But. LE Cy5 Cy5 Hyb 1
ethanol-treated P-DORD-1 Eth. Ext Eth. LE Cy3 Cy3 Hyb 2
butanol-treated P-DORD-2 But. Ext But. LE Cy5 Cy5 Hyb 2
(b) Same investigation as depicted in Figure 8(a), using protocol parameters rather than separate protocols.
Source Name Protocol REF Parameter Value [Compound] Extract Name Labeled Extract Name Label Hybridization Name
ethanol-treated P-DORD-3 ethanol Eth. Ext Eth. LE Cy3 Cy3 Hyb 1
butanol-treated P-DORD-3 butanol But. Ext But. LE Cy5 Cy5 Hyb 1
ethanol-treated P-DORD-3 ethanol Eth. Ext Eth. LE Cy3 Cy3 Hyb 2
butanol-treated P-DORD-3 butanol But. Ext But. LE Cy5 Cy5 Hyb 2
(c) Example of Parameters with units.
Source Name Protocol REF Parameter Value [Temperature] Unit [TemperatureUnit] Term Source REF Sample Name Extract Name Labeled Extract Name Label Hybridization Name Factor Value [Temperature] Unit [TemperatureUnit]
Source 1 P-DORD-5 22 degree_C MGED Ontology 22 deg 22 deg. Ext 22 deg. LE Cy3 Cy3 Hyb 1 22 degree_C
Source 1 P-DORD-5 37 degree_C MGED Ontology 37 deg 37 deg. Ext 37 deg. LE Cy5 Cy5 Hyb 1 37 degree_C
Source 1 P-DORD-5 22 degree_C MGED Ontology 22 deg 22 deg. Ext 22 deg. LE Cy3 Cy3 Hyb 2 22 degree_C
Source 1 P-DORD-5 37 degree_C MGED Ontology 37 deg 37 deg. Ext 37 deg. LE Cy5 Cy5 Hyb 2 37 degree_C

Use Case 2: Variation in treatment application (ChIP-chip)*

For investigations where some treatments are not applied to all the samples, gaps (or the "->" symbol), separated by tabs, may be left in the table to indicate this. For example, ChIP-chip investigations typically compare a chromatin immunoprecipitate to the whole genomic DNA extract from which it was derived. In this example, P-DORD-1, P-DORD-2 and P-DORD-3 reference a genomic DNA extraction protocol, an immunoprecipitation protocol, and a labeling protocol, respectively. These protocol types are specified in the IDF.

Source Name Protocol REF Extract Name Protocol REF Extract Name Protocol REF Labeled Extract Name Label Hybridization Name
yeast 1 P-DORD-1 extract 1 P-DORD-2 ip 1 P-DORD-3 ip 1 LE Cy3 Cy3 Hyb 1
yeast 1 P-DORD-1 extract 1 -> -> P-DORD-3 extract 1 LE Cy5 Cy5 Hyb 1
yeast 1 P-DORD-1 extract 2 P-DORD-2 ip 2 P-DORD-3 ip 2 LE Cy3 Cy3 Hyb 2
yeast 1 P-DORD-1 extract 2 -> -> P-DORD-3 extract 2 LE Cy5 Cy5 Hyb 2

Use Case 3: Association of data files with hybridizations and samples*

Measured and derived data files can be associated with a specific hybridization. In each case, either an "Array Design File" or "Array Design REF" column is needed, referencing either an included ADF file or an identifier in a public repository such as ArrayExpress/DOR, respectively. Note that the repository can optionally be indicated using a "Term Source REF" column.

Figure 10: Data files linked to hybridizations on a per-hybridization basis.
Source Name Hybridization Name Array Design REF Term Source REF Array Data File Derived Array Data File
Sample 1 Hybridization 1 A-DORD-1 ArrayExpress Data1.CEL Data1.CHP
Sample 2 Hybridization 2 A-DORD-1 ArrayExpress Data2.CEL Data2.CHP
Sample 3 Hybridization 3 A-DORD-1 ArrayExpress Data3.CEL Data3.CHP
Sample 4 Hybridization 4 A-DORD-1 ArrayExpress Data4.CEL Data4.CHP

Use Case 4: Technology type, high-throughput sequencing*

For experiments using technology types other than array such as high throughput sequencing, the SDRF should have an "Assay Name" column instead of a "Hybridization Name" column. The "Assay Name" column can be followed by a "Technology Type" column that describes the specific technology used. The following table shows a hypothetical experiment using high-throughput sequencing.

Figure 11: High-throughput sequencing.
Source Name Assay Name Technology Type Term Source REF Protocol REF Array Data File Derived Array Data File
Source 1 Assay 1 high_throughput_sequencing EFO P-DORD-8 run1.fastq run1_norm.txt
Source 2 Assay 2 high_throughput_sequencing EFO P-DORD-8 run2.fastq run2_norm.txt
Source 3 Assay 3 high_throughput_sequencing EFO P-DORD-8 run3.fastq run3_norm.txt
Source 4 Assay 4 high_throughput_sequencing EFO P-DORD-8 run4.fastq run4_norm.txt

SDRF Real-world Examples*

Real-world Example 1: ArrayExpress experiment E-TABM-21*

Each node in the investigation design graph must be represented by an appropriate "Name" column, with the graph edges given as "Protocol REF" columns. Several other column types may be used to convey sample annotation. Data files are referenced using "Array Data File" and "Derived Array Data File" columns, shown here but omitted from subsequent examples for clarity (Figure 12). Note that this example has Genotype as experimental factor. View the E-TABM-21 in the ArrayExpress.

Figure 12: ArrayExpress experiment E-TABM-21 - a simple iterated single-channel design.
Source Name Characteristics [Organism] Term Source REF Characteristics [Genotype] Protocol REF Extract Name Protocol REF Labeled Extract Name Label Hybridization Name Array Data File Derived Array Data File Factor Value [Genotype]
CHIP_322_A Arabidopsis thaliana NCBI_tax wild type P-TABM-21 CHIP_322_A P-TABM-43 CHIP_322_A biotin CHIP_322_A CHIP_322_A.CEL CHIP_322_A.CHP wild_type
CHIP_322_B Arabidopsis thaliana NCBI_tax wild type P-TABM-21 CHIP_322_B P-TABM-43 CHIP_322_B biotin CHIP_322_B CHIP_322_B.CEL CHIP_322_B.CHP wild_type
CHIP_322_C Arabidopsis thaliana NCBI_tax wild type P-TABM-21 CHIP_322_C P-TABM-43 CHIP_322_C biotin CHIP_322_C CHIP_322_C.CEL CHIP_322_C.CHP wild_type
CHIP_323_A Arabidopsis thaliana NCBI_tax co P-TABM-21 CHIP_323_A P-TABM-43 CHIP_323_A biotin CHIP_323_A CHIP_323_A.CEL CHIP_323_A.CHP co
CHIP_323_B Arabidopsis thaliana NCBI_tax co P-TABM-21 CHIP_323_B P-TABM-43 CHIP_323_B biotin CHIP_323_B CHIP_323_B.CEL CHIP_323_B.CHP co
CHIP_323_C Arabidopsis thaliana NCBI_tax co P-TABM-21 CHIP_323_C P-TABM-43 CHIP_323_C biotin CHIP_323_C CHIP_323_C.CEL CHIP_323_C.CHP co
CHIP_324_A Arabidopsis thaliana NCBI_tax ft P-TABM-21 CHIP_324_A P-TABM-43 CHIP_324_A biotin CHIP_324_A CHIP_324_A.CEL CHIP_324_A.CHP ft
CHIP_324_B Arabidopsis thaliana NCBI_tax ft P-TABM-21 CHIP_324_B P-TABM-43 CHIP_324_B biotin CHIP_324_B CHIP_324_B.CEL CHIP_324_B.CHP ft
CHIP_324_C Arabidopsis thaliana NCBI_tax ft P-TABM-21 CHIP_324_C P-TABM-43 CHIP_324_C biotin CHIP_324_C CHIP_324_C.CEL CHIP_324_C.CHP ft
CHIP_326_A Arabidopsis thaliana NCBI_tax wild type P-TABM-21 CHIP_326_A P-TABM-43 CHIP_326_A biotin CHIP_326_A CHIP_326_A.CEL CHIP_326_A.CHP wild_type
CHIP_326_B Arabidopsis thaliana NCBI_tax wild type P-TABM-21 CHIP_326_B P-TABM-43 CHIP_326_B biotin CHIP_326_B CHIP_326_B.CEL CHIP_326_B.CHP wild_type
CHIP_326_C Arabidopsis thaliana NCBI_tax wild type P-TABM-21 CHIP_326_C P-TABM-43 CHIP_326_C biotin CHIP_326_C CHIP_326_C.CEL CHIP_326_C.CHP wild_type
CHIP_327_A Arabidopsis thaliana NCBI_tax co P-TABM-21 CHIP_327_A P-TABM-43 CHIP_327_A biotin CHIP_327_A CHIP_327_A.CEL CHIP_327_A.CHP co
CHIP_327_B Arabidopsis thaliana NCBI_tax co P-TABM-21 CHIP_327_B P-TABM-43 CHIP_327_B biotin CHIP_327_B CHIP_327_B.CEL CHIP_327_B.CHP co
CHIP_327_C Arabidopsis thaliana NCBI_tax co P-TABM-21 CHIP_327_C P-TABM-43 CHIP_327_C biotin CHIP_327_C CHIP_327_C.CEL CHIP_327_C.CHP co
CHIP_328_A Arabidopsis thaliana NCBI_tax ft P-TABM-21 CHIP_328_A P-TABM-43 CHIP_328_A biotin CHIP_328_A CHIP_328_A.CEL CHIP_328_A.CHP ft
CHIP_328_B Arabidopsis thaliana NCBI_tax ft P-TABM-21 CHIP_328_B P-TABM-43 CHIP_328_B biotin CHIP_328_B CHIP_328_B.CEL CHIP_328_B.CHP ft
CHIP_328_C Arabidopsis thaliana NCBI_tax ft P-TABM-21 CHIP_328_C P-TABM-43 CHIP_328_C biotin CHIP_328_C CHIP_328_C.CEL CHIP_328_C.CHP ft

Real-world Example 2: ArrayExpress experiment E-MEXP-549*

Biological replicates should be represented by distinct biological sources, grouped together by common experimental factor values. An example of this is given in Figure 13, where biological replicates (e.g., ARP1-0h, ARP2-0h and ARP3-0h) are represented as distinct Sources sharing the same factor value ("Time", in this example). In comparison, technical replicates are represented by branching of the investigation design graph at intermediate steps of the experimental processing, as shown in Figure 3. View the E-MEXP-549 in the ArrayExpress.

Figure 13: ArrayExpress experiment E-MEXP-252 (excerpt). A series of loop design investigations comparing the brains of worker bees which have different behaviors. Two sets of "loops" are shown here.
Source Name Characteristics [CellLine] Characteristics [CellType] Term Source REF Characteristics [DiseaseState] Term Source REF Characteristics [Organism] Term Source REF Characteristics [StrainOrLine] Hybridization Name Array Design REF Factor Value [Time] Unit [TimeUnit] Term Source REF
ARP1-0h MOLT4 T cell CTO acute lymphoblastic leukemia NCI_meta Homo sapiens NCBI_tax CFARP011 H-ARP1-0h A-AFFY-33 0 hours MO
ARP2-0h MOLT4 T cell CTO acute lymphoblastic leukemia NCI_meta Homo sapiens NCBI_tax CFARP011 H-ARP2-0h A-AFFY-33 0 hours MO
ARP3-0h MOLT4 T cell CTO acute lymphoblastic leukemia NCI_meta Homo sapiens NCBI_tax CFARP011 H-ARP3-0h A-AFFY-33 0 hours MO
ARP1-2h MOLT4 T cell CTO acute lymphoblastic leukemia NCI_meta Homo sapiens NCBI_tax CFARP011 H-ARP1-2h A-AFFY-33 2 hours MO
ARP2-2h MOLT4 T cell CTO acute lymphoblastic leukemia NCI_meta Homo sapiens NCBI_tax CFARP011 H-ARP2-2h A-AFFY-33 2 hours MO
ARP3-2h MOLT4 T cell CTO acute lymphoblastic leukemia NCI_meta Homo sapiens NCBI_tax CFARP011 H-ARP3-2h A-AFFY-33 2 hours MO
ARP1-4h MOLT4 T cell CTO acute lymphoblastic leukemia NCI_meta Homo sapiens NCBI_tax CFARP011 H-ARP1-4h A-AFFY-33 4 hours MO
ARP2-4h MOLT4 T cell CTO acute lymphoblastic leukemia NCI_meta Homo sapiens NCBI_tax CFARP011 H-ARP2-4h A-AFFY-33 4 hours MO
ARP3-4h MOLT4 T cell CTO acute lymphoblastic leukemia NCI_meta Homo sapiens NCBI_tax CFARP011 H-ARP3-4h A-AFFY-33 4 hours MO
ARP1-6h MOLT4 T cell CTO acute lymphoblastic leukemia NCI_meta Homo sapiens NCBI_tax CFARP011 H-ARP1-6h A-AFFY-33 6 hours MO
ARP2-6h MOLT4 T cell CTO acute lymphoblastic leukemia NCI_meta Homo sapiens NCBI_tax CFARP011 H-ARP2-6h A-AFFY-33 6 hours MO
ARP3-6h MOLT4 T cell CTO acute lymphoblastic leukemia NCI_meta Homo sapiens NCBI_tax CFARP011 H-ARP3-6h A-AFFY-33 6 hours MO
ARP1-8h MOLT4 T cell CTO acute lymphoblastic leukemia NCI_meta Homo sapiens NCBI_tax CFARP011 H-ARP1-8h A-AFFY-33 8 hours MO
ARP2-8h MOLT4 T cell CTO acute lymphoblastic leukemia NCI_meta Homo sapiens NCBI_tax CFARP011 H-ARP2-8h A-AFFY-33 8 hours MO
ARP3-8h MOLT4 T cell CTO acute lymphoblastic leukemia NCI_meta Homo sapiens NCBI_tax CFARP011 H-ARP3-8h A-AFFY-33 8 hours MO
ARP1-10h MOLT4 T cell CTO acute lymphoblastic leukemia NCI_meta Homo sapiens NCBI_tax CFARP011 H-ARP1-10h A-AFFY-33 10 hours MO
ARP2-10h MOLT4 T cell CTO acute lymphoblastic leukemia NCI_meta Homo sapiens NCBI_tax CFARP011 H-ARP2-10h A-AFFY-33 10 hours MO
ARP3-10h MOLT4 T cell CTO acute lymphoblastic leukemia NCI_meta Homo sapiens NCBI_tax CFARP011 H-ARP3-10h A-AFFY-33 10 hours MO
ARP1-12h MOLT4 T cell CTO acute lymphoblastic leukemia NCI_meta Homo sapiens NCBI_tax CFARP011 H-ARP1-12h A-AFFY-33 12 hours MO
ARP2-12h MOLT4 T cell CTO acute lymphoblastic leukemia NCI_meta Homo sapiens NCBI_tax CFARP011 H-ARP2-12h A-AFFY-33 12 hours MO
ARP3-12h MOLT4 T cell CTO acute lymphoblastic leukemia NCI_meta Homo sapiens NCBI_tax CFARP011 H-ARP3-12h A-AFFY-33 12 hours MO

Real-world Example 3: ArrayExpress experiment E-MTAB-533*

In the case of the high-throughput sequencing experiment, the generic "Assay Name" column should be used instead of the "Hybridization Name" for the array-based experiment. The "Technology Type" column with the value "high_throughput_sequencing" should follow and annotate the "Assay Name" column. The LIBRARY_LAYOUT, LIBRARY_SOURCE, LIBRARY_STRATEGY and LIBRARY_SELECTION information should be provided by using the "Comment" columns.

Some more platform specific attributes are required, in this example, the "Comment[SEQUENCE_LENGTH]" is mandatory because the "Illumina Genome Analyzer II" platform (defined in the "Protocol Hardware" of the IDF) was used.

Please see the Instruction for Sequencing Data.

View the E-MTAB-533 in the ArrayExpress.

Source Name Characteristics
[Organism]
Protocol REF Extract Name Comment
[LIBRARY_LAYOUT]
Comment
[LIBRARY_SOURCE]
Comment
[LIBRARY_STRATEGY]
Comment
[LIBRARY_SELECTION]
K562_GM12878_cell_lines Homo sapiens P-MTAB-19064 CCL-243 & GM12878 SINGLE TRANSCRIPTOMIC AMPLICON RT-PCR
L3_L4_stages Caenorhabditis elegans P-MTAB-19065 N2 L2 #4 and N2 L4-3 SINGLE TRANSCRIPTOMIC AMPLICON RT-PCR
Figure 14: ArrayExpress experiment E-MTAB-533 (excerpt, divided into the two tables).
Comment
[LIBRARY_SELECTION]
Protocol REF Performer Assay Name Comment
[SEQUENCE_LENGTH]
Technology Type Protocol REF Array Data File Factor Value
[ORGANISM]
RT-PCR P-MTAB-19066 LGTF s_3_1_all 75 high_throughput_sequencing P-MTAB-19067 s_3_1_all.qseq.gz Homo sapiens
RT-PCR P-MTAB-19066 LGTF s_4_1_all 75 high_throughput_sequencing P-MTAB-19067 s_4_1_all.qseq.gz Caenorhabditis elegans

ADF Example*

Figure 15 shows a simple case, where there is a one-to-many (or one-to-one) mapping between Composite Elements and Reporters. Note how the information about Reporter and Composite Element is duplicated, to indicate the fact that every synthetic sequence is spotted more than one time on the array.

Figure 15: Simple design, one Reporter/one Composite Element relationship
Block Column Block Row Column Row Reporter Name Reporter
Sequence
Reporter Group
[role]
Control Type Control Type Term
Source REF
Composite Element
Name
1 1 1 1 R1 ATGGTTGGTTACGTGT experimental PTEN
1 1 1 2 R2 CCGCGTTGCCCCGCC experimental PAX2
1 1 1 3 R3 CGTAGCTGATCGGATGA experimental WWOX
1 1 1 4 R4 GGTTGGCTGAGATCGT experimental MAPK8
1 1 2 1 R1 ATGGTTGGTTACGTGT experimental PTEN
1 1 2 2 R2 CCGCGTTGCCCCGCC experimental PAX2
1 1 2 3 R3 CGTAGCTGATCGGATGA experimental WWOX
1 1 2 4 R4 GGTTGGCTGAGATCGT experimental MAPK8

MAGE-TAB Real-world Examples*

Following table lists the real-world MAGE-TAB example files of the microarray and sequencing data deposited into the ArrayExpress.

MAGE-TAB real-world examples
Accession Number IDF Text SDRF Text Experiment Summary Platform
E-TABM-18 idf text sdrf text Transcription profiling of 35 different Arabidopsis thaliana ecotypes Affymetrix
E-DORD-69 idf text sdrf text Translation profiling of Arabidopsis cell cultures, dual channel Agilent
E-TABM-985 idf text sdrf text Transcription profiling by array of mouse testis development, dye swap design Compugen
E-MEXP-2665 idf text sdrf text ChIP-chip by array of mouse cell lines and fresh tissues Multiple platforms
E-TABM-855 idf text sdrf text Genotyping of human lymphoblastoid cell lines from pairs of HapMap indiviudals Illumina GoldenGate SNP array
E-MTAB-371 idf text sdrf text ChIP-seq of human myofibroblasts Illumina Sequencing
E-GEOD-26444 idf text sdrf text Deep sequencing-based analysis of the anaerobic stimulon SOLiD sequencing
E-GEOD-24366 idf text sdrf text Global Changes following N-deprivation in Chlamydomonas 454 sequencing