Usage Instructions

Step 1: Choose an organism

Select one of the following organisms supported by GOrilla:

Arabidopsis thaliana
Saccharomyces cerevisiae
Caenorhabditis elegans
Drosophila melanogaster
Danio rerio (Zebrafish)
Homo sapiens
Mus musculus
Rattus norvegicus

Step 2: Choose running mode

Single ranked list of genes - When this option is selected the user should supply a single list where the genes are ranked according to some biological measurement (for example expression level). The software will search for GO terms that are enriched in the top of the list compared to the rest of the list using the mHG statistics. Note that the list should not be redundant, i.e. each gene should appear in the list only once.
Two lists of genes (target and background sets) - The user should supply two lists of genes (the ranking within each list does not effect results). The software searches for GO terms that are enriched in the target set compared to the background set using the standard Hyper Geometric statistics. Choosing the correct background set is very important. If for example your gene list comes from a microarray experiment the background should be all the genes on the array.

Step 3: Paste a ranked list of gene/protein names

Paste here a list of gene (or protein) names. For example:

CHMP5

RAB6C

ZNF394

FAM3B

NM_003174

etc...

Remarks:

* Each line should contain one gene name.

* The preferred format is gene symbol. Other supported formats are: gene and protein RefSeq, Uniprot, Unigene and Ensembl.

* Notice that the processed list of genes may contain less genes than what you entered. This is because not all genes have a GO record.

Step 4: Choose Ontology

Select one of the following ontology:

Biological process
Molecular function
Cellular component

Parameters

P-value threshold - Only GO terms with a p-value better than this threshold are reported.

Note that this p-value does not include the multiple hypothesis correction on the number of tested GO terms. To correct for this the p-value should be multiplied by the number of GO terms used as reported in the results page.

Output results in Microsoft Excel format - If checked, an Excel file with the results will also be generated.

Output unresolved and duplicate genes - If checked, additional files with the lists of all genes GOrilla did not recognize and genes which appeared in the list more than once will be generated.

Run GOrilla in fast mode - If checked GOrilla will only attempt to put the threshold separating between the target set and the background set in the top 10% of your list (or, at least 1,000 genes).
This parameter is only relevant to the case of one ranked list and not two lists of genes.

Output

The output of the software consists of a color-coded trimmed DAG of all significantly enriched GO terms. Clicking on a node in the color-coded DAG will show the enrichment p-value, the genes related to this GO term and a link for more information on this GO term.

Output example:

Output example