Gene set enrichment analysis (GSEA)

Calculate enrichment scores for gene sets from the MSigDB

Written by Andrew Goodspeed

Differential expression analysis generates a large list of potentially interesting genes. This type of data can be too large to easily make sense of it in terms of biological pathways. Pathway analysis algorithms can be used to summarize the measured data from individual genes up to a broader pathway level. Gene Set Enrichment Analysis (GSEA) is one of the most popular forms of pathway analysis.

Glossary

  • Gene Set - a list of genes that are known to be associated with a biological pathway

    • Example: the “Hypoxia” gene set consists of 200 genes known to increase expression in response to low oxygen levels (hypoxia)

  • Collection - a collection/set of Gene Sets from the Molecular Signatures Database (MSigDB), a database of collections and gene sets

  • DEG table - the results / plot data produced by a differential expression analysis

Gene Set - a list of genes that are known to be associated with a biological pathway

  • Example: the “Hypoxia” gene set consists of 200 genes known to increase expression in response to low oxygen levels (hypoxia)

Example: the “Hypoxia” gene set consists of 200 genes known to increase expression in response to low oxygen levels (hypoxia)

Collection - a collection/set of Gene Sets from the Molecular Signatures Database (MSigDB), a database of collections and gene sets

Example: the “Immunologic signature gene sets collection” contains 5,200 gene sets related to immune system response to different conditions

DEG table - the results / plot data produced by a differential expression analysis

Running gene set enrichment analysis in Pluto

In order to run GSEA, you first need to create a differential expression analysis. Once at least one differential expression analysis has been created, the Gene Set Enrichment analysis type will become enabled in the analysis sidebar in Pluto for that experiment.

Add a new analysis, select Gene Set Enrichment, select the differential expression comparison that you'd like to analyze with GSEA. Finally, select the gene set collection to use and click "Run analysis."

Creating an enrichment plot for one gene set

By default, when your GSEA analysis completes, Pluto will display the most significantly enriched gene set according to p-value.

To view the enrichment plot for a different gene set, go to the "Plot" tab in the sidebar and select a new gene set from the dropdown. You can also choose to show the adjusted p-value or the normalized enrichment score for the new gene set. Click "Save Changes" to view the new gene set's interactive enrichment plot.



You can also edit the plot title and other aspects of the plot's appearance from the Plot menu.

Viewing all gene sets

GSEA outputs an adjusted p-value and a normalized enrichment score (NES) for each gene set in the collection that you selected when setting up your analysis. These values represent significance and magnitude/direction, respectively, and tell a researcher whether the gene set / pathway was significantly changed in their differential expression comparison of interest.

To view the entire table of gene sets, click the "Expand" button next to the gene sets dropdown. The gene sets table can be searched and downloaded as well.



Plotting multiple gene sets by NES

Another useful way to visualize GSEA results is with a bar plot showing multiple gene sets by their NES values.



In these plots, gene sets (x-axis) are ranked and displayed according to the NES values (y-axis). Positively enriched gene sets (NES value > 0) are shown in the dark navy color above, and negatively enriched gene sets (NES value < 0) are shown below the x=0 line in a light blue color.

To create this plot, choose the Score Bar Plot plot type from the "Edit Plot" menu. Optionally check the box to show the plot as full-width, and filter gene sets by their adjusted p-value and/or NES value. Click "View Changes" to view your plot.



As with all Pluto plots, you can customize the appearance of the score barplot using the "Plot" menu in the sidebar.

Ready to try it out on your data?