Single cell RNA-seq preprocessing: Normalize

Normalize your single cell RNA-seq data

Written by Caitlin Winkler, PhD

 

Normalize is the fourth preprocessing step in the Preprocessing phase of the Experiment roadmap, and follows the Filter step.

During the Normalize step, you will normalize your single cell RNA-seq data. Normalization is done to ensure that gene expression from individual cells is comparable and that potential technical biases are minimized. We offer two different approaches to normalize your data: Global-scaling (log normalized) and SCTransform. (1, 2)

Running the preprocess

To run the Filter preprocess, you will first select either Global-scaling (log normalized) or SCTransform as your normalization method. Next, you can change or select different parameters depending on your analysis goals and unique data set, such as regressing out sources of unwanted variation from your data or picking the number of principal components to use when reducing the dimensionality of your data for visualization.

Once you've selected your parameters, click Run preprocessing step. Feel free to navigate away from the modal window while the preprocess is running; you will get an email notification once the preprocess has successfully completed and you are able to move on to the next step.

Kapture 2024-01-05 at 09.34.30

Check out the Instructions & Tips tab in the modal window for more information about the preprocess, as well as recommendations on what to consider when normalizing your data.

Kapture 2024-01-05 at 09.34.56

With all single cell RNA-seq preprocesses, your sample-level metadata is readily available for reference within the modal window under the Samples tab.

Kapture 2024-01-05 at 09.35.30

Navigating the results

The Normalize step returns several plots that you can review (and are important to assess, especially when considering the next preprocess step, Integrate). Most plots are interactive. For the categorical dimensionality reduction plots (Sample and Cell cycle), you can toggle between groups of cells by clicking on the legend (one click will highlight the sample, and clicking the sample again will return the plot to the original view). For the Elbow plot, you can hover over the individual points for more info. The continuous dimensionality reduction plots (% Mito and UMI) are not interactive.

Kapture 2024-01-05 at 09.41.03

Accepting the results

If you need to or want to change any of the preprocess parameters, you can rerun the process by updating your parameters and clicking the Apply new updates button. Once you are ready to move on to the next preprocessing step in the workflow, click Accept results & proceed. This will pop-up an additional confirmation window, where you can click Yes, accept & proceed to continue on with the workflow or No, take me back if you would like to keep modifying the Normalize step.

Kapture 2024-01-05 at 09.42.41

What's next?

After you have successfully completed the Normalize step, you will move on to the Integrate preprocess, which is optional. Integration can be useful when working with multiple samples, batches, or experiments where variation clearly exists in low-dimensional space. However, integration is not always necessary.

References

  1. Hao and Hao et al. Integrated analysis of multimodal single-cell data. Cell (2021). doi: 10.1016/j.cell.2021.04.048
  2. Hefemeister and Satija. Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression. Genome Biology (2019). doi: 10.1186/s13059-019-1874-1