CellKb: Cell type annotation suggestions

Leverage CellKb's database to help identify cell types in scRNA-seq

In single-cell RNA sequencing (scRNA-seq) experiments, researchers often analyze groups of cells - or clusters - to identify different cell types or states. Annotating these clusters involves identifying the specific cell types represented by the genes expressed in each cluster. This is crucial for understanding how cells function, interact, and contribute to biological processes.

However, annotating clusters can be a challenging task. The difficulty lies in the fact that there may be hundreds or thousands of genes in each cluster, and it’s often hard to know how to accurately assign a cell type label to each cluster without prior knowledge or a reference. This is where CellKb, a curated database of cell type markers, comes in to help.

CellKb is a valuable tool that helps make the process of annotating single-cell clusters much easier and more reliable. It contains a collection of curated cell type signatures — essentially, lists of genes that are characteristic of different cell types. These signatures have been gathered from published studies and are annotated using standard cell type classifications. When you use CellKb, it compares the gene list from your cluster with its cell type signatures and predicts which cell type best matches your cluster.

Using CellKb in Pluto

To annotate your clusters using CellKb in Pluto, navigate to your scRNA-seq experiment and click into the Annotation page. From there, click into the Annotation set that you would like to annotate.

cellkb_helpdoc_1

Select a cluster, then click "Help identify this cluster". From the CellKb suggestions tab, click the "Load CellKb suggestions" button at the bottom of the menu.

cellkb_helpdoc_2

CellKb will generate a list of the top 3 predicted cell types, as well as a list of the 10 best matching cell types. You can select any cell type to label your cluster by clicking "Use as label" next to the cell type name.

cellkb_helpdoc_3

Continue reading to learn more about how CellKb works, and what the scores mean for the predicated and best matching cell types.

How CellKb works

To generate cell type annotations for a cluster, CellKb first compares the gene list from your cluster with its database of cell type signatures. It looks for the best match and identifies the top cell types that most closely match your cluster's gene expression profile.

CellKb then predicts the most likely cell types for your cluster based on the top matching signatures. For the top 3 predicted cell types, a cell type prediction score is calculated, which represents how well each cell type matches your cluster’s gene expression. The combined prediction scores of the top 3 cell types always add up to 1, and a higher score for one cell type means it’s a better match compared to the other two. These scores are relative to your specific query and can’t be directly compared across different queries. 

In other words, the top predicted cell type with the highest score is the one that most closely matches your cluster's gene expression.

cellkb_help_doc_cell_type_prediction_score

For the top 10 best matching cell types, CellKb ranks them based on how closely the genes in your query match the cell type signatures. A higher rank-based match score indicates a stronger match between your cluster's gene list and the cell type signature. These match scores are comparable across different queries, so you can see how well your cluster’s genes align with the cell type signatures in the database. To evaluate the statistical significance of these matches, CellKb uses Fisher's exact test with a False Discovery Rate (FDR) adjustment. A lower FDR value indicates a more reliable match, reducing the chance of a false positive.

A high rank-based score and a low FDR indicate a strong, reliable match between your cluster's gene list and the cell type signature.

cellkb_help_doc_rank_based_score_fdr

Interpreting your results

Once you've generated potential cell types with CellKb, it’s time to interpret the results. Here are a few common scenarios you might encounter:

  1. Clear match to one cell type

    If a single cell type has a high prediction score and is supported by several matching reference signatures, it’s a strong indication that your cluster represents that cell type. A low FDR score means the match is statistically reliable, suggesting good data quality and well-separated clusters.

  2. Multiple cell types with similar prediction scores

    When multiple cell types have similar prediction scores, it indicates that your cluster may contain a mix of cell types, or its gene expression profile closely resembles multiple related cell types. It might also suggest that the clusters are not well separated, making annotation trickier.

    You can refine your annotation by examining marker genes from the top predicted cell types. Alternatively, you can explore different clustering resolutions or annotation sets to see if they provide more refined groups, which may lead to clearer annotations

  3. Low confidence in the prediction

    If CellKb predicts a cell type with a high prediction score but the rank-based score is low and the FDR is high, this suggests the gene list doesn't closely match any known signature. It could indicate the presence of stressed, dead, or contaminated cells, or possibly an uncharacterized cell type.

    To improve results, check your data quality for potential issues like contamination or cell stress, and revisit the filtering step in your preprocessing workflow if needed.

Conclusion

With CellKb now available in Pluto Bio, annotating your single-cell clusters has never been easier. You can access CellKb directly through our platform’s intuitive point-and-click interface - no complex integration required. Simply explore the cell type annotations and let CellKb help you make accurate, reliable predictions based on your data.

If you have any questions or need assistance, don’t hesitate to reach out to our support team at support@pluto.bio. We're here to help!