Step 6: Clustering
In Step 6, clustering is performed to define groups of cells with similar expression profiles using the Seurat implementation of the Louvain network detection with PCA dimensionality reduction as input (Macosko et al. 2015).
The following parameters are adjustable for Step 6 (~/working_directory/job_info/parameters/step6_par.txt
):
Parameter | Default | Description |
---|---|---|
par_save_RNA | Yes | Whether or not to export an RNA expression matrix |
par_save_metadata | Yes | Whether or not to export a metadata dataframe |
par_seurat_object | NULL | If users already have a Seurat object, they may provide the path to the Seurat object to initiate the pipeline at Step 6 |
par_skip_integration | No | Whether or not the user skipped integration in Step 5 |
par_FindNeighbors_dims | 25 | Number of dimensions from linear dimensional reduction used as input to identify neighbours. Can be informed by the elbow and Jackstraw plots produced in Step 5 |
par_RunUMAP_dims | 25 | Number of dimensions to use as input features for uniform manifold approximation and projection (UMAP) |
par_FindNeighbors_k.param | 45 | Defines k for the k-nearest neighbor algorithm |
par_FindNeighbors_prune.SNN | 1/15 | Sets the cutoff for acceptable Jaccard index when computing the neighborhood overlap for the shared nearest-neighbour (SNN) construction |
par_FindClusters_resolution | 0, 0.05, 0.25, 0.5, 0.75, 1.0, 1.25, 1.5, 2.0 | Value of the clustering resolution parameter. You may provide multiple resolution values |
par_compute_ARI | Yes | Whether or not you want to compute the Adjusted Rand Index (ARI) between clusters at a given clustering resolution |
par_RI_reps | 25 | Number of iterations for clustering the data at a given resolution in order to calculate the ARI |
To run Step 6, use the following command:
bash $SCRNABOX_HOME/launch_scrnabox.sh \
-d ${SCRNABOX_PWD} \
--steps 6
The resulting output files are deposited into ~/working_directory/step6
. For a description of the outputs see here.