commot.pp.infer_spatial_information

commot.pp.infer_spatial_information(adata_sc, adata_sp, cost_sc_sp=None, cost_sc=None, cost_sp=None, ot_alpha=0.2, ot_rho=0.05, ot_epsilon=0.01, exp_pred_prone=0.0, loc_pred_k=1, return_gamma=False)

Infer spatial information.

Given a pair of spatial data and scRNA-seq data, estimate spatial origin of scRNA-seq data and impute gene expression for spatial data [Cang2020].

Parameters
  • adata_sc (AnnData) – The data matrix for scRNA-seq data of shape n_obs(sc) × n_vars(sc). Rows corresponds to cells and columns to genes.

  • adata_sp (AnnData) – The data matrix for spatial data of shape n_obs(sp) × n_vars(sp). Rows corresponds to positions and columns to genes.

  • cost_sc_sp (Optional[ndarray]) – The dissimilarity matrix between scRNA-seq data and spatial data of shape n_obs(sc) × n_obs(sp). If not given, 1 - Spearman’s r on common genes is used.

  • cost_sc (Optional[ndarray]) – The dissimilarity matrix within scRNA-seq data of shape n_obs(sc) × n_obs(sc). Only needed when structured optimal transport is used (ot_alpha > 0). If not given, the Euclidean distance in PCA space is used.

  • cost_sp (Optional[ndarray]) – The distance matrix within spatial data of shape n_obs(sp) × n_obs(sp). Only needed when structured optimal transport is used (ot_alpha > 0). If not given, the spatial distance among spatial locations is used.

  • ot_alpha (float) – Weight for the structured component in optimal transport in [0,1].

  • ot_rho (float) – Marginal relaxtion term (>0). Traditional OT when ot_rho=inf.

  • ot_epsilon (float) – Entropy regularization term (>0). A higher value will generate a denser mapping matrix.

  • exp_pred_prone (float) – The percentage of cells with low weights to ignore when predicing gene expression for each spatial data. A higher percentage will increase the sparseness of the predicted spatial data due to the sparseness in the scRNA-seq data.

  • loc_pred_k (int) – Number of top spatial matches for predicting spatial origin of cells.

  • return_gamma (bool) – Whether to return the optimal transport plan (gamma matrix)

Returns

  • adata_sc_pred (anndata.AnnData) – The scRNA-seq data with predicted spatial origins in .obsm['spatial'].

  • adata_sp_pred (anndata.AnnData) – The spatial data with imputed gene expression.

  • gamma (np.ndarray) – The connectivity matrix between scRNA-seq data and spatial data which is used as weights to generate the predicted datasets adata_sc_pred and adata_sp_pred.

References

Cang2020

Cang, Z., & Nie, Q. (2020). Inferring spatial and signaling relationships between cells from single cell transcriptomic data. Nature communications, 11(1), 1-13.