commot.tl.cluster_communication_spatial_permutation

commot.tl.cluster_communication_spatial_permutation(adata, df_ligrec=None, database_name=None, heteromeric=False, heteromeric_rule='min', heteromeric_delimiter='_', dis_thr=None, cost_scale=None, cost_type='euc', cot_eps_p=0.1, cot_eps_mu=None, cot_eps_nu=None, cot_rho=10.0, cot_nitermax=100, cot_weights=(0.25, 0.25, 0.25, 0.25), smooth=False, smth_eta=None, smth_nu=None, smth_kernel='exp', clustering=None, perm_type='within_cluster', n_permutations=100, random_seed=1, verbose=True, copy=False)

Infer cluster-cluster communication and compute p-value by permutating cell/spot locations.

The cluster_communication function using label permutations is computationally efficient but may overestimate the communications of neighboring cell clusters. This function compute the p-value by permutating the locations of cell or spots and requires more computation time since the communication matrix is recomputed for each permutation. To avoid repeated calculation of the cell-level CCC matrices, the cluster-level CCC is summarized for all LR pairs and signaling pathways.

Parameters
  • adata (AnnData) – The data matrix of shape n_obs × n_var. Rows correspond to cells or spots and columns to genes. If the spatial distance is absent in .obsp['spatial_distance'], Euclidean distance determined from .obsm['spatial'] will be used.

  • df_ligrec (Optional[DataFrame]) – A data frame where each row corresponds to a ligand-receptor pair with ligands, receptors, and the associated signaling pathways in the three columns, respectively.

  • database_name (Optional[str]) – Name of the ligand-receptor interaction database. Will be included in the keywords for anndata slots.

  • heteromeric (bool) – Whether the ligands or receptors are made of heteromeric complexes.

  • heteromeric_rule (str) – Use either ‘min’ (minimum) or ‘ave’ (average) expression of the components as the complex level.

  • heteromeric_delimiter (str) – The character in ligand and receptor names separating individual components.

  • dis_thr (Optional[float]) – The threshold of spatial distance of signaling.

  • cost_scale (Optional[dict]) – Weight coefficients of the cost matrix for each ligand-receptor pair, e.g. cost_scale[(‘ligA’,’recA’)] specifies weight for the pair ligA and recA. If None, all pairs have the same weight.

  • cost_type (str) – If ‘euc’, the original Euclidean distance will be used as cost matrix. If ‘euc_square’, the square of the Euclidean distance will be used.

  • cot_eps_p (float) – The coefficient of entropy regularization for transport plan.

  • cot_eps_mu (Optional[float]) – The coefficient of entropy regularization for untransported source (ligand). Set to equal to cot_eps_p for fast algorithm.

  • cot_eps_nu (Optional[float]) – The coefficient of entropy regularization for unfulfilled target (receptor). Set to equal to cot_eps_p for fast algorithm.

  • cot_rho (float) – The coefficient of penalty for unmatched mass.

  • cot_nitermax (int) – Maximum iteration for collective optimal transport algorithm. The default of this parameter is set to a much smaller one (100) compared to the default in spatial_communication to speed up the repeated OT calculation. The resulting communication matrices will be slightly different from the using 10000 for cot_nitermax but very similar.

  • cot_weights (tuple) – A tuple of four weights that add up to one. The weights corresponds to four setups of collective optimal transport: 1) all ligands-all receptors, 2) each ligand-all receptors, 3) all ligands-each receptor, 4) each ligand-each receptor.

  • smooth (bool) – Whether to (spatially) smooth the gene expression for identifying more global signaling trend.

  • smth_eta (Optional[float]) – Kernel bandwidth for smoothing

  • smth_nu (Optional[float]) – Kernel sharpness for smoothing

  • smth_kernel (str) – ‘exp’ exponential kernel. ‘lorentz’ Lorentz kernel.

  • clustering (Optional[str]) – Name of clustering with the labels stored in .obs[clustering].

  • perm_type (str) – The type of permutation to perform. If “within_cluster”, the cells/spots are permutated within each cluster. If “all_cell”, the cells/spots are permutated all together.

  • n_permutations (int) – Number of location permutations for computing the p-value.

  • random_seed (int) – The numpy random_seed for reproducible random permutations.

  • verbose (bool) – Whether to print the permutation iterations.

  • copy (bool) – Whether to return a copy of the anndata.AnnData.

Returns

adata – For example, the cluster-level communication summary by this location permutation method for the LR pair ‘LigA’ and ‘RecA’ from the database ‘databaseX’ is stored in adata.uns['commot_cluster_spatial_permutation-databaseX-clustering-ligA-recA'] If copy=True, return the AnnData object and return None otherwise.

Return type

anndata.AnnData