commot.tl.cluster_communication_spatial_permutation
- commot.tl.cluster_communication_spatial_permutation(adata, df_ligrec=None, database_name=None, heteromeric=False, heteromeric_rule='min', heteromeric_delimiter='_', dis_thr=None, cost_scale=None, cost_type='euc', cot_eps_p=0.1, cot_eps_mu=None, cot_eps_nu=None, cot_rho=10.0, cot_nitermax=100, cot_weights=(0.25, 0.25, 0.25, 0.25), smooth=False, smth_eta=None, smth_nu=None, smth_kernel='exp', clustering=None, perm_type='within_cluster', n_permutations=100, random_seed=1, verbose=True, copy=False)
Infer cluster-cluster communication and compute p-value by permutating cell/spot locations.
The cluster_communication function using label permutations is computationally efficient but may overestimate the communications of neighboring cell clusters. This function compute the p-value by permutating the locations of cell or spots and requires more computation time since the communication matrix is recomputed for each permutation. To avoid repeated calculation of the cell-level CCC matrices, the cluster-level CCC is summarized for all LR pairs and signaling pathways.
- Parameters
adata (
AnnData
) – The data matrix of shapen_obs
×n_var
. Rows correspond to cells or spots and columns to genes. If the spatial distance is absent in.obsp['spatial_distance']
, Euclidean distance determined from.obsm['spatial']
will be used.df_ligrec (
Optional
[DataFrame
]) – A data frame where each row corresponds to a ligand-receptor pair with ligands, receptors, and the associated signaling pathways in the three columns, respectively.database_name (
Optional
[str
]) – Name of the ligand-receptor interaction database. Will be included in the keywords for anndata slots.heteromeric (
bool
) – Whether the ligands or receptors are made of heteromeric complexes.heteromeric_rule (
str
) – Use either ‘min’ (minimum) or ‘ave’ (average) expression of the components as the complex level.heteromeric_delimiter (
str
) – The character in ligand and receptor names separating individual components.dis_thr (
Optional
[float
]) – The threshold of spatial distance of signaling.cost_scale (
Optional
[dict
]) – Weight coefficients of the cost matrix for each ligand-receptor pair, e.g. cost_scale[(‘ligA’,’recA’)] specifies weight for the pair ligA and recA. If None, all pairs have the same weight.cost_type (
str
) – If ‘euc’, the original Euclidean distance will be used as cost matrix. If ‘euc_square’, the square of the Euclidean distance will be used.cot_eps_p (
float
) – The coefficient of entropy regularization for transport plan.cot_eps_mu (
Optional
[float
]) – The coefficient of entropy regularization for untransported source (ligand). Set to equal to cot_eps_p for fast algorithm.cot_eps_nu (
Optional
[float
]) – The coefficient of entropy regularization for unfulfilled target (receptor). Set to equal to cot_eps_p for fast algorithm.cot_rho (
float
) – The coefficient of penalty for unmatched mass.cot_nitermax (
int
) – Maximum iteration for collective optimal transport algorithm. The default of this parameter is set to a much smaller one (100) compared to the default inspatial_communication
to speed up the repeated OT calculation. The resulting communication matrices will be slightly different from the using 10000 for cot_nitermax but very similar.cot_weights (
tuple
) – A tuple of four weights that add up to one. The weights corresponds to four setups of collective optimal transport: 1) all ligands-all receptors, 2) each ligand-all receptors, 3) all ligands-each receptor, 4) each ligand-each receptor.smooth (
bool
) – Whether to (spatially) smooth the gene expression for identifying more global signaling trend.smth_eta (
Optional
[float
]) – Kernel bandwidth for smoothingsmth_nu (
Optional
[float
]) – Kernel sharpness for smoothingsmth_kernel (
str
) – ‘exp’ exponential kernel. ‘lorentz’ Lorentz kernel.clustering (
Optional
[str
]) – Name of clustering with the labels stored in.obs[clustering]
.perm_type (
str
) – The type of permutation to perform. If “within_cluster”, the cells/spots are permutated within each cluster. If “all_cell”, the cells/spots are permutated all together.n_permutations (
int
) – Number of location permutations for computing the p-value.random_seed (
int
) – The numpy random_seed for reproducible random permutations.verbose (
bool
) – Whether to print the permutation iterations.copy (
bool
) – Whether to return a copy of theanndata.AnnData
.
- Returns
adata – For example, the cluster-level communication summary by this location permutation method for the LR pair ‘LigA’ and ‘RecA’ from the database ‘databaseX’ is stored in
adata.uns['commot_cluster_spatial_permutation-databaseX-clustering-ligA-recA']
If copy=True, return the AnnData object and return None otherwise.- Return type
anndata.AnnData