commot.tl.spatial_communication
- commot.tl.spatial_communication(adata, database_name=None, df_ligrec=None, pathway_sum=False, heteromeric=False, heteromeric_rule='min', heteromeric_delimiter='_', dis_thr=None, cost_scale=None, cost_type='euc', cot_eps_p=0.1, cot_eps_mu=None, cot_eps_nu=None, cot_rho=10.0, cot_nitermax=10000, cot_weights=(0.25, 0.25, 0.25, 0.25), smooth=False, smth_eta=None, smth_nu=None, smth_kernel='exp', copy=False)
Infer spatial communication.
- Parameters
adata (
AnnData
) – The data matrix of shapen_obs
×n_var
. Rows correspond to cells or spots and columns to genes. If the spatial distance is absent in.obsp['spatial_distance']
, Euclidean distance determined from.obsm['spatial']
will be used.database_name (
Optional
[str
]) – Name of the ligand-receptor interaction database. Will be included in the keywords for anndata slots.df_ligrec (
Optional
[DataFrame
]) – A data frame where each row corresponds to a ligand-receptor pair with ligands, receptors, and the associated signaling pathways in the three columns, respectively.pathway_sum (
bool
) – Whether to sum over all ligand-receptor pairs of each pathway.heteromeric (
bool
) – Whether the ligands or receptors are made of heteromeric complexes.heteromeric_rule (
str
) – Use either ‘min’ (minimum) or ‘ave’ (average) expression of the components as the level for the heteromeric complex.heteromeric_delimiter (
str
) – The character in ligand and receptor names separating individual components.dis_thr (
Optional
[float
]) – The threshold of spatial distance of signaling.cost_scale (
Optional
[dict
]) – Weight coefficients of the cost matrix for each ligand-receptor pair, e.g. cost_scale[(‘ligA’,’recA’)] specifies weight for the pair ligA and recA. If None, all pairs have the same weight.cost_type (
str
) – If ‘euc’, the original Euclidean distance will be used as cost matrix. If ‘euc_square’, the square of the Euclidean distance will be used.cot_eps_p (
float
) – The coefficient of entropy regularization for transport plan.cot_eps_mu (
Optional
[float
]) – The coefficient of entropy regularization for untransported source (ligand). Set to equal to cot_eps_p for fast algorithm.cot_eps_nu (
Optional
[float
]) – The coefficient of entropy regularization for unfulfilled target (receptor). Set to equal to cot_eps_p for fast algorithm.cot_rho (
float
) – The coefficient of penalty for unmatched mass.cot_nitermax (
int
) – Maximum iteration for collective optimal transport algorithm.cot_weights (
tuple
) – A tuple of four weights that add up to one. The weights corresponds to four setups of collective optimal transport: 1) all ligands-all receptors, 2) each ligand-all receptors, 3) all ligands-each receptor, 4) each ligand-each receptor.smooth (
bool
) – Whether to (spatially) smooth the gene expression for identifying more global signaling trend.smth_eta (
Optional
[float
]) – Kernel bandwidth for smoothingsmth_nu (
Optional
[float
]) – Kernel sharpness for smoothingsmth_kernel (
str
) – ‘exp’ exponential kernel. ‘lorentz’ Lorentz kernel.copy (
bool
) – Whether to return a copy of theanndata.AnnData
.
- Returns
adata – Signaling matrices are added to
.obsp
, e.g., for a LR interaction database named “databaseX”,.obsp['commot-databaseX-ligA-recA']
is an_obs
×n_obs
matrix with the ij th entry being the “score” of cell i sending signal to cell j through ligA and recA. The marginal sums (sender and receiver) of the signaling matrices are stored in.obsm['commot-databaseX-sum-sender']
and.obsm['commot-databaseX-sum-receiver']
. Metadata of the analysis is added to.uns['commot-databaseX-info']
. If copy=True, return the AnnData object and return None otherwise.- Return type
anndata.AnnData
Examples
>>> import anndata >>> import commot as ct >>> import pandas as pd >>> import numpy as np >>> # create a toy data with four cells and four genes >>> X = np.array([[1,0,0,0],[2,0,0,0],[0,1,1,4],[0,3,4,1]], float) >>> adata = anndata.AnnData(X=X, var=pd.DataFrame(index=['LigA','RecA','RecB','RecC'])) >>> adata.obsm['spatial'] = np.array([[0,0],[1,0],[0,1],[1,2]], float) >>> # create a toy ligand-receptor database with two pairs sharing the same ligand >>> df_ligrec = pd.DataFrame([['LigA','RecA_RecB'],['LigA','RecC']]) >>> df_ligrec = pd.DataFrame([['LigA','RecA_RecB','PathwayA'],['LigA','RecC','PathwayA']]) >>> # infer CCC with a distance threshold of 1.1. Due the threshold, the last cell won't be communicating with other cells. >>> ct.tl.spatial_communication(adata, database_name='toyDB',df_ligrec=df_ligrec, pathway_sum=True, heteromeric=True, dis_thr=1.1) >>> adata AnnData object with n_obs × n_vars = 4 × 4 uns: 'commot-toyDB-info' obsm: 'spatial', 'commot-toyDB-sum-sender', 'commot-toyDB-sum-receiver' obsp: 'commot-toyDB-LigA-RecA_RecB', 'commot-toyDB-LigA-RecC', 'commot-toyDB-PathwayA', 'commot-toyDB-total-total' >>> print(adata.obsp['commot-toyDB-LigA-RecA_RecB']) (0, 2) 0.600000000000005 >>> print(adata.obsp['commot-toyDB-LigA-RecC']) (0, 2) 0.9000000000039632 >>> print(adata.obsm['commot-toyDB-sum-receiver']) r-LigA-RecA_RecB r-LigA-RecC r-total-total r-PathwayA 0 0.0 0.0 0.0 0.0 1 0.0 0.0 0.0 0.0 2 0.6 0.9 1.5 1.5 3 0.0 0.0 0.0 0.0 >>> print(adata.obsm['commot-toyDB-sum-sender']) s-LigA-RecA_RecB s-LigA-RecC s-total-total s-PathwayA 0 0.6 0.9 1.5 1.5 1 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 3 0.0 0.0 0.0 0.0 >>> print(adata.obsp['commot-toyDB-PathwayA']) (0, 2) 1.5000000000039682 >>> print(adata.obsp['commot-toyDB-total-total']) (0, 2) 1.5000000000039682 >>> adata.uns['commot-toyDB-info'] {'df_ligrec': ligand receptor pathway 0 LigA RecA_RecB PathwayA 1 LigA RecC PathwayA, 'distance_threshold': 1.1}