commot.tl.communication_impact
- commot.tl.communication_impact(adata, database_name=None, pathway_name=None, pathway_sum_only=False, heteromeric_delimiter='_', normalize=False, method=None, corr_method='spearman', tree_method='rf', tree_ntrees=100, tree_repeat=100, tree_max_depth=5, tree_max_features='sqrt', tree_learning_rate=0.1, tree_subsample=1.0, tree_combined=False, ds_genes=None, bg_genes=100)
Analyze impact of communication.
When using the ‘treebased_score’ as the method, there is potentially dilution of importance between the LR pairs if ‘tree_combined’ is set to True. Therefore, if uniqueness of potential impact of various LR pairs on the target genes is not the focus, ‘tree_combined’ can be set to False. If the unique impact of signaling in addition to the intra-cellular regulatory impact of target genes is not of interest, ‘bg_genes’ can be set to 0.
- Parameters
adata (
AnnData
) – The data matrix of shapen_obs
×n_var
. Rows correspond to cells or positions and columns to genes. The full normalized dataset should be available inadata.raw
.pathway_name (
Optional
[str
]) – Name of the signaling pathway.normalize (
bool
) – Whether to perform normalization before determining variable genes.method (
Optional
[str
]) – ‘partial_corr’: partial correlation. ‘semipartial_corr’: semipartial correlation. ‘treebased_score’: machine learning based score (ensemble of trees).corr_method (
str
) – The correlation coefficient to use when method is ‘partial_corr’ or ‘semipartial_corr’. ‘spearman’: Spearman’s r. ‘pearson’: Pearson’s r.tree_method (
str
) – The ensemble of trees method to use when method is ‘treebased_score’. ‘gbt’: gradient boosted trees. ‘rf’: random forest.tree_ntrees (
int
) – Number of trees when using ‘treebased_score’.tree_repeat (
int
) – Number of times to repeat to account for randomness when using ‘treebased_score’.tree_mas_depth – Max depth of trees when using ‘treebased_score’.
tree_max_features (
str
) – Max features for trees when using ‘treebased_score’.tree_learning_rate (
float
) – Learning rate when using ‘treebased_score’.tree_subsample (
float
) – Subsample (between 0 and 1) when using ‘treebased_score’.tree_combined (
bool
) – If True, use a single model for each target gene with all features.ds_genes (
Optional
[list
]) – A list of genes for analyzing the correlation with cell-cell communication.bg_genes (
Union
[list
,int
]) – If an integer, the top number of variable genes are used. Alternatively, a list of genes.
- Returns
df_impact – A data frame describing the correlation between the ds_genes and cell-cell communication.
- Return type
pd.DataFrame