Statistical Comparison¶
-
class
pygna.statistical_comparison.
StatisticalComparison
(comparison_statistic, network: networkx.classes.graph.Graph, n_proc: int = 1, diz: dict = {}, degree_bins=1)[source]¶ This class implements the statistical analysis comparison between two genesets. Please refer to the single method documentation for the returning values
-
comparison_empirical_pvalue
(genesetA: set, genesetB: set, alternative: str = 'less', max_iter: int = 100, keep: bool = False) → [<class 'int'>, <class 'float'>, <class 'float'>, <class 'int'>, <class 'int'>][source]¶ Calculate the empirical value between two genesets
Parameters: - genesetA – the first geneset to compare
- genesetB – the second geneset to compare
- alternative – the pvalue selection of the observed genes
- max_iter – the maximum number of iterations
- keep – if the geneset B should not be kept
Return observed, pvalue, null_distribution, len(mapped_genesetA), len(mapped_genesetB): the list with the data calculated
-
get_comparison_null_distribution
(genesetA: list, genesetB: list, n_samples: int, keep: bool, sampling_p_a=None, sampling_p_b=None) → list[source]¶ Calculate the null distribution between two genesets with single CPU
Parameters: - genesetA – the first geneset to compare
- genesetB – the second geneset to compare
- n_samples – the number of samples to be taken
- keep – if the geneset B should not be kept
Returns: the random distribution calculated
-
get_comparison_null_distribution_mp
(genesetA: list, genesetB: list, max_iter: int = 100, keep: bool = False, sampling_p_a=None, sampling_p_b=None) → numpy.ndarray[source]¶ Calculate the null distribution between two genesets with multiple CPUs
Parameters: - genesetA – the first geneset to compare
- genesetB – the second geneset to compare
- max_iter – maximum number of iteration to perform
- keep – if the geneset B should not be kept
- sampling_p_a – random sampling probability for geneset a
- sampling_p_b – random sampling probability for geneset b
Returns: the array with null distribution
-
-
pygna.statistical_comparison.
comparison_shortest_path
(network: networkx.classes.graph.Graph, genesetA: set, genesetB: set, diz: dict) → float[source]¶ Evaluate the shortest path between two genesets
Parameters: - network – the graph representing the network
- genesetA – the first geneset list
- genesetB – the second geneset list
- diz – the dictionary containing the nodes name and index
-
pygna.statistical_comparison.
calculate_sum
(n: numpy.ndarray, m: numpy.ndarray, diz: dict) → numpy.ndarray[source]¶ Evaluate the sum of the columns of two matrices
Parameters: - n – the first column
- m – the second column
- diz – the dictionary containing the data
-
pygna.statistical_comparison.
comparison_random_walk
(network: networkx.classes.graph.Graph, genesetA: list, genesetB: list, diz: dict = {}) → float[source]¶ Evaluate the random walk on two genesets
Parameters: - network – the graph representing the network
- genesetA – the first geneset list
- genesetB – the second geneset list
- diz – the dictionary containing the nodes name and index