Skip to content

Commit ccb0025

Browse files
committed
Pass flake8
1 parent b17600a commit ccb0025

1 file changed

Lines changed: 2 additions & 87 deletions

File tree

src/diffupy/diffuse.py

Lines changed: 2 additions & 87 deletions
Original file line numberDiff line numberDiff line change
@@ -28,99 +28,14 @@ def diffuse(
2828
) -> Matrix:
2929
"""Run diffusion on a network given an input and a diffusion method.
3030
31-
Diffusion methods procedures provided in this package differ on:
32-
(a) How to distinguish positives, negatives and unlabelled examples.
33-
(b) Their statistical normalisation.
34-
35-
Input scores can be specified in three formats: a single set of scores to smooth can be represented as.
36-
(1) A named numeric vector, whereas if several of these vectors that share the node names need to be smoothed,
37-
they can be provided as
38-
(2) A column-wise matrix. However, if the unlabelled entities are not the same from one case to another,
39-
(3) A named list of such score matrices can be passed to this function. The path format will be kept in the
40-
output.
41-
42-
If the path labels are not quantitative, i.e. positive(1), negative(0) and possibly unlabelled, all the scores
43-
raw, gm, ml, z, mc, ber_s, ber_p can be used.
44-
45-
Methods [method attribute to choose one node]:
46-
47-
- Methods without statistical normalisation:
48-
49-
{raw}: positive nodes introduce unitary flow {y_raw[i] = 1} to
50-
the network, whereas either negative and unlabelled
51-
nodes introduce null diffusion {y_raw[j] = 0}.
52-
[Vandin, 2011]. They are computed as:
53-
54-
f_{raw} = k · y_{raw}
55-
56-
where k is a graph kernel, see kernels.py.
57-
These scores treat negative and unlabelled nodes equivalently.
58-
59-
{ml}: same as raw, but negative nodes introduce a negative unit of flow.
60-
therefore not equivalent to unlabelled nodes.
61-
[Zoidi, 2015]
62-
63-
{gm}: same as ml, but the unlabelled nodes are assigned
64-
a (generally non-null) bias term based on the total
65-
number of positives, negatives and unlabelled nodes
66-
[Mostafavi, 2008]
67-
68-
{ber_s}: a quantification of the relative change in the node score before
69-
and after the network smoothing. The score for a particular node i can be written as
70-
71-
f_{ber_s}[i] = f_{raw}[i] / (y_{raw}[i] + eps)
72-
73-
where eps is a parameter controlling the importance of the relative change.
74-
75-
- Methods with statistical normalisation:
76-
77-
{z}: a parametric alternative to the raw score of node is subtracted its mean
78-
value and divided by its standard deviation. Differential trait of this package.
79-
The statistical moments have a closed analytical form,
80-
see the main vignette, and are inspired in [Harchaoui, 2013].
81-
82-
{mc}: the score of node code {i} is based on its empirical p-value, computed by permuting
83-
the path {n.perm} times.
84-
It is roughly the proportion of path permutations that led to a diffusion score as
85-
high or higher than the original diffusion score.
86-
87-
{ber_p}: as used in [Bersanelli, 2016], this score combines raw and mc, in order to take into
88-
account both the magnitude of the {raw} scores and the effect of the network topology :
89-
this is a quantification of the relative change in the node score before and after the
90-
network smoothing.
91-
92-
Methods summary table:
93-
__ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __
94-
| Scores | y+ | y- | yn | Normalized | Stochastic | Quantitative | Reference |
95-
__ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __
96-
__Unormalized__ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __
97-
| raw | 1 | 0 | 0 | No | No | Yes | Vandin (2010) |
98-
__ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ _
99-
| ml | 1 | -1 | 0 | No | No | No | Tsuda (2010) |
100-
__ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ _
101-
| gm | 1 | -1 | k | No | No | No | Mostafavi (2008) |
102-
__ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __
103-
| ber_s | 1 | 0 | 0 | No | No | Yes | Bersanelli (2016)|
104-
__ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ _
105-
__Normalized __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ _
106-
| ber_p | 1 | 0 | 0* | Yes | Yes | Yes | Bersanelli (2016)|
107-
__ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __
108-
| mc | 1 | 0 | 0* | Yes | Yes | Yes | Bersanelli (2016)|
109-
__ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __
110-
| z | 1 | 0 | 0* | Yes | No | Yes | Harchaoui (2013) |
111-
__ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ _
112-
113-
:param input_scores: score collection, supplied as n-dimensional array.
114-
Could be 1-dimensional (List) or n-dimensional (Matrix).
115-
:param method: Elected method, among the previously described in the table.
116-
Possible values ["raw", "ml", "gm", "ber_s", "ber_p", "mc", "z"]
31+
:param input_scores: score collection, supplied as n-dimensional array. Could be 1-dimensional (List) or n-dimensional (Matrix).
32+
:param method: Elected method ["raw", "ml", "gm", "ber_s", "ber_p", "mc", "z"]
11733
:param graph: A network as a graph. It could be optional if a Kernel is provided
11834
:param kwargs: Optional arguments:
11935
- k: a kernel [matrix] steaming from a graph, thus sparing the graph transformation process
12036
- Other arguments which would differ depending on the chosen method
12137
:return: The diffused scores within the matrix transformation of the network, with the diffusion operation
12238
[k x input_vector] performed
123-
12439
"""
12540

12641
# Sanity checks

0 commit comments

Comments
 (0)