Skip to content

Commit d7f08c7

Browse files
sarahbeeniesarah.mubeen
andauthored
Datasets (#21)
* update example datasets * fix docs * update sample data * fix docs Co-authored-by: sarah.mubeen <sarah.mubeen@scai.fraunhofer.de>
1 parent 2515d91 commit d7f08c7

4 files changed

Lines changed: 83 additions & 40 deletions

File tree

docs/source/intro.rst

Lines changed: 36 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ You can submit your dataset in any of the following formats:
1414
Please ensure that the dataset minimally has a column 'Node' containing node IDs. You can also optionally add the
1515
following columns to your dataset:
1616

17+
- NodeType
1718
- LogFC [*]_
1819
- p-value
1920

@@ -42,39 +43,55 @@ details.
4243
| D |
4344
+------------+
4445

45-
2. You can also choose to provide a dataset with a column 'Node' containing node IDs as well as a column 'logFC' with
46-
their abs(LogFC).
46+
2. You can also provide a dataset with a column 'Node' containing node IDs as well as a column 'NodeType', indicating
47+
the entity type of the node to run diffusion by entity type.
48+
49+
+------------+--------------+
50+
| Node | NodeType |
51+
+============+==============+
52+
| A | Gene |
53+
+------------+--------------+
54+
| B | Gene |
55+
+------------+--------------+
56+
| C | Metabolite |
57+
+------------+--------------+
58+
| D | Gene |
59+
+------------+--------------+
60+
61+
3. You can also choose to provide a dataset with a column 'Node' containing node IDs as well as a column 'logFC' with
62+
their | logFC | . You may also add a 'NodeType' column to run diffusion by entity type.
4763

4864
+--------------+------------+
4965
| Node | LogFC |
5066
+==============+============+
51-
| Gene A | 4 |
67+
| A | 4 |
5268
+--------------+------------+
53-
| Gene B | -1 |
69+
| B | -1 |
5470
+--------------+------------+
55-
| Metabolite C | 1.5 |
71+
| C | 1.5 |
5672
+--------------+------------+
57-
| Gene D | 3 |
73+
| D | 3 |
5874
+--------------+------------+
5975

60-
3. Finally, you can provide a dataset with a column 'Node' containing node IDs, a column 'logFC' with their abs(LogFC)
61-
and a column 'p-value' with adjusted p-values.
76+
.. | logFC | replace:: Log\ :sub:`2`\ FC
77+
78+
4. Finally, you can provide a dataset with a column 'Node' containing node IDs, a column 'logFC' with their | logFC |
79+
and a column 'p-value' with adjusted p-values. You may also add a 'NodeType' column to run diffusion by entity type.
6280

6381
+--------------+------------+---------+
6482
| Node | LogFC | p-value |
6583
+==============+============+=========+
66-
| Gene A | 4 | 0.03 |
84+
| A | 4 | 0.03 |
6785
+--------------+------------+---------+
68-
| Gene B | -1 | 0.05 |
86+
| B | -1 | 0.05 |
6987
+--------------+------------+---------+
70-
| Metabolite C | 1.5 | 0.001 |
88+
| C | 1.5 | 0.001 |
7189
+--------------+------------+---------+
72-
| Gene D | 3 | 0.07 |
90+
| D | 3 | 0.07 |
7391
+--------------+------------+---------+
7492

75-
You can also take a look at our `sample datasets <https://github.com/multipaths/DiffuPy/tree/master/examples/datasets>`_
76-
folder for some examples files.
77-
93+
See the `sample datasets <https://github.com/multipaths/DiffuPy/tree/master/examples/datasets>`_ directory for example
94+
files.
7895

7996
Networks
8097
--------
@@ -119,13 +136,13 @@ Custom-network example
119136
~~~~~~~~~~~~~~~~~~~~~~
120137

121138
+-----------+--------------+-------------+
122-
| Source | Target | Relation |
139+
| Source | Target | Relation |
123140
+===========+==============+=============+
124-
| Gene A | Gene B | Increase |
141+
| A | B | Increase |
125142
+-----------+--------------+-------------+
126-
| Gene B | Metabolite C | Association |
143+
| B | C | Association |
127144
+-----------+--------------+-------------+
128-
| Gene A | Pathology D | Association |
145+
| A | D | Association |
129146
+-----------+--------------+-------------+
130147

131148
You can also take a look at our `sample networks <https://github.com/multipaths/DiffuPy/tree/master/examples/networks>`_

examples/README.rst

Lines changed: 31 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ You can submit your dataset in any of the following formats:
1111
Please ensure that the dataset minimally has a column 'Node' containing node IDs. You can also optionally add the
1212
following columns to your dataset:
1313

14+
- NodeType
1415
- LogFC [*]_
1516
- p-value
1617

@@ -39,36 +40,51 @@ details.
3940
| D |
4041
+------------+
4142

42-
2. You can also choose to provide a dataset with a column 'Node' containing node IDs as well as a column 'logFC' with
43-
their | logFC |.
43+
2. You can also provide a dataset with a column 'Node' containing node IDs as well as a column 'NodeType', indicating
44+
the entity type of the node to run diffusion by entity type.
45+
46+
+------------+--------------+
47+
| Node | NodeType |
48+
+============+==============+
49+
| A | Gene |
50+
+------------+--------------+
51+
| B | Gene |
52+
+------------+--------------+
53+
| C | Metabolite |
54+
+------------+--------------+
55+
| D | Gene |
56+
+------------+--------------+
57+
58+
3. You can also choose to provide a dataset with a column 'Node' containing node IDs as well as a column 'logFC' with
59+
their | logFC |. You may also add a 'NodeType' column to run diffusion by entity type.
4460
4561
+--------------+------------+
4662
| Node | LogFC |
4763
+==============+============+
48-
| Gene A | 4 |
64+
| A | 4 |
4965
+--------------+------------+
50-
| Gene B | -1 |
66+
| B | -1 |
5167
+--------------+------------+
52-
| Metabolite C | 1.5 |
68+
| C | 1.5 |
5369
+--------------+------------+
54-
| Gene D | 3 |
70+
| D | 3 |
5571
+--------------+------------+
5672

5773
.. | logFC | replace:: Log\ :sub:`2`\ FC
5874
59-
3. Finally, you can provide a dataset with a column 'Node' containing node IDs, a column 'logFC' with their | logFC | and
60-
a column 'p-value' with adjusted p-values.
75+
4. Finally, you can provide a dataset with a column 'Node' containing node IDs, a column 'logFC' with their | logFC |
76+
and a column 'p-value' with adjusted p-values. You may also add a 'NodeType' column to run diffusion by entity type.
6177

6278
+--------------+------------+---------+
6379
| Node | LogFC | p-value |
6480
+==============+============+=========+
65-
| Gene A | 4 | 0.03 |
81+
| A | 4 | 0.03 |
6682
+--------------+------------+---------+
67-
| Gene B | -1 | 0.05 |
83+
| B | -1 | 0.05 |
6884
+--------------+------------+---------+
69-
| Metabolite C | 1.5 | 0.001 |
85+
| C | 1.5 | 0.001 |
7086
+--------------+------------+---------+
71-
| Gene D | 3 | 0.07 |
87+
| D | 3 | 0.07 |
7288
+--------------+------------+---------+
7389

7490
See the `sample datasets <https://github.com/multipaths/DiffuPy/tree/master/examples/datasets>`_ directory for example
@@ -118,11 +134,11 @@ Custom-network example
118134
+-----------+--------------+-------------+
119135
| Source | Target | Relation |
120136
+===========+==============+=============+
121-
| Gene A | Gene B | Increase |
137+
| A | B | Increase |
122138
+-----------+--------------+-------------+
123-
| Gene B | Metabolite C | Association |
139+
| B | C | Association |
124140
+-----------+--------------+-------------+
125-
| Gene A | Pathology D | Association |
141+
| A | D | Association |
126142
+-----------+--------------+-------------+
127143

128144
See the `sample networks <https://github.com/multipaths/DiffuPy/tree/master/examples/networks>`_ directory for some

examples/datasets/node_type.csv

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
Node,NodeType
2+
A,Gene
3+
B,Gene
4+
C,Metabolite
5+
D,Gene
6+
E,Metabolite
7+
F,Gene
8+
G,Pathology
Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
1-
NodeType,Node
2-
Gene,A
3-
Gene,B
4-
Metabolite,C
5-
Gene,D
6-
Gene,E
1+
Node,NodeType
2+
A,Gene
3+
B,Gene
4+
C,Metabolite
5+
D,Gene
6+
E,Metabolite
7+
F,Gene
8+
G,Pathology

0 commit comments

Comments
 (0)