You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.Rmd
-48Lines changed: 0 additions & 48 deletions
Original file line number
Diff line number
Diff line change
@@ -56,54 +56,6 @@ Once you have entered the data and metadata in a template you can use the `read_
56
56
57
57
Details about the data guide format and how to write one as well as about how to design a template can be found in the package vignettes.
58
58
59
-
### Data guide
60
-
61
-
The *data guide* is a human readable and editable file in [YAML](https://yaml.org/spec/1.2.2/) format that specifies the structure and location of the data in the Excel file. It contains a list of data types, each of which is defined by a name and a set of parameters. As the name suggests, the *data guide* is used by the **excelDataGuide** package as a guide to extract all indexed data from the Excel file and convert it into proper R objects. Part of the *data guide* from the example in the package, *i.e.*`system.file("extdata", "example_guide.yml", package = "excelDataGuide")` is shown below:
62
-
63
-
```yaml
64
-
guide.version: '1.0'
65
-
template.name: competition
66
-
template.min.version: '9.3'
67
-
template.max.version: ~
68
-
plate.format: 96
69
-
locations:
70
-
- sheet: description
71
-
type: cells
72
-
varname: .template
73
-
translate: false
74
-
variables:
75
-
- name: version
76
-
cell: B2
77
-
- sheet: description
78
-
type: keyvalue
79
-
translate: true
80
-
atomicclass:
81
-
- character
82
-
- character
83
-
- character
84
-
- character
85
-
- character
86
-
- date
87
-
- character
88
-
- numeric
89
-
- character
90
-
- numeric
91
-
- character
92
-
- numeric
93
-
- character
94
-
- character
95
-
varname: metadata
96
-
ranges:
97
-
- A10:B21
98
-
- A24:B25
99
-
```
100
-
101
-
We provide a cue schema for the data guide, allowing you to check the validity of
102
-
guides that you wrote. The schema is available in the package as
103
-
`system.file("extdata", "excelguide_schema.cue", package = "excelDataGuide")`. To
104
-
check its validity against the schema you can use the [CUE](https://cuelang.org/docs/) validator.
105
-
More details can be found in the vignette (to be done, see below).
106
-
107
59
## Future work
108
60
109
61
- Complete the vignette ([issue](https://github.com/SystemsBioinformatics/excelDataGuide/issues/2))
We urge you to use the `NA()` function to represent missing values in your tenmplates, in particular in calculations. The advantage of using `NA()` is that calculations in the sheets will automatically handle `NA()` and pass them on to subsequent caclulations, avoiding errors and producing sensible results. A disadvantage of using `NA()` is that it requires special care to detect and handle missing values in formulas. One particularly weird problem is that you can not use detection of the string "#N/A" in a cell as a way to generically detect missing values in formulas, even though this "solution" is often presented on internet fora, even in official documentation. The reason is that different language settings of Excel use different string representations for missing values. You have to consistently use the `ISNA()` function to detect `NA()` values throughout your entire template.
178
+
179
+
### Labeled values (bad values)
180
+
181
+
You may have obtained raw measurements that you do not want to include in your analysis. Clearly, you should not delete these measurements from the spreadsheet, because labelling a value as a "bad" measurement is, to some degree, a subjective action with which an other user or your future self may disagree. Instead, you can label them as "bad". An easy way to do this is by adding a star before or behind the value, *e.g.* `1000*` or `*1000`. You should also add a note explaining why the value is bad in a table with columns of cell addresses and remarks at a logical position in the same sheet. You can detect such "starred" values in Excel by using for example the `ISERROR()`, `ISNUMBER()` or `ISNONTEXT()` functions in a clause in calculations with these values and set a calculated cell to `NA()` based on the result. For example, `=IF(NOT(ISNUMBER(A1)), NA(), A1)` will set the cell with this formula to `NA()` if the value is not a number. An additional visual aid to detect "starred" values is to use a different font color or cell background for such cells using conditional formatting.
182
+
183
+
In the excelDataGuide package we provide the functions `has_star()` and `star_to_number()` to detect "starred" values, convert them back to numbers, but label them as "bad" in a separate column in the template output.
0 commit comments