Updated data schema documentation#1859
Conversation
| val orders: List<_DataFrameType11> | ||
| val user: kotlin.String | ||
| } | ||
| data class Customer( |
There was a problem hiding this comment.
re-generate classes please, it'll have nested structure and domain name Orders instead of Customer1
|
Btw, as for closing #309:
Can be used on type directly: Keeping in mind that compiler plugin operations change initial type Workaround, use cast in DataRow context: |
|
@koperagen should we add this trick to the documentation? |
| <primary> | ||
| <title>First steps</title> | ||
| <a href="SetupKotlinNotebook.md"/> | ||
| <a href="SetupGradle.md"/> |
There was a problem hiding this comment.
hmm maybe we should keep the setup kotlin notebook for a while but just move it downward
|
|
||
| ```kotlin | ||
| val df = DataFrame.readCsv("example.csv").convertTo<Person>() | ||
| val df = DataFrame.readCsv("example.csv").cast<Person>() |
There was a problem hiding this comment.
this is an interesting example... CSV is inherently flat, yet we have a nested type here XD This can only occur if there's json inside the csv, which is not that common
There was a problem hiding this comment.
Maybe specify that it's not "just a CSV file", but that it contains a JSON column
| with a separate class representing the schema of each column group or nested `DataFrame`. | ||
|
|
||
| For example, consider a simple hierarchical DataFrame from | ||
| For example, consider a simple hierarchical dataframe from |
There was a problem hiding this comment.
yes, here we should also mention that this is a csv which has a json column, which is why it has a hierarchical structure
| ## `@DataSchema` annotation | ||
|
|
||
| `@DataSchema` is a Kotlin annotation that marks a data class or interface as a data schema. | ||
| Compiler plugin generates [extension properties](extensionPropertiesApi.md) for the `DataFrame` |
There was a problem hiding this comment.
The compiler plugin (and link to it)
|
|
||
| `@DataSchema` is a Kotlin annotation that marks a data class or interface as a data schema. | ||
| Compiler plugin generates [extension properties](extensionPropertiesApi.md) for the `DataFrame` | ||
| (or `DataRow`, `ColumnGroup`, etc.) |
| > we highly recommend using it only on interfaces and data classes specially made | ||
| > for representing the data schema of a `DataFrame`. | ||
| > | ||
| > Use only trivial properties, avoiding computed, `lateinit` or delegated properties. |
Yes please, somewhere about dataschemas. As an alternative to having computed properties right in the classes |
Closes #309.
We now explicitly recommend against using “complex” classes for data schemas in the documentation.
I also think it's better use interface for describing data schemas, especially for beginners, so I put interfaces on the first place and updated example with them.