Skip to content

Commit c123e69

Browse files
authored
Merge pull request #312 from lonvia/improve-docs
Updates to the manual
2 parents 7c45537 + 283e41c commit c123e69

7 files changed

Lines changed: 352 additions & 19 deletions

docs/index.md

Lines changed: 12 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -25,27 +25,31 @@ the following additional dependencies need to be available:
2525
* [libosmium](https://github.com/osmcode/libosmium) >= 2.16.0
2626
* [protozero](https://github.com/mapbox/protozero)
2727
* [cmake](https://cmake.org/)
28-
* [Pybind11](https://github.com/pybind/pybind11) >= 2.2
2928
* [expat](https://libexpat.github.io/)
3029
* [libz](https://www.zlib.net/)
3130
* [libbz2](https://www.sourceware.org/bzip2/)
3231
* [Boost](https://www.boost.org/) variant and iterator >= 1.41
3332
* [Python Requests](https://docs.python-requests.org/en/master/)
34-
* Python setuptools
3533
* a recent C++ compiler (Clang 3.4+, GCC 4.8+)
3634

35+
The following additional dependencies are automatically installed as part
36+
of the build process:
37+
38+
* [scikit-build-core](https://scikit-build-core.readthedocs.io/en/latest/)
39+
* [Pybind11](https://github.com/pybind/pybind11)
40+
3741
On Debian/Ubuntu-like systems, the following command installs all required
3842
packages:
3943

4044
sudo apt-get install python3-dev build-essential cmake libboost-dev \
4145
libexpat1-dev zlib1g-dev libbz2-dev
4246

43-
libosmium, protozero and pybind11 are shipped with the source wheel. When
44-
building from source, you need to download the source code and put it
45-
in the subdirectory 'contrib'. Alternatively, if you want to put the sources
46-
somewhere else, point pyosmium to the source code location by setting the
47-
CMake variables `LIBOSMIUM_PREFIX`, `PROTOZERO_PREFIX` and
48-
`PYBIND11_PREFIX` respectively.
47+
Compatible versions of libosmium and protozero are shipped with the source
48+
wheel. When building from source, you need to download the source code of these
49+
two libraries and put it in the subdirectory 'contrib'. Alternatively,
50+
if you already have the sources somewhere else,
51+
point pyosmium to the source code location by setting the
52+
CMake variables `Libosmium_ROOT` and `Protozero_ROOT`.
4953

5054
To compile and install the bindings, run
5155

docs/user_manual/01-First-Steps.md

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -140,7 +140,7 @@ out about the tags. It is also always useful to consult
140140
different keys and value in actual use.
141141

142142
Tags are common to all OSM objects. After that there are three kinds of
143-
objects in OSM: nodes, ways and relations.
143+
object types in OSM: nodes, ways and relations.
144144

145145
### Nodes
146146

@@ -187,7 +187,7 @@ backward references when talking about the dependencies between objects:
187187

188188
* A __forward reference__ means that an object is referenced to by another.
189189
Nodes appear in ways. Ways appear in relations. And a node may even have
190-
an indirect forward reference to a relation through a way it appear in.
190+
an indirect forward reference to a relation through a way it appears in.
191191
Forward references are important when tracking changes. When the location
192192
of a node changes, then all its forward references have to be reevaluated.
193193

@@ -198,6 +198,10 @@ backward references when talking about the dependencies between objects:
198198
to follow the backward references for ways and relations until we reach
199199
the nodes.
200200

201+
Closely related to backward references is the concept of __reference
202+
completeness__. A dataset or file is considered reference complete when
203+
all backward references can be resolved.
204+
201205
## Order in OSM files
202206

203207
OSM files usually follow a sorting convention to make life easier for

docs/user_manual/02-Extracting-Object-Data.md

Lines changed: 20 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,8 @@ Finally, there is a type for changesets, which contains information about
1414
edits in the OSM database. It can only appear in special changeset files
1515
and explained in more detail [below](#changeset).
1616

17-
The FileProcessor may return any of these objects, when iterating over a file.
17+
When iterating over a file, then the FileProcessor may return any of these
18+
objects.
1819
Therefore, a script will usually first need to determine the type of object
1920
received. There are a couple of ways to do this.
2021

@@ -83,7 +84,7 @@ You can simply test for this object type:
8384
## Reading object tags
8485

8586
Every object has a list of properties, the tags. They can be accessed through
86-
the `tags` property, which provides a simple dictionary-like view of the tags.
87+
the `tags` property. It provides a simple dictionary-like view of the tags.
8788
You can use the bracket notation to access a specific tag or use the more
8889
explicit `get()` function. Just like for Python dictionaries, an access by
8990
bracket raises a `ValueError` when the key you are looking for does not exist,
@@ -140,7 +141,23 @@ list into a Python dictionary:
140141
## Other common meta information
141142

142143
Next to the tags, every OSM object also carries some meta information
143-
describing its ID, version and information regarding the editor.
144+
which all can be accessed through read-only properties.
145+
146+
The most important meta information is the object's ID in the `id` property.
147+
This is the ID used when objects reference each other.
148+
149+
The other meta fields contain information when and by whom the objet was edited.
150+
The following table gives a quick overview over these fields:
151+
152+
| Property | Description |
153+
|-----------|--------------------------|
154+
| version | Version of the object. A newly created object starts with version 1. |
155+
| deleted | A boolean property stating if the object should be used or ignored. Only relevant for [change](08-Working-With-Change-Files.md) and [history](09-Working-With-History-Files.md) files. |
156+
| changeset | The ID of the change set this object was created with. A change set contains a set of edits that have been uploaded by an editor in a single session. |
157+
| timestamp | UTC time at which the object was created, or more precisely, added to the database. |
158+
| uid | The ID of the user who created this version of the object. User IDs are univocal and prepetual. |
159+
| user | The name of the user who created this version of the object. This is the name the user had when the object was created. User names may be changed over time. The same name in different objects doesn't necessarily reference the same user. |
160+
144161

145162
## Properties of OSM object types
146163

docs/user_manual/06-Writing-Data.md

Lines changed: 57 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,18 @@ pyosmium will refuse to overwrite any existing files. Either make sure to
1818
delete the files before instantiating a writer or use the parameter
1919
`overwrite=true`.
2020

21+
All writers are [context managers](https://docs.python.org/3/reference/datamodel.html#context-managers) and to ensure that the file is properly closed in the
22+
end, the recommended way to use them is in a with statement:
23+
24+
!!! example
25+
```python
26+
with osmium.SimpleWriter('my_extra_data.osm.pbf') as writer:
27+
# do stuff here
28+
```
29+
30+
When not used inside a with block, then don't forget to call the `close()`
31+
function explicitly to close the writer.
32+
2133
Once a writer is instantiated, one of the `add*` functions can be used to
2234
add an OSM object to the file. You can either use one of the
2335
`add_node/way/relation` functions to force writing a specific type of
@@ -27,9 +39,6 @@ they are given to the writer object. It is your responsibility as a user to
2739
make sure that the order is correct with respect to the
2840
[conventions for object order][order-in-osm-files].
2941

30-
After writing all data the writer needs to be closed using the `close()`
31-
function. It is usually easier to use a writer as a context manager.
32-
3342
Here is a complete example for a script that converts a file from OPL format
3443
to PBF format:
3544

@@ -129,3 +138,48 @@ pyosmium implements three different writer classes: the basic
129138
the two reference-completing writers
130139
[ForwardReferenceWriter][osmium.ForwardReferenceWriter] and
131140
[BackReferenceWriter][osmium.BackReferenceWriter].
141+
142+
### Writing specific objects only
143+
144+
The [SimpleWriter][osmium.SimpleWriter] creates an OSM data file by directly
145+
writing out any OSM object that it receives in the chosen format.
146+
147+
148+
### Writing reference-complete files
149+
150+
The [BackReferenceWriter][osmium.BackReferenceWriter] will make sure that the
151+
file that is written out is reference-complete, meaning all objects that are
152+
directly referenced by the object written are added to the output file as well.
153+
This is needed when you want to make sure that geometries can be recreated
154+
from the object in the file.
155+
156+
Creating a file with backward references is a two-stage process: while the
157+
writer is open, it will write all objects received through one of the `add_*()`
158+
functions into a temporary file and keeps a record of which objects are needed
159+
to make the file reference-complete. Once the writer is closed, it collects the
160+
missing object from a given reference file, merges them with the data from
161+
the temporary file and writes out the final result.
162+
163+
### Writing files with forward references
164+
165+
The [ForwardReferenceWriter][osmium.ForwardReferenceWriter] completes the
166+
written objects with forward references. This is particularly useful when
167+
creating geographic extracts of any kind: one selects the node of interest
168+
in a particular area and then lets the ForwardReferenceWriter complete the
169+
ways and relations referring to the nodes.
170+
171+
Files written by the ForwardReferenceWriter are not necessarily
172+
reference-complete. That is easy to see when considering the example of the
173+
geographic extract: there may be ways in the area that cross the boundary
174+
of the area chosen but only the nodes within the area are written out. This
175+
might be useful in many situations as the way would be simply seem to be cut
176+
on the area of interest. However, it has the disadvantage that some objects
177+
will get invalid geometries, especially when they represent areas.
178+
179+
The other thing to consider during forward completion are indirect references.
180+
When completing relations indirectly referenced through ways or other relations,
181+
then the resulting file can become big very quickly. For example, a seemingly
182+
small extract of the city of Strasbourg can suddenly contain not only the
183+
relations for France and Germany but also electoral boundaries and entire
184+
timezones. For that reason, when forward-completing relations, it is not
185+
recommended to use backward completion.

docs/user_manual/07-Input-Formats-And-Other-Sources.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,7 @@ The special file name `-` can be used to read from standard input or
5555
write to standard output.
5656

5757
When reading data, use a `File` object to specify the file format. With
58-
the SimpleReader, you need to use the parameter `filetype`.
58+
the SimpleWriter, you need to use the parameter `filetype`.
5959

6060
!!! example
6161
This code snipped dumps all ids of your input file to the console.

0 commit comments

Comments
 (0)