You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
>The CF metadata conventions are designed to promote the processing and sharing of files created with the NetCDF API. The conventions define metadata that provide a definitive description of what the data in each variable represents, and the spatial and temporal properties of the data. This enables users of data from different sources to decide which quantities are comparable, and facilitates building applications with powerful extraction, regridding, and display capabilities. The CF convention includes a standard name table, which defines strings that identify physical quantities.
46
46
47
-
CF metadata conventions set common expectations for metadata names and locations across datasets. In this tutorial, we will use tools such as [cf_xarray]() that leverage CF conventions to add programmatic handling of CF metadata to Xarray objects, meaning that users can spend less time wrangling metadata. 🤩
47
+
CF metadata conventions set common expectations for metadata names and locations across datasets. In this tutorial, we will use tools such as [cf_xarray]() that leverage CF conventions to add programmatic handling of CF metadata to Xarray objects, meaning that users can spend less time wrangling metadata.
48
48
49
49
Spatio-temporal Asset Catalog (STAC)
50
50
STAC is a metadata specification for geospatial data that allows the data to be more easily "worked with, indexed, and discovered" [$\tiny \nearrow$](https://stacspec.org/en). It does this by setting a common format for how metadata will be structured. This functions like setting a common expectation that all users of the data can rely on so that they know where certain information will be located and how it will be stored.
Copy file name to clipboardExpand all lines: book/background/tutorial_data.md
+7-7Lines changed: 7 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -13,16 +13,16 @@ Here is a broad overview the data included in this tutorial, including how it is
13
13
| ITS_LIVE |[ITS_LIVE project, NASA JPL](https://its-live.jpl.nasa.gov/)| Zarr | AWS S3|
14
14
15
15
16
-
ITS_LIVE is a dataset of ice velocity observations derived from applying a feature tracking algorithm to pairs of satellite imagery. Ice velocity refers to the downslope movement of glaciers and ice sheets. Because glaciers and ice sheets are dynamic elements of our climate system, they lose or gain mass in response to changes in climate conditions such as warmer temperatures or increased snowfall, measuring variability in the speed of ice flow can help scientists better understand trends in glacier dynamics and interactions between glaciers and climate.
16
+
ITS_LIVE is a dataset of ice velocity observations derived from applying a feature tracking algorithm to pairs of satellite imagery. Ice velocity refers to the downslope movement of glaciers and ice sheets {cite}`Gardner_Scambos_2022`. Because glaciers and ice sheets are dynamic elements of our climate system, they lose or gain mass in response to changes in climate conditions such as warmer temperatures or increased snowfall, measuring variability in the speed of ice flow can help scientists better understand trends in glacier dynamics and interactions between glaciers and climate.
17
17
18
18
Part of what is so exciting about ITS_LIVE is that it combines image pairs from a number of satellites, including imagery from optical (Landsat 4,5,7,8,9 & Sentinel-2) and synthetic aperture radar (Sentinel-1) sensors. For this reason, ITS_LIVE time series data can be quite large. Another exciting aspect of the ITS_LIVE dataset is that the image pair time series data is made available as Zarr data cubes stored in cloud object storage on Amazon Web Services (AWS), meaning that users don't need to download massive files to start working with the data!
19
19
20
20
ITS_LIVE produces a number of data products in addition to the image pair time series that we use in this tutorial, and provides different options to access the data. Check them out [here](https://its-live.jpl.nasa.gov/#access).
21
21
22
22
**Documentation & References**:
23
23
Be sure to also check out the ITS_LIVE image pair velocities [documentation](http://its-live-data.jpl.nasa.gov.s3.amazonaws.com/documentation/ITS_LIVE-Landsat-Scene-Pair-Velocities-v01.pdf) and papers on the ITS_LIVE processing methodology:
24
-
-[Processing methodology for the ITS_LIVE Sentinel-1 ice velocity products](https://doi.org/10.5194/essd-14-5111-2022). Lei et al., (2022)
25
-
-[Autonomous Repeat Image Feature Tracking (autoRIFT) and its application for tracking ice displacement](https://www.mdpi.com/2072-4292/13/4/749). Lei et al., (2021)
24
+
-[Autonomous Repeat Image Feature Tracking (autoRIFT) and its application for tracking ice displacement](https://www.mdpi.com/2072-4292/13/4/749). {cite}`lei_2021_AutonomousRepeatImage`
25
+
-[Processing methodology for the ITS_LIVE Sentinel-1 ice velocity products](https://doi.org/10.5194/essd-14-5111-2022). {cite}`Lei_2022_Processing`
26
26
27
27
**Further reading on ice velocities**:
28
28
-[NASA/USGS Provide Global View of Speed of Ice](https://www.jpl.nasa.gov/news/nasausgs-provide-global-view-of-speed-of-ice/)
@@ -69,7 +69,7 @@ SAR data is collected in slant range, which is the viewing geometry of the side-
69
69
| Sentinel-1 RTC |[Alaska Satellite Facility](https://asf.alaska.edu/)| COG (locally as GeoTIFF) | Local |
70
70
71
71
72
-
We use Sentinel-1 RTC imagery processed by Alaska Satellite Facility's Hypbrid Pluggable Processing Pipeline (**HyP3**). This is a processing platform that allows users to perform processing steps necessary for analysis-ready SAR data through ASF.
72
+
We use Sentinel-1 RTC imagery processed by Alaska Satellite Facility's Hypbrid Pluggable Processing Pipeline (**HyP3**) {cite}`hogenson_2024_10903242`. This is a processing platform that allows users to perform processing steps necessary for analysis-ready SAR data through ASF.
73
73
74
74
From the [ASF HyP3 Documentation](https://hyp3-docs.asf.alaska.edu/):
75
75
HyP3 is a service for processing Synthetic Aperture Radar (SAR) imagery that addresses many common issues for users of SAR data:
@@ -81,7 +81,7 @@ HyP3 is a service for processing Synthetic Aperture Radar (SAR) imagery that add
81
81
HyP3 solves these problems by providing a free service where people can request SAR processing on-demand. These processing requests are picked up by automated systems, which handle the complexity of SAR processing on behalf of the user. HyP3 doesn't require users to have a lot of knowledge of SAR processing before getting started; users only need to submit the input data and set a few optional parameters if desired. With HyP3, analysis-ready products are just a few clicks away.
82
82
83
83
84
-
The data in this tutorial was processed using HyP3 and then published via Zenodo [here](https://zenodo.org/records/7236413#.Y1rNi37MJ-0). For more on how to use HyP3 for your own data processing needs, check out their [tutorials page](https://hyp3-docs.asf.alaska.edu/tutorials/).
84
+
The data in this tutorial was processed using HyP3 {cite}`andrew_johnston_2022_6629125`and then published via Zenodo [here](https://zenodo.org/records/7236413#.Y1rNi37MJ-0). For more on how to use HyP3 for your own data processing needs, check out their [tutorials page](https://hyp3-docs.asf.alaska.edu/tutorials/).
85
85
:::
86
86
87
87
:::{tab-item} Microsoft Planetary Computer
@@ -100,7 +100,7 @@ Further reading on SAR data and Sentinel-1:
100
100
-[ASF Introduction to SAR](https://hyp3-docs.asf.alaska.edu/guides/introduction_to_sar/)
101
101
-[NASA Earth Observation Data Basics - SAR](https://www.earthdata.nasa.gov/learn/earth-observation-data-basics/sar#toc-resources)
102
102
-[University of Alaska Fairbanks - Microwave Remote Sensing](https://radar.community.uaf.edu/)
103
-
-[Mathematical tutorial on SAR](https://www.earthdata.nasa.gov/s3fs-public/2024-06/sar%20mathematical%20tutorial.pdf) (by Margaret Cheney, from NASA EarthData)
103
+
- Mathematical tutorial on SAR {cite}`cheney_SAR_2001`, publicly available via NASA [EarthData](https://www.earthdata.nasa.gov/s3fs-public/2024-06/sar%20mathematical%20tutorial.pdf)
104
104
105
105
## *Vector data*
106
106
@@ -110,7 +110,7 @@ Further reading on SAR data and Sentinel-1:
The Randolph Glacier Inventory (RGI) is a community-driven public dataset that provides outlines and auxiliary information such as area, length and asepct of glaciers across the world. RGI is a subset of the Global Land Ice Measurements from Space ([GLIMS](https://www.glims.org/)) initiative and RGI data is hosted by the National Snow and Ice Data Center ([NSDIC](https://nsidc.org/data/nsidc-0770/versions/7)). Read more about the RGI project [here](http://www.glims.org/rgi_user_guide/01_introduction.html).
113
+
The Randolph Glacier Inventory (RGI) is a community-driven public dataset that provides outlines and auxiliary information such as area, length and asepct of glaciers across the world {cite}`RGI_Consortium_2023`. RGI is a subset of the Global Land Ice Measurements from Space ([GLIMS](https://www.glims.org/)) initiative and RGI data is hosted by the National Snow and Ice Data Center ([NSDIC](https://nsidc.org/data/nsidc-0770/versions/7)). Read more about the RGI project [here](http://www.glims.org/rgi_user_guide/01_introduction.html).
These tutorials were initially developed while Emma Marshall interned with the Summer Internships in Parallel Computational Sciences ([SIParCS](https://www.cisl.ucar.edu/outreach/internships)) program at the National Center for Atmospheric Research ([NCAR](https://ncar.ucar.edu/)). Jessica Scheick, Scott Henderson, and Deepak Cherian were internship supervisors for this project. The internship was also supported by a NASA Open Source Tools, Frameworks, and Libraries program (Award 80NSSC22K0345), with a specific focus on developing educational resources for working with cloud-hosted data using Xarray. Tutorial development continued after the conclusion of the SIParCS internship when Emma Marshall returned to the University of Utah as a Ph.D. student, where she was supported by a (insert finesst grant no ? ).
4
-
3
+
These tutorials were initially developed while Emma Marshall interned with the Summer Internships in Parallel Computational Sciences ([SIParCS](https://www.cisl.ucar.edu/outreach/internships)) program at the National Center for Atmospheric Research ([NCAR](https://ncar.ucar.edu/)). Jessica Scheick, Scott Henderson, and Deepak Cherian were internship supervisors for this project. The internship was also supported by a NASA Open Source Tools, Frameworks, and Libraries program (Award 80NSSC22K0345), with a specific focus on developing educational resources for working with cloud-hosted data using Xarray. Tutorial development continued after the conclusion of the SIParCS internship when Emma Marshall returned to the University of Utah as a Ph.D. student, where she was supported by a FINESST Fellowship Grant (80NSSC22K1536).
5
4
## Contributing
6
5
7
6
If you'd like to contribute to this book, please start a discussion or raise an issue in the GitHub [repository](https://github.com/e-marshall/cloud-open-source-geospatial-datacube-workflows).
8
7
9
8
## Citation
10
9
11
-
If you use this material, please include the following citation:
10
+
If you use this material, please include the following citation:
11
+
This book is currently under review in the [Journal of Open Source Education](https://jose.theoj.org/), please check back later for a citation.
12
+
13
+
## Acknowledgements
14
+
15
+
This book is the product of many discussions and developments within the open-source community. All of the workflows demonstrates throughout these tutorials are made possible by open-source developers and maintainers. Below is a full list of all open-source libraries used in this book:
This book was made using Jupyter Book {cite}`executable_books_community_2020_4539666` which uses a number of tools developed by the [Executable Books](https://executablebooks.org/) project.
Copy file name to clipboardExpand all lines: book/endmatter/appendix.md
+6-7Lines changed: 6 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,12 +1,14 @@
1
1
# Appendix
2
2
3
-
While developing this book, we encountered different examples that didn't always fit into the overall scope of the tutorials, but still may be useful to others.
3
+
While developing this book, we encountered different examples that didn't always fit into the overall scope of the tutorials, but still may be useful to others, which we've included here.
## 1. Troubleshooting visualizations with different geometry types[$\tiny \nearrow$](nbs/1_handle_mult_geom_types.ipynb)
6
+
*From the ITS_LIVE [tutorial](../itslive/itslive_intro.md).*
6
7
7
8
In the first tutorial, while making an [interactive visualization of vector dataframes](../itslive_nbs/3_combining_raster_vector_data.ipynb), we encountered a warning. This notebook includes a step-by-step demonstration of troubleshooting this warning, identifying its source and resolving it.
8
9
9
-
## [2. Reading a stack of files with `xr.open_mfdataset()` (Sentinel-1 tutorial)](nbs/2_read_w_xropen_mfdataset.ipynb)
10
+
## 2. Reading multiple files with `xr.open_mfdataset()`[$\tiny \nearrow$](nbs/2_read_w_xropen_mfdataset.ipynb)
11
+
*From the Sentinel-1 RTC [tutorial](../sentinel1/s1_intro.md).*
10
12
11
13
Xarray's `xr.open_mfdataset()`[function](https://docs.xarray.dev/en/stable/generated/xarray.open_mfdataset.html) allows the user to read in and combine multiple files at once to produce a single `xr.DataArray` object. This approach was explore when developing the [Read ASF-processed Sentinel-1 RTC data notebook](../sentinel1/nbs/1_read_asf_data.ipynb). However, `xr.open_mfdataset() didn't work well for this purpose because, while the stack of raster files used in this example covers a common area of interest, it includes several different spatial footprints. This creates problems when specifying a chunking strategy.
12
14
@@ -25,7 +27,4 @@ In addition to the documentation linked above, some other useful resources for `
25
27
26
28
```{note}
27
29
If you wanted to select scenes from a single viewing geometry at the expense of a denser time series, `xr.open_mfdataset()` might work a bit better (this approach has not been tested).
28
-
```
29
-
30
-
## [3. Another regridding approach using `xESMF` (Sentinel-1 tutorial)](nbs/3_regridding_w_xesmf.ipynb)
31
-
This notebook demonstrates an alternative approach to the regridding shown in [noteboook 5](../sentinel1/nbs/5_comparing_s1_rtc_datasets.ipynb) of Tutorial 2, but this time using a different regridding package.
Copy file name to clipboardExpand all lines: book/intro/software.md
-1Lines changed: 0 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -32,4 +32,3 @@ There are two options for creating a software environment: [pixi](https://pixi.s
32
32
```jupyterlab```
33
33
34
34
Both tutorials use functions that are stored in scripts associated with each dataset. You can find these scripts here: [`itslive_tools.py`](../itslive/nbs/itslive_tools.py) and [`s1_tools.py`](../s1/nbs/s1_tools.py).
Copy file name to clipboardExpand all lines: book/introduction.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,7 @@
1
1
# Cloud-native geospatial datacube workflows with open-source tools
2
2
3
3
4
-
Welcome to `Cloud-native geospatial datacube workflows with open-source tools`! This tutorial demonstrates steps of typical scientific workflows involving earth observation data with a focus on cloud-optimized data formats, larger-than memory data, and manipulating multi-dimensional datasets.
4
+
Welcome to `Cloud-native geospatial datacube workflows with open-source tools`! This tutorial demonstrates steps of typical scientific workflows involving earth observation data with a focus on cloud-optimized data formats, larger-than memory data, and manipulating multi-dimensional datasets to prepare them for analysis.
5
5
6
6
We focus on data derived from different types of satellite imagery that are publicly available in cloud-hosted repositories such as [AWS S3](https://aws.amazon.com/s3/). These tutorials demonstrate how to work with this data in Python, using software packages from the popular [Pangeo](https://www.pangeo.io/) ecosystem that are built on and around the [Xarray](https://xarray.dev/) data model.
Copy file name to clipboardExpand all lines: book/itslive/itslive_intro.md
+6-4Lines changed: 6 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -35,11 +35,11 @@ This tutorial contains jupyter notebooks demonstrating various steps of a typica
35
35
36
36
This tutorial will spend a lot of time discussing the following concepts, if they're unfamiliar to you, we recommend first heading to [Relevant Concepts](../background/relevant_concepts.md).
37
37
38
-
##1. {term}`Larger than memory data`
38
+
1. {term}`Larger than memory data`
39
39
40
-
##2. {term}`Dask`
40
+
2. {term}`Dask`
41
41
42
-
##3. {term}`Chunking`
42
+
3. {term}`Chunking`
43
43
44
44
:::
45
45
:::{tab-item} Learning goals
@@ -62,4 +62,6 @@ For instructions on setting up a computing environment needed for this tutorial,
62
62
63
63
For more background on the data used in this tutorial, head to [Tutorial Data](../background/tutorial_data.md).
64
64
65
-
::::
65
+
::::
66
+
67
+
To get started with this tutorial, make sure you've followed the instructions on the [Software](../intro/software.md) page for downloading the necessary material and setting up a virtual environment, then head to the first notebook.
Copy file name to clipboardExpand all lines: book/itslive/nbs/1_accessing_itslive_s3_data.ipynb
-12Lines changed: 0 additions & 12 deletions
Original file line number
Diff line number
Diff line change
@@ -306,18 +306,6 @@
306
306
"In addition to passing `url` to `xr.open_dataset()`, we include `chunks='auto'`. This introduces [dask](https://www.dask.org/) into our workflow; `chunks='auto'` will choose chunk sizes that match the underlying data structure; this is often ideal, but sometimes you may need to specify different chunking schemes. You can read more about choosing good chunk sizes [here](https://blog.dask.org/2021/11/02/choosing-dask-chunk-sizes); subsequent notebooks in this tutorial will explore different approaches to dask chunking. "
307
307
]
308
308
},
309
-
{
310
-
"cell_type": "markdown",
311
-
"id": "a844766e",
312
-
"metadata": {
313
-
"tags": [
314
-
"papermill-error-cell-tag"
315
-
]
316
-
},
317
-
"source": [
318
-
"<span id=\"papermill-error-cell\" style=\"color:red; font-family:Helvetica Neue, Helvetica, Arial, sans-serif; font-size:2em;\">Execution using papermill encountered an exception here and stopped:</span>"
0 commit comments