|
14 | 14 | }, |
15 | 15 | "source": [ |
16 | 16 | "<h1>Table of Contents<span class=\"tocSkip\"></span></h1>\n", |
17 | | - "<div class=\"toc\"><ul class=\"toc-item\"><li><span><a href=\"#Introduction\" data-toc-modified-id=\"Introduction-1\"><span class=\"toc-item-num\">1 </span>Introduction</a></span></li><li><span><a href=\"#RandLA-Net-architecture\" data-toc-modified-id=\"RandLA-Net-architecture-2\"><span class=\"toc-item-num\">2 </span>RandLA-Net architecture</a></span><ul class=\"toc-item\"><li><span><a href=\"#Local-Feature-Aggregation\" data-toc-modified-id=\"Local-Feature-Aggregation-2.1\"><span class=\"toc-item-num\">2.1 </span>Local Feature Aggregation</a></span></li></ul></li><li><span><a href=\"#Implementation-in-arcgis.learn\" data-toc-modified-id=\"Implementation-in-arcgis.learn-3\"><span class=\"toc-item-num\">3 </span>Implementation in <code>arcgis.learn</code></a></span><ul class=\"toc-item\"><li><span><a href=\"#For-advanced-users\" data-toc-modified-id=\"For-advanced-users-3.1\"><span class=\"toc-item-num\">3.1 </span>For advanced users</a></span></li></ul></li><li><span><a href=\"#Setting-up-the-environment\" data-toc-modified-id=\"Setting-up-the-environment-4\"><span class=\"toc-item-num\">4 </span>Setting up the environment</a></span><ul class=\"toc-item\"><li><ul class=\"toc-item\"><li><span><a href=\"#For-ArcGIS-Pro-users:\" data-toc-modified-id=\"For-ArcGIS-Pro-users:-4.0.1\"><span class=\"toc-item-num\">4.0.1 </span>For ArcGIS Pro users:</a></span></li><li><span><a href=\"#For-Anaconda-users-(Windows-and-Linux-platforms):\" data-toc-modified-id=\"For-Anaconda-users-(Windows-and-Linux-platforms):-4.0.2\"><span class=\"toc-item-num\">4.0.2 </span>For Anaconda users (Windows and Linux platforms):</a></span></li></ul></li></ul></li><li><span><a href=\"#Best-practices-for-RandLA-Net-workflow\" data-toc-modified-id=\"Best-practices-for-RandLA-Net-workflow-5\"><span class=\"toc-item-num\">5 </span>Best practices for RandLA-Net workflow</a></span></li><li><span><a href=\"#References\" data-toc-modified-id=\"References-6\"><span class=\"toc-item-num\">6 </span>References</a></span></li></ul></div>" |
| 17 | + "<div class=\"toc\"><ul class=\"toc-item\"><li><span><a href=\"#Introduction\" data-toc-modified-id=\"Introduction-1\"><span class=\"toc-item-num\">1 </span>Introduction</a></span></li><li><span><a href=\"#RandLA-Net-architecture\" data-toc-modified-id=\"RandLA-Net-architecture-2\"><span class=\"toc-item-num\">2 </span>RandLA-Net architecture</a></span><ul class=\"toc-item\"><li><span><a href=\"#Local-Feature-Aggregation\" data-toc-modified-id=\"Local-Feature-Aggregation-2.1\"><span class=\"toc-item-num\">2.1 </span>Local Feature Aggregation</a></span></li></ul></li><li><span><a href=\"#Implementation-in-arcgis.learn\" data-toc-modified-id=\"Implementation-in-arcgis.learn-3\"><span class=\"toc-item-num\">3 </span>Implementation in <code>arcgis.learn</code></a></span><ul class=\"toc-item\"><li><span><a href=\"#For-advanced-users\" data-toc-modified-id=\"For-advanced-users-3.1\"><span class=\"toc-item-num\">3.1 </span>For advanced users</a></span></li></ul></li><li><span><a href=\"#Best-practices-for-RandLA-Net-workflow\" data-toc-modified-id=\"Best-practices-for-RandLA-Net-workflow-4\"><span class=\"toc-item-num\">4 </span>Best practices for RandLA-Net workflow</a></span></li><li><span><a href=\"#References\" data-toc-modified-id=\"References-5\"><span class=\"toc-item-num\">5 </span>References</a></span></li></ul></div>" |
18 | 18 | ] |
19 | 19 | }, |
20 | 20 | { |
|
52 | 52 | "cell_type": "markdown", |
53 | 53 | "metadata": {}, |
54 | 54 | "source": [ |
55 | | - "Point cloud classification is a task where each point in the point cloud is assigned a label, representing a real-world entity (see Figure 1.). And similar to how it's done in traditional methods, for deep learning, the point cloud classification process involves training – where the neural network learns from an already classified (labeled) point cloud dataset, where each point has a unique class code. These class codes are used to represent the features that we want the neural network to recognize. \n", |
| 55 | + "Point cloud classification is a task where each point in the point cloud is assigned a label, representing a real-world entity (see Figure 1). And similar to how it's done in traditional methods, for deep learning, the point cloud classification process involves training – where the neural network learns from an already classified (labeled) point cloud dataset, where each point has a unique class code. These class codes are used to represent the features that we want the neural network to recognize. \n", |
56 | 56 | "\n", |
57 | 57 | "In deep learning workflows for point cloud classification, one should not use a ‘thinned-out’ representation of a point cloud dataset that preserves only class codes of interest but drops a majority of the undesired return points, as we would like the neural network to learn and be able to differentiate points of interest and those that are not. Likewise, additional attributes that are present in training datasets, for example, Intensity, RGB, number of returns, etc. will improve the model’s accuracy but could inversely affect it if those parameters are not correct in the datasets that are used for inferencing." |
58 | 58 | ] |
|
90 | 90 | "cell_type": "markdown", |
91 | 91 | "metadata": {}, |
92 | 92 | "source": [ |
93 | | - "RandLA-Net is an architecture that allows for the learning of point features within a point cloud by using an encoder-decoder sequence with skip connections. The network applies shared MLP layers along with four encoding and decoding layers, as well as three fully-connected layers and a dropout layer to predict the semantic label of each point (see Figure 2.).\n", |
| 93 | + "RandLA-Net is an architecture that allows for the learning of point features within a point cloud by using an encoder-decoder sequence with skip connections. The network applies shared MLP layers along with four encoding and decoding layers, as well as three fully-connected layers and a dropout layer to predict the semantic label of each point (see Figure 2).\n", |
94 | 94 | "\n", |
95 | 95 | "\n", |
96 | 96 | "- The input to the architecture is a large-scale point cloud consisting of N points with feature dimensions of d<sub>in</sub>, where the batch dimension is dropped for simplicity.\n", |
|
140 | 140 | "- attentive pooling,\n", |
141 | 141 | "- and dilated residual block.\n", |
142 | 142 | "\n", |
143 | | - "These units work together to learn complex local structures by preserving local geometric features while progressively increasing the receptive field size in each neural layer (see Figure 3.). The LocSE unit is introduced first to capture the local spatial encoding of the point. Then, the attentive pooling unit is leveraged to select the most useful local features that contribute the most to the classification task. Finally, the multiple LocSE and attentive pooling units are stacked together as a dilated residual block to further enhance the effective receptive field for each point in a computationally efficient way." |
| 143 | + "These units work together to learn complex local structures by preserving local geometric features while progressively increasing the receptive field size in each neural layer (see Figure 3). The LocSE unit is introduced first to capture the local spatial encoding of the point. Then, the attentive pooling unit is leveraged to select the most useful local features that contribute the most to the classification task. Finally, the multiple LocSE and attentive pooling units are stacked together as a dilated residual block to further enhance the effective receptive field for each point in a computationally efficient way." |
144 | 144 | ] |
145 | 145 | }, |
146 | 146 | { |
|
164 | 164 | "\n", |
165 | 165 | "In an attentive pooling unit, the attention mechanism is used to automatically learn important local features and aggregate neighboring point features while avoiding the loss of crucial information. It also maintains the focus on the overall objective, which is to learn complex local structures in a point cloud by considering the relative importance of neighboring point features.\n", |
166 | 166 | "\n", |
167 | | - "Lastly in the dilated residual block unit, the receptive field is increased for each point by stacking multiple LocSE and Attentive Pooling units. This dilated residual block operates by cheaply dilating the receptive field and expanding the effective neighborhood through feature propagation (see Figure 4.). Stacking more and more units enhances the receptive field and makes the block more powerful, which may compromise the overall computation efficiency and lead to overfitting. Hence, in RandLA-Net, two sets of LocSE and Attentive Pooling are stacked as a standard residual block to achieve a balance between efficiency and effectiveness <a href=\"#References\">[1]</a>." |
| 167 | + "Lastly in the dilated residual block unit, the receptive field is increased for each point by stacking multiple LocSE and Attentive Pooling units. This dilated residual block operates by cheaply dilating the receptive field and expanding the effective neighborhood through feature propagation (see Figure 4). Stacking more and more units enhances the receptive field and makes the block more powerful, which may compromise the overall computation efficiency and lead to overfitting. Hence, in RandLA-Net, two sets of LocSE and Attentive Pooling are stacked as a standard residual block to achieve a balance between efficiency and effectiveness <a href=\"#References\">[1]</a>." |
168 | 168 | ] |
169 | 169 | }, |
170 | 170 | { |
|
198 | 198 | "cell_type": "markdown", |
199 | 199 | "metadata": {}, |
200 | 200 | "source": [ |
201 | | - "For this step of exporting the data into an intermediate format, use <a href=\"https://pro.arcgis.com/en/pro-app/latest/tool-reference/3d-analyst/prepare-point-cloud-training-data.htm\" target=\"_blank\">Prepare Point Cloud Training Data</a> tool, in the <a href=\"https://pro.arcgis.com/en/pro-app/latest/tool-reference/3d-analyst/an-overview-of-the-3d-analyst-toolbox.htm\" target=\"_blank\">3D Analyst extension</a>, available from ArcGIS Pro 2.8 onwards (see Figure 5.)." |
| 201 | + "For this step of exporting the data into an intermediate format, use <a href=\"https://pro.arcgis.com/en/pro-app/latest/tool-reference/3d-analyst/prepare-point-cloud-training-data.htm\" target=\"_blank\">Prepare Point Cloud Training Data</a> tool, in the <a href=\"https://pro.arcgis.com/en/pro-app/latest/tool-reference/3d-analyst/an-overview-of-the-3d-analyst-toolbox.htm\" target=\"_blank\">3D Analyst extension</a> (see Figure 5)." |
202 | 202 | ] |
203 | 203 | }, |
204 | 204 | { |
|
232 | 232 | "cell_type": "markdown", |
233 | 233 | "metadata": {}, |
234 | 234 | "source": [ |
235 | | - "For inferencing, use <a href=\"https://pro.arcgis.com/en/pro-app/latest/tool-reference/3d-analyst/classify-point-cloud-using-trained-model.htm\" target=\"_blank\">Classify Points Using Trained Model</a> tool, in the <a href=\"https://pro.arcgis.com/en/pro-app/latest/tool-reference/3d-analyst/an-overview-of-the-3d-analyst-toolbox.htm\" target=\"_blank\">3D Analyst extension</a>, available from ArcGIS Pro 2.8 onwards (see Figure 6.).\n", |
| 235 | + "For inferencing, use <a href=\"https://pro.arcgis.com/en/pro-app/latest/tool-reference/3d-analyst/classify-point-cloud-using-trained-model.htm\" target=\"_blank\">Classify Points Using Trained Model</a> tool, in the <a href=\"https://pro.arcgis.com/en/pro-app/latest/tool-reference/3d-analyst/an-overview-of-the-3d-analyst-toolbox.htm\" target=\"_blank\">3D Analyst extension</a> (see Figure 6).\n", |
236 | 236 | "\n", |
237 | 237 | "Main features available during the inferencing step:\n", |
238 | 238 | " \n", |
|
285 | 285 | "```" |
286 | 286 | ] |
287 | 287 | }, |
288 | | - { |
289 | | - "cell_type": "markdown", |
290 | | - "metadata": {}, |
291 | | - "source": [ |
292 | | - "## Setting up the environment" |
293 | | - ] |
294 | | - }, |
295 | | - { |
296 | | - "cell_type": "markdown", |
297 | | - "metadata": {}, |
298 | | - "source": [ |
299 | | - "<i>Make sure to update your 'GPU driver' to a recent version and use 'Administrator Rights' for all the steps, written in this guide.</i>\n", |
300 | | - "\n", |
301 | | - "_**Below, are the instructions to set up the required 'conda environment':**_" |
302 | | - ] |
303 | | - }, |
304 | | - { |
305 | | - "cell_type": "markdown", |
306 | | - "metadata": {}, |
307 | | - "source": [ |
308 | | - "#### For ArcGIS Pro users:" |
309 | | - ] |
310 | | - }, |
311 | | - { |
312 | | - "cell_type": "markdown", |
313 | | - "metadata": {}, |
314 | | - "source": [ |
315 | | - "<a href=\"https://github.com/esri/deep-learning-frameworks\" target=\"_blank\">Deep learning frameworks</a>\n", |
316 | | - "can be used to install all the required dependencies in ArcGIS Pro's default python environment using an MSI installer. \n", |
317 | | - "\n", |
318 | | - "Alternatively, \n", |
319 | | - "for a cloned environment of ArcGIS Pro's default environment, `deep-learning-essentials` metapackage can be used to install the required dependencies which can be done using the following command, in the _`Python Command Prompt`_ <i>(included with ArcGIS Pro)</i>:\n", |
320 | | - "\n", |
321 | | - "`conda install deep-learning-essentials`" |
322 | | - ] |
323 | | - }, |
324 | | - { |
325 | | - "cell_type": "markdown", |
326 | | - "metadata": {}, |
327 | | - "source": [ |
328 | | - "#### For Anaconda users (Windows and Linux platforms):" |
329 | | - ] |
330 | | - }, |
331 | | - { |
332 | | - "cell_type": "markdown", |
333 | | - "metadata": {}, |
334 | | - "source": [ |
335 | | - "`arcgis_learn` metapackage can be used for both `windows` and `linux` installations of `Anaconda` in a new environment.\n", |
336 | | - "\n", |
337 | | - "The following command will update `Anaconda` to the latest version. \n", |
338 | | - "\n", |
339 | | - "`conda update conda`\n", |
340 | | - "\n", |
341 | | - "After that, metapackage can be installed using the command below:\n", |
342 | | - "\n", |
343 | | - "`conda install -c esri arcgis_learn=3.9`" |
344 | | - ] |
345 | | - }, |
346 | 288 | { |
347 | 289 | "cell_type": "markdown", |
348 | 290 | "metadata": {}, |
|
410 | 352 | "cell_type": "markdown", |
411 | 353 | "metadata": {}, |
412 | 354 | "source": [ |
413 | | - "- `mask_class` functionality in `show_results()` can be used for analyzing any inter-class noises present in the validation output. This can be used to understand which classes need more diversity in training data or need an increase in its number of labeled points _(As shown below, in Figure 7.)_.\n", |
| 355 | + "- `mask_class` functionality in `show_results()` can be used for analyzing any inter-class noises present in the validation output. This can be used to understand which classes need more diversity in training data or need an increase in its number of labeled points _(See Figure 7)_.\n", |
414 | 356 | "\n", |
415 | 357 | "\n", |
416 | 358 | "<p align=\"center\"><center><img src=\"../../static/img/pointcnn_guide_gif_1.gif\" /></center></p>\n", |
|
471 | 413 | "cell_type": "markdown", |
472 | 414 | "metadata": {}, |
473 | 415 | "source": [ |
474 | | - "\n", |
475 | 416 | "[1] Hu, Q., Yang, B., Xie, L., Rosa, S., Guo, Y., Wang, Z., Trigoni, N., & Markham, A. (2020). Randla-Net: Efficient semantic segmentation of large-scale point clouds. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 11105–11114. https://doi.org/10.1109/CVPR42600.2020.01112" |
476 | 417 | ] |
477 | 418 | } |
|
492 | 433 | "name": "python", |
493 | 434 | "nbconvert_exporter": "python", |
494 | 435 | "pygments_lexer": "ipython3", |
495 | | - "version": "3.9.18" |
| 436 | + "version": "3.11.10" |
496 | 437 | }, |
497 | 438 | "toc": { |
498 | 439 | "base_numbering": 1, |
|
0 commit comments