Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 0 additions & 7 deletions _typos.toml
Original file line number Diff line number Diff line change
Expand Up @@ -40,12 +40,6 @@ Simle = "Simle"
Sovler = "Sovler"
Successed = "Successed"
classfy = "classfy"
contxt = "contxt"
convertion = "convertion"
convinience = "convinience"
correponding = "correponding"
corresonding = "corresonding"
correspoinding = "correspoinding"
corss = "corss"
creatation = "creatation"
creats = "creats"
Expand Down Expand Up @@ -135,7 +129,6 @@ similary = "similary"
simplier = "simplier"
skiped = "skiped"
softwares = "softwares"
sould = "sould"
specail = "specail"
sperated = "sperated"
splited = "splited"
Expand Down
4 changes: 2 additions & 2 deletions docs/design/data_type/float16.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ To support the above features, two fundamental conversion functions are provided
float16 float_to_half_rn(float f); // convert to half precision in round-to-nearest-even mode
float half_to_float(float16 h);
```
which provides one-to-one conversion between float32 and float16. These twos functions will do different conversion routines based on the current hardware. CUDA/ARM instrinsics will be used when the corresonding hardware is available. If the hardware or compiler level does not support float32 to float16 conversion, software emulation will be performed to do the conversion.
which provides one-to-one conversion between float32 and float16. These twos functions will do different conversion routines based on the current hardware. CUDA/ARM instrinsics will be used when the corresponding hardware is available. If the hardware or compiler level does not support float32 to float16 conversion, software emulation will be performed to do the conversion.

## float16 inference
In Fluid, a neural network is represented as a protobuf message called [ProgramDesc](https://github.com/PaddlePaddle/docs/blob/develop/docs/design/concepts/program.md), whose Python wrapper is a [Program](https://github.com/PaddlePaddle/docs/blob/develop/docs/design/modules/python_api.md#program). The basic structure of a program is some nested [blocks](https://github.com/PaddlePaddle/docs/blob/develop/docs/design/modules/python_api.md#block), where each block consists of some [variable](https://github.com/PaddlePaddle/docs/blob/develop/docs/design/modules/python_api.md#variable) definitions and a sequence of [operators](https://github.com/PaddlePaddle/docs/blob/develop/docs/design/modules/python_api.md#operator). An [executor](https://github.com/PaddlePaddle/docs/blob/develop/docs/design/concepts/executor.md) will run a given program desc by executing the sequence of operators in the entrance block of the program one by one.
Expand All @@ -112,7 +112,7 @@ Operators including convolution and multiplication (used in fully-connected laye

When these operators are running in float16 mode, the float16 kernel requires those parameter variables to contain weights of Fluid float16 data type. Thus, we need a convenient way to convert the original float weights to float16 weights.

In Fluid, we use tensor to hold actual data for a variable on the c++ end. [Pybind](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/pybind/tensor_py.h) is used to bind c++ tensors of certain data type with numpy array of the correponding numpy data type on the Python end. Each common c++ built-in data type has a corresponding numpy data type of the same name. However, since there is no built-in float16 type in c++, we cannot directly bind numpy float16 data type with the Fluid float16 class. Since both Fluid float16 and numpy float16 use uint16 as the internal data storage type, we use c++ built-in type `uint16_t` and the corresponding numpy uint16 data type to bridge the gap via [Pybind](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/pybind/tensor_py.h).
In Fluid, we use tensor to hold actual data for a variable on the c++ end. [Pybind](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/pybind/tensor_py.h) is used to bind c++ tensors of certain data type with numpy array of the corresponding numpy data type on the Python end. Each common c++ built-in data type has a corresponding numpy data type of the same name. However, since there is no built-in float16 type in c++, we cannot directly bind numpy float16 data type with the Fluid float16 class. Since both Fluid float16 and numpy float16 use uint16 as the internal data storage type, we use c++ built-in type `uint16_t` and the corresponding numpy uint16 data type to bridge the gap via [Pybind](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/pybind/tensor_py.h).

The following code demonstrates how to do the tensor conversion.
```Python
Expand Down
2 changes: 1 addition & 1 deletion docs/design/dynamic_rnn/rnn_design.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ public:
LODTensor LODSliceShared(int level, int elem_begin, int elem_end) const;

// copy other's lod_start_pos_, to share LOD info.
// NOTE the LOD info sould not be changed.
// NOTE the LOD info could not be changed.
void ShareConstLODFrom(const LODTensor &other) {
lod_start_pos_ = other.lod_start_pos_;
}
Expand Down
2 changes: 1 addition & 1 deletion docs/design/dynamic_rnn/rnn_design_en.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ public:
LODTensor LODSliceShared(int level, int elem_begin, int elem_end) const;

// copy other's lod_start_pos_, to share LOD info.
// NOTE the LOD info sould not be changed.
// NOTE the LOD info could not be changed.
void ShareConstLODFrom(const LODTensor &other) {
lod_start_pos_ = other.lod_start_pos_;
}
Expand Down
2 changes: 1 addition & 1 deletion docs/design/mkldnn/int8/QAT/C++.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ To download other Quant models, set the `QUANT_MODEL_NAME` variable to on of the
- `ResNet50_qat_channelwise`, with input/output scales in `fake_quantize_range_abs_max` operators and the `out_threshold` attributes, with weight scales in `fake_channel_wise_dequantize_max_abs` operators


### Model convertion
### Model conversion

To run this quantiozation approach, first you need to set `AnalysisConfig` first and use `EnableMkldnnInt8` function that converts fake-quant model to INT8 OneDNN one.
Examples:
Expand Down
2 changes: 1 addition & 1 deletion docs/design/motivation/api.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ def f(in):
return o

# Create 3 topologies (subnets), they share parameters because all
# correspoinding layers have the same parameter names.
# corresponding layers have the same parameter names.
fA = f(paddle.layer.data(input_name="A"))
fB = f(paddle.layer.data(input_name="B"))
fQ = f(paddle.layer.data(input_name="Q"))
Expand Down
2 changes: 1 addition & 1 deletion docs/dev_guides/sugon/paddle_c86_cn.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ ROCm 软件栈本身具备较高的成熟度与完备性,用户根据 ROCm 提
- 动态库加载: 在 [paddle/phi/backends/dynload](https://github.com/PaddlePaddle/Paddle/tree/develop/paddle/phi/backends/dynload) 目录下动态加载 ROCm 加速库及所需 API,如 [hiprand.h](https://github.com/PaddlePaddle/Paddle/tree/develop/paddle/phi/backends/dynload/hiprand.h) [miopen.h](https://github.com/PaddlePaddle/Paddle/tree/develop/paddle/phi/backends/dynload/miopen.h) [rocblas.h](https://github.com/PaddlePaddle/Paddle/tree/develop/paddle/phi/backends/dynload/rocblas.h)等
- Driver/Runtime 适配:主要在 [paddle/fluid/platform/device/gpu](https://github.com/PaddlePaddle/Paddle/tree/develop/paddle/fluid/platform/device/gpu) 目录下对 HIP 和 CUDA 进行了相关 API 的封装,其中在 [gpu_types.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/core/platform/device/gpu/gpu_types.h) 少量封装了部分与 CUDA 差异较小的数据类型定义,部分 ROCm 独有代码位于[paddle/phi/core/platform/device/gpu/rocm](https://github.com/PaddlePaddle/Paddle/tree/develop/paddle/phi/core/platform/device/gpu/rocm)目录
- Memory 管理:利用上一步封装好的 Driver/Runtime API 对 [memcpy.cc](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/core/memory/memcpy.cc#L574) 与 [paddle/phi/core/memory/allocation](https://github.com/PaddlePaddle/Paddle/tree/develop/paddle/phi/core/memory/allocation) 目录下的多种 Memory Allocator 进行实现
- Device Context 管理:利用封装好的 API 实现对设备上下文的管理及设备池的初始化,位于 [device_contxt.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/core/platform/device_context.h)
- Device Context 管理:利用封装好的 API 实现对设备上下文的管理及设备池的初始化,位于 [device_context.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/core/platform/device_context.h)
- 其他设备管理相关的适配接入,如 Profiler, Tracer, Error Message, NCCL 等,代码主要位于 [Paddle/platform](https://github.com/PaddlePaddle/Paddle/tree/develop/paddle/fluid/platform) 目录下
3. 算子注册:主要包括 HIP Kernel 的算子注册,以及 MIOpen 的算子在 ROCm 平台上的注册
- 数据类型支持:除通用数据类型外,还需适配 Paddle 支持的特殊数据类型包括 [float16.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/common/float16.h#L144) [complex.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/common/complex.h#L88) [bfloat16.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/common/bfloat16.h#L65) 等
Expand Down
2 changes: 1 addition & 1 deletion docs/guides/model_convert/update_en.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ In order to make the API organization more concise and clear, the original direc

### API alias rule

- APIs are created with aliases in different paths for better convinience:
- APIs are created with aliases in different paths for better convenience:
- All APIs under device, framework, and tensor directories are aliased in the paddle root directory; all APIs are not aliased in the paddle root directory except a few special APIs.
- All APIs in the paddle.nn directory except for the functional directory have aliases in the paddle.nn directory; all APIs in the functional directory have no aliases in the paddle.nn directory.
- ** **It is recommended to give preference to aliases with shorter paths**, for example `paddle.add -> paddle.tensor.add`; `paddle.add` is recommended.
Expand Down