Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 0 additions & 12 deletions _typos.toml
Original file line number Diff line number Diff line change
Expand Up @@ -26,9 +26,6 @@ Nervana = "Nervana"

# These words need to be fixed
Creenshot = "Creenshot"
Embeddding = "Embeddding"
Embeding = "Embeding"
Engish = "Engish"
Learing = "Learing"
Moible = "Moible"
Operaton = "Operaton"
Expand Down Expand Up @@ -57,15 +54,6 @@ dimention = "dimention"
dimentions = "dimentions"
dirrectories = "dirrectories"
disucssion = "disucssion"
egde = "egde"
enviornment = "enviornment"
erros = "erros"
evalute = "evalute"
exampels = "exampels"
exection = "exection"
exlusive = "exlusive"
exmaple = "exmaple"
exsits = "exsits"
feeded = "feeded"
flaot = "flaot"
fliters = "fliters"
Expand Down
2 changes: 1 addition & 1 deletion ci_scripts/check_api_docs_en.py
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,6 @@ def check_system_message_in_doc(doc_file):
if error_files:
print("error files: ", error_files)
print(
"ERROR: these docs exsits System Message: WARNING/ERROR, please check and fix them"
"ERROR: these docs exists System Message: WARNING/ERROR, please check and fix them"
)
sys.exit(1)
2 changes: 1 addition & 1 deletion ci_scripts/check_api_docs_en.sh
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ function check_system_message(){
fi
}

echo "RUN Engish API Docs Checks"
echo "RUN English API Docs Checks"
jsonfn=$1
output_path=$2
need_check_api_py_files="${3}"
Expand Down
2 changes: 1 addition & 1 deletion docs/design/dist_train/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ The training process of asynchronous training can be:
2. Trainer gets all parameters back from pserver.

### Note:
There are also some conditions that need to consider. For exmaple:
There are also some conditions that need to consider. For example:

1. If trainer needs to wait for the pserver to apply it's gradient and then get back the parameters back.
1. If we need a lock between parameter update and parameter fetch.
Expand Down
2 changes: 1 addition & 1 deletion docs/design/memory/memory_optimization.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ We can leran these techniques from compilers. There are mainly two stages to mak


#### Control Flow Graph
To perform analysis on a program, it is often useful to make a control flow graph. A [control flow graph](https://en.wikipedia.org/wiki/Control_flow_graph) (CFG) in computer science is a representation, using graph notation, of all paths that might be traversed through a program during its execution. Each statement in the program is a node in the flow graph; if statemment x can be followed by statement y, there is an egde from x to y.
To perform analysis on a program, it is often useful to make a control flow graph. A [control flow graph](https://en.wikipedia.org/wiki/Control_flow_graph) (CFG) in computer science is a representation, using graph notation, of all paths that might be traversed through a program during its execution. Each statement in the program is a node in the flow graph; if statemment x can be followed by statement y, there is an edge from x to y.

Following is the flow graph for a simple loop.

Expand Down
2 changes: 1 addition & 1 deletion docs/design/mkldnn/inplace/inplace.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,4 +94,4 @@ replace this original name in all of next op instances.

\* oneDNN gelu kernel is able to perform in-place execution, but currently gelu op does not support in-place execution.

\*\* sum kernel is using oneDNN sum primitive that does not provide in-place exection, so in-place computation is done faked through external buffer. So it was not added into oneDNN inplace pass.
\*\* sum kernel is using oneDNN sum primitive that does not provide in-place execution, so in-place computation is done faked through external buffer. So it was not added into oneDNN inplace pass.
2 changes: 1 addition & 1 deletion docs/design/phi/kernel_migrate_cn.md
Original file line number Diff line number Diff line change
Expand Up @@ -159,7 +159,7 @@ void LogSoftmaxKernel(const Context& dev_ctx,
| `auto* ptr = out->mutbale_data()` | `auto* ptr = out->data()` |
| `out->mutbale_data(dims, place)` | `out->Resize(dims); dev_ctx.template Alloc(out)` |
| `out->mutbale_data(place, dtype)` | `dev_ctx.Alloc(out, dtype)` |
| `platform::erros::XXX` | `phi::erros::XXX` |
| `platform::errors::XXX` | `phi::errors::XXX` |
| `platform::float16/bfloat16/complex64/complex128` | `dtype::float16/bfloat16/complex64/complex128` |
| `framework::Eigen***` | `Eigen***` |
| `platform::XXXPlace` | `phi::XXXPlace` |
Expand Down
2 changes: 1 addition & 1 deletion docs/design/phi/kernel_migrate_en.md
Original file line number Diff line number Diff line change
Expand Up @@ -159,7 +159,7 @@ Secondly, it is necessary to replace some of the types or functions that were on
| `auto* ptr = out->mutbale_data()` | `auto* ptr = out->data()` |
| `out->mutbale_data(dims, place)` | `out->Resize(dims); dev_ctx.template Alloc(out)` |
| `out->mutbale_data(place, dtype)` | `dev_ctx.Alloc(out, dtype)` |
| `platform::erros::XXX` | `phi::erros::XXX` |
| `platform::errors::XXX` | `phi::errors::XXX` |
| `platform::float16/bfloat16/complex64/complex128` | `dtype::float16/bfloat16/complex64/complex128` |
| `framework::Eigen***` | `Eigen***` |
| `platform::XXXPlace` | `phi::XXXPlace` |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ In this section we will walk through the steps required to extend a fake hardwar

**InitPlugin**

As a custom runtime entry function, InitPlugin is required to be implemented by the plug-in. The parameter in InitPlugin should also be checked, device information should be filled in, and the runtime API should be registered. In the initialization, PaddlePaddle loads the plug-in and invokes InitPlugin to initialize it, and register runtime (The whole process can be done automatically by the framework, only if the dynamic-link library is in site-packages/paddle-plugins/ or the designated directory of the enviornment variable of CUSTOM_DEVICE_ROOT).
As a custom runtime entry function, InitPlugin is required to be implemented by the plug-in. The parameter in InitPlugin should also be checked, device information should be filled in, and the runtime API should be registered. In the initialization, PaddlePaddle loads the plug-in and invokes InitPlugin to initialize it, and register runtime (The whole process can be done automatically by the framework, only if the dynamic-link library is in site-packages/paddle-plugins/ or the designated directory of the environment variable of CUSTOM_DEVICE_ROOT).

Example:

Expand Down
2 changes: 1 addition & 1 deletion docs/guides/06_distributed_training/model_parallel_cn.rst
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@

对于 Embedding 操作,可以将其理解为一种查找表操作。即,将输入看做索引,将 Embedding 参数看做查找表,根据该索引查表得到相应的输出,如下图(a)所示。当采用模型并行时,Embedding 的参数被均匀切分到多个卡上。假设 Embedding 参数的维度为 N*D,并采用 K 张卡执行模型并行,那么模型并行模式下每张卡上的 Embedding 参数的维度为 N//K*D。当参数的维度 N 不能被卡数 K 整除时,最后一张卡的参数维度值为(N//K+N%K)*D。以下图(b)为例,Embedding 参数的维度为 8*D,采用 2 张卡执行模型并行,那么每张卡上 Embedding 参数的维度为 4*D。

为了便于说明,以下我们均假设 Embedding 的参数维度值 D 可以被模型并行的卡数 D 整除。此时,每张卡上 Embedding 参数的索引值为[0, N/K),逻辑索引值为[k*N/K, (k+1)*N/K),其中 k 表示卡序号,0<=k<K。对于输入索引 I,如果该索引在该卡表示的逻辑索引范围内,则返回该索引所表示的表项(索引值为 I-k*N/K;否则,返回值为全 0 的虚拟表项。随后,通过 AllReduce 操作获取所有输出表项的和,即对应该 Embeding 操作的输出;整个查表过程如下图(b)所示。
为了便于说明,以下我们均假设 Embedding 的参数维度值 D 可以被模型并行的卡数 D 整除。此时,每张卡上 Embedding 参数的索引值为[0, N/K),逻辑索引值为[k*N/K, (k+1)*N/K),其中 k 表示卡序号,0<=k<K。对于输入索引 I,如果该索引在该卡表示的逻辑索引范围内,则返回该索引所表示的表项(索引值为 I-k*N/K;否则,返回值为全 0 的虚拟表项。随后,通过 AllReduce 操作获取所有输出表项的和,即对应该 Embedding 操作的输出;整个查表过程如下图(b)所示。

.. image:: ./images/parallel_embedding.png
:width: 600
Expand Down
4 changes: 2 additions & 2 deletions docs/guides/advanced/customize_cn.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -318,8 +318,8 @@
" def on_epoch_end(self, epoch, logs=None) 每轮训练结束后,`Model.fit`接口中调用\n",
" def on_train_batch_begin(self, step, logs=None) 单个Batch训练开始前,`Model.fit`和`Model.train_batch`接口中调用\n",
" def on_train_batch_end(self, step, logs=None) 单个Batch训练结束后,`Model.fit`和`Model.train_batch`接口中调用\n",
" def on_eval_batch_begin(self, step, logs=None) 单个Batch评估开始前,`Model.evalute`和`Model.eval_batch`接口中调用\n",
" def on_eval_batch_end(self, step, logs=None) 单个Batch评估结束后,`Model.evalute`和`Model.eval_batch`接口中调用\n",
" def on_eval_batch_begin(self, step, logs=None) 单个Batch评估开始前,`Model.evaluate`和`Model.eval_batch`接口中调用\n",
" def on_eval_batch_end(self, step, logs=None) 单个Batch评估结束后,`Model.evaluate`和`Model.eval_batch`接口中调用\n",
" def on_predict_batch_begin(self, step, logs=None) 单个Batch推理开始前,`Model.predict`和`Model.test_batch`接口中调用\n",
" def on_predict_batch_end(self, step, logs=None) 单个Batch推理结束后,`Model.predict`和`Model.test_batch`接口中调用\n",
" \"\"\"\n",
Expand Down
2 changes: 1 addition & 1 deletion docs/guides/advanced/layer_and_model_en.md
Original file line number Diff line number Diff line change
Expand Up @@ -264,7 +264,7 @@ Tensor(shape=[10, 1], dtype=float32, place=CPUPlace, stop_gradient=True,
...
```

Here we first set the execution mode to **eval**, and soon after to **train**. The two execution modes are exlusive therefore the latter mode will override the former.
Here we first set the execution mode to **eval**, and soon after to **train**. The two execution modes are exclusive therefore the latter mode will override the former.

### Perform an execution

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -30,5 +30,5 @@ paddle.nn.functional.avg_pool1d(x, kernel_size, stride=None, padding=0, exclusiv
torch.nn.functional.avg_pool1d(input=input, kernel_size=2, stride=2, padding=1, ceil_mode=True, count_include_pad=False)

# Paddle 写法
paddle.nn.functional.avg_pool1d(x=input, kernel_size=2, stride=2, padding=1, ceil_mode=True, exlusive=True)
paddle.nn.functional.avg_pool1d(x=input, kernel_size=2, stride=2, padding=1, ceil_mode=True, exclusive=True)
```
Original file line number Diff line number Diff line change
Expand Up @@ -31,5 +31,5 @@ paddle.nn.functional.avg_pool2d(x, kernel_size, stride=None, padding=0, ceil_mod
torch.nn.functional.avg_pool2d(input=input, kernel_size=2, stride=2, padding=1, ceil_mode=True, count_include_pad=False)

# Paddle 写法
paddle.nn.AvgPool2D(x=input, kernel_size=2, stride=2, padding=1, ceil_mode=True, exlusive=True)
paddle.nn.AvgPool2D(x=input, kernel_size=2, stride=2, padding=1, ceil_mode=True, exclusive=True)
```
Original file line number Diff line number Diff line change
Expand Up @@ -31,5 +31,5 @@ paddle.nn.functional.avg_pool3d(x, kernel_size, stride=None, padding=0, ceil_mod
torch.nn.functional.avg_pool3d(input=input, kernel_size=2, stride=2, padding=1, ceil_mode=True, count_include_pad=False)

# Paddle 写法
paddle.nn.functional.avg_pool3d(x=input, kernel_size=2, stride=2, padding=1, ceil_mode=True, exlusive=True)
paddle.nn.functional.avg_pool3d(x=input, kernel_size=2, stride=2, padding=1, ceil_mode=True, exclusive=True)
```
Original file line number Diff line number Diff line change
Expand Up @@ -387,7 +387,7 @@ def _init_weights(self, module):
| [torch.nn.Dropout](https://pytorch.org/docs/stable/generated/torch.nn.Dropout.html?highlight=dropout#torch.nn.Dropout) | [paddle.nn.Dropout](https://www.paddlepaddle.org.cn/documentation/docs/zh/api/paddle/nn/Dropout_cn.html#dropout) | PyTorch 有 inplace 参数,表示在不更改变量的内存地址的情况下,直接修改变量的值,飞桨无此参数。 |
| [torch.nn.Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html?highlight=linear#torch.nn.Linear) | [paddle.nn.Linear](https://www.paddlepaddle.org.cn/documentation/docs/zh/api/paddle/nn/Linear_cn.html#linear) | PyTorch `bias`默认为 True,表示使用可更新的偏置参数。飞桨 `weight_attr`/`bias_attr`默认使用默认的权重/偏置参数属性,否则为指定的权重/偏置参数属性,具体用法参见[ParamAttr](https://www.paddlepaddle.org.cn/documentation/docs/zh/api/paddle/ParamAttr_cn.html#paramattr);当`bias_attr`设置为 bool 类型与 PyTorch 的作用一致。 |
| [torch.nn.LayerNorm](https://pytorch.org/docs/stable/generated/torch.nn.LayerNorm.html?highlight=layernorm#torch.nn.LayerNorm) | [paddle.nn.LayerNorm](https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/api/paddle/nn/LayerNorm_cn.html#layernorm) | 注意参数 epsilon 不同模型参数值,可能不同,对模型精度影响大。 |
| [torch.nn.Embedding](https://pytorch.org/docs/stable/generated/torch.nn.Embedding.html?highlight=embedding#torch.nn.Embedding) | [paddle.nn.Embedding](https://www.paddlepaddle.org.cn/documentation/docs/zh/api/paddle/nn/Embedding_cn.html#embedding) | PyTorch:当 max_norm 不为`None`时,如果 Embeddding 向量的范数(范数的计算方式由 norm_type 决定)超过了 max_norm 这个界限,就要再进行归一化。PaddlePaddle:PaddlePaddle 无此要求,因此不需要归一化。PyTorch:若 scale_grad_by_freq 设置为`True`,会根据单词在 mini-batch 中出现的频率,对梯度进行放缩。 PaddlePaddle:PaddlePaddle 无此功能。 |
| [torch.nn.Embedding](https://pytorch.org/docs/stable/generated/torch.nn.Embedding.html?highlight=embedding#torch.nn.Embedding) | [paddle.nn.Embedding](https://www.paddlepaddle.org.cn/documentation/docs/zh/api/paddle/nn/Embedding_cn.html#embedding) | PyTorch:当 max_norm 不为`None`时,如果 Embedding 向量的范数(范数的计算方式由 norm_type 决定)超过了 max_norm 这个界限,就要再进行归一化。PaddlePaddle:PaddlePaddle 无此要求,因此不需要归一化。PyTorch:若 scale_grad_by_freq 设置为`True`,会根据单词在 mini-batch 中出现的频率,对梯度进行放缩。 PaddlePaddle:PaddlePaddle 无此功能。 |

### 3.2 权重转换

Expand Down
12 changes: 6 additions & 6 deletions docs/practices/nlp/seq2seq_with_attention.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -670,20 +670,20 @@
"encoder.eval()\n",
"atten_decoder.eval()\n",
"\n",
"num_of_exampels_to_evaluate = 10\n",
"num_of_examples_to_evaluate = 10\n",
"\n",
"indices = np.random.choice(\n",
" len(train_en_sents), num_of_exampels_to_evaluate, replace=False\n",
" len(train_en_sents), num_of_examples_to_evaluate, replace=False\n",
")\n",
"x_data = train_en_sents[indices]\n",
"sent = paddle.to_tensor(x_data)\n",
"en_repr = encoder(sent)\n",
"\n",
"word = np.array([[cn_vocab[\"<bos>\"]]] * num_of_exampels_to_evaluate)\n",
"word = np.array([[cn_vocab[\"<bos>\"]]] * num_of_examples_to_evaluate)\n",
"word = paddle.to_tensor(word)\n",
"\n",
"hidden = paddle.zeros([num_of_exampels_to_evaluate, 1, hidden_size])\n",
"cell = paddle.zeros([num_of_exampels_to_evaluate, 1, hidden_size])\n",
"hidden = paddle.zeros([num_of_examples_to_evaluate, 1, hidden_size])\n",
"cell = paddle.zeros([num_of_examples_to_evaluate, 1, hidden_size])\n",
"\n",
"decoded_sent = []\n",
"for i in range(MAX_LEN + 2):\n",
Expand All @@ -693,7 +693,7 @@
" word = paddle.unsqueeze(word, axis=-1)\n",
"\n",
"results = np.stack(decoded_sent, axis=1)\n",
"for i in range(num_of_exampels_to_evaluate):\n",
"for i in range(num_of_examples_to_evaluate):\n",
" en_input = \" \".join(filtered_pairs[indices[i]][0])\n",
" ground_truth_translate = \"\".join(filtered_pairs[indices[i]][1])\n",
" model_translate = \"\"\n",
Expand Down
4 changes: 2 additions & 2 deletions docs/practices/quick_start/high_level_api.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -941,8 +941,8 @@
" def on_epoch_end(self, epoch, logs=None) 每轮训练结束后,`Model.fit`接口中调用 \n",
" def on_train_batch_begin(self, step, logs=None) 单个Batch训练开始前,`Model.fit`和`Model.train_batch`接口中调用\n",
" def on_train_batch_end(self, step, logs=None) 单个Batch训练结束后,`Model.fit`和`Model.train_batch`接口中调用\n",
" def on_eval_batch_begin(self, step, logs=None) 单个Batch评估开始前,`Model.evalute`和`Model.eval_batch`接口中调用\n",
" def on_eval_batch_end(self, step, logs=None) 单个Batch评估结束后,`Model.evalute`和`Model.eval_batch`接口中调用\n",
" def on_eval_batch_begin(self, step, logs=None) 单个Batch评估开始前,`Model.evaluate`和`Model.eval_batch`接口中调用\n",
" def on_eval_batch_end(self, step, logs=None) 单个Batch评估结束后,`Model.evaluate`和`Model.eval_batch`接口中调用\n",
" def on_test_batch_begin(self, step, logs=None) 单个Batch预测测试开始前,`Model.predict`和`Model.test_batch`接口中调用\n",
" def on_test_batch_end(self, step, logs=None) 单个Batch预测测试结束后,`Model.predict`和`Model.test_batch`接口中调用\n",
" \"\"\"\n",
Expand Down