Commit 97402b5
FIX: Allow multiple inlined image data links in html clean
Add a lazy quantifier in the regex `_find_image_dataurls`
to match as few characters as possible,
to make it stop at the first occurence of `;base64,`
e.g.
```py
>>> _find_image_dataurls = re.compile(r'data:image/(.+);base64,', re.I).findall
>>> _find_image_dataurls('<div style="background: url(data:image/jpeg;base64,foo); background-image: url(data:image/jpeg;base64,foo);"></div>')
['jpeg;base64,foo); background-image: url(data:image/jpeg']
```
```py
>>> _find_image_dataurls = re.compile(r'data:image/(.+?);base64,', re.I).findall
>>> _find_image_dataurls('<div style="background: url(data:image/jpeg;base64,foo); background-image: url(data:image/jpeg;base64,foo);"></div>')
['jpeg', 'jpeg']
```
This allows to have multiple image data links on the same line,
which happens for instance in inline styles.
Without this change, `_has_javascript_scheme` returns `True`
because the count of safe image urls is lower than the number of
possible malicious scheme.
Then, the whole style is dropped as considered malicious.
Co-authored-by: Christophe Simonis <chs@odoo.com>1 parent 2dfd5ac commit 97402b5
2 files changed
Lines changed: 26 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
54 | 54 | | |
55 | 55 | | |
56 | 56 | | |
57 | | - | |
| 57 | + | |
58 | 58 | | |
59 | 59 | | |
60 | 60 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
255 | 255 | | |
256 | 256 | | |
257 | 257 | | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
258 | 283 | | |
259 | 284 | | |
260 | 285 | | |
| |||
0 commit comments