diff options
author | Timothy Gu <timothygu99@gmail.com> | 2017-04-19 11:34:35 -0700 |
---|---|---|
committer | Timothy Gu <timothygu99@gmail.com> | 2017-04-24 16:36:03 -0700 |
commit | b2870a4f8c9e68c01ad998cf72ed5964327ccef5 (patch) | |
tree | fae01888fb3e8e84a3ed92eb653ce6424b96c908 /doc/api/url.md | |
parent | 75bfdad0371444ec4aa69a6f60062d0d6f0fe9ad (diff) | |
download | android-node-v8-b2870a4f8c9e68c01ad998cf72ed5964327ccef5.tar.gz android-node-v8-b2870a4f8c9e68c01ad998cf72ed5964327ccef5.tar.bz2 android-node-v8-b2870a4f8c9e68c01ad998cf72ed5964327ccef5.zip |
url: update WHATWG URL API to latest spec
- Update to spec
- Add opaque hosts
- File state did not correctly deal with lack of base URL
- Cleanup API for file and non-special URLs
- Allow % and IPv6 addresses in non-special URL hosts
- Use specific names for percent-encode sets
- Add empty host concept for file and non-special URLs
- Clarify IPv6 serializer
- Fix existing mistakes
- Add missing ':' to forbidden host code point list.
- Correct IPv4 parser empty label behavior
- Maintain type equivalence in URLContext with spec
- scheme, username, and password should always be strings
- host, port, query, and fragment may be strings or null
- Align scheme state more closely with the spec
- Make sure the `special` variable is always synced with
URL_FLAG_SPECIAL.
PR-URL: https://github.com/nodejs/node/pull/12523
Fixes: https://github.com/nodejs/node/issues/10608
Fixes: https://github.com/nodejs/node/issues/10634
Refs: https://github.com/whatwg/url/pull/185
Refs: https://github.com/whatwg/url/pull/225
Refs: https://github.com/whatwg/url/pull/224
Refs: https://github.com/whatwg/url/pull/218
Refs: https://github.com/whatwg/url/pull/243
Refs: https://github.com/whatwg/url/pull/260
Refs: https://github.com/whatwg/url/pull/268
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Daijiro Wachi <daijiro.wachi@gmail.com>
Reviewed-By: Joyee Cheung <joyeec9h3@gmail.com>
Diffstat (limited to 'doc/api/url.md')
-rw-r--r-- | doc/api/url.md | 28 |
1 files changed, 15 insertions, 13 deletions
diff --git a/doc/api/url.md b/doc/api/url.md index 7a0d56c01f..934e65fabc 100644 --- a/doc/api/url.md +++ b/doc/api/url.md @@ -1053,23 +1053,25 @@ located within the structure of the URL. The WHATWG URL Standard uses a more selective and fine grained approach to selecting encoded characters than that used by the older [`url.parse()`][] and [`url.format()`][] methods. -The WHATWG algorithm defines three "encoding sets" that describe ranges of -characters that must be percent-encoded: +The WHATWG algorithm defines three "percent-encode sets" that describe ranges +of characters that must be percent-encoded: -* The *simple encode set* includes code points in range U+0000 to U+001F - (inclusive) and all code points greater than U+007E. +* The *C0 control percent-encode set* includes code points in range U+0000 to + U+001F (inclusive) and all code points greater than U+007E. -* The *default encode set* includes the *simple encode set* and code points - U+0020, U+0022, U+0023, U+003C, U+003E, U+003F, U+0060, U+007B, and U+007D. +* The *path percent-encode set* includes the *C0 control percent-encode set* + and code points U+0020, U+0022, U+0023, U+003C, U+003E, U+003F, U+0060, + U+007B, and U+007D. -* The *userinfo encode set* includes the *default encode set* and code points - U+002F, U+003A, U+003B, U+003D, U+0040, U+005B, U+005C, U+005D, U+005E, and - U+007C. +* The *userinfo encode set* includes the *path percent-encode set* and code + points U+002F, U+003A, U+003B, U+003D, U+0040, U+005B, U+005C, U+005D, + U+005E, and U+007C. -The *simple encode set* is used primary for URL fragments and certain specific -conditions for the path. The *userinfo encode set* is used specifically for -username and passwords encoded within the URL. The *default encode set* is used -for all other cases. +The *userinfo percent-encode set* is used exclusively for username and +passwords encoded within the URL. The *path percent-encode set* is used for the +path of most URLs. The *C0 control percent-encode set* is used for all +other cases, including URL fragments in particular, but also host and path +under certain specific conditions. When non-ASCII characters appear within a hostname, the hostname is encoded using the [Punycode][] algorithm. Note, however, that a hostname *may* contain |