summaryrefslogtreecommitdiff
path: root/doc/api/url.md
diff options
context:
space:
mode:
authorTimothy Gu <timothygu99@gmail.com>2017-04-19 11:34:35 -0700
committerTimothy Gu <timothygu99@gmail.com>2017-04-24 16:36:03 -0700
commitb2870a4f8c9e68c01ad998cf72ed5964327ccef5 (patch)
treefae01888fb3e8e84a3ed92eb653ce6424b96c908 /doc/api/url.md
parent75bfdad0371444ec4aa69a6f60062d0d6f0fe9ad (diff)
downloadandroid-node-v8-b2870a4f8c9e68c01ad998cf72ed5964327ccef5.tar.gz
android-node-v8-b2870a4f8c9e68c01ad998cf72ed5964327ccef5.tar.bz2
android-node-v8-b2870a4f8c9e68c01ad998cf72ed5964327ccef5.zip
url: update WHATWG URL API to latest spec
- Update to spec - Add opaque hosts - File state did not correctly deal with lack of base URL - Cleanup API for file and non-special URLs - Allow % and IPv6 addresses in non-special URL hosts - Use specific names for percent-encode sets - Add empty host concept for file and non-special URLs - Clarify IPv6 serializer - Fix existing mistakes - Add missing ':' to forbidden host code point list. - Correct IPv4 parser empty label behavior - Maintain type equivalence in URLContext with spec - scheme, username, and password should always be strings - host, port, query, and fragment may be strings or null - Align scheme state more closely with the spec - Make sure the `special` variable is always synced with URL_FLAG_SPECIAL. PR-URL: https://github.com/nodejs/node/pull/12523 Fixes: https://github.com/nodejs/node/issues/10608 Fixes: https://github.com/nodejs/node/issues/10634 Refs: https://github.com/whatwg/url/pull/185 Refs: https://github.com/whatwg/url/pull/225 Refs: https://github.com/whatwg/url/pull/224 Refs: https://github.com/whatwg/url/pull/218 Refs: https://github.com/whatwg/url/pull/243 Refs: https://github.com/whatwg/url/pull/260 Refs: https://github.com/whatwg/url/pull/268 Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Daijiro Wachi <daijiro.wachi@gmail.com> Reviewed-By: Joyee Cheung <joyeec9h3@gmail.com>
Diffstat (limited to 'doc/api/url.md')
-rw-r--r--doc/api/url.md28
1 files changed, 15 insertions, 13 deletions
diff --git a/doc/api/url.md b/doc/api/url.md
index 7a0d56c01f..934e65fabc 100644
--- a/doc/api/url.md
+++ b/doc/api/url.md
@@ -1053,23 +1053,25 @@ located within the structure of the URL. The WHATWG URL Standard uses a more
selective and fine grained approach to selecting encoded characters than that
used by the older [`url.parse()`][] and [`url.format()`][] methods.
-The WHATWG algorithm defines three "encoding sets" that describe ranges of
-characters that must be percent-encoded:
+The WHATWG algorithm defines three "percent-encode sets" that describe ranges
+of characters that must be percent-encoded:
-* The *simple encode set* includes code points in range U+0000 to U+001F
- (inclusive) and all code points greater than U+007E.
+* The *C0 control percent-encode set* includes code points in range U+0000 to
+ U+001F (inclusive) and all code points greater than U+007E.
-* The *default encode set* includes the *simple encode set* and code points
- U+0020, U+0022, U+0023, U+003C, U+003E, U+003F, U+0060, U+007B, and U+007D.
+* The *path percent-encode set* includes the *C0 control percent-encode set*
+ and code points U+0020, U+0022, U+0023, U+003C, U+003E, U+003F, U+0060,
+ U+007B, and U+007D.
-* The *userinfo encode set* includes the *default encode set* and code points
- U+002F, U+003A, U+003B, U+003D, U+0040, U+005B, U+005C, U+005D, U+005E, and
- U+007C.
+* The *userinfo encode set* includes the *path percent-encode set* and code
+ points U+002F, U+003A, U+003B, U+003D, U+0040, U+005B, U+005C, U+005D,
+ U+005E, and U+007C.
-The *simple encode set* is used primary for URL fragments and certain specific
-conditions for the path. The *userinfo encode set* is used specifically for
-username and passwords encoded within the URL. The *default encode set* is used
-for all other cases.
+The *userinfo percent-encode set* is used exclusively for username and
+passwords encoded within the URL. The *path percent-encode set* is used for the
+path of most URLs. The *C0 control percent-encode set* is used for all
+other cases, including URL fragments in particular, but also host and path
+under certain specific conditions.
When non-ASCII characters appear within a hostname, the hostname is encoded
using the [Punycode][] algorithm. Note, however, that a hostname *may* contain