What's URL encoding
URL encoding, also known as percent encoding, is a method to encode arbitrary data in a Uniform Resource Identifier (URI) using only the limited US-ASCII characters. Is is also used in the preparation of data of the application/x-www-form-urlencoded
media type, as is often used in the submission of HTML form data in HTTP requests.
Detail of URL encoding
A percent-encoded octet is encoded as a character triplet, consisting of the percent character "%" followed by the two hexadecimal digits representing that octet's numeric value. For example, "%20" is the percent-encoding for the binary octet "00100000" (ABNF: %x20), which in US-ASCII corresponds to the space character (SP).
According to RFC 3986, The characters allowed in a URI are either reserved or unreserved (or a percent character as part of a percent-encoding).
Reserved characters are those characters that sometimes have special meaning. For example, forward slash characters are used to separate different parts of a URL (or more generally, a URI).
!
*
'
(
)
;
:
@
&
=
+
$
,
/
?
#
[
]
The unreserved characters can be encoded, but should not be encoded. The unreserved characters include alpha A
to Z
, a
to z
, number 0
to 9
, and -
_
.
~
.
The function encodeURIComponent
in JavaScript aims to do URL encoding, and used widely in web development. The function does not encode characters !
*
'
(
)
, it seems to not fit RFC 3986 specification. The tools in the page will do better than encodeURIComponent
.