curl_url_set.md (9184B)
1 --- 2 c: Copyright (C) Daniel Stenberg, <daniel@haxx.se>, et al. 3 SPDX-License-Identifier: curl 4 Title: curl_url_set 5 Section: 3 6 Source: libcurl 7 See-also: 8 - CURLOPT_CURLU (3) 9 - curl_url (3) 10 - curl_url_cleanup (3) 11 - curl_url_dup (3) 12 - curl_url_get (3) 13 - curl_url_strerror (3) 14 Protocol: 15 - All 16 Added-in: 7.62.0 17 --- 18 19 # NAME 20 21 curl_url_set - set a URL part 22 23 # SYNOPSIS 24 25 ~~~c 26 #include <curl/curl.h> 27 28 CURLUcode curl_url_set(CURLU *url, 29 CURLUPart part, 30 const char *content, 31 unsigned int flags); 32 ~~~ 33 34 # DESCRIPTION 35 36 The *url* handle to work on, passed in as the first argument, must be a 37 handle previously created by curl_url(3) or curl_url_dup(3). 38 39 This function sets or updates individual URL components, or parts, held by the 40 URL object the handle identifies. 41 42 The *part* argument should identify the particular URL part (see list below) 43 to set or change, with *content* pointing to a null-terminated string with the 44 new contents for that URL part. The contents should be in the form and 45 encoding they would use in a URL: URL encoded. 46 47 When setting a part in the URL object that was previously already set, it 48 replaces the data that was previously stored for that part with the new 49 *content*. 50 51 The caller does not have to keep *content* around after a successful call 52 as this function copies the content. 53 54 Setting a part to a NULL pointer removes that part's contents from the *CURLU* 55 handle. 56 57 This function has an 8 MB maximum length limit for all provided input strings. 58 In the real world, excessively long fields in URLs cause problems even if this 59 function accepts them. 60 61 When setting or updating contents of individual URL parts, curl_url_set(3) 62 might accept data that would not be otherwise possible to set in the string 63 when it gets populated as a result of a full URL parse. Beware. If done so, 64 extracting a full URL later on from such components might render an invalid 65 URL. 66 67 The *flags* argument is a bitmask with independent features. 68 69 # PARTS 70 71 ## CURLUPART_URL 72 73 Allows the full URL of the handle to be replaced. If the handle already is 74 populated with a URL, the new URL can be relative to the previous. 75 76 When successfully setting a new URL, relative or absolute, the handle contents 77 is replaced with the components of the newly set URL. 78 79 Pass a pointer to a null-terminated string to the *url* parameter. The string 80 must point to a correctly formatted "RFC 3986+" URL or be a NULL pointer. The 81 URL parser only understands and parses the subset of URLS that are 82 "hierarchical" and therefore contain a `://` separator - not the ones that are 83 normally specified with only a colon separator. 84 85 By default this API only parses URLs using schemes for protocols that are 86 supported built-in. To make libcurl parse URLs generically even for schemes it 87 does not know about, the **CURLU_NON_SUPPORT_SCHEME** flags bit must be set. 88 Otherwise, this function returns *CURLUE_UNSUPPORTED_SCHEME* for URL schemes 89 it does not recognize. 90 91 Unless *CURLU_NO_AUTHORITY* is set, a blank hostname is not allowed in 92 the URL. 93 94 When a full URL is set (parsed), the hostname component is stored URL decoded. 95 96 It is considered fine to set a blank URL ("") as a redirect, but not as a 97 normal URL. Therefore, setting a "" URL works fine if the handle already holds 98 a URL, otherwise it triggers an error. 99 100 ## CURLUPART_SCHEME 101 102 Scheme cannot be URL decoded on set. libcurl only accepts setting schemes up 103 to 40 bytes long. 104 105 ## CURLUPART_USER 106 107 If only the user part is set and not the password, the URL is represented with 108 a blank password. 109 110 ## CURLUPART_PASSWORD 111 112 If only the password part is set and not the user, the URL is represented with 113 a blank user. 114 115 ## CURLUPART_OPTIONS 116 117 The options field is an optional field that might follow the password in the 118 userinfo part. It is only recognized/used when parsing URLs for the following 119 schemes: pop3, smtp and imap. This function however allows users to 120 independently set this field. 121 122 ## CURLUPART_HOST 123 124 The hostname. If it is International Domain Name (IDN) the string must then be 125 encoded as your locale says or UTF-8 (when WinIDN is used). If it is a 126 bracketed IPv6 numeric address it may contain a zone id (or you can use 127 *CURLUPART_ZONEID*). 128 129 Note that if you set an IPv6 address, it gets ruined and causes an error if 130 you also set the CURLU_URLENCODE flag. 131 132 Unless *CURLU_NO_AUTHORITY* is set, a blank hostname is not allowed to set. 133 134 ## CURLUPART_ZONEID 135 136 If the hostname is a numeric IPv6 address, this field can also be set. 137 138 ## CURLUPART_PORT 139 140 The port number cannot be URL encoded on set. The given port number is 141 provided as a string and the decimal number in it must be between 0 and 142 65535. Anything else returns an error. 143 144 ## CURLUPART_PATH 145 146 If a path is set in the URL without a leading slash, a slash is prepended 147 automatically. 148 149 ## CURLUPART_QUERY 150 151 The query part gets spaces converted to pluses when asked to URL encode on set 152 with the *CURLU_URLENCODE* bit. 153 154 If used together with the *CURLU_APPENDQUERY* bit, the provided part is 155 appended on the end of the existing query. 156 157 The question mark in the URL is not part of the actual query contents. 158 159 ## CURLUPART_FRAGMENT 160 161 The hash sign in the URL is not part of the actual fragment contents. 162 163 # FLAGS 164 165 The flags argument is zero, one or more bits set in a bitmask. 166 167 ## CURLU_APPENDQUERY 168 169 Can be used when setting the *CURLUPART_QUERY* component. The provided new 170 part is then appended at the end of the existing query - and if the previous 171 part did not end with an ampersand (&), an ampersand gets inserted before the 172 new appended part. 173 174 When *CURLU_APPENDQUERY* is used together with *CURLU_URLENCODE*, the 175 first '=' symbol is not URL encoded. 176 177 ## CURLU_NON_SUPPORT_SCHEME 178 179 If set, allows curl_url_set(3) to set a non-supported scheme. It then of 180 course cannot know if the provided scheme is a valid one or not. 181 182 ## CURLU_URLENCODE 183 184 When set, curl_url_set(3) URL encodes the part on entry, except for 185 **scheme**, **port** and **URL**. 186 187 When setting the path component with URL encoding enabled, the slash character 188 is skipped. 189 190 The query part gets space-to-plus converted before the URL conversion is 191 applied. 192 193 This URL encoding is charset unaware and converts the input in a byte-by-byte 194 manner. 195 196 ## CURLU_DEFAULT_SCHEME 197 198 If set, allows the URL to be set without a scheme and then sets that to the 199 default scheme: HTTPS. Overrides the *CURLU_GUESS_SCHEME* option if both are 200 set. 201 202 ## CURLU_GUESS_SCHEME 203 204 If set, allows the URL to be set without a scheme and it instead "guesses" 205 which scheme that was intended based on the hostname. If the outermost 206 subdomain name matches DICT, FTP, IMAP, LDAP, POP3 or SMTP then that scheme is 207 used, otherwise it picks HTTP. Conflicts with the *CURLU_DEFAULT_SCHEME* 208 option which takes precedence if both are set. 209 210 If guessing is not allowed and there is no default scheme set, trying to parse 211 a URL without a scheme returns error. 212 213 If the scheme ends up set as a result of guessing, i.e. it is not actually 214 present in the parsed URL, it can later be figured out by using the 215 **CURLU_NO_GUESS_SCHEME** flag when subsequently getting the URL or the scheme 216 with curl_url_get(3). 217 218 ## CURLU_NO_AUTHORITY 219 220 If set, skips authority checks. The RFC allows individual schemes to omit the 221 host part (normally the only mandatory part of the authority), but libcurl 222 cannot know whether this is permitted for custom schemes. Specifying the flag 223 permits empty authority sections, similar to how file scheme is handled. 224 225 ## CURLU_PATH_AS_IS 226 227 When set for **CURLUPART_URL**, this skips the normalization of the 228 path. That is the procedure where libcurl otherwise removes sequences of 229 dot-slash and dot-dot etc. The same option used for transfers is called 230 CURLOPT_PATH_AS_IS(3). 231 232 ## CURLU_ALLOW_SPACE 233 234 If set, the URL parser allows space (ASCII 32) where possible. The URL syntax 235 does normally not allow spaces anywhere, but they should be encoded as %20 236 or '+'. When spaces are allowed, they are still not allowed in the scheme. 237 When space is used and allowed in a URL, it is stored as-is unless 238 *CURLU_URLENCODE* is also set, which then makes libcurl URL encode the 239 space before stored. This affects how the URL is constructed when 240 curl_url_get(3) is subsequently used to extract the full URL or 241 individual parts. (Added in 7.78.0) 242 243 ## CURLU_DISALLOW_USER 244 245 If set, the URL parser does not accept embedded credentials for the 246 **CURLUPART_URL**, and instead returns **CURLUE_USER_NOT_ALLOWED** for 247 such URLs. 248 249 # %PROTOCOLS% 250 251 # EXAMPLE 252 253 ~~~c 254 int main(void) 255 { 256 CURLUcode rc; 257 CURLU *url = curl_url(); 258 rc = curl_url_set(url, CURLUPART_URL, "https://example.com", 0); 259 if(!rc) { 260 /* change it to an FTP URL */ 261 rc = curl_url_set(url, CURLUPART_SCHEME, "ftp", 0); 262 } 263 curl_url_cleanup(url); 264 } 265 ~~~ 266 267 # %AVAILABILITY% 268 269 # RETURN VALUE 270 271 Returns a *CURLUcode* error value, which is CURLUE_OK (0) if everything 272 went fine. See the libcurl-errors(3) man page for the full list with 273 descriptions. 274 275 The input string passed to curl_url_set(3) must be shorter than eight 276 million bytes. Otherwise this function returns **CURLUE_MALFORMED_INPUT**. 277 278 If this function returns an error, no URL part is set.