libcurl-tutorial.md (59999B)
1 --- 2 c: Copyright (C) Daniel Stenberg, <daniel@haxx.se>, et al. 3 SPDX-License-Identifier: curl 4 Title: libcurl-tutorial 5 Section: 3 6 Source: libcurl 7 See-also: 8 - libcurl-easy (3) 9 - libcurl-errors (3) 10 - libcurl-multi (3) 11 - libcurl-url (3) 12 Protocol: 13 - All 14 Added-in: n/a 15 --- 16 17 # NAME 18 19 libcurl-tutorial - libcurl programming tutorial 20 21 # Objective 22 23 This document attempts to describe the general principles and some basic 24 approaches to consider when programming with libcurl. The text focuses on the 25 C interface but should apply fairly well on other language bindings as well as 26 they usually follow the C API pretty closely. 27 28 This document refers to 'the user' as the person writing the source code that 29 uses libcurl. That would probably be you or someone in your position. What is 30 generally referred to as 'the program' is the collected source code that you 31 write that is using libcurl for transfers. The program is outside libcurl and 32 libcurl is outside of the program. 33 34 To get more details on all options and functions described herein, please 35 refer to their respective man pages. 36 37 # Building 38 39 There are many different ways to build C programs. This chapter assumes a Unix 40 style build process. If you use a different build system, you can still read 41 this to get general information that may apply to your environment as well. 42 43 ## Compiling the Program 44 45 Your compiler needs to know where the libcurl headers are located. Therefore 46 you must set your compiler's include path to point to the directory where you 47 installed them. The 'curl-config'[3] tool can be used to get this information: 48 ~~~c 49 $ curl-config --cflags 50 ~~~ 51 52 ## Linking the Program with libcurl 53 54 When having compiled the program, you need to link your object files to create 55 a single executable. For that to succeed, you need to link with libcurl and 56 possibly also with other libraries that libcurl itself depends on. Like the 57 OpenSSL libraries, but even some standard OS libraries may be needed on the 58 command line. To figure out which flags to use, once again the 'curl-config' 59 tool comes to the rescue: 60 ~~~c 61 $ curl-config --libs 62 ~~~ 63 64 ## SSL or Not 65 66 libcurl can be built and customized in many ways. One of the things that 67 varies from different libraries and builds is the support for SSL-based 68 transfers, like HTTPS and FTPS. If a supported SSL library was detected 69 properly at build-time, libcurl is built with SSL support. To figure out if an 70 installed libcurl has been built with SSL support enabled, use *curl-config* 71 like this: 72 73 ~~~c 74 $ curl-config --feature 75 ~~~ 76 77 If SSL is supported, the keyword *SSL* is written to stdout, possibly together 78 with a other features that could be either on or off on for different 79 libcurls. 80 81 See also the "Features libcurl Provides" further down. 82 83 ## autoconf macro 84 85 When you write your configure script to detect libcurl and setup variables 86 accordingly, we offer a macro that probably does everything you need in this 87 area. See docs/libcurl/libcurl.m4 file - it includes docs on how to use it. 88 89 # Portable Code in a Portable World 90 91 The people behind libcurl have put a considerable effort to make libcurl work 92 on a large amount of different operating systems and environments. 93 94 You program libcurl the same way on all platforms that libcurl runs on. There 95 are only a few minor details that differ. If you just make sure to write your 96 code portable enough, you can create a portable program. libcurl should not 97 stop you from that. 98 99 # Global Preparation 100 101 The program must initialize some of the libcurl functionality globally. That 102 means it should be done exactly once, no matter how many times you intend to 103 use the library. Once for your program's entire life time. This is done using 104 ~~~c 105 curl_global_init() 106 ~~~ 107 and it takes one parameter which is a bit pattern that tells libcurl what to 108 initialize. Using *CURL_GLOBAL_ALL* makes it initialize all known internal 109 sub modules, and might be a good default option. The current two bits that are 110 specified are: 111 112 ## CURL_GLOBAL_WIN32 113 114 which only does anything on Windows machines. When used on a Windows machine, 115 it makes libcurl initialize the Win32 socket stuff. Without having that 116 initialized properly, your program cannot use sockets properly. You should 117 only do this once for each application, so if your program already does this 118 or of another library in use does it, you should not tell libcurl to do this 119 as well. 120 121 ## CURL_GLOBAL_SSL 122 123 which only does anything on libcurls compiled and built SSL-enabled. On these 124 systems, this makes libcurl initialize the SSL library properly for this 125 application. This only needs to be done once for each application so if your 126 program or another library already does this, this bit should not be needed. 127 128 libcurl has a default protection mechanism that detects if 129 curl_global_init(3) has not been called by the time 130 curl_easy_perform(3) is called and if that is the case, libcurl runs the 131 function itself with a guessed bit pattern. Please note that depending solely 132 on this is not considered nice nor good. 133 134 When the program no longer uses libcurl, it should call 135 curl_global_cleanup(3), which is the opposite of the init call. It 136 performs the reversed operations to cleanup the resources the 137 curl_global_init(3) call initialized. 138 139 Repeated calls to curl_global_init(3) and curl_global_cleanup(3) 140 should be avoided. They should only be called once each. 141 142 # Features libcurl Provides 143 144 It is considered best-practice to determine libcurl features at runtime rather 145 than at build-time (if possible of course). By calling 146 curl_version_info(3) and checking out the details of the returned 147 struct, your program can figure out exactly what the currently running libcurl 148 supports. 149 150 # Two Interfaces 151 152 libcurl first introduced the so called easy interface. All operations in the 153 easy interface are prefixed with 'curl_easy'. The easy interface lets you do 154 single transfers with a synchronous and blocking function call. 155 156 libcurl also offers another interface that allows multiple simultaneous 157 transfers in a single thread, the so called multi interface. More about that 158 interface is detailed in a separate chapter further down. You still need to 159 understand the easy interface first, so please continue reading for better 160 understanding. 161 162 # Handle the Easy libcurl 163 164 To use the easy interface, you must first create yourself an easy handle. You 165 need one handle for each easy session you want to perform. Basically, you 166 should use one handle for every thread you plan to use for transferring. You 167 must never share the same handle in multiple threads. 168 169 Get an easy handle with 170 ~~~c 171 handle = curl_easy_init(); 172 ~~~ 173 It returns an easy handle. Using that you proceed to the next step: setting 174 up your preferred actions. A handle is just a logic entity for the upcoming 175 transfer or series of transfers. 176 177 You set properties and options for this handle using 178 curl_easy_setopt(3). They control how the subsequent transfer or 179 transfers using this handle are made. Options remain set in the handle until 180 set again to something different. They are sticky. Multiple requests using the 181 same handle use the same options. 182 183 If you at any point would like to blank all previously set options for a 184 single easy handle, you can call curl_easy_reset(3) and you can also 185 make a clone of an easy handle (with all its set options) using 186 curl_easy_duphandle(3). 187 188 Many of the options you set in libcurl are "strings", pointers to data 189 terminated with a zero byte. When you set strings with 190 curl_easy_setopt(3), libcurl makes its own copy so that they do not need 191 to be kept around in your application after being set[4]. 192 193 One of the most basic properties to set in the handle is the URL. You set your 194 preferred URL to transfer with CURLOPT_URL(3) in a manner similar to: 195 196 ~~~c 197 curl_easy_setopt(handle, CURLOPT_URL, "http://example.com/"); 198 ~~~ 199 200 Let's assume for a while that you want to receive data as the URL identifies a 201 remote resource you want to get here. Since you write a sort of application 202 that needs this transfer, I assume that you would like to get the data passed 203 to you directly instead of simply getting it passed to stdout. So, you write 204 your own function that matches this prototype: 205 ~~~c 206 size_t write_data(void *buffer, size_t size, size_t nmemb, void *userp); 207 ~~~ 208 You tell libcurl to pass all data to this function by issuing a function 209 similar to this: 210 ~~~c 211 curl_easy_setopt(handle, CURLOPT_WRITEFUNCTION, write_data); 212 ~~~ 213 You can control what data your callback function gets in the fourth argument 214 by setting another property: 215 ~~~c 216 curl_easy_setopt(handle, CURLOPT_WRITEDATA, &internal_struct); 217 ~~~ 218 Using that property, you can easily pass local data between your application 219 and the function that gets invoked by libcurl. libcurl itself does not touch 220 the data you pass with CURLOPT_WRITEDATA(3). 221 222 libcurl offers its own default internal callback that takes care of the data 223 if you do not set the callback with CURLOPT_WRITEFUNCTION(3). It simply 224 outputs the received data to stdout. You can have the default callback write 225 the data to a different file handle by passing a 'FILE *' to a file opened for 226 writing with the CURLOPT_WRITEDATA(3) option. 227 228 Now, we need to take a step back and take a deep breath. Here is one of those 229 rare platform-dependent nitpicks. Did you spot it? On some platforms[2], 230 libcurl is not able to operate on file handles opened by the 231 program. Therefore, if you use the default callback and pass in an open file 232 handle with CURLOPT_WRITEDATA(3), libcurl crashes. You should avoid this 233 to make your program run fine virtually everywhere. 234 235 (CURLOPT_WRITEDATA(3) was formerly known as *CURLOPT_FILE*. Both names still 236 work and do the same thing). 237 238 If you are using libcurl as a Windows DLL, you MUST use the 239 CURLOPT_WRITEFUNCTION(3) if you set CURLOPT_WRITEDATA(3) - or experience 240 crashes. 241 242 There are of course many more options you can set, and we get back to a few of 243 them later. Let's instead continue to the actual transfer: 244 245 ~~~c 246 success = curl_easy_perform(handle); 247 ~~~ 248 249 curl_easy_perform(3) connects to the remote site, does the necessary commands 250 and performs the transfer. Whenever it receives data, it calls the callback 251 function we previously set. The function may get one byte at a time, or it may 252 get many kilobytes at once. libcurl delivers as much as possible as often as 253 possible. Your callback function should return the number of bytes it "took 254 care of". If that is not the same amount of bytes that was passed to it, 255 libcurl aborts the operation and returns with an error code. 256 257 When the transfer is complete, the function returns a return code that informs 258 you if it succeeded in its mission or not. If a return code is not enough for 259 you, you can use the CURLOPT_ERRORBUFFER(3) to point libcurl to a buffer of 260 yours where it stores a human readable error message as well. 261 262 If you then want to transfer another file, the handle is ready to be used 263 again. It is even preferred and encouraged that you reuse an existing handle 264 if you intend to make another transfer. libcurl then attempts to reuse a 265 previous connection. 266 267 For some protocols, downloading a file can involve a complicated process of 268 logging in, setting the transfer mode, changing the current directory and 269 finally transferring the file data. libcurl takes care of all that 270 complication for you. Given simply the URL to a file, libcurl takes care of 271 all the details needed to get the file moved from one machine to another. 272 273 # Multi-threading Issues 274 275 libcurl is thread safe but there are a few exceptions. Refer to 276 libcurl-thread(3) for more information. 277 278 # When It does not Work 279 280 There are times when the transfer fails for some reason. You might have set 281 the wrong libcurl option or misunderstood what the libcurl option actually 282 does, or the remote server might return non-standard replies that confuse the 283 library which then confuses your program. 284 285 There is one golden rule when these things occur: set the 286 CURLOPT_VERBOSE(3) option to 1. it causes the library to spew out the 287 entire protocol details it sends, some internal info and some received 288 protocol data as well (especially when using FTP). If you are using HTTP, 289 adding the headers in the received output to study is also a clever way to get 290 a better understanding why the server behaves the way it does. Include headers 291 in the normal body output with CURLOPT_HEADER(3) set 1. 292 293 Of course, there are bugs left. We need to know about them to be able to fix 294 them, so we are quite dependent on your bug reports. When you do report 295 suspected bugs in libcurl, please include as many details as you possibly can: 296 a protocol dump that CURLOPT_VERBOSE(3) produces, library version, as 297 much as possible of your code that uses libcurl, operating system name and 298 version, compiler name and version etc. 299 300 If CURLOPT_VERBOSE(3) is not enough, you increase the level of debug 301 data your application receive by using the CURLOPT_DEBUGFUNCTION(3). 302 303 Getting some in-depth knowledge about the protocols involved is never wrong, 304 and if you are trying to do funny things, you might understand libcurl and how 305 to use it better if you study the appropriate RFC documents at least briefly. 306 307 # Upload Data to a Remote Site 308 309 libcurl tries to keep a protocol independent approach to most transfers, thus 310 uploading to a remote FTP site is similar to uploading data to an HTTP server 311 with a PUT request. 312 313 Of course, first you either create an easy handle or you reuse one existing 314 one. Then you set the URL to operate on just like before. This is the remote 315 URL, that we now upload. 316 317 Since we write an application, we most likely want libcurl to get the upload 318 data by asking us for it. To make it do that, we set the read callback and the 319 custom pointer libcurl passes to our read callback. The read callback should 320 have a prototype similar to: 321 ~~~c 322 size_t function(char *bufptr, size_t size, size_t nitems, void *userp); 323 ~~~ 324 Where *bufptr* is the pointer to a buffer we fill in with data to upload 325 and *nitems* is the size of the buffer and therefore also the maximum 326 amount of data we can return to libcurl in this call. The *userp* pointer 327 is the custom pointer we set to point to a struct of ours to pass private data 328 between the application and the callback. 329 ~~~c 330 curl_easy_setopt(handle, CURLOPT_READFUNCTION, read_function); 331 332 curl_easy_setopt(handle, CURLOPT_READDATA, &filedata); 333 ~~~ 334 Tell libcurl that we want to upload: 335 ~~~c 336 curl_easy_setopt(handle, CURLOPT_UPLOAD, 1L); 337 ~~~ 338 A few protocols do not behave properly when uploads are done without any prior 339 knowledge of the expected file size. So, set the upload file size using the 340 CURLOPT_INFILESIZE_LARGE(3) for all known file sizes like this[1]: 341 342 ~~~c 343 /* in this example, file_size must be an curl_off_t variable */ 344 curl_easy_setopt(handle, CURLOPT_INFILESIZE_LARGE, file_size); 345 ~~~ 346 347 When you call curl_easy_perform(3) this time, it performs all the 348 necessary operations and when it has invoked the upload it calls your supplied 349 callback to get the data to upload. The program should return as much data as 350 possible in every invoke, as that is likely to make the upload perform as fast 351 as possible. The callback should return the number of bytes it wrote in the 352 buffer. Returning 0 signals the end of the upload. 353 354 # Passwords 355 356 Many protocols use or even require that username and password are provided 357 to be able to download or upload the data of your choice. libcurl offers 358 several ways to specify them. 359 360 Most protocols support that you specify the name and password in the URL 361 itself. libcurl detects this and use them accordingly. This is written like 362 this: 363 ~~~c 364 protocol://user:password@example.com/path/ 365 ~~~ 366 If you need any odd letters in your username or password, you should enter 367 them URL encoded, as %XX where XX is a two-digit hexadecimal number. 368 369 libcurl also provides options to set various passwords. The username and 370 password as shown embedded in the URL can instead get set with the 371 CURLOPT_USERPWD(3) option. The argument passed to libcurl should be a 372 char * to a string in the format "user:password". In a manner like this: 373 374 ~~~c 375 curl_easy_setopt(handle, CURLOPT_USERPWD, "myname:thesecret"); 376 ~~~ 377 378 Another case where name and password might be needed at times, is for those 379 users who need to authenticate themselves to a proxy they use. libcurl offers 380 another option for this, the CURLOPT_PROXYUSERPWD(3). It is used quite similar 381 to the CURLOPT_USERPWD(3) option like this: 382 383 ~~~c 384 curl_easy_setopt(handle, CURLOPT_PROXYUSERPWD, "myname:thesecret"); 385 ~~~ 386 387 There is a long time Unix "standard" way of storing FTP usernames and 388 passwords, namely in the $HOME/.netrc file (on Windows, libcurl also checks 389 the *%USERPROFILE% environment* variable if *%HOME%* is unset, and tries 390 "_netrc" as name). The file should be made private so that only the user may 391 read it (see also the "Security Considerations" chapter), as it might contain 392 the password in plain text. libcurl has the ability to use this file to figure 393 out what set of username and password to use for a particular host. As an 394 extension to the normal functionality, libcurl also supports this file for 395 non-FTP protocols such as HTTP. To make curl use this file, use the 396 CURLOPT_NETRC(3) option: 397 398 ~~~c 399 curl_easy_setopt(handle, CURLOPT_NETRC, 1L); 400 ~~~ 401 402 A basic example of how such a .netrc file may look like: 403 404 ~~~c 405 machine myhost.mydomain.com 406 login userlogin 407 password secretword 408 ~~~ 409 410 All these examples have been cases where the password has been optional, or 411 at least you could leave it out and have libcurl attempt to do its job 412 without it. There are times when the password is not optional, like when 413 you are using an SSL private key for secure transfers. 414 415 To pass the known private key password to libcurl: 416 ~~~c 417 curl_easy_setopt(handle, CURLOPT_KEYPASSWD, "keypassword"); 418 ~~~ 419 420 # HTTP Authentication 421 422 The previous chapter showed how to set username and password for getting URLs 423 that require authentication. When using the HTTP protocol, there are many 424 different ways a client can provide those credentials to the server and you 425 can control which way libcurl uses them. The default HTTP authentication 426 method is called 'Basic', which is sending the name and password in clear-text 427 in the HTTP request, base64-encoded. This is insecure. 428 429 At the time of this writing, libcurl can be built to use: Basic, Digest, NTLM, 430 Negotiate (SPNEGO). You can tell libcurl which one to use with 431 CURLOPT_HTTPAUTH(3) as in: 432 433 ~~~c 434 curl_easy_setopt(handle, CURLOPT_HTTPAUTH, CURLAUTH_DIGEST); 435 436 ~~~ 437 438 When you send authentication to a proxy, you can also set authentication type 439 the same way but instead with CURLOPT_PROXYAUTH(3): 440 441 ~~~c 442 curl_easy_setopt(handle, CURLOPT_PROXYAUTH, CURLAUTH_NTLM); 443 ~~~ 444 445 Both these options allow you to set multiple types (by ORing them together), 446 to make libcurl pick the most secure one out of the types the server/proxy 447 claims to support. This method does however add a round-trip since libcurl 448 must first ask the server what it supports: 449 450 ~~~c 451 curl_easy_setopt(handle, CURLOPT_HTTPAUTH, CURLAUTH_DIGEST|CURLAUTH_BASIC); 452 ~~~ 453 454 For convenience, you can use the *CURLAUTH_ANY* define (instead of a list with 455 specific types) which allows libcurl to use whatever method it wants. 456 457 When asking for multiple types, libcurl picks the available one it considers 458 "best" in its own internal order of preference. 459 460 # HTTP POSTing 461 462 We get many questions regarding how to issue HTTP POSTs with libcurl the 463 proper way. This chapter thus includes examples using both different versions 464 of HTTP POST that libcurl supports. 465 466 The first version is the simple POST, the most common version, that most HTML 467 pages using the \<form\> tag uses. We provide a pointer to the data and tell 468 libcurl to post it all to the remote site: 469 470 ~~~c 471 char *data="name=daniel&project=curl"; 472 curl_easy_setopt(handle, CURLOPT_POSTFIELDS, data); 473 curl_easy_setopt(handle, CURLOPT_URL, "http://posthere.com/"); 474 475 curl_easy_perform(handle); /* post away! */ 476 ~~~ 477 478 Simple enough, huh? Since you set the POST options with the 479 CURLOPT_POSTFIELDS(3), this automatically switches the handle to use 480 POST in the upcoming request. 481 482 What if you want to post binary data that also requires you to set the 483 Content-Type: header of the post? Well, binary posts prevent libcurl from being 484 able to do strlen() on the data to figure out the size, so therefore we must 485 tell libcurl the size of the post data. Setting headers in libcurl requests are 486 done in a generic way, by building a list of our own headers and then passing 487 that list to libcurl. 488 489 ~~~c 490 struct curl_slist *headers=NULL; 491 headers = curl_slist_append(headers, "Content-Type: text/xml"); 492 493 /* post binary data */ 494 curl_easy_setopt(handle, CURLOPT_POSTFIELDS, binaryptr); 495 496 /* set the size of the postfields data */ 497 curl_easy_setopt(handle, CURLOPT_POSTFIELDSIZE, 23L); 498 499 /* pass our list of custom made headers */ 500 curl_easy_setopt(handle, CURLOPT_HTTPHEADER, headers); 501 502 curl_easy_perform(handle); /* post away! */ 503 504 curl_slist_free_all(headers); /* free the header list */ 505 ~~~ 506 507 While the simple examples above cover the majority of all cases where HTTP 508 POST operations are required, they do not do multi-part formposts. Multi-part 509 formposts were introduced as a better way to post (possibly large) binary data 510 and were first documented in the RFC 1867 (updated in RFC 2388). They are 511 called multi-part because they are built by a chain of parts, each part being 512 a single unit of data. Each part has its own name and contents. You can in 513 fact create and post a multi-part formpost with the regular libcurl POST 514 support described above, but that would require that you build a formpost 515 yourself and provide to libcurl. 516 517 To make that easier, libcurl provides a MIME API consisting in several 518 functions: using those, you can create and fill a multi-part form. Function 519 curl_mime_init(3) creates a multi-part body; you can then append new parts 520 to a multi-part body using curl_mime_addpart(3). 521 522 There are three possible data sources for a part: memory using 523 curl_mime_data(3), file using curl_mime_filedata(3) and user-defined data 524 read callback using curl_mime_data_cb(3). curl_mime_name(3) sets a part's 525 (i.e.: form field) name, while curl_mime_filename(3) fills in the remote 526 filename. With curl_mime_type(3), you can tell the MIME type of a part, 527 curl_mime_headers(3) allows defining the part's headers. When a multi-part 528 body is no longer needed, you can destroy it using curl_mime_free(3). 529 530 The following example sets two simple text parts with plain textual contents, 531 and then a file with binary contents and uploads the whole thing. 532 533 ~~~c 534 curl_mime *multipart = curl_mime_init(handle); 535 curl_mimepart *part = curl_mime_addpart(multipart); 536 curl_mime_name(part, "name"); 537 curl_mime_data(part, "daniel", CURL_ZERO_TERMINATED); 538 part = curl_mime_addpart(multipart); 539 curl_mime_name(part, "project"); 540 curl_mime_data(part, "curl", CURL_ZERO_TERMINATED); 541 part = curl_mime_addpart(multipart); 542 curl_mime_name(part, "logotype-image"); 543 curl_mime_filedata(part, "curl.png"); 544 545 /* Set the form info */ 546 curl_easy_setopt(handle, CURLOPT_MIMEPOST, multipart); 547 548 curl_easy_perform(handle); /* post away! */ 549 550 /* free the post data again */ 551 curl_mime_free(multipart); 552 ~~~ 553 554 To post multiple files for a single form field, you must supply each file in 555 a separate part, all with the same field name. Although function 556 curl_mime_subparts(3) implements nested multi-parts, this way of 557 multiple files posting is deprecated by RFC 7578, chapter 4.3. 558 559 To set the data source from an already opened FILE pointer, use: 560 561 ~~~c 562 curl_mime_data_cb(part, filesize, (curl_read_callback) fread, 563 (curl_seek_callback) fseek, NULL, filepointer); 564 ~~~ 565 566 A deprecated curl_formadd(3) function is still supported in libcurl. 567 It should however not be used anymore for new designs and programs using it 568 ought to be converted to the MIME API. It is however described here as an 569 aid to conversion. 570 571 Using *curl_formadd*, you add parts to the form. When you are done adding 572 parts, you post the whole form. 573 574 The MIME API example above is expressed as follows using this function: 575 576 ~~~c 577 struct curl_httppost *post=NULL; 578 struct curl_httppost *last=NULL; 579 curl_formadd(&post, &last, 580 CURLFORM_COPYNAME, "name", 581 CURLFORM_COPYCONTENTS, "daniel", CURLFORM_END); 582 curl_formadd(&post, &last, 583 CURLFORM_COPYNAME, "project", 584 CURLFORM_COPYCONTENTS, "curl", CURLFORM_END); 585 curl_formadd(&post, &last, 586 CURLFORM_COPYNAME, "logotype-image", 587 CURLFORM_FILECONTENT, "curl.png", CURLFORM_END); 588 589 /* Set the form info */ 590 curl_easy_setopt(handle, CURLOPT_HTTPPOST, post); 591 592 curl_easy_perform(handle); /* post away! */ 593 594 /* free the post data again */ 595 curl_formfree(post); 596 ~~~ 597 598 Multipart formposts are chains of parts using MIME-style separators and 599 headers. It means that each one of these separate parts get a few headers set 600 that describe the individual content-type, size etc. To enable your 601 application to handicraft this formpost even more, libcurl allows you to 602 supply your own set of custom headers to such an individual form part. You can 603 of course supply headers to as many parts as you like, but this little example 604 shows how you set headers to one specific part when you add that to the post 605 handle: 606 607 ~~~c 608 struct curl_slist *headers=NULL; 609 headers = curl_slist_append(headers, "Content-Type: text/xml"); 610 611 curl_formadd(&post, &last, 612 CURLFORM_COPYNAME, "logotype-image", 613 CURLFORM_FILECONTENT, "curl.xml", 614 CURLFORM_CONTENTHEADER, headers, 615 CURLFORM_END); 616 617 curl_easy_perform(handle); /* post away! */ 618 619 curl_formfree(post); /* free post */ 620 curl_slist_free_all(headers); /* free custom header list */ 621 ~~~ 622 623 Since all options on an easy handle are "sticky", they remain the same until 624 changed even if you do call curl_easy_perform(3), you may need to tell 625 curl to go back to a plain GET request if you intend to do one as your next 626 request. You force an easy handle to go back to GET by using the 627 CURLOPT_HTTPGET(3) option: 628 ~~~c 629 curl_easy_setopt(handle, CURLOPT_HTTPGET, 1L); 630 ~~~ 631 Just setting CURLOPT_POSTFIELDS(3) to "" or NULL does *not* stop libcurl 632 from doing a POST. It just makes it POST without any data to send! 633 634 # Converting from deprecated form API to MIME API 635 636 Four rules have to be respected in building the multi-part: 637 638 - The easy handle must be created before building the multi-part. 639 640 - The multi-part is always created by a call to curl_mime_init(handle). 641 642 - Each part is created by a call to curl_mime_addpart(multipart). 643 644 - When complete, the multi-part must be bound to the easy handle using 645 CURLOPT_MIMEPOST(3) instead of CURLOPT_HTTPPOST(3). 646 647 Here are some example of *curl_formadd* calls to MIME API sequences: 648 649 ~~~c 650 curl_formadd(&post, &last, 651 CURLFORM_COPYNAME, "id", 652 CURLFORM_COPYCONTENTS, "daniel", CURLFORM_END); 653 CURLFORM_CONTENTHEADER, headers, 654 CURLFORM_END); 655 ~~~ 656 becomes: 657 ~~~c 658 part = curl_mime_addpart(multipart); 659 curl_mime_name(part, "id"); 660 curl_mime_data(part, "daniel", CURL_ZERO_TERMINATED); 661 curl_mime_headers(part, headers, FALSE); 662 ~~~ 663 664 Setting the last curl_mime_headers(3) argument to TRUE would have caused 665 the headers to be automatically released upon destroyed the multi-part, thus 666 saving a clean-up call to curl_slist_free_all(3). 667 668 ~~~c 669 curl_formadd(&post, &last, 670 CURLFORM_PTRNAME, "logotype-image", 671 CURLFORM_FILECONTENT, "-", 672 CURLFORM_END); 673 ~~~ 674 becomes: 675 ~~~c 676 part = curl_mime_addpart(multipart); 677 curl_mime_name(part, "logotype-image"); 678 curl_mime_data_cb(part, (curl_off_t) -1, fread, fseek, NULL, stdin); 679 ~~~ 680 681 curl_mime_name(3) always copies the field name. The special filename "-" is 682 not supported by curl_mime_filename(3): to read an open file, use a callback 683 source using fread(). The transfer is be chunk-encoded since the data size is 684 unknown. 685 686 ~~~c 687 curl_formadd(&post, &last, 688 CURLFORM_COPYNAME, "datafile[]", 689 CURLFORM_FILE, "file1", 690 CURLFORM_FILE, "file2", 691 CURLFORM_END); 692 ~~~ 693 becomes: 694 ~~~c 695 part = curl_mime_addpart(multipart); 696 curl_mime_name(part, "datafile[]"); 697 curl_mime_filedata(part, "file1"); 698 part = curl_mime_addpart(multipart); 699 curl_mime_name(part, "datafile[]"); 700 curl_mime_filedata(part, "file2"); 701 ~~~ 702 703 The deprecated multipart/mixed implementation of multiple files field is 704 translated to two distinct parts with the same name. 705 706 ~~~c 707 curl_easy_setopt(handle, CURLOPT_READFUNCTION, myreadfunc); 708 curl_formadd(&post, &last, 709 CURLFORM_COPYNAME, "stream", 710 CURLFORM_STREAM, arg, 711 CURLFORM_CONTENTLEN, (curl_off_t) datasize, 712 CURLFORM_FILENAME, "archive.zip", 713 CURLFORM_CONTENTTYPE, "application/zip", 714 CURLFORM_END); 715 ~~~ 716 becomes: 717 ~~~c 718 part = curl_mime_addpart(multipart); 719 curl_mime_name(part, "stream"); 720 curl_mime_data_cb(part, (curl_off_t) datasize, 721 myreadfunc, NULL, NULL, arg); 722 curl_mime_filename(part, "archive.zip"); 723 curl_mime_type(part, "application/zip"); 724 ~~~ 725 726 CURLOPT_READFUNCTION(3) callback is not used: it is replace by directly 727 setting the part source data from the callback read function. 728 729 ~~~c 730 curl_formadd(&post, &last, 731 CURLFORM_COPYNAME, "memfile", 732 CURLFORM_BUFFER, "memfile.bin", 733 CURLFORM_BUFFERPTR, databuffer, 734 CURLFORM_BUFFERLENGTH, (long) sizeof databuffer, 735 CURLFORM_END); 736 ~~~ 737 becomes: 738 ~~~c 739 part = curl_mime_addpart(multipart); 740 curl_mime_name(part, "memfile"); 741 curl_mime_data(part, databuffer, (curl_off_t) sizeof databuffer); 742 curl_mime_filename(part, "memfile.bin"); 743 ~~~ 744 745 curl_mime_data(3) always copies the initial data: data buffer is thus 746 free for immediate reuse. 747 748 ~~~c 749 curl_formadd(&post, &last, 750 CURLFORM_COPYNAME, "message", 751 CURLFORM_FILECONTENT, "msg.txt", 752 CURLFORM_END); 753 ~~~ 754 becomes: 755 ~~~c 756 part = curl_mime_addpart(multipart); 757 curl_mime_name(part, "message"); 758 curl_mime_filedata(part, "msg.txt"); 759 curl_mime_filename(part, NULL); 760 ~~~ 761 762 Use of curl_mime_filedata(3) sets the remote filename as a side effect: it is 763 therefore necessary to clear it for *CURLFORM_FILECONTENT* emulation. 764 765 # Showing Progress 766 767 For historical and traditional reasons, libcurl has a built-in progress meter 768 that can be switched on and then makes it present a progress meter in your 769 terminal. 770 771 Switch on the progress meter by, oddly enough, setting 772 CURLOPT_NOPROGRESS(3) to zero. This option is set to 1 by default. 773 774 For most applications however, the built-in progress meter is useless and what 775 instead is interesting is the ability to specify a progress callback. The 776 function pointer you pass to libcurl is then called on irregular intervals 777 with information about the current transfer. 778 779 Set the progress callback by using CURLOPT_PROGRESSFUNCTION(3). Pass a pointer 780 to a function that matches this prototype: 781 782 ~~~c 783 int progress_callback(void *clientp, 784 double dltotal, 785 double dlnow, 786 double ultotal, 787 double ulnow); 788 ~~~ 789 790 If any of the input arguments is unknown, a 0 is provided. The first argument, 791 the 'clientp' is the pointer you pass to libcurl with 792 CURLOPT_PROGRESSDATA(3). libcurl does not touch it. 793 794 # libcurl with C++ 795 796 There is basically only one thing to keep in mind when using C++ instead of C 797 when interfacing libcurl: 798 799 The callbacks CANNOT be non-static class member functions 800 801 Example C++ code: 802 803 ~~~c 804 class AClass { 805 static size_t write_data(void *ptr, size_t size, size_t nmemb, 806 void *ourpointer) 807 { 808 /* do what you want with the data */ 809 } 810 } 811 ~~~ 812 813 # Proxies 814 815 What "proxy" means according to Merriam-Webster: "a person authorized to act 816 for another" but also "the agency, function, or office of a deputy who acts as 817 a substitute for another". 818 819 Proxies are exceedingly common these days. Companies often only offer Internet 820 access to employees through their proxies. Network clients or user-agents ask 821 the proxy for documents, the proxy does the actual request and then it returns 822 them. 823 824 libcurl supports SOCKS and HTTP proxies. When a given URL is wanted, libcurl 825 asks the proxy for it instead of trying to connect to the actual remote host 826 identified in the URL. 827 828 If you are using a SOCKS proxy, you may find that libcurl does not quite support 829 all operations through it. 830 831 For HTTP proxies: the fact that the proxy is an HTTP proxy puts certain 832 restrictions on what can actually happen. A requested URL that might not be a 833 HTTP URL is passed to the HTTP proxy to deliver back to libcurl. This happens 834 transparently, and an application may not need to know. I say "may", because 835 at times it is important to understand that all operations over an HTTP proxy 836 use the HTTP protocol. For example, you cannot invoke your own custom FTP 837 commands or even proper FTP directory listings. 838 839 ## Proxy Options 840 841 To tell libcurl to use a proxy at a given port number: 842 ~~~c 843 curl_easy_setopt(handle, CURLOPT_PROXY, "proxy-host.com:8080"); 844 ~~~ 845 Some proxies require user authentication before allowing a request, and you 846 pass that information similar to this: 847 ~~~c 848 curl_easy_setopt(handle, CURLOPT_PROXYUSERPWD, "user:password"); 849 ~~~ 850 If you want to, you can specify the hostname only in the 851 CURLOPT_PROXY(3) option, and set the port number separately with 852 CURLOPT_PROXYPORT(3). 853 854 Tell libcurl what kind of proxy it is with CURLOPT_PROXYTYPE(3) (if not, 855 it defaults to assuming an HTTP proxy): 856 ~~~c 857 curl_easy_setopt(handle, CURLOPT_PROXYTYPE, CURLPROXY_SOCKS4); 858 ~~~ 859 860 ## Environment Variables 861 862 libcurl automatically checks and uses a set of environment variables to know 863 what proxies to use for certain protocols. The names of the variables are 864 following an old tradition and are built up as "[protocol]_proxy" (note the 865 lower casing). Which makes the variable 'http_proxy' checked for a name of a 866 proxy to use when the input URL is HTTP. Following the same rule, the variable 867 named 'ftp_proxy' is checked for FTP URLs. Again, the proxies are always HTTP 868 proxies, the different names of the variables simply allows different HTTP 869 proxies to be used. 870 871 The proxy environment variable contents should be in the format 872 "[protocol://][user:password@]machine[:port]". Where the protocol:// part 873 specifies which type of proxy it is, and the optional port number specifies on 874 which port the proxy operates. If not specified, the internal default port 875 number is used and that is most likely not the one you would like it to be. 876 877 There are two special environment variables. 'all_proxy' is what sets proxy 878 for any URL in case the protocol specific variable was not set, and 'no_proxy' 879 defines a list of hosts that should not use a proxy even though a variable may 880 say so. If 'no_proxy' is a plain asterisk ("*") it matches all hosts. 881 882 To explicitly disable libcurl's checking for and using the proxy environment 883 variables, set the proxy name to "" - an empty string - with 884 CURLOPT_PROXY(3). 885 886 ## SSL and Proxies 887 888 SSL is for secure point-to-point connections. This involves strong encryption 889 and similar things, which effectively makes it impossible for a proxy to 890 operate as a "man in between" which the proxy's task is, as previously 891 discussed. Instead, the only way to have SSL work over an HTTP proxy is to ask 892 the proxy to tunnel everything through without being able to check or fiddle 893 with the traffic. 894 895 Opening an SSL connection over an HTTP proxy is therefore a matter of asking the 896 proxy for a straight connection to the target host on a specified port. This 897 is made with the HTTP request CONNECT. ("please dear proxy, connect me to that 898 remote host"). 899 900 Because of the nature of this operation, where the proxy has no idea what kind 901 of data that is passed in and out through this tunnel, this breaks some of the 902 few advantages that come from using a proxy, such as caching. Many 903 organizations prevent this kind of tunneling to other destination port numbers 904 than 443 (which is the default HTTPS port number). 905 906 ## Tunneling Through Proxy 907 908 As explained above, tunneling is required for SSL to work and often even 909 restricted to the operation intended for SSL; HTTPS. 910 911 This is however not the only time proxy-tunneling might offer benefits to 912 you or your application. 913 914 As tunneling opens a direct connection from your application to the remote 915 machine, it suddenly also re-introduces the ability to do non-HTTP 916 operations over an HTTP proxy. You can in fact use things such as FTP 917 upload or FTP custom commands this way. 918 919 Again, this is often prevented by the administrators of proxies and is 920 rarely allowed. 921 922 Tell libcurl to use proxy tunneling like this: 923 ~~~c 924 curl_easy_setopt(handle, CURLOPT_HTTPPROXYTUNNEL, 1L); 925 ~~~ 926 In fact, there might even be times when you want to do plain HTTP operations 927 using a tunnel like this, as it then enables you to operate on the remote 928 server instead of asking the proxy to do so. libcurl does not stand in the way 929 for such innovative actions either! 930 931 ## Proxy Auto-Config 932 933 Netscape first came up with this. It is basically a webpage (usually using a 934 .pac extension) with a JavaScript that when executed by the browser with the 935 requested URL as input, returns information to the browser on how to connect 936 to the URL. The returned information might be "DIRECT" (which means no proxy 937 should be used), "PROXY host:port" (to tell the browser where the proxy for 938 this particular URL is) or "SOCKS host:port" (to direct the browser to a SOCKS 939 proxy). 940 941 libcurl has no means to interpret or evaluate JavaScript and thus it does not 942 support this. If you get yourself in a position where you face this nasty 943 invention, the following advice have been mentioned and used in the past: 944 945 - Depending on the JavaScript complexity, write up a script that translates it 946 to another language and execute that. 947 948 - Read the JavaScript code and rewrite the same logic in another language. 949 950 - Implement a JavaScript interpreter; people have successfully used the 951 Mozilla JavaScript engine in the past. 952 953 - Ask your admins to stop this, for a static proxy setup or similar. 954 955 # Persistence Is The Way to Happiness 956 957 Re-cycling the same easy handle several times when doing multiple requests is 958 the way to go. 959 960 After each single curl_easy_perform(3) operation, libcurl keeps the 961 connection alive and open. A subsequent request using the same easy handle to 962 the same host might just be able to use the already open connection! This 963 reduces network impact a lot. 964 965 Even if the connection is dropped, all connections involving SSL to the same 966 host again, benefit from libcurl's session ID cache that drastically reduces 967 re-connection time. 968 969 FTP connections that are kept alive save a lot of time, as the command- 970 response round-trips are skipped, and also you do not risk getting blocked 971 without permission to login again like on many FTP servers only allowing N 972 persons to be logged in at the same time. 973 974 libcurl caches DNS name resolving results, to make lookups of a previously 975 looked up name a lot faster. 976 977 Other interesting details that improve performance for subsequent requests 978 may also be added in the future. 979 980 Each easy handle attempts to keep the last few connections alive for a while 981 in case they are to be used again. You can set the size of this "cache" with 982 the CURLOPT_MAXCONNECTS(3) option. Default is 5. There is rarely any 983 point in changing this value, and if you think of changing this it is often 984 just a matter of thinking again. 985 986 To force your upcoming request to not use an already existing connection, you 987 can do that by setting CURLOPT_FRESH_CONNECT(3) to 1. In a similar 988 spirit, you can also forbid the upcoming request to be "lying" around and 989 possibly get reused after the request by setting 990 CURLOPT_FORBID_REUSE(3) to 1. 991 992 # HTTP Headers Used by libcurl 993 994 When you use libcurl to do HTTP requests, it passes along a series of headers 995 automatically. It might be good for you to know and understand these. You can 996 replace or remove them by using the CURLOPT_HTTPHEADER(3) option. 997 998 ## Host 999 1000 This header is required by HTTP 1.1 and even many 1.0 servers and should be 1001 the name of the server we want to talk to. This includes the port number if 1002 anything but default. 1003 1004 ## Accept 1005 1006 "*/*" 1007 1008 ## Expect 1009 1010 When doing POST requests, libcurl sets this header to "100-continue" to ask 1011 the server for an "OK" message before it proceeds with sending the data part 1012 of the post. If the posted data amount is deemed "small", libcurl does not use 1013 this header. 1014 1015 # Customizing Operations 1016 1017 There is an ongoing development today where more and more protocols are built 1018 upon HTTP for transport. This has obvious benefits as HTTP is a tested and 1019 reliable protocol that is widely deployed and has excellent proxy-support. 1020 1021 When you use one of these protocols, and even when doing other kinds of 1022 programming you may need to change the traditional HTTP (or FTP or...) 1023 manners. You may need to change words, headers or various data. 1024 1025 libcurl is your friend here too. 1026 1027 ## CURLOPT_CUSTOMREQUEST 1028 1029 If just changing the actual HTTP request keyword is what you want, like when 1030 GET, HEAD or POST is not good enough for you, CURLOPT_CUSTOMREQUEST(3) 1031 is there for you. It is simple to use: 1032 1033 ~~~c 1034 curl_easy_setopt(handle, CURLOPT_CUSTOMREQUEST, "MYOWNREQUEST"); 1035 ~~~ 1036 1037 When using the custom request, you change the request keyword of the actual 1038 request you are performing. Thus, by default you make a GET request but you 1039 can also make a POST operation (as described before) and then replace the POST 1040 keyword if you want to. You are the boss. 1041 1042 ## Modify Headers 1043 1044 HTTP-like protocols pass a series of headers to the server when doing the 1045 request, and you are free to pass any amount of extra headers that you 1046 think fit. Adding headers is this easy: 1047 1048 ~~~c 1049 struct curl_slist *headers=NULL; /* init to NULL is important */ 1050 1051 headers = curl_slist_append(headers, "Hey-server-hey: how are you?"); 1052 headers = curl_slist_append(headers, "X-silly-content: yes"); 1053 1054 /* pass our list of custom made headers */ 1055 curl_easy_setopt(handle, CURLOPT_HTTPHEADER, headers); 1056 1057 curl_easy_perform(handle); /* transfer http */ 1058 1059 curl_slist_free_all(headers); /* free the header list */ 1060 ~~~ 1061 1062 ... and if you think some of the internally generated headers, such as Accept: 1063 or Host: do not contain the data you want them to contain, you can replace 1064 them by simply setting them too: 1065 1066 ~~~c 1067 headers = curl_slist_append(headers, "Accept: Agent-007"); 1068 headers = curl_slist_append(headers, "Host: munged.host.line"); 1069 ~~~ 1070 1071 ## Delete Headers 1072 1073 If you replace an existing header with one with no contents, you prevent the 1074 header from being sent. For instance, if you want to completely prevent the 1075 "Accept:" header from being sent, you can disable it with code similar to 1076 this: 1077 1078 headers = curl_slist_append(headers, "Accept:"); 1079 1080 Both replacing and canceling internal headers should be done with careful 1081 consideration and you should be aware that you may violate the HTTP protocol 1082 when doing so. 1083 1084 ## Enforcing chunked transfer-encoding 1085 1086 By making sure a request uses the custom header "Transfer-Encoding: chunked" 1087 when doing a non-GET HTTP operation, libcurl switches over to "chunked" 1088 upload, even though the size of the data to upload might be known. By default, 1089 libcurl usually switches over to chunked upload automatically if the upload 1090 data size is unknown. 1091 1092 ## HTTP Version 1093 1094 All HTTP requests includes the version number to tell the server which version 1095 we support. libcurl speaks HTTP 1.1 by default. Some old servers do not like 1096 getting 1.1-requests and when dealing with stubborn old things like that, you 1097 can tell libcurl to use 1.0 instead by doing something like this: 1098 1099 curl_easy_setopt(handle, CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_1_0); 1100 1101 ## FTP Custom Commands 1102 1103 Not all protocols are HTTP-like, and thus the above may not help you when 1104 you want to make, for example, your FTP transfers to behave differently. 1105 1106 Sending custom commands to an FTP server means that you need to send the 1107 commands exactly as the FTP server expects them (RFC 959 is a good guide 1108 here), and you can only use commands that work on the control-connection 1109 alone. All kinds of commands that require data interchange and thus need a 1110 data-connection must be left to libcurl's own judgment. Also be aware that 1111 libcurl does its best to change directory to the target directory before doing 1112 any transfer, so if you change directory (with CWD or similar) you might 1113 confuse libcurl and then it might not attempt to transfer the file in the 1114 correct remote directory. 1115 1116 A little example that deletes a given file before an operation: 1117 1118 ~~~c 1119 headers = curl_slist_append(headers, "DELE file-to-remove"); 1120 1121 /* pass the list of custom commands to the handle */ 1122 curl_easy_setopt(handle, CURLOPT_QUOTE, headers); 1123 1124 curl_easy_perform(handle); /* transfer ftp data! */ 1125 1126 curl_slist_free_all(headers); /* free the header list */ 1127 ~~~ 1128 1129 If you would instead want this operation (or chain of operations) to happen 1130 _after_ the data transfer took place the option to curl_easy_setopt(3) 1131 would instead be called CURLOPT_POSTQUOTE(3) and used the exact same 1132 way. 1133 1134 The custom FTP commands are issued to the server in the same order they are 1135 added to the list, and if a command gets an error code returned back from the 1136 server, no more commands are issued and libcurl bails out with an error code 1137 (CURLE_QUOTE_ERROR). Note that if you use CURLOPT_QUOTE(3) to send 1138 commands before a transfer, no transfer actually takes place when a quote 1139 command has failed. 1140 1141 If you set the CURLOPT_HEADER(3) to 1, you tell libcurl to get 1142 information about the target file and output "headers" about it. The headers 1143 are in "HTTP-style", looking like they do in HTTP. 1144 1145 The option to enable headers or to run custom FTP commands may be useful to 1146 combine with CURLOPT_NOBODY(3). If this option is set, no actual file 1147 content transfer is performed. 1148 1149 ## FTP Custom CURLOPT_CUSTOMREQUEST 1150 1151 If you do want to list the contents of an FTP directory using your own defined 1152 FTP command, CURLOPT_CUSTOMREQUEST(3) does just that. "NLST" is the default 1153 one for listing directories but you are free to pass in your idea of a good 1154 alternative. 1155 1156 # Cookies Without Chocolate Chips 1157 1158 In the HTTP sense, a cookie is a name with an associated value. A server sends 1159 the name and value to the client, and expects it to get sent back on every 1160 subsequent request to the server that matches the particular conditions set. 1161 The conditions include that the domain name and path match and that the cookie 1162 has not become too old. 1163 1164 In real-world cases, servers send new cookies to replace existing ones to 1165 update them. Server use cookies to "track" users and to keep "sessions". 1166 1167 Cookies are sent from server to clients with the header Set-Cookie: and 1168 they are sent from clients to servers with the Cookie: header. 1169 1170 To just send whatever cookie you want to a server, you can use 1171 CURLOPT_COOKIE(3) to set a cookie string like this: 1172 1173 ~~~c 1174 curl_easy_setopt(handle, CURLOPT_COOKIE, "name1=var1; name2=var2;"); 1175 ~~~ 1176 1177 In many cases, that is not enough. You might want to dynamically save whatever 1178 cookies the remote server passes to you, and make sure those cookies are then 1179 used accordingly on later requests. 1180 1181 One way to do this, is to save all headers you receive in a plain file and 1182 when you make a request, you tell libcurl to read the previous headers to 1183 figure out which cookies to use. Set the header file to read cookies from with 1184 CURLOPT_COOKIEFILE(3). 1185 1186 The CURLOPT_COOKIEFILE(3) option also automatically enables the cookie 1187 parser in libcurl. Until the cookie parser is enabled, libcurl does not parse 1188 or understand incoming cookies and they are just be ignored. However, when the 1189 parser is enabled the cookies are understood and the cookies are kept in 1190 memory and used properly in subsequent requests when the same handle is 1191 used. Many times this is enough, and you may not have to save the cookies to 1192 disk at all. Note that the file you specify to CURLOPT_COOKIEFILE(3) 1193 does not have to exist to enable the parser, so a common way to just enable 1194 the parser and not read any cookies is to use the name of a file you know does 1195 not exist. 1196 1197 If you would rather use existing cookies that you have previously received 1198 with your Netscape or Mozilla browsers, you can make libcurl use that cookie 1199 file as input. The CURLOPT_COOKIEFILE(3) is used for that too, as 1200 libcurl automatically finds out what kind of file it is and acts accordingly. 1201 1202 Perhaps the most advanced cookie operation libcurl offers, is saving the 1203 entire internal cookie state back into a Netscape/Mozilla formatted cookie 1204 file. We call that the cookie-jar. When you set a filename with 1205 CURLOPT_COOKIEJAR(3), that filename is created and all received cookies get 1206 stored in it when curl_easy_cleanup(3) is called. This enables cookies to get 1207 passed on properly between multiple handles without any information getting 1208 lost. 1209 1210 # FTP Peculiarities We Need 1211 1212 FTP transfers use a second TCP/IP connection for the data transfer. This is 1213 usually a fact you can forget and ignore but at times this detail comes back 1214 to haunt you. libcurl offers several different ways to customize how the 1215 second connection is being made. 1216 1217 libcurl can either connect to the server a second time or tell the server to 1218 connect back to it. The first option is the default and it is also what works 1219 best for all the people behind firewalls, NATs or IP-masquerading setups. 1220 libcurl then tells the server to open up a new port and wait for a second 1221 connection. This is by default attempted with EPSV first, and if that does not 1222 work it tries PASV instead. (EPSV is an extension to the original FTP spec 1223 and does not exist nor work on all FTP servers.) 1224 1225 You can prevent libcurl from first trying the EPSV command by setting 1226 CURLOPT_FTP_USE_EPSV(3) to zero. 1227 1228 In some cases, you want to have the server connect back to you for the second 1229 connection. This might be when the server is perhaps behind a firewall or 1230 something and only allows connections on a single port. libcurl then informs 1231 the remote server which IP address and port number to connect to. This is made 1232 with the CURLOPT_FTPPORT(3) option. If you set it to "-", libcurl uses your 1233 system's "default IP address". If you want to use a particular IP, you can set 1234 the full IP address, a hostname to resolve to an IP address or even a local 1235 network interface name that libcurl gets the IP address from. 1236 1237 When doing the "PORT" approach, libcurl attempts to use the EPRT and the LPRT 1238 before trying PORT, as they work with more protocols. You can disable this 1239 behavior by setting CURLOPT_FTP_USE_EPRT(3) to zero. 1240 1241 # MIME API revisited for SMTP and IMAP 1242 1243 In addition to support HTTP multi-part form fields, the MIME API can be used 1244 to build structured email messages and send them via SMTP or append such 1245 messages to IMAP directories. 1246 1247 A structured email message may contain several parts: some are displayed 1248 inline by the MUA, some are attachments. Parts can also be structured as 1249 multi-part, for example to include another email message or to offer several 1250 text formats alternatives. This can be nested to any level. 1251 1252 To build such a message, you prepare the nth-level multi-part and then include 1253 it as a source to the parent multi-part using function 1254 curl_mime_subparts(3). Once it has been 1255 bound to its parent multi-part, a nth-level multi-part belongs to it and 1256 should not be freed explicitly. 1257 1258 Email messages data is not supposed to be non-ASCII and line length is 1259 limited: fortunately, some transfer encodings are defined by the standards to 1260 support the transmission of such incompatible data. Function 1261 curl_mime_encoder(3) tells a part that its source data must be encoded 1262 before being sent. It also generates the corresponding header for that part. 1263 If the part data you want to send is already encoded in such a scheme, do not 1264 use this function (this would over-encode it), but explicitly set the 1265 corresponding part header. 1266 1267 Upon sending such a message, libcurl prepends it with the header list 1268 set with CURLOPT_HTTPHEADER(3), as zero level mime part headers. 1269 1270 Here is an example building an email message with an inline plain/html text 1271 alternative and a file attachment encoded in base64: 1272 1273 ~~~c 1274 curl_mime *message = curl_mime_init(handle); 1275 1276 /* The inline part is an alternative proposing the html and the text 1277 versions of the email. */ 1278 curl_mime *alt = curl_mime_init(handle); 1279 1280 /* HTML message. */ 1281 curl_mimepart *part = curl_mime_addpart(alt); 1282 curl_mime_data(part, "<html><body><p>This is HTML</p></body></html>", 1283 CURL_ZERO_TERMINATED); 1284 curl_mime_type(part, "text/html"); 1285 1286 /* Text message. */ 1287 part = curl_mime_addpart(alt); 1288 curl_mime_data(part, "This is plain text message", 1289 CURL_ZERO_TERMINATED); 1290 1291 /* Create the inline part. */ 1292 part = curl_mime_addpart(message); 1293 curl_mime_subparts(part, alt); 1294 curl_mime_type(part, "multipart/alternative"); 1295 struct curl_slist *headers = curl_slist_append(NULL, 1296 "Content-Disposition: inline"); 1297 curl_mime_headers(part, headers, TRUE); 1298 1299 /* Add the attachment. */ 1300 part = curl_mime_addpart(message); 1301 curl_mime_filedata(part, "manual.pdf"); 1302 curl_mime_encoder(part, "base64"); 1303 1304 /* Build the mail headers. */ 1305 headers = curl_slist_append(NULL, "From: me@example.com"); 1306 headers = curl_slist_append(headers, "To: you@example.com"); 1307 1308 /* Set these into the easy handle. */ 1309 curl_easy_setopt(handle, CURLOPT_HTTPHEADER, headers); 1310 curl_easy_setopt(handle, CURLOPT_MIMEPOST, mime); 1311 ~~~ 1312 1313 It should be noted that appending a message to an IMAP directory requires 1314 the message size to be known prior upload. It is therefore not possible to 1315 include parts with unknown data size in this context. 1316 1317 # Headers Equal Fun 1318 1319 Some protocols provide "headers", meta-data separated from the normal 1320 data. These headers are by default not included in the normal data stream, but 1321 you can make them appear in the data stream by setting CURLOPT_HEADER(3) 1322 to 1. 1323 1324 What might be even more useful, is libcurl's ability to separate the headers 1325 from the data and thus make the callbacks differ. You can for example set a 1326 different pointer to pass to the ordinary write callback by setting 1327 CURLOPT_HEADERDATA(3). 1328 1329 Or, you can set an entirely separate function to receive the headers, by using 1330 CURLOPT_HEADERFUNCTION(3). 1331 1332 The headers are passed to the callback function one by one, and you can 1333 depend on that fact. It makes it easier for you to add custom header parsers 1334 etc. 1335 1336 "Headers" for FTP transfers equal all the FTP server responses. They are not 1337 actually true headers, but in this case we pretend they are! ;-) 1338 1339 # Post Transfer Information 1340 1341 See curl_easy_getinfo(3). 1342 1343 # The multi Interface 1344 1345 The easy interface as described in detail in this document is a synchronous 1346 interface that transfers one file at a time and does not return until it is 1347 done. 1348 1349 The multi interface, on the other hand, allows your program to transfer 1350 multiple files in both directions at the same time, without forcing you to use 1351 multiple threads. The name might make it seem that the multi interface is for 1352 multi-threaded programs, but the truth is almost the reverse. The multi 1353 interface allows a single-threaded application to perform the same kinds of 1354 multiple, simultaneous transfers that multi-threaded programs can perform. It 1355 allows many of the benefits of multi-threaded transfers without the complexity 1356 of managing and synchronizing many threads. 1357 1358 To complicate matters somewhat more, there are even two versions of the multi 1359 interface. The event based one, also called multi_socket and the "normal one" 1360 designed for using with select(). See the libcurl-multi.3 man page for details 1361 on the multi_socket event based API, this description here is for the select() 1362 oriented one. 1363 1364 To use this interface, you are better off if you first understand the basics 1365 of how to use the easy interface. The multi interface is simply a way to make 1366 multiple transfers at the same time by adding up multiple easy handles into 1367 a "multi stack". 1368 1369 You create the easy handles you want, one for each concurrent transfer, and 1370 you set all the options just like you learned above, and then you create a 1371 multi handle with curl_multi_init(3) and add all those easy handles to 1372 that multi handle with curl_multi_add_handle(3). 1373 1374 When you have added the handles you have for the moment (you can still add new 1375 ones at any time), you start the transfers by calling 1376 curl_multi_perform(3). 1377 1378 curl_multi_perform(3) is asynchronous. It only performs what can be done 1379 now and then return control to your program. It is designed to never 1380 block. You need to keep calling the function until all transfers are 1381 completed. 1382 1383 The best usage of this interface is when you do a select() on all possible 1384 file descriptors or sockets to know when to call libcurl again. This also 1385 makes it easy for you to wait and respond to actions on your own application's 1386 sockets/handles. You figure out what to select() for by using 1387 curl_multi_fdset(3), that fills in a set of *fd_set* variables for 1388 you with the particular file descriptors libcurl uses for the moment. 1389 1390 When you then call select(), it returns when one of the file handles signal 1391 action and you then call curl_multi_perform(3) to allow libcurl to do 1392 what it wants to do. Take note that libcurl does also feature some time-out 1393 code so we advise you to never use long timeouts on select() before you call 1394 curl_multi_perform(3) again. curl_multi_timeout(3) is provided to 1395 help you get a suitable timeout period. 1396 1397 Another precaution you should use: always call curl_multi_fdset(3) 1398 immediately before the select() call since the current set of file descriptors 1399 may change in any curl function invoke. 1400 1401 If you want to stop the transfer of one of the easy handles in the stack, you 1402 can use curl_multi_remove_handle(3) to remove individual easy 1403 handles. Remember that easy handles should be curl_easy_cleanup(3)ed. 1404 1405 When a transfer within the multi stack has finished, the counter of running 1406 transfers (as filled in by curl_multi_perform(3)) decreases. When the 1407 number reaches zero, all transfers are done. 1408 1409 curl_multi_info_read(3) can be used to get information about completed 1410 transfers. It then returns the CURLcode for each easy transfer, to allow you 1411 to figure out success on each individual transfer. 1412 1413 # SSL, Certificates and Other Tricks 1414 1415 [ seeding, passwords, keys, certificates, ENGINE, ca certs ] 1416 1417 # Sharing Data Between Easy Handles 1418 1419 You can share some data between easy handles when the easy interface is used, 1420 and some data is share automatically when you use the multi interface. 1421 1422 When you add easy handles to a multi handle, these easy handles automatically 1423 share a lot of the data that otherwise would be kept on a per-easy handle 1424 basis when the easy interface is used. 1425 1426 The DNS cache is shared between handles within a multi handle, making 1427 subsequent name resolving faster, and the connection pool that is kept to 1428 better allow persistent connections and connection reuse is also shared. If 1429 you are using the easy interface, you can still share these between specific 1430 easy handles by using the share interface, see libcurl-share(3). 1431 1432 Some things are never shared automatically, not within multi handles, like for 1433 example cookies so the only way to share that is with the share interface. 1434 1435 # Footnotes 1436 1437 ## [1] 1438 1439 libcurl 7.10.3 and later have the ability to switch over to chunked 1440 Transfer-Encoding in cases where HTTP uploads are done with data of an unknown 1441 size. 1442 1443 ## [2] 1444 1445 This happens on Windows machines when libcurl is built and used as a 1446 DLL. However, you can still do this on Windows if you link with a static 1447 library. 1448 1449 ## [3] 1450 1451 The curl-config tool is generated at build-time (on Unix-like systems) and 1452 should be installed with the 'make install' or similar instruction that 1453 installs the library, header files, man pages etc. 1454 1455 ## [4] 1456 1457 This behavior was different in versions before 7.17.0, where strings had to 1458 remain valid past the end of the curl_easy_setopt(3) call.