STRPARSE.md (5714B)
1 <!-- 2 Copyright (C) Daniel Stenberg, <daniel@haxx.se>, et al. 3 4 SPDX-License-Identifier: curl 5 --> 6 7 # String parsing with `strparse` 8 9 The functions take input via a pointer to a pointer, which allows the 10 functions to advance the pointer on success which then by extension allows 11 "chaining" of functions like this example that gets a word, a space and then a 12 second word: 13 14 ~~~c 15 if(curlx_str_word(&line, &word1, MAX) || 16 curlx_str_singlespace(&line) || 17 curlx_str_word(&line, &word2, MAX)) 18 fprintf(stderr, "ERROR\n"); 19 ~~~ 20 21 The input pointer **must** point to a null-terminated buffer area or these 22 functions risk continuing "off the edge". 23 24 ## Strings 25 26 The functions that return string information does so by populating a 27 `struct Curl_str`: 28 29 ~~~c 30 struct Curl_str { 31 char *str; 32 size_t len; 33 }; 34 ~~~ 35 36 Access the struct fields with `curlx_str()` for the pointer and `curlx_strlen()` 37 for the length rather than using the struct fields directly. 38 39 ## `curlx_str_init` 40 41 ~~~c 42 void curlx_str_init(struct Curl_str *out) 43 ~~~ 44 45 This initiates a string struct. The parser functions that store info in 46 strings always init the string themselves, so this stand-alone use is often 47 not necessary. 48 49 ## `curlx_str_assign` 50 51 ~~~c 52 void curlx_str_assign(struct Curl_str *out, const char *str, size_t len) 53 ~~~ 54 55 Set a pointer and associated length in the string struct. 56 57 ## `curlx_str_word` 58 59 ~~~c 60 int curlx_str_word(char **linep, struct Curl_str *out, const size_t max); 61 ~~~ 62 63 Get a sequence of bytes until the first space or the end of the string. Return 64 non-zero on error. There is no way to include a space in the word, no sort of 65 escaping. The word must be at least one byte, otherwise it is considered an 66 error. 67 68 `max` is the longest accepted word, or it returns error. 69 70 On a successful return, `linep` is updated to point to the byte immediately 71 following the parsed word. 72 73 ## `curlx_str_until` 74 75 ~~~c 76 int curlx_str_until(char **linep, struct Curl_str *out, const size_t max, 77 char delim); 78 ~~~ 79 80 Like `curlx_str_word` but instead of parsing to space, it parses to a given 81 custom delimiter non-zero byte `delim`. 82 83 `max` is the longest accepted word, or it returns error. 84 85 The parsed word must be at least one byte, otherwise it is considered an 86 error. 87 88 ## `curlx_str_untilnl` 89 90 ~~~c 91 int curlx_str_untilnl(char **linep, struct Curl_str *out, const size_t max); 92 ~~~ 93 94 Like `curlx_str_untilnl` but instead parses until it finds a "newline byte". 95 That means either a CR (ASCII 13) or an LF (ASCII 10) octet. 96 97 `max` is the longest accepted word, or it returns error. 98 99 The parsed word must be at least one byte, otherwise it is considered an 100 error. 101 102 ## `curlx_str_cspn` 103 104 ~~~c 105 int curlx_str_cspn(const char **linep, struct Curl_str *out, const char *cspn); 106 ~~~ 107 108 Get a sequence of characters until one of the bytes in the `cspn` string 109 matches. Similar to the `strcspn` function. 110 111 ## `curlx_str_quotedword` 112 113 ~~~c 114 int curlx_str_quotedword(char **linep, struct Curl_str *out, const size_t max); 115 ~~~ 116 117 Get a "quoted" word. This means everything that is provided within a leading 118 and an ending double quote character. No escaping possible. 119 120 `max` is the longest accepted word, or it returns error. 121 122 The parsed word must be at least one byte, otherwise it is considered an 123 error. 124 125 ## `curlx_str_single` 126 127 ~~~c 128 int curlx_str_single(char **linep, char byte); 129 ~~~ 130 131 Advance over a single character provided in `byte`. Return non-zero on error. 132 133 ## `curlx_str_singlespace` 134 135 ~~~c 136 int curlx_str_singlespace(char **linep); 137 ~~~ 138 139 Advance over a single ASCII space. Return non-zero on error. 140 141 ## `curlx_str_passblanks` 142 143 ~~~c 144 void curlx_str_passblanks(char **linep); 145 ~~~ 146 147 Advance over all spaces and tabs. 148 149 ## `curlx_str_trimblanks` 150 151 ~~~c 152 void curlx_str_trimblanks(struct Curl_str *out); 153 ~~~ 154 155 Trim off blanks (spaces and tabs) from the start and the end of the given 156 string. 157 158 ## `curlx_str_number` 159 160 ~~~c 161 int curlx_str_number(char **linep, curl_size_t *nump, size_t max); 162 ~~~ 163 164 Get an unsigned decimal number not larger than `max`. Leading zeroes are just 165 swallowed. Return non-zero on error. Returns error if there was not a single 166 digit. 167 168 ## `curlx_str_numblanks` 169 170 ~~~c 171 int curlx_str_numblanks(char **linep, curl_size_t *nump); 172 ~~~ 173 174 Get an unsigned 63-bit decimal number. Leading blanks and zeroes are skipped. 175 Returns non-zero on error. Returns error if there was not a single digit. 176 177 ## `curlx_str_hex` 178 179 ~~~c 180 int curlx_str_hex(char **linep, curl_size_t *nump, size_t max); 181 ~~~ 182 183 Get an unsigned hexadecimal number not larger than `max`. Leading zeroes are 184 just swallowed. Return non-zero on error. Returns error if there was not a 185 single digit. Does *not* handled `0x` prefix. 186 187 ## `curlx_str_octal` 188 189 ~~~c 190 int curlx_str_octal(char **linep, curl_size_t *nump, size_t max); 191 ~~~ 192 193 Get an unsigned octal number not larger than `max`. Leading zeroes are just 194 swallowed. Return non-zero on error. Returns error if there was not a single 195 digit. 196 197 ## `curlx_str_newline` 198 199 ~~~c 200 int curlx_str_newline(char **linep); 201 ~~~ 202 203 Check for a single CR or LF. Return non-zero on error */ 204 205 ## `curlx_str_casecompare` 206 207 ~~~c 208 int curlx_str_casecompare(struct Curl_str *str, const char *check); 209 ~~~ 210 211 Returns true if the provided string in the `str` argument matches the `check` 212 string case insensitively. 213 214 ## `curlx_str_cmp` 215 216 ~~~c 217 int curlx_str_cmp(struct Curl_str *str, const char *check); 218 ~~~ 219 220 Returns true if the provided string in the `str` argument matches the `check` 221 string case sensitively. This is *not* the same return code as `strcmp`. 222 223 ## `curlx_str_nudge` 224 225 ~~~c 226 int curlx_str_nudge(struct Curl_str *str, size_t num); 227 ~~~ 228 229 Removes `num` bytes from the beginning (left) of the string kept in `str`. If 230 `num` is larger than the string, it instead returns an error.