summaryrefslogtreecommitdiff
path: root/src/string_bytes.cc
AgeCommit message (Collapse)Author
2015-01-20src: silence clang warningsTrevor Norris
Mark several methods "override" in order to remove build warnings. PR-URL: https://github.com/iojs/io.js/pull/531 Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
2015-01-12Remove excessive copyright/license boilerplateisaacs
The copyright and license notice is already in the LICENSE file. There is no justifiable reason to also require that it be included in every file, since the individual files are not individually distributed except as part of the entire package.
2015-01-07src: pass Isolate to additional locationsTrevor Norris
Due to a recent V8 upgrade, more methods require Isolate as an argument. PR-URL: https://github.com/iojs/io.js/pull/244 Reviewed-by: Ben Noordhuis <info@bnoordhuis.nl>
2014-12-14src: move BE/LE buffer conversion to StringSlice()Ben Noordhuis
Move the big endian to little endian conversion logic for UCS2 input from src/string_bytes.cc to src/node_buffer.cc; StringSlice() is the only function that actually needs it and with this commit, a second copy is avoided on big endian architectures.
2014-12-14src: redo unaligned access workaroundBen Noordhuis
Introduce two-byte overloads of node::Encode() and StringBytes::Encode() that ensure that the input is suitably aligned. Revisits commit 535fec8 from yesterday.
2014-12-09src: fix unaligned access in ucs2 string encoderBen Noordhuis
Seen with g++ 4.9.2 on x86_64 Linux: a SIGSEGV is generated when the input to v8::String::NewFromTwoByte() is not suitably aligned. g++ 4.9.2 emits SSE instructions for copy loops. That requires aligned input but that was something StringBytes::Encode() did not enforce until now. Make a properly aligned copy before handing off the input to V8. We could, as an optimization, check that the pointer is aligned on a two-byte boundary but that is technically still UB; pointers-to-char are allowed to alias other pointers but the reverse is not true: a pointer-to-uint16_t that aliases a pointer-to-char is in violation of the pointer aliasing rules. See https://code.google.com/p/v8/issues/detail?id=3694 Fixes segfaulting test simple/test-stream2-writable. PR-URL: https://github.com/iojs/io.js/pull/127 Reviewed-by: Trevor Norris <trev.norris@gmail.com>
2014-11-14src: fixups after v8 upgradeBen Noordhuis
* v8::Platform has a new MonotonicallyIncreasingTime() method, implement it. * The ASCII apocalypse continues with the replacement of external ASCII strings with external one byte strings.
2014-10-23src: mark more destructors with override keywordBen Noordhuis
The previous commits fixed oversights in destructors that should have been marked virtual but weren't. This commit marks destructors from derived classes with the override keyword.
2014-10-23src: replace NULL with nullptrBen Noordhuis
Now that we are building with C++11 features enabled, replace use of NULL with nullptr. The benefit of using nullptr is that it can never be confused for an integral type because it does not support implicit conversions to integral types except boolean - unlike NULL, which is defined as a literal `0`.
2014-10-16build: add x32 supportBen Noordhuis
This commit adds preliminary x32 support. Configure with: $ ./configure --dest-cpu=x32 PR-URL: https://github.com/node-forward/node/pull/24 Reviewed-By: Fedor Indutny <fedor@indutny.com>
2014-10-12src: replace assert() with CHECK()Ben Noordhuis
Mechanically replace assert() statements with UNREACHABLE(), CHECK(), or CHECK_{EQ,NE,LT,GT,LE,GE}() statements. The exceptions are src/node.h and src/node_object_wrap.h because they are public headers. PR-URL: https://github.com/node-forward/node/pull/16 Reviewed-By: Fedor Indutny <fedor@indutny.com>
2014-06-10Merge remote-tracking branch 'upstream/v0.10'Timothy J Fontaine
Conflicts: AUTHORS ChangeLog deps/v8/src/api.cc deps/v8/src/unicode-inl.h deps/v8/src/unicode.h lib/_stream_readable.js lib/http.js src/cares_wrap.cc src/node.cc src/node_crypto.cc src/node_dtrace.cc src/node_file.cc src/node_stat_watcher.cc src/node_version.h src/process_wrap.cc src/string_bytes.cc src/string_bytes.h src/udp_wrap.cc src/util.h test/simple/test-buffer.js test/simple/test-stream2-compatibility.js
2014-06-06string_bytes: Guarantee valid utf-8 outputFelix Geisendörfer
Previously v8's WriteUtf8 function would produce invalid utf-8 output when encountering unmatched surrogate code units [1]. The new REPLACE_INVALID_UTF8 option fixes that by replacing invalid code points with the unicode replacement character. [1]: JS Strings are defined as arrays of 16 bit unsigned integers. There is no unicode enforcement, so one can easily end up with invalid unicode code unit sequences inside a string.
2014-05-21string_bytes: ucs2 support big endianAndrew Low
64bit constants are keyed for x64 platforms only, add PowerPC based platform constants. Node's "ucs2" encoding wants LE character data stored in the Buffer, so we need to reorder on BE platforms. See http://nodejs.org/api/buffer.html regarding Node's "ucs2" encoding specification Signed-off-by: Timothy J Fontaine <tjfontaine@gmail.com>
2014-05-12src: fix StringBytes::Write if string is externalRefael Ackermann
Signed-off-by: Fedor Indutny <fedor@indutny.com>
2014-03-16src: don't call DecodeWrite() on BuffersBen Noordhuis
Don't call DecodeWrite() with a Buffer as its argument because it in turn calls StringBytes::Write() and that method expects a Local<String>. "Why then does that function take a Local<Value>?" I hear you ask. Good question but I don't have the answer. I added a CHECK for good measure and what do you know, all of a sudden a large number of crypto tests started failing. Calling DecodeWrite(BINARY) on a buffer is nonsensical anyway: if you want the contents of the buffer, just copy out the data, there is no need to decode it - and that's exactly what this commit does. Fixes a great many instances of the following run-time error in debug builds: FATAL ERROR: v8::String::Cast() Could not convert to string
2014-03-16src: fix segfaults, fix 32 bits integer negationBen Noordhuis
Make calls to v8::Isolate::AdjustAmountOfExternalAllocatedMemory() take special care when negating 32 bits unsigned types like size_t. Before this commit, values were negated before they got promoted to 64 bits, meaning that on 32 bits architectures, a value like 42 got cast to 4294967254 instead of -42. That in turn made the garbage collector start scavenging like crazy because it thought the system was out of memory. That's bad enough but calls to AdjustAmountOfExternalAllocatedMemory() were made from weak callbacks, i.e. at a time when the garbage collector was already busy. It triggered asserts in debug builds and caused random crashes and memory corruption in release builds. The behavior in release builds is arguably a V8 bug and should perhaps be reported upstream. Partially fixes #7309 but requires further bug fixes to src/smalloc.cc that I'll address in a follow-up commit.
2014-03-16src: squelch -Wmaybe-uninitialized warningBen Noordhuis
The variable isn't actually used uninitialized but g++ 4.8 doesn't know that. Set it to NULL to silence the following compiler warning: ../src/string_bytes.cc:247:29: warning: 'data' may be used uninitialized in this function [-Wmaybe-uninitialized] unsigned a = hex2bin(src[i * 2 + 0]); ^ ../src/string_bytes.cc:299:15: note: 'data' was declared here const char* data; ^
2014-03-16src: remove unused ExternString constructorBen Noordhuis
Remove an unused (and unsafe) constructor. Unsafe because it doesn't initialize the data_ field.
2014-03-13src: update to v8 3.24 APIsFedor Indutny
2014-02-22src: remove `node_isolate` from sourceFedor Indutny
fix #6899
2013-11-11v8: upgrade to 3.22.24Ben Noordhuis
This commit removes the simple/test-event-emitter-memory-leak test for being unreliable with the new garbage collector: the memory pressure exerted by the test case is too low for the garbage collector to kick in. It can be made to work again by limiting the heap size with the --max_old_space_size=x flag but that won't be very reliable across platforms and architectures.
2013-10-17cpplint: disallow if one-linersFedor Indutny
2013-09-03string_bytes: use extern for length and write utf8Trevor Norris
If the string is external then the length can be quickly retrieved. This is especially faster for large strings that are being treated as UTF8. Also, if the string is external then there's no need for a full String::WriteUtf8 operation. A simple memcpy will do.
2013-08-09src: use v8::String::NewFrom*() functionsBen Noordhuis
* Change calls to String::New() and String::NewSymbol() to their respective one-byte, two-byte and UTF-8 counterparts. * Add a FIXED_ONE_BYTE_STRING macro that takes a string literal and turns it into a v8::Local<v8::String>. * Add helper functions that make v8::String::NewFromOneByte() easier to work with. Said function expects a `const uint8_t*` but almost every call site deals with `const char*` or `const unsigned char*`. Helps us avoid doing reinterpret_casts all over the place. * Code that handles file system paths keeps using UTF-8 for backwards compatibility reasons. At least now the use of UTF-8 is explicit. * Remove v8::String::NewSymbol() entirely. Almost all call sites were effectively minor de-optimizations. If you create a string only once, there is no point in making it a symbol. If you are create the same string repeatedly, it should probably be cached in a persistent handle.
2013-07-31src: more lint after cpplint tighteningBen Noordhuis
Commit 847c6d9 adds a 'project headers before system headers' check to cpplint. Update the files in src/ to make the linter pass again.
2013-07-31src: lint c++ codeFedor Indutny
2013-07-30string_bytes: add StringBytes::IsValidString()Ben Noordhuis
Performs a quick, non-exhaustive check on the input string to see if it's compatible with the specified string encoding. Curently it only checks that hex strings have a length that is a multiple of two.
2013-07-30fs: write strings directly to diskTrevor Norris
Prior, strings would first be converted to a Buffer before being written to disk. Now the intermediary step has been removed. Other changes of note: * Class member "must_free" was added to req_wrap so to track if the memory needs to be manually cleaned up after use. * External String Resource support, so the memory will be used directly instead of copying out the data. * Docs have been updated to reflect that if position is not a number then it will assume null. Previously it specified the argument must be null, but that was not how the code worked. An attempt was made to only support == null, but there were too many tests that assumed != number would be enough. * Docs update show some of the write/writeSync arguments are optional.
2013-07-30string_bytes: export GetExternalPartsTrevor Norris
The method is useful elsewhere when needing to check if external and grab data.
2013-07-30Merge remote-tracking branch 'origin/v0.10'Ben Noordhuis
Conflicts: AUTHORS ChangeLog deps/uv/ChangeLog deps/uv/src/version.c deps/uv/src/win/fs.c src/node.cc src/node_crypto.cc src/node_os.cc src/node_version.h
2013-07-06string_bytes: stop using String::AsciiValueBen Noordhuis
Debug builds of V8 now actively check that the string only contains ASCII characters (i.e. doesn't contain bytes with the high bit set.)
2013-06-25Merge remote-tracking branch 'ry/v0.10' into masterisaacs
Conflicts: ChangeLog deps/uv/ChangeLog deps/uv/src/unix/stream.c deps/uv/src/version.c deps/v8/build/common.gypi deps/v8/src/frames.h deps/v8/src/runtime.cc deps/v8/test/mjsunit/debug-set-variable-value.js lib/http.js src/node_version.h
2013-06-19string_bytes: properly detect 64bitTimothy J Fontaine
2013-06-18buffer: use smalloc as backing data storeTrevor Norris
Memory allocations are now done through smalloc. The Buffer cc class has been removed completely, but for backwards compatibility have left the namespace as Buffer. The .parent attribute is only set if the Buffer is a slice of an allocation. Which is then set to the alloc object (not a Buffer). The .offset attribute is now a ReadOnly set to 0, for backwards compatibility. I'd like to remove it in the future (pre v1.0). A few alterations have been made to how arguments are either coerced or thrown. All primitives will now be coerced to their respective values, and (most) all out of range index requests will throw. The indexes that are coerced were left for backwards compatibility. For example: Buffer slice operates more like Array slice, and coerces instead of throwing out of range indexes. This may change in the future. The reason for wanting to throw for out of range indexes is because giving js access to raw memory has high potential risk. To mitigate that it's easier to make sure the developer is always quickly alerted to the fact that their code is attempting to access beyond memory bounds. Because SlowBuffer will be deprecated, and simply returns a new Buffer instance, all tests on SlowBuffer have been removed. Heapdumps will now show usage under "smalloc" instead of "Buffer". ParseArrayIndex was added to node_internals to support proper uint argument checking/coercion for external array data indexes. SlabAllocator had to be updated since handle_ no longer exists.
2013-06-17src: clean up `using` directivesBen Noordhuis
Remove the unused ones and alphabetically sort the ones that remain.
2013-06-12string_bytes: write strings using new APITrevor Norris
StringBytes::Write now uses new v8 API and also does preliminary check if the string is external, then will use external memory instead.
2013-06-12string_bytes: use external for large stringsTrevor Norris
When large strings are used they cause v8's GC to spend a lot more time cleaning up. In these cases it's much faster to use external string resources. UTF8 strings do not use external string resources because only one and two byte external strings are supported. EXTERN_APEX is the value at which v8's GC overtakes performance. The following table has the type and buffer size that use to encode the strings as rough estimates of the percentage of performance gain from this patch (UTF8 is missing because they cannot be externalized). encoding 128KB 1MB 5MB ----------------------------- ASCII 58% 208% 250% HEX 15% 74% 86% BASE64 11% 74% 71% UCS2 2% 225% 398% BINARY 2234% 1728% 2305% BINARY is so much faster across the board because of using the new v8 WriteOneByte API.
2013-06-11string_bytes: implement new v8 APITrevor Norris
v8 has a new API to write out strings to memory. This has been implemented. One other change of note is BINARY encoded strings have a new implementation. This has improved performance substantially.
2013-06-11lint: add mising isolates and minor style fixesTrevor Norris
2013-05-20Merge remote-tracking branch ry/v0.10 into masterisaacs
Conflicts: AUTHORS ChangeLog src/node_crypto.cc src/node_version.h
2013-05-20string_bytes: strip padding from base64 stringsTrevor Norris
Because of variations in different base64 implementation, it's been decided to strip all padding from the end of a base64 string and calculate its size from that.
2013-05-17src: Remove superfluous static_castisaacs
2013-05-17Merge remote-tracking branch 'ry/v0.10' into masterisaacs
Conflicts: AUTHORS ChangeLog deps/uv/ChangeLog deps/uv/config-unix.mk deps/uv/src/unix/stream.c deps/uv/src/version.c deps/uv/uv.gyp src/node.cc src/node_buffer.cc src/node_crypto.cc src/node_version.h src/stream_wrap.cc src/stream_wrap.h
2013-05-16buffer, crypto: fix default encoding regressionBen Noordhuis
The default encoding is 'buffer'. When the input is a string, treat it as 'binary'. Fixes the following assertion: node: ../src/string_bytes.cc:309: static size_t node::StringBytes::StorageSize(v8::Handle<v8::Value>, node::encoding): Assertion `0 && "buffer encoding specified but string provided"' failed. Introduced in 64fc34b2. Fixes #5482.
2013-05-14src: Add StringBytes static util classisaacs
Four functions: - StringBytes::StorageSize() - StringBytes::Size() - StringBytes::Write() - StringBytes::Encode()