[Likwid-commit] [likwid] 01/01: Add bstrlib documentation and remove STREAM benchmark
Thomas Röhl
eatmyfear-guest at moszumanska.debian.org
Tue May 5 14:36:49 UTC 2015
This is an automated email from the git hooks/post-receive script.
eatmyfear-guest pushed a commit to branch master
in repository likwid.
commit 8fd47f5b56f0ea5d36b5561742728646cec6da13
Author: Thomas.Roehl <Thomas.Roehl at googlemail.com>
Date: Tue May 5 16:36:39 2015 +0200
Add bstrlib documentation and remove STREAM benchmark
---
debian/patches/07-bstrlib.patch | 3213 +++++++++++++++++++++++++++++++++++++
debian/patches/08-no-stream.patch | 465 ++++++
debian/patches/series | 2 +
3 files changed, 3680 insertions(+)
diff --git a/debian/patches/07-bstrlib.patch b/debian/patches/07-bstrlib.patch
new file mode 100644
index 0000000..67234d1
--- /dev/null
+++ b/debian/patches/07-bstrlib.patch
@@ -0,0 +1,3213 @@
+Description: Add bstrlib.txt to documentation
+ The debian guys recognized that the license in bstrlib.c mentions
+ "accompanying documentation". So here it is.
+ Author: Thomas Roehl <thomas.roehl at fau.de>$
+ Last-Update: 2015-05-05$
+ ---$
+ This patch header follows DEP-3: http://dep.debian.net/deps/dep3/$
+
+diff -Nur likwid-3.1.3.orig/doc/bstrlib.txt likwid-3.1.3/doc/bstrlib.txt
+--- likwid-3.1.3.orig/doc/bstrlib.txt 1970-01-01 01:00:00.000000000 +0100
++++ likwid-3.1.3/doc/bstrlib.txt 2015-05-05 16:29:42.856001000 +0200
+@@ -0,0 +1,3201 @@
++Better String library
++---------------------
++
++by Paul Hsieh
++
++The bstring library is an attempt to provide improved string processing
++functionality to the C and C++ language. At the heart of the bstring library
++(Bstrlib for short) is the management of "bstring"s which are a significant
++improvement over '\0' terminated char buffers.
++
++===============================================================================
++
++Motivation
++----------
++
++The standard C string library has serious problems:
++
++ 1) Its use of '\0' to denote the end of the string means knowing a
++ string's length is O(n) when it could be O(1).
++ 2) It imposes an interpretation for the character value '\0'.
++ 3) gets() always exposes the application to a buffer overflow.
++ 4) strtok() modifies the string its parsing and thus may not be usable in
++ programs which are re-entrant or multithreaded.
++ 5) fgets has the unusual semantic of ignoring '\0's that occur before
++ '\n's are consumed.
++ 6) There is no memory management, and actions performed such as strcpy,
++ strcat and sprintf are common places for buffer overflows.
++ 7) strncpy() doesn't '\0' terminate the destination in some cases.
++ 8) Passing NULL to C library string functions causes an undefined NULL
++ pointer access.
++ 9) Parameter aliasing (overlapping, or self-referencing parameters)
++ within most C library functions has undefined behavior.
++ 10) Many C library string function calls take integer parameters with
++ restricted legal ranges. Parameters passed outside these ranges are
++ not typically detected and cause undefined behavior.
++
++So the desire is to create an alternative string library that does not suffer
++from the above problems and adds in the following functionality:
++
++ 1) Incorporate string functionality seen from other languages.
++ a) MID$() - from BASIC
++ b) split()/join() - from Python
++ c) string/char x n - from Perl
++ 2) Implement analogs to functions that combine stream IO and char buffers
++ without creating a dependency on stream IO functionality.
++ 3) Implement the basic text editor-style functions insert, delete, find,
++ and replace.
++ 4) Implement reference based sub-string access (as a generalization of
++ pointer arithmetic.)
++ 5) Implement runtime write protection for strings.
++
++There is also a desire to avoid "API-bloat". So functionality that can be
++implemented trivially in other functionality is omitted. So there is no
++left$() or right$() or reverse() or anything like that as part of the core
++functionality.
++
++Explaining Bstrings
++-------------------
++
++A bstring is basically a header which wraps a pointer to a char buffer. Lets
++start with the declaration of a struct tagbstring:
++
++ struct tagbstring {
++ int mlen;
++ int slen;
++ unsigned char * data;
++ };
++
++This definition is considered exposed, not opaque (though it is neither
++necessary nor recommended that low level maintenance of bstrings be performed
++whenever the abstract interfaces are sufficient). The mlen field (usually)
++describes a lower bound for the memory allocated for the data field. The
++slen field describes the exact length for the bstring. The data field is a
++single contiguous buffer of unsigned chars. Note that the existence of a '\0'
++character in the unsigned char buffer pointed to by the data field does not
++necessarily denote the end of the bstring.
++
++To be a well formed modifiable bstring the mlen field must be at least the
++length of the slen field, and slen must be non-negative. Furthermore, the
++data field must point to a valid buffer in which access to the first mlen
++characters has been acquired. So the minimal check for correctness is:
++
++ (slen >= 0 && mlen >= slen && data != NULL)
++
++bstrings returned by bstring functions can be assumed to be either NULL or
++satisfy the above property. (When bstrings are only readable, the mlen >=
++slen restriction is not required; this is discussed later in this section.)
++A bstring itself is just a pointer to a struct tagbstring:
++
++ typedef struct tagbstring * bstring;
++
++Note that use of the prefix "tag" in struct tagbstring is required to work
++around the inconsistency between C and C++'s struct namespace usage. This
++definition is also considered exposed.
++
++Bstrlib basically manages bstrings allocated as a header and an associated
++data-buffer. Since the implementation is exposed, they can also be
++constructed manually. Functions which mutate bstrings assume that the header
++and data buffer have been malloced; the bstring library may perform free() or
++realloc() on both the header and data buffer of any bstring parameter.
++Functions which return bstring's create new bstrings. The string memory is
++freed by a bdestroy() call (or using the bstrFree macro).
++
++The following related typedef is also provided:
++
++ typedef const struct tagbstring * const_bstring;
++
++which is also considered exposed. These are directly bstring compatible (no
++casting required) but are just used for parameters which are meant to be
++non-mutable. So in general, bstring parameters which are read as input but
++not meant to be modified will be declared as const_bstring, and bstring
++parameters which may be modified will be declared as bstring. This convention
++is recommended for user written functions as well.
++
++Since bstrings maintain interoperability with C library char-buffer style
++strings, all functions which modify, update or create bstrings also append a
++'\0' character into the position slen + 1. This trailing '\0' character is
++not required for bstrings input to the bstring functions; this is provided
++solely as a convenience for interoperability with standard C char-buffer
++functionality.
++
++Analogs for the ANSI C string library functions have been created when they
++are necessary, but have also been left out when they are not. In particular
++there are no functions analogous to fwrite, or puts just for the purposes of
++bstring. The ->data member of any string is exposed, and therefore can be
++used just as easily as char buffers for C functions which read strings.
++
++For those that wish to hand construct bstrings, the following should be kept
++in mind:
++
++ 1) While bstrlib can accept constructed bstrings without terminating
++ '\0' characters, the rest of the C language string library will not
++ function properly on such non-terminated strings. This is obvious
++ but must be kept in mind.
++ 2) If it is intended that a constructed bstring be written to by the
++ bstring library functions then the data portion should be allocated
++ by the malloc function and the slen and mlen fields should be entered
++ properly. The struct tagbstring header is not reallocated, and only
++ freed by bdestroy.
++ 3) Writing arbitrary '\0' characters at various places in the string
++ will not modify its length as perceived by the bstring library
++ functions. In fact, '\0' is a legitimate non-terminating character
++ for a bstring to contain.
++ 4) For read only parameters, bstring functions do not check the mlen.
++ I.e., the minimal correctness requirements are reduced to:
++
++ (slen >= 0 && data != NULL)
++
++Better pointer arithmetic
++-------------------------
++
++One built-in feature of '\0' terminated char * strings, is that its very easy
++and fast to obtain a reference to the tail of any string using pointer
++arithmetic. Bstrlib does one better by providing a way to get a reference to
++any substring of a bstring (or any other length delimited block of memory.)
++So rather than just having pointer arithmetic, with bstrlib one essentially
++has segment arithmetic. This is achieved using the macro blk2tbstr() which
++builds a reference to a block of memory and the macro bmid2tbstr() which
++builds a reference to a segment of a bstring. Bstrlib also includes
++functions for direct consumption of memory blocks into bstrings, namely
++bcatblk () and blk2bstr ().
++
++One scenario where this can be extremely useful is when string contains many
++substrings which one would like to pass as read-only reference parameters to
++some string consuming function without the need to allocate entire new
++containers for the string data. More concretely, imagine parsing a command
++line string whose parameters are space delimited. This can only be done for
++tails of the string with '\0' terminated char * strings.
++
++Improved NULL semantics and error handling
++------------------------------------------
++
++Unless otherwise noted, if a NULL pointer is passed as a bstring or any other
++detectably illegal parameter, the called function will return with an error
++indicator (either NULL or BSTR_ERR) rather than simply performing a NULL
++pointer access, or having undefined behavior.
++
++To illustrate the value of this, consider the following example:
++
++ strcpy (p = malloc (13 * sizeof (char)), "Hello,");
++ strcat (p, " World");
++
++This is not correct because malloc may return NULL (due to an out of memory
++condition), and the behaviour of strcpy is undefined if either of its
++parameters are NULL. However:
++
++ bstrcat (p = bfromcstr ("Hello,"), q = bfromcstr (" World"));
++ bdestroy (q);
++
++is well defined, because if either p or q are assigned NULL (indicating a
++failure to allocate memory) both bstrcat and bdestroy will recognize it and
++perform no detrimental action.
++
++Note that it is not necessary to check any of the members of a returned
++bstring for internal correctness (in particular the data member does not need
++to be checked against NULL when the header is non-NULL), since this is
++assured by the bstring library itself.
++
++bStreams
++--------
++
++In addition to the bgets and bread functions, bstrlib can abstract streams
++with a high performance read only stream called a bStream. In general, the
++idea is to open a core stream (with something like fopen) then pass its
++handle as well as a bNread function pointer (like fread) to the bsopen
++function which will return a handle to an open bStream. Then the functions
++bsread, bsreadln or bsreadlns can be called to read portions of the stream.
++Finally, the bsclose function is called to close the bStream -- it will
++return a handle to the original (core) stream. So bStreams, essentially,
++wrap other streams.
++
++The bStreams have two main advantages over the bgets and bread (as well as
++fgets/ungetc) paradigms:
++
++1) Improved functionality via the bunread function which allows a stream to
++ unread characters, giving the bStream stack-like functionality if so
++ desired.
++2) A very high performance bsreadln function. The C library function fgets()
++ (and the bgets function) can typically be written as a loop on top of
++ fgetc(), thus paying all of the overhead costs of calling fgetc on a per
++ character basis. bsreadln will read blocks at a time, thus amortizing the
++ overhead of fread calls over many characters at once.
++
++However, clearly bStreams are suboptimal or unusable for certain kinds of
++streams (stdin) or certain usage patterns (a few spotty, or non-sequential
++reads from a slow stream.) For those situations, using bgets will be more
++appropriate.
++
++The semantics of bStreams allows practical construction of layerable data
++streams. What this means is that by writing a bNread compatible function on
++top of a bStream, one can construct a new bStream on top of it. This can be
++useful for writing multi-pass parsers that don't actually read the entire
++input more than once and don't require the use of intermediate storage.
++
++Aliasing
++--------
++
++Aliasing occurs when a function is given two parameters which point to data
++structures which overlap in the memory they occupy. While this does not
++disturb read only functions, for many libraries this can make functions that
++write to these memory locations malfunction. This is a common problem of the
++C standard library and especially the string functions in the C standard
++library.
++
++The C standard string library is entirely char by char oriented (as is
++bstring) which makes conforming implementations alias safe for some
++scenarios. However no actual detection of aliasing is typically performed,
++so it is easy to find cases where the aliasing will cause anomolous or
++undesirable behaviour (consider: strcat (p, p).) The C99 standard includes
++the "restrict" pointer modifier which allows the compiler to document and
++assume a no-alias condition on usage. However, only the most trivial cases
++can be caught (if at all) by the compiler at compile time, and thus there is
++no actual enforcement of non-aliasing.
++
++Bstrlib, by contrast, permits aliasing and is completely aliasing safe, in
++the C99 sense of aliasing. That is to say, under the assumption that
++pointers of incompatible types from distinct objects can never alias, bstrlib
++is completely aliasing safe. (In practice this means that the data buffer
++portion of any bstring and header of any bstring are assumed to never alias.)
++With the exception of the reference building macros, the library behaves as
++if all read-only parameters are first copied and replaced by temporary
++non-aliased parameters before any writing to any output bstring is performed
++(though actual copying is extremely rarely ever done.)
++
++Besides being a useful safety feature, bstring searching/comparison
++functions can improve to O(1) execution when aliasing is detected.
++
++Note that aliasing detection and handling code in Bstrlib is generally
++extremely cheap. There is almost never any appreciable performance penalty
++for using aliased parameters.
++
++Reenterancy
++-----------
++
++Nearly every function in Bstrlib is a leaf function, and is completely
++reenterable with the exception of writing to common bstrings. The split
++functions which use a callback mechanism requires only that the source string
++not be destroyed by the callback function unless the callback function returns
++with an error status (note that Bstrlib functions which return an error do
++not modify the string in any way.) The string can in fact be modified by the
++callback and the behaviour is deterministic. See the documentation of the
++various split functions for more details.
++
++Undefined scenarios
++-------------------
++
++One of the basic important premises for Bstrlib is to not to increase the
++propogation of undefined situations from parameters that are otherwise legal
++in of themselves. In particular, except for extremely marginal cases, usages
++of bstrings that use the bstring library functions alone cannot lead to any
++undefined action. But due to C/C++ language and library limitations, there
++is no way to define a non-trivial library that is completely without
++undefined operations. All such possible undefined operations are described
++below:
++
++1) bstrings or struct tagbstrings that are not explicitely initialized cannot
++ be passed as a parameter to any bstring function.
++2) The members of the NULL bstring cannot be accessed directly. (Though all
++ APIs and macros detect the NULL bstring.)
++3) A bstring whose data member has not been obtained from a malloc or
++ compatible call and which is write accessible passed as a writable
++ parameter will lead to undefined results. (i.e., do not writeAllow any
++ constructed bstrings unless the data portion has been obtained from the
++ heap.)
++4) If the headers of two strings alias but are not identical (which can only
++ happen via a defective manual construction), then passing them to a
++ bstring function in which one is writable is not defined.
++5) If the mlen member is larger than the actual accessible length of the data
++ member for a writable bstring, or if the slen member is larger than the
++ readable length of the data member for a readable bstring, then the
++ corresponding bstring operations are undefined.
++6) Any bstring definition whose header or accessible data portion has been
++ assigned to inaccessible or otherwise illegal memory clearly cannot be
++ acted upon by the bstring library in any way.
++7) Destroying the source of an incremental split from within the callback
++ and not returning with a negative value (indicating that it should abort)
++ will lead to undefined behaviour. (Though *modifying* or adjusting the
++ state of the source data, even if those modification fail within the
++ bstrlib API, has well defined behavior.)
++8) Modifying a bstring which is write protected by direct access has
++ undefined behavior.
++
++While this may seem like a long list, with the exception of invalid uses of
++the writeAllow macro, and source destruction during an iterative split
++without an accompanying abort, no usage of the bstring API alone can cause
++any undefined scenario to occurr. I.e., the policy of restricting usage of
++bstrings to the bstring API can significantly reduce the risk of runtime
++errors (in practice it should eliminate them) related to string manipulation
++due to undefined action.
++
++C++ wrapper
++-----------
++
++A C++ wrapper has been created to enable bstring functionality for C++ in the
++most natural (for C++ programers) way possible. The mandate for the C++
++wrapper is different from the base C bstring library. Since the C++ language
++has far more abstracting capabilities, the CBString structure is considered
++fully abstracted -- i.e., hand generated CBStrings are not supported (though
++conversion from a struct tagbstring is allowed) and all detectable errors are
++manifest as thrown exceptions.
++
++- The C++ class definitions are all under the namespace Bstrlib. bstrwrap.h
++ enables this namespace (with a using namespace Bstrlib; directive at the
++ end) unless the macro BSTRLIB_DONT_ASSUME_NAMESPACE has been defined before
++ it is included.
++
++- Erroneous accesses results in an exception being thrown. The exception
++ parameter is of type "struct CBStringException" which is derived from
++ std::exception if STL is used. A verbose description of the error message
++ can be obtained from the what() method.
++
++- CBString is a C++ structure derived from a struct tagbstring. An address
++ of a CBString cast to a bstring must not be passed to bdestroy. The bstring
++ C API has been made C++ safe and can be used directly in a C++ project.
++
++- It includes constructors which can take a char, '\0' terminated char
++ buffer, tagbstring, (char, repeat-value), a length delimited buffer or a
++ CBStringList to initialize it.
++
++- Concatenation is performed with the + and += operators. Comparisons are
++ done with the ==, !=, <, >, <= and >= operators. Note that == and != use
++ the biseq call, while <, >, <= and >= use bstrcmp.
++
++- CBString's can be directly cast to const character buffers.
++
++- CBString's can be directly cast to double, float, int or unsigned int so
++ long as the CBString are decimal representations of those types (otherwise
++ an exception will be thrown). Converting the other way should be done with
++ the format(a) method(s).
++
++- CBString contains the length, character and [] accessor methods. The
++ character and [] accessors are aliases of each other. If the bounds for
++ the string are exceeded, an exception is thrown. To avoid the overhead for
++ this check, first cast the CBString to a (const char *) and use [] to
++ dereference the array as normal. Note that the character and [] accessor
++ methods allows both reading and writing of individual characters.
++
++- The methods: format, formata, find, reversefind, findcaseless,
++ reversefindcaseless, midstr, insert, insertchrs, replace, findreplace,
++ findreplacecaseless, remove, findchr, nfindchr, alloc, toupper, tolower,
++ gets, read are analogous to the functions that can be found in the C API.
++
++- The caselessEqual and caselessCmp methods are analogous to biseqcaseless
++ and bstricmp functions respectively.
++
++- Note that just like the bformat function, the format and formata methods do
++ not automatically cast CBStrings into char * strings for "%s"-type
++ substitutions:
++
++ CBString w("world");
++ CBString h("Hello");
++ CBString hw;
++
++ /* The casts are necessary */
++ hw.format ("%s, %s", (const char *)h, (const char *)w);
++
++- The methods trunc and repeat have been added instead of using pattern.
++
++- ltrim, rtrim and trim methods have been added. These remove characters
++ from a given character string set (defaulting to the whitespace characters)
++ from either the left, right or both ends of the CBString, respectively.
++
++- The method setsubstr is also analogous in functionality to bsetstr, except
++ that it cannot be passed NULL. Instead the method fill and the fill-style
++ constructor have been supplied to enable this functionality.
++
++- The writeprotect(), writeallow() and iswriteprotected() methods are
++ analogous to the bwriteprotect(), bwriteallow() and biswriteprotected()
++ macros in the C API. Write protection semantics in CBString are stronger
++ than with the C API in that indexed character assignment is checked for
++ write protection. However, unlike with the C API, a write protected
++ CBString can be destroyed by the destructor.
++
++- CBStream is a C++ structure which wraps a struct bStream (its not derived
++ from it, since destruction is slightly different). It is constructed by
++ passing in a bNread function pointer and a stream parameter cast to void *.
++ This structure includes methods for detecting eof, setting the buffer
++ length, reading the whole stream or reading entries line by line or block
++ by block, an unread function, and a peek function.
++
++- If STL is available, the CBStringList structure is derived from a vector of
++ CBString with various split methods. The split method has been overloaded
++ to accept either a character or CBString as the second parameter (when the
++ split parameter is a CBString any character in that CBString is used as a
++ seperator). The splitstr method takes a CBString as a substring seperator.
++ Joins can be performed via a CBString constructor which takes a
++ CBStringList as a parameter, or just using the CBString::join() method.
++
++- If there is proper support for std::iostreams, then the >> and << operators
++ and the getline() function have been added (with semantics the same as
++ those for std::string).
++
++Multithreading
++--------------
++
++A mutable bstring is kind of analogous to a small (two entry) linked list
++allocated by malloc, with all aliasing completely under programmer control.
++I.e., manipulation of one bstring will never affect any other distinct
++bstring unless explicitely constructed to do so by the programmer via hand
++construction or via building a reference. Bstrlib also does not use any
++static or global storage, so there are no hidden unremovable race conditions.
++Bstrings are also clearly not inherently thread local. So just like
++char *'s, bstrings can be passed around from thread to thread and shared and
++so on, so long as modifications to a bstring correspond to some kind of
++exclusive access lock as should be expected (or if the bstring is read-only,
++which can be enforced by bstring write protection) for any sort of shared
++object in a multithreaded environment.
++
++Bsafe module
++------------
++
++For convenience, a bsafe module has been included. The idea is that if this
++module is included, inadvertant usage of the most dangerous C functions will
++be overridden and lead to an immediate run time abort. Of course, it should
++be emphasized that usage of this module is completely optional. The
++intention is essentially to provide an option for creating project safety
++rules which can be enforced mechanically rather than socially. This is
++useful for larger, or open development projects where its more difficult to
++enforce social rules or "coding conventions".
++
++Problems not solved
++-------------------
++
++Bstrlib is written for the C and C++ languages, which have inherent weaknesses
++that cannot be easily solved:
++
++1. Memory leaks: Forgetting to call bdestroy on a bstring that is about to be
++ unreferenced, just as forgetting to call free on a heap buffer that is
++ about to be dereferenced. Though bstrlib itself is leak free.
++2. Read before write usage: In C, declaring an auto bstring does not
++ automatically fill it with legal/valid contents. This problem has been
++ somewhat mitigated in C++. (The bstrDeclare and bstrFree macros from
++ bstraux can be used to help mitigate this problem.)
++
++Other problems not addressed:
++
++3. Built-in mutex usage to automatically avoid all bstring internal race
++ conditions in multitasking environments: The problem with trying to
++ implement such things at this low a level is that it is typically more
++ efficient to use locks in higher level primitives. There is also no
++ platform independent way to implement locks or mutexes.
++4. Unicode/widecharacter support.
++
++Note that except for spotty support of wide characters, the default C
++standard library does not address any of these problems either.
++
++Configurable compilation options
++--------------------------------
++
++All configuration options are meant solely for the purpose of compiler
++compatibility. Configuration options are not meant to change the semantics
++or capabilities of the library, except where it is unavoidable.
++
++Since some C++ compilers don't include the Standard Template Library and some
++have the options of disabling exception handling, a number of macros can be
++used to conditionally compile support for each of this:
++
++BSTRLIB_CAN_USE_STL
++
++ - defining this will enable the used of the Standard Template Library.
++ Defining BSTRLIB_CAN_USE_STL overrides the BSTRLIB_CANNOT_USE_STL macro.
++
++BSTRLIB_CANNOT_USE_STL
++
++ - defining this will disable the use of the Standard Template Library.
++ Defining BSTRLIB_CAN_USE_STL overrides the BSTRLIB_CANNOT_USE_STL macro.
++
++BSTRLIB_CAN_USE_IOSTREAM
++
++ - defining this will enable the used of streams from class std. Defining
++ BSTRLIB_CAN_USE_IOSTREAM overrides the BSTRLIB_CANNOT_USE_IOSTREAM macro.
++
++BSTRLIB_CANNOT_USE_IOSTREAM
++
++ - defining this will disable the use of streams from class std. Defining
++ BSTRLIB_CAN_USE_IOSTREAM overrides the BSTRLIB_CANNOT_USE_IOSTREAM macro.
++
++BSTRLIB_THROWS_EXCEPTIONS
++
++ - defining this will enable the exception handling within bstring.
++ Defining BSTRLIB_THROWS_EXCEPTIONS overrides the
++ BSTRLIB_DOESNT_THROWS_EXCEPTIONS macro.
++
++BSTRLIB_DOESNT_THROW_EXCEPTIONS
++
++ - defining this will disable the exception handling within bstring.
++ Defining BSTRLIB_THROWS_EXCEPTIONS overrides the
++ BSTRLIB_DOESNT_THROW_EXCEPTIONS macro.
++
++Note that these macros must be defined consistently throughout all modules
++that use CBStrings including bstrwrap.cpp.
++
++Some older C compilers do not support functions such as vsnprintf. This is
++handled by the following macro variables:
++
++BSTRLIB_NOVSNP
++
++ - defining this indicates that the compiler does not support vsnprintf.
++ This will cause bformat and bformata to not be declared. Note that
++ for some compilers, such as Turbo C, this is set automatically.
++ Defining BSTRLIB_NOVSNP overrides the BSTRLIB_VSNP_OK macro.
++
++BSTRLIB_VSNP_OK
++
++ - defining this will disable the autodetection of compilers that do not
++ vsnprintf.
++ Defining BSTRLIB_NOVSNP overrides the BSTRLIB_VSNP_OK macro.
++
++Semantic compilation options
++----------------------------
++
++Bstrlib comes with very few compilation options for changing the semantics of
++of the library. These are described below.
++
++BSTRLIB_DONT_ASSUME_NAMESPACE
++
++ - Defining this before including bstrwrap.h will disable the automatic
++ enabling of the Bstrlib namespace for the C++ declarations.
++
++BSTRLIB_DONT_USE_VIRTUAL_DESTRUCTOR
++
++ - Defining this will make the CBString destructor non-virtual.
++
++BSTRLIB_MEMORY_DEBUG
++
++ - Defining this will cause the bstrlib modules bstrlib.c and bstrwrap.cpp
++ to invoke a #include "memdbg.h". memdbg.h has to be supplied by the user.
++
++Note that these macros must be defined consistently throughout all modules
++that use bstrings or CBStrings including bstrlib.c, bstraux.c and
++bstrwrap.cpp.
++
++===============================================================================
++
++Files
++-----
++
++bstrlib.c - C implementaion of bstring functions.
++bstrlib.h - C header file for bstring functions.
++bstraux.c - C example that implements trivial additional functions.
++bstraux.h - C header for bstraux.c
++bstest.c - C unit/regression test for bstrlib.c
++
++bstrwrap.cpp - C++ implementation of CBString.
++bstrwrap.h - C++ header file for CBString.
++test.cpp - C++ unit/regression test for bstrwrap.cpp
++
++bsafe.c - C runtime stubs to abort usage of unsafe C functions.
++bsafe.h - C header file for bsafe.c functions.
++
++C projects need only include bstrlib.h and compile/link bstrlib.c to use the
++bstring library. C++ projects need to additionally include bstrwrap.h and
++compile/link bstrwrap.cpp. For both, there may be a need to make choices
++about feature configuration as described in the "Configurable compilation
++options" in the section above.
++
++Other files that are included in this archive are:
++
++license.txt - The BSD license for Bstrlib
++gpl.txt - The GPL version 2
++security.txt - A security statement useful for auditting Bstrlib
++porting.txt - A guide to porting Bstrlib
++bstrlib.txt - This file
++
++===============================================================================
++
++The functions
++-------------
++
++ extern bstring bfromcstr (const char * str);
++
++ Take a standard C library style '\0' terminated char buffer and generate
++ a bstring with the same contents as the char buffer. If an error occurs
++ NULL is returned.
++
++ So for example:
++
++ bstring b = bfromcstr ("Hello");
++ if (!b) {
++ fprintf (stderr, "Out of memory");
++ } else {
++ puts ((char *) b->data);
++ }
++
++ ..........................................................................
++
++ extern bstring bfromcstralloc (int mlen, const char * str);
++
++ Create a bstring which contains the contents of the '\0' terminated
++ char * buffer str. The memory buffer backing the bstring is at least
++ mlen characters in length. If an error occurs NULL is returned.
++
++ So for example:
++
++ bstring b = bfromcstralloc (64, someCstr);
++ if (b) b->data[63] = 'x';
++
++ The idea is that this will set the 64th character of b to 'x' if it is at
++ least 64 characters long otherwise do nothing. And we know this is well
++ defined so long as b was successfully created, since it will have been
++ allocated with at least 64 characters.
++
++ ..........................................................................
++
++ extern bstring blk2bstr (const void * blk, int len);
++
++ Create a bstring whose contents are described by the contiguous buffer
++ pointing to by blk with a length of len bytes. Note that this function
++ creates a copy of the data in blk, rather than simply referencing it.
++ Compare with the blk2tbstr macro. If an error occurs NULL is returned.
++
++ ..........................................................................
++
++ extern char * bstr2cstr (const_bstring s, char z);
++
++ Create a '\0' terminated char buffer which contains the contents of the
++ bstring s, except that any contained '\0' characters are converted to the
++ character in z. This returned value should be freed with bcstrfree(), by
++ the caller. If an error occurs NULL is returned.
++
++ ..........................................................................
++
++ extern int bcstrfree (char * s);
++
++ Frees a C-string generated by bstr2cstr (). This is normally unnecessary
++ since it just wraps a call to free (), however, if malloc () and free ()
++ have been redefined as a macros within the bstrlib module (via macros in
++ the memdbg.h backdoor) with some difference in behaviour from the std
++ library functions, then this allows a correct way of freeing the memory
++ that allows higher level code to be independent from these macro
++ redefinitions.
++
++ ..........................................................................
++
++ extern bstring bstrcpy (const_bstring b1);
++
++ Make a copy of the passed in bstring. The copied bstring is returned if
++ there is no error, otherwise NULL is returned.
++
++ ..........................................................................
++
++ extern int bassign (bstring a, const_bstring b);
++
++ Overwrite the bstring a with the contents of bstring b. Note that the
++ bstring a must be a well defined and writable bstring. If an error
++ occurs BSTR_ERR is returned and a is not overwritten.
++
++ ..........................................................................
++
++ int bassigncstr (bstring a, const char * str);
++
++ Overwrite the string a with the contents of char * string str. Note that
++ the bstring a must be a well defined and writable bstring. If an error
++ occurs BSTR_ERR is returned and a may be partially overwritten.
++
++ ..........................................................................
++
++ int bassignblk (bstring a, const void * s, int len);
++
++ Overwrite the string a with the contents of the block (s, len). Note that
++ the bstring a must be a well defined and writable bstring. If an error
++ occurs BSTR_ERR is returned and a is not overwritten.
++
++ ..........................................................................
++
++ extern int bassignmidstr (bstring a, const_bstring b, int left, int len);
++
++ Overwrite the bstring a with the middle of contents of bstring b
++ starting from position left and running for a length len. left and
++ len are clamped to the ends of b as with the function bmidstr. Note that
++ the bstring a must be a well defined and writable bstring. If an error
++ occurs BSTR_ERR is returned and a is not overwritten.
++
++ ..........................................................................
++
++ extern bstring bmidstr (const_bstring b, int left, int len);
++
++ Create a bstring which is the substring of b starting from position left
++ and running for a length len (clamped by the end of the bstring b.) If
++ there was no error, the value of this constructed bstring is returned
++ otherwise NULL is returned.
++
++ ..........................................................................
++
++ extern int bdelete (bstring s1, int pos, int len);
++
++ Removes characters from pos to pos+len-1 and shifts the tail of the
++ bstring starting from pos+len to pos. len must be positive for this call
++ to have any effect. The section of the bstring described by (pos, len)
++ is clamped to boundaries of the bstring b. The value BSTR_OK is returned
++ if the operation is successful, otherwise BSTR_ERR is returned.
++
++ ..........................................................................
++
++ extern int bconcat (bstring b0, const_bstring b1);
++
++ Concatenate the bstring b1 to the end of bstring b0. The value BSTR_OK
++ is returned if the operation is successful, otherwise BSTR_ERR is
++ returned.
++
++ ..........................................................................
++
++ extern int bconchar (bstring b, char c);
++
++ Concatenate the character c to the end of bstring b. The value BSTR_OK
++ is returned if the operation is successful, otherwise BSTR_ERR is
++ returned.
++
++ ..........................................................................
++
++ extern int bcatcstr (bstring b, const char * s);
++
++ Concatenate the char * string s to the end of bstring b. The value
++ BSTR_OK is returned if the operation is successful, otherwise BSTR_ERR is
++ returned.
++
++ ..........................................................................
++
++ extern int bcatblk (bstring b, const void * s, int len);
++
++ Concatenate a fixed length buffer (s, len) to the end of bstring b. The
++ value BSTR_OK is returned if the operation is successful, otherwise
++ BSTR_ERR is returned.
++
++ ..........................................................................
++
++ extern int biseq (const_bstring b0, const_bstring b1);
++
++ Compare the bstring b0 and b1 for equality. If the bstrings differ, 0
++ is returned, if the bstrings are the same, 1 is returned, if there is an
++ error, -1 is returned. If the length of the bstrings are different, this
++ function has O(1) complexity. Contained '\0' characters are not treated
++ as a termination character.
++
++ Note that the semantics of biseq are not completely compatible with
++ bstrcmp because of its different treatment of the '\0' character.
++
++ ..........................................................................
++
++ extern int bisstemeqblk (const_bstring b, const void * blk, int len);
++
++ Compare beginning of bstring b0 with a block of memory of length len for
++ equality. If the beginning of b0 differs from the memory block (or if b0
++ is too short), 0 is returned, if the bstrings are the same, 1 is returned,
++ if there is an error, -1 is returned.
++
++ ..........................................................................
++
++ extern int biseqcaseless (const_bstring b0, const_bstring b1);
++
++ Compare two bstrings for equality without differentiating between case.
++ If the bstrings differ other than in case, 0 is returned, if the bstrings
++ are the same, 1 is returned, if there is an error, -1 is returned. If
++ the length of the bstrings are different, this function is O(1). '\0'
++ termination characters are not treated in any special way.
++
++ ..........................................................................
++
++ extern int bisstemeqcaselessblk (const_bstring b0, const void * blk, int len);
++
++ Compare beginning of bstring b0 with a block of memory of length len
++ without differentiating between case for equality. If the beginning of b0
++ differs from the memory block other than in case (or if b0 is too short),
++ 0 is returned, if the bstrings are the same, 1 is returned, if there is an
++ error, -1 is returned.
++
++ ..........................................................................
++
++ extern int biseqcstr (const_bstring b, const char *s);
++
++ Compare the bstring b and char * bstring s. The C string s must be '\0'
++ terminated at exactly the length of the bstring b, and the contents
++ between the two must be identical with the bstring b with no '\0'
++ characters for the two contents to be considered equal. This is
++ equivalent to the condition that their current contents will be always be
++ equal when comparing them in the same format after converting one or the
++ other. If they are equal 1 is returned, if they are unequal 0 is
++ returned and if there is a detectable error BSTR_ERR is returned.
++
++ ..........................................................................
++
++ extern int biseqcstrcaseless (const_bstring b, const char *s);
++
++ Compare the bstring b and char * string s. The C string s must be '\0'
++ terminated at exactly the length of the bstring b, and the contents
++ between the two must be identical except for case with the bstring b with
++ no '\0' characters for the two contents to be considered equal. This is
++ equivalent to the condition that their current contents will be always be
++ equal ignoring case when comparing them in the same format after
++ converting one or the other. If they are equal, except for case, 1 is
++ returned, if they are unequal regardless of case 0 is returned and if
++ there is a detectable error BSTR_ERR is returned.
++
++ ..........................................................................
++
++ extern int bstrcmp (const_bstring b0, const_bstring b1);
++
++ Compare the bstrings b0 and b1 for ordering. If there is an error,
++ SHRT_MIN is returned, otherwise a value less than or greater than zero,
++ indicating that the bstring pointed to by b0 is lexicographically less
++ than or greater than the bstring pointed to by b1 is returned. If the
++ bstring lengths are unequal but the characters up until the length of the
++ shorter are equal then a value less than, or greater than zero,
++ indicating that the bstring pointed to by b0 is shorter or longer than the
++ bstring pointed to by b1 is returned. 0 is returned if and only if the
++ two bstrings are the same. If the length of the bstrings are different,
++ this function is O(n). Like its standard C library counter part, the
++ comparison does not proceed past any '\0' termination characters
++ encountered.
++
++ The seemingly odd error return value, merely provides slightly more
++ granularity than the undefined situation given in the C library function
++ strcmp. The function otherwise behaves very much like strcmp().
++
++ Note that the semantics of bstrcmp are not completely compatible with
++ biseq because of its different treatment of the '\0' termination
++ character.
++
++ ..........................................................................
++
++ extern int bstrncmp (const_bstring b0, const_bstring b1, int n);
++
++ Compare the bstrings b0 and b1 for ordering for at most n characters. If
++ there is an error, SHRT_MIN is returned, otherwise a value is returned as
++ if b0 and b1 were first truncated to at most n characters then bstrcmp
++ was called with these new bstrings are paremeters. If the length of the
++ bstrings are different, this function is O(n). Like its standard C
++ library counter part, the comparison does not proceed past any '\0'
++ termination characters encountered.
++
++ The seemingly odd error return value, merely provides slightly more
++ granularity than the undefined situation given in the C library function
++ strncmp. The function otherwise behaves very much like strncmp().
++
++ ..........................................................................
++
++ extern int bstricmp (const_bstring b0, const_bstring b1);
++
++ Compare two bstrings without differentiating between case. The return
++ value is the difference of the values of the characters where the two
++ bstrings first differ, otherwise 0 is returned indicating that the
++ bstrings are equal. If the lengths are different, then a difference from
++ 0 is given, but if the first extra character is '\0', then it is taken to
++ be the value UCHAR_MAX+1.
++
++ ..........................................................................
++
++ extern int bstrnicmp (const_bstring b0, const_bstring b1, int n);
++
++ Compare two bstrings without differentiating between case for at most n
++ characters. If the position where the two bstrings first differ is
++ before the nth position, the return value is the difference of the values
++ of the characters, otherwise 0 is returned. If the lengths are different
++ and less than n characters, then a difference from 0 is given, but if the
++ first extra character is '\0', then it is taken to be the value
++ UCHAR_MAX+1.
++
++ ..........................................................................
++
++ extern int bdestroy (bstring b);
++
++ Deallocate the bstring passed. Passing NULL in as a parameter will have
++ no effect. Note that both the header and the data portion of the bstring
++ will be freed. No other bstring function which modifies one of its
++ parameters will free or reallocate the header. Because of this, in
++ general, bdestroy cannot be called on any declared struct tagbstring even
++ if it is not write protected. A bstring which is write protected cannot
++ be destroyed via the bdestroy call. Any attempt to do so will result in
++ no action taken, and BSTR_ERR will be returned.
++
++ Note to C++ users: Passing in a CBString cast to a bstring will lead to
++ undefined behavior (free will be called on the header, rather than the
++ CBString destructor.) Instead just use the ordinary C++ language
++ facilities to dealloc a CBString.
++
++ ..........................................................................
++
++ extern int binstr (const_bstring s1, int pos, const_bstring s2);
++
++ Search for the bstring s2 in s1 starting at position pos and looking in a
++ forward (increasing) direction. If it is found then it returns with the
++ first position after pos where it is found, otherwise it returns BSTR_ERR.
++ The algorithm used is brute force; O(m*n).
++
++ ..........................................................................
++
++ extern int binstrr (const_bstring s1, int pos, const_bstring s2);
++
++ Search for the bstring s2 in s1 starting at position pos and looking in a
++ backward (decreasing) direction. If it is found then it returns with the
++ first position after pos where it is found, otherwise return BSTR_ERR.
++ Note that the current position at pos is tested as well -- so to be
++ disjoint from a previous forward search it is recommended that the
++ position be backed up (decremented) by one position. The algorithm used
++ is brute force; O(m*n).
++
++ ..........................................................................
++
++ extern int binstrcaseless (const_bstring s1, int pos, const_bstring s2);
++
++ Search for the bstring s2 in s1 starting at position pos and looking in a
++ forward (increasing) direction but without regard to case. If it is
++ found then it returns with the first position after pos where it is
++ found, otherwise it returns BSTR_ERR. The algorithm used is brute force;
++ O(m*n).
++
++ ..........................................................................
++
++ extern int binstrrcaseless (const_bstring s1, int pos, const_bstring s2);
++
++ Search for the bstring s2 in s1 starting at position pos and looking in a
++ backward (decreasing) direction but without regard to case. If it is
++ found then it returns with the first position after pos where it is
++ found, otherwise return BSTR_ERR. Note that the current position at pos
++ is tested as well -- so to be disjoint from a previous forward search it
++ is recommended that the position be backed up (decremented) by one
++ position. The algorithm used is brute force; O(m*n).
++
++ ..........................................................................
++
++ extern int binchr (const_bstring b0, int pos, const_bstring b1);
++
++ Search for the first position in b0 starting from pos or after, in which
++ one of the characters in b1 is found. This function has an execution
++ time of O(b0->slen + b1->slen). If such a position does not exist in b0,
++ then BSTR_ERR is returned.
++
++ ..........................................................................
++
++ extern int binchrr (const_bstring b0, int pos, const_bstring b1);
++
++ Search for the last position in b0 no greater than pos, in which one of
++ the characters in b1 is found. This function has an execution time
++ of O(b0->slen + b1->slen). If such a position does not exist in b0,
++ then BSTR_ERR is returned.
++
++ ..........................................................................
++
++ extern int bninchr (const_bstring b0, int pos, const_bstring b1);
++
++ Search for the first position in b0 starting from pos or after, in which
++ none of the characters in b1 is found and return it. This function has
++ an execution time of O(b0->slen + b1->slen). If such a position does
++ not exist in b0, then BSTR_ERR is returned.
++
++ ..........................................................................
++
++ extern int bninchrr (const_bstring b0, int pos, const_bstring b1);
++
++ Search for the last position in b0 no greater than pos, in which none of
++ the characters in b1 is found and return it. This function has an
++ execution time of O(b0->slen + b1->slen). If such a position does not
++ exist in b0, then BSTR_ERR is returned.
++
++ ..........................................................................
++
++ extern int bstrchr (const_bstring b, int c);
++
++ Search for the character c in the bstring b forwards from the start of
++ the bstring. Returns the position of the found character or BSTR_ERR if
++ it is not found.
++
++ NOTE: This has been implemented as a macro on top of bstrchrp ().
++
++ ..........................................................................
++
++ extern int bstrrchr (const_bstring b, int c);
++
++ Search for the character c in the bstring b backwards from the end of the
++ bstring. Returns the position of the found character or BSTR_ERR if it is
++ not found.
++
++ NOTE: This has been implemented as a macro on top of bstrrchrp ().
++
++ ..........................................................................
++
++ extern int bstrchrp (const_bstring b, int c, int pos);
++
++ Search for the character c in b forwards from the position pos
++ (inclusive). Returns the position of the found character or BSTR_ERR if
++ it is not found.
++
++ ..........................................................................
++
++ extern int bstrrchrp (const_bstring b, int c, int pos);
++
++ Search for the character c in b backwards from the position pos in bstring
++ (inclusive). Returns the position of the found character or BSTR_ERR if
++ it is not found.
++
++ ..........................................................................
++
++ extern int bsetstr (bstring b0, int pos, const_bstring b1, unsigned char fill);
++
++ Overwrite the bstring b0 starting at position pos with the bstring b1. If
++ the position pos is past the end of b0, then the character "fill" is
++ appended as necessary to make up the gap between the end of b0 and pos.
++ If b1 is NULL, it behaves as if it were a 0-length bstring. The value
++ BSTR_OK is returned if the operation is successful, otherwise BSTR_ERR is
++ returned.
++
++ ..........................................................................
++
++ extern int binsert (bstring s1, int pos, const_bstring s2, unsigned char fill);
++
++ Inserts the bstring s2 into s1 at position pos. If the position pos is
++ past the end of s1, then the character "fill" is appended as necessary to
++ make up the gap between the end of s1 and pos. The value BSTR_OK is
++ returned if the operation is successful, otherwise BSTR_ERR is returned.
++
++ ..........................................................................
++
++ extern int binsertch (bstring s1, int pos, int len, unsigned char fill);
++
++ Inserts the character fill repeatedly into s1 at position pos for a
++ length len. If the position pos is past the end of s1, then the
++ character "fill" is appended as necessary to make up the gap between the
++ end of s1 and the position pos + len (exclusive). The value BSTR_OK is
++ returned if the operation is successful, otherwise BSTR_ERR is returned.
++
++ ..........................................................................
++
++ extern int breplace (bstring b1, int pos, int len, const_bstring b2,
++ unsigned char fill);
++
++ Replace a section of a bstring from pos for a length len with the bstring
++ b2. If the position pos is past the end of b1 then the character "fill"
++ is appended as necessary to make up the gap between the end of b1 and
++ pos.
++
++ ..........................................................................
++
++ extern int bfindreplace (bstring b, const_bstring find,
++ const_bstring replace, int position);
++
++ Replace all occurrences of the find substring with a replace bstring
++ after a given position in the bstring b. The find bstring must have a
++ length > 0 otherwise BSTR_ERR is returned. This function does not
++ perform recursive per character replacement; that is to say successive
++ searches resume at the position after the last replace.
++
++ So for example:
++
++ bfindreplace (a0 = bfromcstr("aabaAb"), a1 = bfromcstr("a"),
++ a2 = bfromcstr("aa"), 0);
++
++ Should result in changing a0 to "aaaabaaAb".
++
++ This function performs exactly (b->slen - position) bstring comparisons,
++ and data movement is bounded above by character volume equivalent to size
++ of the output bstring.
++
++ ..........................................................................
++
++ extern int bfindreplacecaseless (bstring b, const_bstring find,
++ const_bstring replace, int position);
++
++ Replace all occurrences of the find substring, ignoring case, with a
++ replace bstring after a given position in the bstring b. The find bstring
++ must have a length > 0 otherwise BSTR_ERR is returned. This function
++ does not perform recursive per character replacement; that is to say
++ successive searches resume at the position after the last replace.
++
++ So for example:
++
++ bfindreplacecaseless (a0 = bfromcstr("AAbaAb"), a1 = bfromcstr("a"),
++ a2 = bfromcstr("aa"), 0);
++
++ Should result in changing a0 to "aaaabaaaab".
++
++ This function performs exactly (b->slen - position) bstring comparisons,
++ and data movement is bounded above by character volume equivalent to size
++ of the output bstring.
++
++ ..........................................................................
++
++ extern int balloc (bstring b, int length);
++
++ Increase the allocated memory backing the data buffer for the bstring b
++ to a length of at least length. If the memory backing the bstring b is
++ already large enough, not action is performed. This has no effect on the
++ bstring b that is visible to the bstring API. Usually this function will
++ only be used when a minimum buffer size is required coupled with a direct
++ access to the ->data member of the bstring structure.
++
++ Be warned that like any other bstring function, the bstring must be well
++ defined upon entry to this function. I.e., doing something like:
++
++ b->slen *= 2; /* ?? Most likely incorrect */
++ balloc (b, b->slen);
++
++ is invalid, and should be implemented as:
++
++ int t;
++ if (BSTR_OK == balloc (b, t = (b->slen * 2))) b->slen = t;
++
++ This function will return with BSTR_ERR if b is not detected as a valid
++ bstring or length is not greater than 0, otherwise BSTR_OK is returned.
++
++ ..........................................................................
++
++ extern int ballocmin (bstring b, int length);
++
++ Change the amount of memory backing the bstring b to at least length.
++ This operation will never truncate the bstring data including the
++ extra terminating '\0' and thus will not decrease the length to less than
++ b->slen + 1. Note that repeated use of this function may cause
++ performance problems (realloc may be called on the bstring more than
++ the O(log(INT_MAX)) times). This function will return with BSTR_ERR if b
++ is not detected as a valid bstring or length is not greater than 0,
++ otherwise BSTR_OK is returned.
++
++ So for example:
++
++ if (BSTR_OK == ballocmin (b, 64)) b->data[63] = 'x';
++
++ The idea is that this will set the 64th character of b to 'x' if it is at
++ least 64 characters long otherwise do nothing. And we know this is well
++ defined so long as the ballocmin call was successfully, since it will
++ ensure that b has been allocated with at least 64 characters.
++
++ ..........................................................................
++
++ int btrunc (bstring b, int n);
++
++ Truncate the bstring to at most n characters. This function will return
++ with BSTR_ERR if b is not detected as a valid bstring or n is less than
++ 0, otherwise BSTR_OK is returned.
++
++ ..........................................................................
++
++ extern int bpattern (bstring b, int len);
++
++ Replicate the starting bstring, b, end to end repeatedly until it
++ surpasses len characters, then chop the result to exactly len characters.
++ This function operates in-place. This function will return with BSTR_ERR
++ if b is NULL or of length 0, otherwise BSTR_OK is returned.
++
++ ..........................................................................
++
++ extern int btoupper (bstring b);
++
++ Convert contents of bstring to upper case. This function will return with
++ BSTR_ERR if b is NULL or of length 0, otherwise BSTR_OK is returned.
++
++ ..........................................................................
++
++ extern int btolower (bstring b);
++
++ Convert contents of bstring to lower case. This function will return with
++ BSTR_ERR if b is NULL or of length 0, otherwise BSTR_OK is returned.
++
++ ..........................................................................
++
++ extern int bltrimws (bstring b);
++
++ Delete whitespace contiguous from the left end of the bstring. This
++ function will return with BSTR_ERR if b is NULL or of length 0, otherwise
++ BSTR_OK is returned.
++
++ ..........................................................................
++
++ extern int brtrimws (bstring b);
++
++ Delete whitespace contiguous from the right end of the bstring. This
++ function will return with BSTR_ERR if b is NULL or of length 0, otherwise
++ BSTR_OK is returned.
++
++ ..........................................................................
++
++ extern int btrimws (bstring b);
++
++ Delete whitespace contiguous from both ends of the bstring. This function
++ will return with BSTR_ERR if b is NULL or of length 0, otherwise BSTR_OK
++ is returned.
++
++ ..........................................................................
++
++ extern struct bstrList* bstrListCreate (void);
++
++ Create an empty struct bstrList. The struct bstrList output structure is
++ declared as follows:
++
++ struct bstrList {
++ int qty, mlen;
++ bstring * entry;
++ };
++
++ The entry field actually is an array with qty number entries. The mlen
++ record counts the maximum number of bstring's for which there is memory
++ in the entry record.
++
++ The Bstrlib API does *NOT* include a comprehensive set of functions for
++ full management of struct bstrList in an abstracted way. The reason for
++ this is because aliasing semantics of the list are best left to the user
++ of this function, and performance varies wildly depending on the
++ assumptions made. For a complete list of bstring data type it is
++ recommended that the C++ public std::vector<CBString> be used, since its
++ semantics are usage are more standard.
++
++ ..........................................................................
++
++ extern int bstrListDestroy (struct bstrList * sl);
++
++ Destroy a struct bstrList structure that was returned by the bsplit
++ function. Note that this will destroy each bstring in the ->entry array
++ as well. See bstrListCreate() above for structure of struct bstrList.
++
++ ..........................................................................
++
++ extern int bstrListAlloc (struct bstrList * sl, int msz);
++
++ Ensure that there is memory for at least msz number of entries for the
++ list.
++
++ ..........................................................................
++
++ extern int bstrListAllocMin (struct bstrList * sl, int msz);
++
++ Try to allocate the minimum amount of memory for the list to include at
++ least msz entries or sl->qty whichever is greater.
++
++ ..........................................................................
++
++ extern struct bstrList * bsplit (bstring str, unsigned char splitChar);
++
++ Create an array of sequential substrings from str divided by the
++ character splitChar. Successive occurrences of the splitChar will be
++ divided by empty bstring entries, following the semantics from the Python
++ programming language. To reclaim the memory from this output structure,
++ bstrListDestroy () should be called. See bstrListCreate() above for
++ structure of struct bstrList.
++
++ ..........................................................................
++
++ extern struct bstrList * bsplits (bstring str, const_bstring splitStr);
++
++ Create an array of sequential substrings from str divided by any
++ character contained in splitStr. An empty splitStr causes a single entry
++ bstrList containing a copy of str to be returned. See bstrListCreate()
++ above for structure of struct bstrList.
++
++ ..........................................................................
++
++ extern struct bstrList * bsplitstr (bstring str, const_bstring splitStr);
++
++ Create an array of sequential substrings from str divided by the entire
++ substring splitStr. An empty splitStr causes a single entry bstrList
++ containing a copy of str to be returned. See bstrListCreate() above for
++ structure of struct bstrList.
++
++ ..........................................................................
++
++ extern bstring bjoin (const struct bstrList * bl, const_bstring sep);
++
++ Join the entries of a bstrList into one bstring by sequentially
++ concatenating them with the sep bstring in between. If sep is NULL, it
++ is treated as if it were the empty bstring. Note that:
++
++ bjoin (l = bsplit (b, s->data[0]), s);
++
++ should result in a copy of b, if s->slen is 1. If there is an error NULL
++ is returned, otherwise a bstring with the correct result is returned.
++ See bstrListCreate() above for structure of struct bstrList.
++
++ ..........................................................................
++
++ extern int bsplitcb (const_bstring str, unsigned char splitChar, int pos,
++ int (* cb) (void * parm, int ofs, int len), void * parm);
++
++ Iterate the set of disjoint sequential substrings over str starting at
++ position pos divided by the character splitChar. The parm passed to
++ bsplitcb is passed on to cb. If the function cb returns a value < 0,
++ then further iterating is halted and this value is returned by bsplitcb.
++
++ Note: Non-destructive modification of str from within the cb function
++ while performing this split is not undefined. bsplitcb behaves in
++ sequential lock step with calls to cb. I.e., after returning from a cb
++ that return a non-negative integer, bsplitcb continues from the position
++ 1 character after the last detected split character and it will halt
++ immediately if the length of str falls below this point. However, if the
++ cb function destroys str, then it *must* return with a negative value,
++ otherwise bsplitcb will continue in an undefined manner.
++
++ This function is provided as an incremental alternative to bsplit that is
++ abortable and which does not impose additional memory allocation.
++
++ ..........................................................................
++
++ extern int bsplitscb (const_bstring str, const_bstring splitStr, int pos,
++ int (* cb) (void * parm, int ofs, int len), void * parm);
++
++ Iterate the set of disjoint sequential substrings over str starting at
++ position pos divided by any of the characters in splitStr. An empty
++ splitStr causes the whole str to be iterated once. The parm passed to
++ bsplitcb is passed on to cb. If the function cb returns a value < 0,
++ then further iterating is halted and this value is returned by bsplitcb.
++
++ Note: Non-destructive modification of str from within the cb function
++ while performing this split is not undefined. bsplitscb behaves in
++ sequential lock step with calls to cb. I.e., after returning from a cb
++ that return a non-negative integer, bsplitscb continues from the position
++ 1 character after the last detected split character and it will halt
++ immediately if the length of str falls below this point. However, if the
++ cb function destroys str, then it *must* return with a negative value,
++ otherwise bsplitscb will continue in an undefined manner.
++
++ This function is provided as an incremental alternative to bsplits that
++ is abortable and which does not impose additional memory allocation.
++
++ ..........................................................................
++
++ extern int bsplitstrcb (const_bstring str, const_bstring splitStr, int pos,
++ int (* cb) (void * parm, int ofs, int len), void * parm);
++
++ Iterate the set of disjoint sequential substrings over str starting at
++ position pos divided by the entire substring splitStr. An empty splitStr
++ causes each character of str to be iterated. The parm passed to bsplitcb
++ is passed on to cb. If the function cb returns a value < 0, then further
++ iterating is halted and this value is returned by bsplitcb.
++
++ Note: Non-destructive modification of str from within the cb function
++ while performing this split is not undefined. bsplitstrcb behaves in
++ sequential lock step with calls to cb. I.e., after returning from a cb
++ that return a non-negative integer, bsplitstrcb continues from the position
++ 1 character after the last detected split character and it will halt
++ immediately if the length of str falls below this point. However, if the
++ cb function destroys str, then it *must* return with a negative value,
++ otherwise bsplitscb will continue in an undefined manner.
++
++ This function is provided as an incremental alternative to bsplitstr that
++ is abortable and which does not impose additional memory allocation.
++
++ ..........................................................................
++
++ extern bstring bformat (const char * fmt, ...);
++
++ Takes the same parameters as printf (), but rather than outputting
++ results to stdio, it forms a bstring which contains what would have been
++ output. Note that if there is an early generation of a '\0' character,
++ the bstring will be truncated to this end point.
++
++ Note that %s format tokens correspond to '\0' terminated char * buffers,
++ not bstrings. To print a bstring, first dereference data element of the
++ the bstring:
++
++ /* b1->data needs to be '\0' terminated, so tagbstrings generated
++ by blk2tbstr () might not be suitable. */
++ b0 = bformat ("Hello, %s", b1->data);
++
++ Note that if the BSTRLIB_NOVSNP macro has been set when bstrlib has been
++ compiled the bformat function is not present.
++
++ ..........................................................................
++
++ extern int bformata (bstring b, const char * fmt, ...);
++
++ In addition to the initial output buffer b, bformata takes the same
++ parameters as printf (), but rather than outputting results to stdio, it
++ appends the results to the initial bstring parameter. Note that if
++ there is an early generation of a '\0' character, the bstring will be
++ truncated to this end point.
++
++ Note that %s format tokens correspond to '\0' terminated char * buffers,
++ not bstrings. To print a bstring, first dereference data element of the
++ the bstring:
++
++ /* b1->data needs to be '\0' terminated, so tagbstrings generated
++ by blk2tbstr () might not be suitable. */
++ bformata (b0 = bfromcstr ("Hello"), ", %s", b1->data);
++
++ Note that if the BSTRLIB_NOVSNP macro has been set when bstrlib has been
++ compiled the bformata function is not present.
++
++ ..........................................................................
++
++ extern int bassignformat (bstring b, const char * fmt, ...);
++
++ After the first parameter, it takes the same parameters as printf (), but
++ rather than outputting results to stdio, it outputs the results to
++ the bstring parameter b. Note that if there is an early generation of a
++ '\0' character, the bstring will be truncated to this end point.
++
++ Note that %s format tokens correspond to '\0' terminated char * buffers,
++ not bstrings. To print a bstring, first dereference data element of the
++ the bstring:
++
++ /* b1->data needs to be '\0' terminated, so tagbstrings generated
++ by blk2tbstr () might not be suitable. */
++ bassignformat (b0 = bfromcstr ("Hello"), ", %s", b1->data);
++
++ Note that if the BSTRLIB_NOVSNP macro has been set when bstrlib has been
++ compiled the bassignformat function is not present.
++
++ ..........................................................................
++
++ extern int bvcformata (bstring b, int count, const char * fmt, va_list arglist);
++
++ The bvcformata function formats data under control of the format control
++ string fmt and attempts to append the result to b. The fmt parameter is
++ the same as that of the printf function. The variable argument list is
++ replaced with arglist, which has been initialized by the va_start macro.
++ The size of the output is upper bounded by count. If the required output
++ exceeds count, the string b is not augmented with any contents and a value
++ below BSTR_ERR is returned. If a value below -count is returned then it
++ is recommended that the negative of this value be used as an update to the
++ count in a subsequent pass. On other errors, such as running out of
++ memory, parameter errors or numeric wrap around BSTR_ERR is returned.
++ BSTR_OK is returned when the output is successfully generated and
++ appended to b.
++
++ Note: There is no sanity checking of arglist, and this function is
++ destructive of the contents of b from the b->slen point onward. If there
++ is an early generation of a '\0' character, the bstring will be truncated
++ to this end point.
++
++ Although this function is part of the external API for Bstrlib, the
++ interface and semantics (length limitations, and unusual return codes)
++ are fairly atypical. The real purpose for this function is to provide an
++ engine for the bvformata macro.
++
++ Note that if the BSTRLIB_NOVSNP macro has been set when bstrlib has been
++ compiled the bvcformata function is not present.
++
++ ..........................................................................
++
++ extern bstring bread (bNread readPtr, void * parm);
++ typedef size_t (* bNread) (void *buff, size_t elsize, size_t nelem,
++ void *parm);
++
++ Read an entire stream into a bstring, verbatum. The readPtr function
++ pointer is compatible with fread sematics, except that it need not obtain
++ the stream data from a file. The intention is that parm would contain
++ the stream data context/state required (similar to the role of the FILE*
++ I/O stream parameter of fread.)
++
++ Abstracting the block read function allows for block devices other than
++ file streams to be read if desired. Note that there is an ANSI
++ compatibility issue if "fread" is used directly; see the ANSI issues
++ section below.
++
++ ..........................................................................
++
++ extern int breada (bstring b, bNread readPtr, void * parm);
++
++ Read an entire stream and append it to a bstring, verbatum. Behaves
++ like bread, except that it appends it results to the bstring b.
++ BSTR_ERR is returned on error, otherwise 0 is returned.
++
++ ..........................................................................
++
++ extern bstring bgets (bNgetc getcPtr, void * parm, char terminator);
++ typedef int (* bNgetc) (void * parm);
++
++ Read a bstring from a stream. As many bytes as is necessary are read
++ until the terminator is consumed or no more characters are available from
++ the stream. If read from the stream, the terminator character will be
++ appended to the end of the returned bstring. The getcPtr function must
++ have the same semantics as the fgetc C library function (i.e., returning
++ an integer whose value is negative when there are no more characters
++ available, otherwise the value of the next available unsigned character
++ from the stream.) The intention is that parm would contain the stream
++ data context/state required (similar to the role of the FILE* I/O stream
++ parameter of fgets.) If no characters are read, or there is some other
++ detectable error, NULL is returned.
++
++ bgets will never call the getcPtr function more often than necessary to
++ construct its output (including a single call, if required, to determine
++ that the stream contains no more characters.)
++
++ Abstracting the character stream function and terminator character allows
++ for different stream devices and string formats other than '\n'
++ terminated lines in a file if desired (consider \032 terminated email
++ messages, in a UNIX mailbox for example.)
++
++ For files, this function can be used analogously as fgets as follows:
++
++ fp = fopen ( ... );
++ if (fp) b = bgets ((bNgetc) fgetc, fp, '\n');
++
++ (Note that only one terminator character can be used, and that '\0' is
++ not assumed to terminate the stream in addition to the terminator
++ character. This is consistent with the semantics of fgets.)
++
++ ..........................................................................
++
++ extern int bgetsa (bstring b, bNgetc getcPtr, void * parm, char terminator);
++
++ Read from a stream and concatenate to a bstring. Behaves like bgets,
++ except that it appends it results to the bstring b. The value 1 is
++ returned if no characters are read before a negative result is returned
++ from getcPtr. Otherwise BSTR_ERR is returned on error, and 0 is returned
++ in other normal cases.
++
++ ..........................................................................
++
++ extern int bassigngets (bstring b, bNgetc getcPtr, void * parm, char terminator);
++
++ Read from a stream and concatenate to a bstring. Behaves like bgets,
++ except that it assigns the results to the bstring b. The value 1 is
++ returned if no characters are read before a negative result is returned
++ from getcPtr. Otherwise BSTR_ERR is returned on error, and 0 is returned
++ in other normal cases.
++
++ ..........................................................................
++
++ extern struct bStream * bsopen (bNread readPtr, void * parm);
++
++ Wrap a given open stream (described by a fread compatible function
++ pointer and stream handle) into an open bStream suitable for the bstring
++ library streaming functions.
++
++ ..........................................................................
++
++ extern void * bsclose (struct bStream * s);
++
++ Close the bStream, and return the handle to the stream that was
++ originally used to open the given stream. If s is NULL or detectably
++ invalid, NULL will be returned.
++
++ ..........................................................................
++
++ extern int bsbufflength (struct bStream * s, int sz);
++
++ Set the length of the buffer used by the bStream. If sz is the macro
++ BSTR_BS_BUFF_LENGTH_GET (which is 0), the length is not set. If s is
++ NULL or sz is negative, the function will return with BSTR_ERR, otherwise
++ this function returns with the previous length.
++
++ ..........................................................................
++
++ extern int bsreadln (bstring r, struct bStream * s, char terminator);
++
++ Read a bstring terminated by the terminator character or the end of the
++ stream from the bStream (s) and return it into the parameter r. The
++ matched terminator, if found, appears at the end of the line read. If
++ the stream has been exhausted of all available data, before any can be
++ read, BSTR_ERR is returned. This function may read additional characters
++ into the stream buffer from the core stream that are not returned, but
++ will be retained for subsequent read operations. When reading from high
++ speed streams, this function can perform significantly faster than bgets.
++
++ ..........................................................................
++
++ extern int bsreadlna (bstring r, struct bStream * s, char terminator);
++
++ Read a bstring terminated by the terminator character or the end of the
++ stream from the bStream (s) and concatenate it to the parameter r. The
++ matched terminator, if found, appears at the end of the line read. If
++ the stream has been exhausted of all available data, before any can be
++ read, BSTR_ERR is returned. This function may read additional characters
++ into the stream buffer from the core stream that are not returned, but
++ will be retained for subsequent read operations. When reading from high
++ speed streams, this function can perform significantly faster than bgets.
++
++ ..........................................................................
++
++ extern int bsreadlns (bstring r, struct bStream * s, bstring terminators);
++
++ Read a bstring terminated by any character in the terminators bstring or
++ the end of the stream from the bStream (s) and return it into the
++ parameter r. This function may read additional characters from the core
++ stream that are not returned, but will be retained for subsequent read
++ operations.
++
++ ..........................................................................
++
++ extern int bsreadlnsa (bstring r, struct bStream * s, bstring terminators);
++
++ Read a bstring terminated by any character in the terminators bstring or
++ the end of the stream from the bStream (s) and concatenate it to the
++ parameter r. If the stream has been exhausted of all available data,
++ before any can be read, BSTR_ERR is returned. This function may read
++ additional characters from the core stream that are not returned, but
++ will be retained for subsequent read operations.
++
++ ..........................................................................
++
++ extern int bsread (bstring r, struct bStream * s, int n);
++
++ Read a bstring of length n (or, if it is fewer, as many bytes as is
++ remaining) from the bStream. This function will read the minimum
++ required number of additional characters from the core stream. When the
++ stream is at the end of the file BSTR_ERR is returned, otherwise BSTR_OK
++ is returned.
++
++ ..........................................................................
++
++ extern int bsreada (bstring r, struct bStream * s, int n);
++
++ Read a bstring of length n (or, if it is fewer, as many bytes as is
++ remaining) from the bStream and concatenate it to the parameter r. This
++ function will read the minimum required number of additional characters
++ from the core stream. When the stream is at the end of the file BSTR_ERR
++ is returned, otherwise BSTR_OK is returned.
++
++ ..........................................................................
++
++ extern int bsunread (struct bStream * s, const_bstring b);
++
++ Insert a bstring into the bStream at the current position. These
++ characters will be read prior to those that actually come from the core
++ stream.
++
++ ..........................................................................
++
++ extern int bspeek (bstring r, const struct bStream * s);
++
++ Return the number of currently buffered characters from the bStream that
++ will be read prior to reads from the core stream, and append it to the
++ the parameter r.
++
++ ..........................................................................
++
++ extern int bssplitscb (struct bStream * s, const_bstring splitStr,
++ int (* cb) (void * parm, int ofs, const_bstring entry), void * parm);
++
++ Iterate the set of disjoint sequential substrings over the stream s
++ divided by any character from the bstring splitStr. The parm passed to
++ bssplitscb is passed on to cb. If the function cb returns a value < 0,
++ then further iterating is halted and this return value is returned by
++ bssplitscb.
++
++ Note: At the point of calling the cb function, the bStream pointer is
++ pointed exactly at the position right after having read the split
++ character. The cb function can act on the stream by causing the bStream
++ pointer to move, and bssplitscb will continue by starting the next split
++ at the position of the pointer after the return from cb.
++
++ However, if the cb causes the bStream s to be destroyed then the cb must
++ return with a negative value, otherwise bssplitscb will continue in an
++ undefined manner.
++
++ This function is provided as way to incrementally parse through a file
++ or other generic stream that in total size may otherwise exceed the
++ practical or desired memory available. As with the other split callback
++ based functions this is abortable and does not impose additional memory
++ allocation.
++
++ ..........................................................................
++
++ extern int bssplitstrcb (struct bStream * s, const_bstring splitStr,
++ int (* cb) (void * parm, int ofs, const_bstring entry), void * parm);
++
++ Iterate the set of disjoint sequential substrings over the stream s
++ divided by the entire substring splitStr. The parm passed to
++ bssplitstrcb is passed on to cb. If the function cb returns a
++ value < 0, then further iterating is halted and this return value is
++ returned by bssplitstrcb.
++
++ Note: At the point of calling the cb function, the bStream pointer is
++ pointed exactly at the position right after having read the split
++ character. The cb function can act on the stream by causing the bStream
++ pointer to move, and bssplitstrcb will continue by starting the next
++ split at the position of the pointer after the return from cb.
++
++ However, if the cb causes the bStream s to be destroyed then the cb must
++ return with a negative value, otherwise bssplitscb will continue in an
++ undefined manner.
++
++ This function is provided as way to incrementally parse through a file
++ or other generic stream that in total size may otherwise exceed the
++ practical or desired memory available. As with the other split callback
++ based functions this is abortable and does not impose additional memory
++ allocation.
++
++ ..........................................................................
++
++ extern int bseof (const struct bStream * s);
++
++ Return the defacto "EOF" (end of file) state of a stream (1 if the
++ bStream is in an EOF state, 0 if not, and BSTR_ERR if stream is closed or
++ detectably erroneous.) When the readPtr callback returns a value <= 0
++ the stream reaches its "EOF" state. Note that bunread with non-empty
++ content will essentially turn off this state, and the stream will not be
++ in its "EOF" state so long as its possible to read more data out of it.
++
++ Also note that the semantics of bseof() are slightly different from
++ something like feof(). I.e., reaching the end of the stream does not
++ necessarily guarantee that bseof() will return with a value indicating
++ that this has happened. bseof() will only return indicating that it has
++ reached the "EOF" and an attempt has been made to read past the end of
++ the bStream.
++
++The macros
++----------
++
++ The macros described below are shown in a prototype form indicating their
++ intended usage. Note that the parameters passed to these macros will be
++ referenced multiple times. As with all macros, programmer care is
++ required to guard against unintended side effects.
++
++ int blengthe (const_bstring b, int err);
++
++ Returns the length of the bstring. If the bstring is NULL err is
++ returned.
++
++ ..........................................................................
++
++ int blength (const_bstring b);
++
++ Returns the length of the bstring. If the bstring is NULL, the length
++ returned is 0.
++
++ ..........................................................................
++
++ int bchare (const_bstring b, int p, int c);
++
++ Returns the p'th character of the bstring b. If the position p refers to
++ a position that does not exist in the bstring or the bstring is NULL,
++ then c is returned.
++
++ ..........................................................................
++
++ char bchar (const_bstring b, int p);
++
++ Returns the p'th character of the bstring b. If the position p refers to
++ a position that does not exist in the bstring or the bstring is NULL,
++ then '\0' is returned.
++
++ ..........................................................................
++
++ char * bdatae (bstring b, char * err);
++
++ Returns the char * data portion of the bstring b. If b is NULL, err is
++ returned.
++
++ ..........................................................................
++
++ char * bdata (bstring b);
++
++ Returns the char * data portion of the bstring b. If b is NULL, NULL is
++ returned.
++
++ ..........................................................................
++
++ char * bdataofse (bstring b, int ofs, char * err);
++
++ Returns the char * data portion of the bstring b offset by ofs. If b is
++ NULL, err is returned.
++
++ ..........................................................................
++
++ char * bdataofs (bstring b, int ofs);
++
++ Returns the char * data portion of the bstring b offset by ofs. If b is
++ NULL, NULL is returned.
++
++ ..........................................................................
++
++ struct tagbstring var = bsStatic ("...");
++
++ The bsStatic macro allows for static declarations of literal string
++ constants as struct tagbstring structures. The resulting tagbstring does
++ not need to be freed or destroyed. Note that this macro is only well
++ defined for string literal arguments. For more general string pointers,
++ use the btfromcstr macro.
++
++ The resulting struct tagbstring is permanently write protected. Attempts
++ to write to this struct tagbstring from any bstrlib function will lead to
++ BSTR_ERR being returned. Invoking the bwriteallow macro onto this struct
++ tagbstring has no effect.
++
++ ..........................................................................
++
++ <void * blk, int len> <- bsStaticBlkParms ("...")
++
++ The bsStaticBlkParms macro emits a pair of comma seperated parameters
++ corresponding to the block parameters for the block functions in Bstrlib
++ (i.e., blk2bstr, bcatblk, blk2tbstr, bisstemeqblk, bisstemeqcaselessblk.)
++ Note that this macro is only well defined for string literal arguments.
++
++ Examples:
++
++ bstring b = blk2bstr (bsStaticBlkParms ("Fast init. "));
++ bcatblk (b, bsStaticBlkParms ("No frills fast concatenation."));
++
++ These are faster than using bfromcstr() and bcatcstr() respectively
++ because the length of the inline string is known as a compile time
++ constant. Also note that seperate struct tagbstring declarations for
++ holding the output of a bsStatic() macro are not required.
++
++ ..........................................................................
++
++ void btfromcstr (struct tagbstring& t, const char * s);
++
++ Fill in the tagbstring t with the '\0' terminated char buffer s. This
++ action is purely reference oriented; no memory management is done. The
++ data member is just assigned s, and slen is assigned the strlen of s.
++ The s parameter is accessed exactly once in this macro.
++
++ The resulting struct tagbstring is initially write protected. Attempts
++ to write to this struct tagbstring in a write protected state from any
++ bstrlib function will lead to BSTR_ERR being returned. Invoke the
++ bwriteallow on this struct tagbstring to make it writeable (though this
++ requires that s be obtained from a function compatible with malloc.)
++
++ ..........................................................................
++
++ void btfromblk (struct tagbstring& t, void * s, int len);
++
++ Fill in the tagbstring t with the data buffer s with length len. This
++ action is purely reference oriented; no memory management is done. The
++ data member of t is just assigned s, and slen is assigned len. Note that
++ the buffer is not appended with a '\0' character. The s and len
++ parameters are accessed exactly once each in this macro.
++
++ The resulting struct tagbstring is initially write protected. Attempts
++ to write to this struct tagbstring in a write protected state from any
++ bstrlib function will lead to BSTR_ERR being returned. Invoke the
++ bwriteallow on this struct tagbstring to make it writeable (though this
++ requires that s be obtained from a function compatible with malloc.)
++
++ ..........................................................................
++
++ void btfromblkltrimws (struct tagbstring& t, void * s, int len);
++
++ Fill in the tagbstring t with the data buffer s with length len after it
++ has been left trimmed. This action is purely reference oriented; no
++ memory management is done. The data member of t is just assigned to a
++ pointer inside the buffer s. Note that the buffer is not appended with a
++ '\0' character. The s and len parameters are accessed exactly once each
++ in this macro.
++
++ The resulting struct tagbstring is permanently write protected. Attempts
++ to write to this struct tagbstring from any bstrlib function will lead to
++ BSTR_ERR being returned. Invoking the bwriteallow macro onto this struct
++ tagbstring has no effect.
++
++ ..........................................................................
++
++ void btfromblkrtrimws (struct tagbstring& t, void * s, int len);
++
++ Fill in the tagbstring t with the data buffer s with length len after it
++ has been right trimmed. This action is purely reference oriented; no
++ memory management is done. The data member of t is just assigned to a
++ pointer inside the buffer s. Note that the buffer is not appended with a
++ '\0' character. The s and len parameters are accessed exactly once each
++ in this macro.
++
++ The resulting struct tagbstring is permanently write protected. Attempts
++ to write to this struct tagbstring from any bstrlib function will lead to
++ BSTR_ERR being returned. Invoking the bwriteallow macro onto this struct
++ tagbstring has no effect.
++
++ ..........................................................................
++
++ void btfromblktrimws (struct tagbstring& t, void * s, int len);
++
++ Fill in the tagbstring t with the data buffer s with length len after it
++ has been left and right trimmed. This action is purely reference
++ oriented; no memory management is done. The data member of t is just
++ assigned to a pointer inside the buffer s. Note that the buffer is not
++ appended with a '\0' character. The s and len parameters are accessed
++ exactly once each in this macro.
++
++ The resulting struct tagbstring is permanently write protected. Attempts
++ to write to this struct tagbstring from any bstrlib function will lead to
++ BSTR_ERR being returned. Invoking the bwriteallow macro onto this struct
++ tagbstring has no effect.
++
++ ..........................................................................
++
++ void bmid2tbstr (struct tagbstring& t, bstring b, int pos, int len);
++
++ Fill the tagbstring t with the substring from b, starting from position
++ pos with a length len. The segment is clamped by the boundaries of
++ the bstring b. This action is purely reference oriented; no memory
++ management is done. Note that the buffer is not appended with a '\0'
++ character. Note that the t parameter to this macro may be accessed
++ multiple times. Note that the contents of t will become undefined
++ if the contents of b change or are destroyed.
++
++ The resulting struct tagbstring is permanently write protected. Attempts
++ to write to this struct tagbstring in a write protected state from any
++ bstrlib function will lead to BSTR_ERR being returned. Invoking the
++ bwriteallow macro on this struct tagbstring will have no effect.
++
++ ..........................................................................
++
++ void bvformata (int& ret, bstring b, const char * format, lastarg);
++
++ Append the bstring b with printf like formatting with the format control
++ string, and the arguments taken from the ... list of arguments after
++ lastarg passed to the containing function. If the containing function
++ does not have ... parameters or lastarg is not the last named parameter
++ before the ... then the results are undefined. If successful, the
++ results are appended to b and BSTR_OK is assigned to ret. Otherwise
++ BSTR_ERR is assigned to ret.
++
++ Example:
++
++ void dbgerror (FILE * fp, const char * fmt, ...) {
++ int ret;
++ bstring b;
++ bvformata (ret, b = bfromcstr ("DBG: "), fmt, fmt);
++ if (BSTR_OK == ret) fputs ((char *) bdata (b), fp);
++ bdestroy (b);
++ }
++
++ Note that if the BSTRLIB_NOVSNP macro was set when bstrlib had been
++ compiled the bvformata macro will not link properly. If the
++ BSTRLIB_NOVSNP macro has been set, the bvformata macro will not be
++ available.
++
++ ..........................................................................
++
++ void bwriteprotect (struct tagbstring& t);
++
++ Disallow bstring from being written to via the bstrlib API. Attempts to
++ write to the resulting tagbstring from any bstrlib function will lead to
++ BSTR_ERR being returned.
++
++ Note: bstrings which are write protected cannot be destroyed via bdestroy.
++
++ Note to C++ users: Setting a CBString as write protected will not prevent
++ it from being destroyed by the destructor.
++
++ ..........................................................................
++
++ void bwriteallow (struct tagbstring& t);
++
++ Allow bstring to be written to via the bstrlib API. Note that such an
++ action makes the bstring both writable and destroyable. If the bstring is
++ not legitimately writable (as is the case for struct tagbstrings
++ initialized with a bsStatic value), the results of this are undefined.
++
++ Note that invoking the bwriteallow macro may increase the number of
++ reallocs by one more than necessary for every call to bwriteallow
++ interleaved with any bstring API which writes to this bstring.
++
++ ..........................................................................
++
++ int biswriteprotected (struct tagbstring& t);
++
++ Returns 1 if the bstring is write protected, otherwise 0 is returned.
++
++===============================================================================
++
++The bstest module
++-----------------
++
++The bstest module is just a unit test for the bstrlib module. For correct
++implementations of bstrlib, it should execute with 0 failures being reported.
++This test should be utilized if modifications/customizations to bstrlib have
++been performed. It tests each core bstrlib function with bstrings of every
++mode (read-only, NULL, static and mutable) and ensures that the expected
++semantics are observed (including results that should indicate an error). It
++also tests for aliasing support. Passing bstest is a necessary but not a
++sufficient condition for ensuring the correctness of the bstrlib module.
++
++
++The test module
++---------------
++
++The test module is just a unit test for the bstrwrap module. For correct
++implementations of bstrwrap, it should execute with 0 failures being
++reported. This test should be utilized if modifications/customizations to
++bstrwrap have been performed. It tests each core bstrwrap function with
++CBStrings write protected or not and ensures that the expected semantics are
++observed (including expected exceptions.) Note that exceptions cannot be
++disabled to run this test. Passing test is a necessary but not a sufficient
++condition for ensuring the correctness of the bstrwrap module.
++
++===============================================================================
++
++Using Bstring and CBString as an alternative to the C library
++-------------------------------------------------------------
++
++First let us give a table of C library functions and the alternative bstring
++functions and CBString methods that should be used instead of them.
++
++C-library Bstring alternative CBString alternative
++--------- ------------------- --------------------
++gets bgets ::gets
++strcpy bassign = operator
++strncpy bassignmidstr ::midstr
++strcat bconcat += operator
++strncat bconcat + btrunc += operator + ::trunc
++strtok bsplit, bsplits ::split
++sprintf b(assign)format ::format
++snprintf b(assign)format + btrunc ::format + ::trunc
++vsprintf bvformata bvformata
++
++vsnprintf bvformata + btrunc bvformata + btrunc
++vfprintf bvformata + fputs use bvformata + fputs
++strcmp biseq, bstrcmp comparison operators.
++strncmp bstrncmp, memcmp bstrncmp, memcmp
++strlen ->slen, blength ::length
++strdup bstrcpy constructor
++strset bpattern ::fill
++strstr binstr ::find
++strpbrk binchr ::findchr
++stricmp bstricmp cast & use bstricmp
++strlwr btolower cast & use btolower
++strupr btoupper cast & use btoupper
++strrev bReverse (aux module) cast & use bReverse
++strchr bstrchr cast & use bstrchr
++strspnp use strspn use strspn
++ungetc bsunread bsunread
++
++The top 9 C functions listed here are troublesome in that they impose memory
++management in the calling function. The Bstring and CBstring interfaces have
++built-in memory management, so there is far less code with far less potential
++for buffer overrun problems. strtok can only be reliably called as a "leaf"
++calculation, since it (quite bizarrely) maintains hidden internal state. And
++gets is well known to be broken no matter what. The Bstrlib alternatives do
++not suffer from those sorts of problems.
++
++The substitute for strncat can be performed with higher performance by using
++the blk2tbstr macro to create a presized second operand for bconcat.
++
++C-library Bstring alternative CBString alternative
++--------- ------------------- --------------------
++strspn strspn acceptable strspn acceptable
++strcspn strcspn acceptable strcspn acceptable
++strnset strnset acceptable strnset acceptable
++printf printf acceptable printf acceptable
++puts puts acceptable puts acceptable
++fprintf fprintf acceptable fprintf acceptable
++fputs fputs acceptable fputs acceptable
++memcmp memcmp acceptable memcmp acceptable
++
++Remember that Bstring (and CBstring) functions will automatically append the
++'\0' character to the character data buffer. So by simply accessing the data
++buffer directly, ordinary C string library functions can be called directly
++on them. Note that bstrcmp is not the same as memcmp in exactly the same way
++that strcmp is not the same as memcmp.
++
++C-library Bstring alternative CBString alternative
++--------- ------------------- --------------------
++fread balloc + fread ::alloc + fread
++fgets balloc + fgets ::alloc + fgets
++
++These are odd ones because of the exact sizing of the buffer required. The
++Bstring and CBString alternatives requires that the buffers are forced to
++hold at least the prescribed length, then just use fread or fgets directly.
++However, typically the automatic memory management of Bstring and CBstring
++will make the typical use of fgets and fread to read specifically sized
++strings unnecessary.
++
++Implementation Choices
++----------------------
++
++Overhead:
++.........
++
++The bstring library has more overhead versus straight char buffers for most
++functions. This overhead is essentially just the memory management and
++string header allocation. This overhead usually only shows up for small
++string manipulations. The performance loss has to be considered in
++light of the following:
++
++1) What would be the performance loss of trying to write this management
++ code in one's own application?
++2) Since the bstring library source code is given, a sufficiently powerful
++ modern inlining globally optimizing compiler can remove function call
++ overhead.
++
++Since the data type is exposed, a developer can replace any unsatisfactory
++function with their own inline implementation. And that is besides the main
++point of what the better string library is mainly meant to provide. Any
++overhead lost has to be compared against the value of the safe abstraction
++for coupling memory management and string functionality.
++
++Performance of the C interface:
++...............................
++
++The algorithms used have performance advantages versus the analogous C
++library functions. For example:
++
++1. bfromcstr/blk2str/bstrcpy versus strcpy/strdup. By using memmove instead
++ of strcpy, the break condition of the copy loop is based on an independent
++ counter (that should be allocated in a register) rather than having to
++ check the results of the load. Modern out-of-order executing CPUs can
++ parallelize the final branch mis-predict penality with the loading of the
++ source string. Some CPUs will also tend to have better built-in hardware
++ support for counted memory moves than load-compare-store. (This is a
++ minor, but non-zero gain.)
++2. biseq versus strcmp. If the strings are unequal in length, bsiseq will
++ return in O(1) time. If the strings are aliased, or have aliased data
++ buffers, biseq will return in O(1) time. strcmp will always be O(k),
++ where k is the length of the common prefix or the whole string if they are
++ identical.
++3. ->slen versus strlen. ->slen is obviously always O(1), while strlen is
++ always O(n) where n is the length of the string.
++4. bconcat versus strcat. Both rely on precomputing the length of the
++ destination string argument, which will favor the bstring library. On
++ iterated concatenations the performance difference can be enormous.
++5. bsreadln versus fgets. The bsreadln function reads large blocks at a time
++ from the given stream, then parses out lines from the buffers directly.
++ Some C libraries will implement fgets as a loop over single fgetc calls.
++ Testing indicates that the bsreadln approach can be several times faster
++ for fast stream devices (such as a file that has been entirely cached.)
++6. bsplits/bsplitscb versus strspn. Accelerators for the set of match
++ characters are generated only once.
++7. binstr versus strstr. The binstr implementation unrolls the loops to
++ help reduce loop overhead. This will matter if the target string is
++ long and source string is not found very early in the target string.
++ With strstr, while it is possible to unroll the source contents, it is
++ not possible to do so with the destination contents in a way that is
++ effective because every destination character must be tested against
++ '\0' before proceeding to the next character.
++8. bReverse versus strrev. The C function must find the end of the string
++ first before swaping character pairs.
++9. bstrrchr versus no comparable C function. Its not hard to write some C
++ code to search for a character from the end going backwards. But there
++ is no way to do this without computing the length of the string with
++ strlen.
++
++Practical testing indicates that in general Bstrlib is never signifcantly
++slower than the C library for common operations, while very often having a
++performance advantage that ranges from significant to massive. Even for
++functions like b(n)inchr versus str(c)spn() (where, in theory, there is no
++advantage for the Bstrlib architecture) the performance of Bstrlib is vastly
++superior to most tested C library implementations.
++
++Some of Bstrlib's extra functionality also lead to inevitable performance
++advantages over typical C solutions. For example, using the blk2tbstr macro,
++one can (in O(1) time) generate an internal substring by reference while not
++disturbing the original string. If disturbing the original string is not an
++option, typically, a comparable char * solution would have to make a copy of
++the substring to provide similar functionality. Another example is reverse
++character set scanning -- the str(c)spn functions only scan in a forward
++direction which can complicate some parsing algorithms.
++
++Where high performance char * based algorithms are available, Bstrlib can
++still leverage them by accessing the ->data field on bstrings. So
++realistically Bstrlib can never be significantly slower than any standard
++'\0' terminated char * based solutions.
++
++Performance of the C++ interface:
++.................................
++
++The C++ interface has been designed with an emphasis on abstraction and safety
++first. However, since it is substantially a wrapper for the C bstring
++functions, for longer strings the performance comments described in the
++"Performance of the C interface" section above still apply. Note that the
++(CBString *) type can be directly cast to a (bstring) type, and passed as
++parameters to the C functions (though a CBString must never be passed to
++bdestroy.)
++
++Probably the most controversial choice is performing full bounds checking on
++the [] operator. This decision was made because 1) the fast alternative of
++not bounds checking is still available by first casting the CBString to a
++(const char *) buffer or to a (struct tagbstring) then derefencing .data and
++2) because the lack of bounds checking is seen as one of the main weaknesses
++of C/C++ versus other languages. This check being done on every access leads
++to individual character extraction being actually slower than other languages
++in this one respect (other language's compilers will normally dedicate more
++resources on hoisting or removing bounds checking as necessary) but otherwise
++bring C++ up to the level of other languages in terms of functionality.
++
++It is common for other C++ libraries to leverage the abstractions provided by
++C++ to use reference counting and "copy on write" policies. While these
++techniques can speed up some scenarios, they impose a problem with respect to
++thread safety. bstrings and CBStrings can be properly protected with
++"per-object" mutexes, meaning that two bstrlib calls can be made and execute
++simultaneously, so long as the bstrings and CBstrings are distinct. With a
++reference count and alias before copy on write policy, global mutexes are
++required that prevent multiple calls to the strings library to execute
++simultaneously regardless of whether or not the strings represent the same
++string.
++
++One interesting trade off in CBString is that the default constructor is not
++trivial. I.e., it always prepares a ready to use memory buffer. The purpose
++is to ensure that there is a uniform internal composition for any functioning
++CBString that is compatible with bstrings. It also means that the other
++methods in the class are not forced to perform "late initialization" checks.
++In the end it means that construction of CBStrings are slower than other
++comparable C++ string classes. Initial testing, however, indicates that
++CBString outperforms std::string and MFC's CString, for example, in all other
++operations. So to work around this weakness it is recommended that CBString
++declarations be pushed outside of inner loops.
++
++Practical testing indicates that with the exception of the caveats given
++above (constructors and safe index character manipulations) the C++ API for
++Bstrlib generally outperforms popular standard C++ string classes. Amongst
++the standard libraries and compilers, the quality of concatenation operations
++varies wildly and very little care has gone into search functions. Bstrlib
++dominates those performance benchmarks.
++
++Memory management:
++..................
++
++The bstring functions which write and modify bstrings will automatically
++reallocate the backing memory for the char buffer whenever it is required to
++grow. The algorithm for resizing chosen is to snap up to sizes that are a
++power of two which are sufficient to hold the intended new size. Memory
++reallocation is not performed when the required size of the buffer is
++decreased. This behavior can be relied on, and is necessary to make the
++behaviour of balloc deterministic. This trades off additional memory usage
++for decreasing the frequency for required reallocations:
++
++1. For any bstring whose size never exceeds n, its buffer is not ever
++ reallocated more than log_2(n) times for its lifetime.
++2. For any bstring whose size never exceeds n, its buffer is never more than
++ 2*(n+1) in length. (The extra characters beyond 2*n are to allow for the
++ implicit '\0' which is always added by the bstring modifying functions.)
++
++Decreasing the buffer size when the string decreases in size would violate 1)
++above and in real world case lead to pathological heap thrashing. Similarly,
++allocating more tightly than "least power of 2 greater than necessary" would
++lead to a violation of 1) and have the same potential for heap thrashing.
++
++Property 2) needs emphasizing. Although the memory allocated is always a
++power of 2, for a bstring that grows linearly in size, its buffer memory also
++grows linearly, not exponentially. The reason is that the amount of extra
++space increases with each reallocation, which decreases the frequency of
++future reallocations.
++
++Obviously, given that bstring writing functions may reallocate the data
++buffer backing the target bstring, one should not attempt to cache the data
++buffer address and use it after such bstring functions have been called.
++This includes making reference struct tagbstrings which alias to a writable
++bstring.
++
++balloc or bfromcstralloc can be used to preallocate the minimum amount of
++space used for a given bstring. This will reduce even further the number of
++times the data portion is reallocated. If the length of the string is never
++more than one less than the memory length then there will be no further
++reallocations.
++
++Note that invoking the bwriteallow macro may increase the number of reallocs
++by one more than necessary for every call to bwriteallow interleaved with any
++bstring API which writes to this bstring.
++
++The library does not use any mechanism for automatic clean up for the C API.
++Thus explicit clean up via calls to bdestroy() are required to avoid memory
++leaks.
++
++Constant and static tagbstrings:
++................................
++
++A struct tagbstring can be write protected from any bstrlib function using
++the bwriteprotect macro. A write protected struct tagbstring can then be
++reset to being writable via the bwriteallow macro. There is, of course, no
++protection from attempts to directly access the bstring members. Modifying a
++bstring which is write protected by direct access has undefined behavior.
++
++static struct tagbstrings can be declared via the bsStatic macro. They are
++considered permanently unwritable. Such struct tagbstrings's are declared
++such that attempts to write to it are not well defined. Invoking either
++bwriteallow or bwriteprotect on static struct tagbstrings has no effect.
++
++struct tagbstring's initialized via btfromcstr or blk2tbstr are protected by
++default but can be made writeable via the bwriteallow macro. If bwriteallow
++is called on such struct tagbstring's, it is the programmer's responsibility
++to ensure that:
++
++1) the buffer supplied was allocated from the heap.
++2) bdestroy is not called on this tagbstring (unless the header itself has
++ also been allocated from the heap.)
++3) free is called on the buffer to reclaim its memory.
++
++bwriteallow and bwriteprotect can be invoked on ordinary bstrings (they have
++to be dereferenced with the (*) operator to get the levels of indirection
++correct) to give them write protection.
++
++Buffer declaration:
++...................
++
++The memory buffer is actually declared "unsigned char *" instead of "char *".
++The reason for this is to trigger compiler warnings whenever uncasted char
++buffers are assigned to the data portion of a bstring. This will draw more
++diligent programmers into taking a second look at the code where they
++have carelessly left off the typically required cast. (Research from
++AT&T/Lucent indicates that additional programmer eyeballs is one of the most
++effective mechanisms at ferreting out bugs.)
++
++Function pointers:
++..................
++
++The bgets, bread and bStream functions use function pointers to obtain
++strings from data streams. The function pointer declarations have been
++specifically chosen to be compatible with the fgetc and fread functions.
++While this may seem to be a convoluted way of implementing fgets and fread
++style functionality, it has been specifically designed this way to ensure
++that there is no dependency on a single narrowly defined set of device
++interfaces, such as just stream I/O. In the embedded world, its quite
++possible to have environments where such interfaces may not exist in the
++standard C library form. Furthermore, the generalization that this opens up
++allows for more sophisticated uses for these functions (performing an fgets
++like function on a socket, for example.) By using function pointers, it also
++allows such abstract stream interfaces to be created using the bstring library
++itself while not creating a circular dependency.
++
++Use of int's for sizes:
++.......................
++
++This is just a recognition that 16bit platforms with requirements for strings
++that are larger than 64K and 32bit+ platforms with requirements for strings
++that are larger than 4GB are pretty marginal. The main focus is for 32bit
++platforms, and emerging 64bit platforms with reasonable < 4GB string
++requirements. Using ints allows for negative values which has meaning
++internally to bstrlib.
++
++Semantic consideration:
++.......................
++
++Certain care needs to be taken when copying and aliasing bstrings. A bstring
++is essentially a pointer type which points to a multipart abstract data
++structure. Thus usage, and lifetime of bstrings have semantics that follow
++these considerations. For example:
++
++ bstring a, b;
++ struct tagbstring t;
++
++ a = bfromcstr("Hello"); /* Create new bstring and copy "Hello" into it. */
++ b = a; /* Alias b to the contents of a. */
++ t = *a; /* Create a current instance pseudo-alias of a. */
++ bconcat (a, b); /* Double a and b, t is now undefined. */
++ bdestroy (a); /* Destroy the contents of both a and b. */
++
++Variables of type bstring are really just references that point to real
++bstring objects. The equal operator (=) creates aliases, and the asterisk
++dereference operator (*) creates a kind of alias to the current instance (which
++is generally not useful for any purpose.) Using bstrcpy() is the correct way
++of creating duplicate instances. The ampersand operator (&) is useful for
++creating aliases to struct tagbstrings (remembering that constructed struct
++tagbstrings are not writable by default.)
++
++CBStrings use complete copy semantics for the equal operator (=), and thus do
++not have these sorts of issues.
++
++Debugging:
++..........
++
++Bstrings have a simple, exposed definition and construction, and the library
++itself is open source. So most debugging is going to be fairly straight-
++forward. But the memory for bstrings come from the heap, which can often be
++corrupted indirectly, and it might not be obvious what has happened even from
++direct examination of the contents in a debugger or a core dump. There are
++some tools such as Purify, Insure++ and Electric Fence which can help solve
++such problems, however another common approach is to directly instrument the
++calls to malloc, realloc, calloc, free, memcpy, memmove and/or other calls
++by overriding them with macro definitions.
++
++Although the user could hack on the Bstrlib sources directly as necessary to
++perform such an instrumentation, Bstrlib comes with a built-in mechanism for
++doing this. By defining the macro BSTRLIB_MEMORY_DEBUG and providing an
++include file named memdbg.h this will force the core Bstrlib modules to
++attempt to include this file. In such a file, macros could be defined which
++overrides Bstrlib's useage of the C standard library.
++
++Rather than calling malloc, realloc, free, memcpy or memmove directly, Bstrlib
++emits the macros bstr__alloc, bstr__realloc, bstr__free, bstr__memcpy and
++bstr__memmove in their place respectively. By default these macros are simply
++assigned to be equivalent to their corresponding C standard library function
++call. However, if they are given earlier macro definitions (via the back
++door include file) they will not be given their default definition. In this
++way Bstrlib's interface to the standard library can be changed but without
++having to directly redefine or link standard library symbols (both of which
++are not strictly ANSI C compliant.)
++
++An example definition might include:
++
++ #define bstr__alloc(sz) X_malloc ((sz), __LINE__, __FILE__)
++
++which might help contextualize heap entries in a debugging environment.
++
++The NULL parameter and sanity checking of bstrings is part of the Bstrlib
++API, and thus Bstrlib itself does not present any different modes which would
++correspond to "Debug" or "Release" modes. Bstrlib always contains mechanisms
++which one might think of as debugging features, but retains the performance
++and small memory footprint one would normally associate with release mode
++code.
++
++Integration Microsoft's Visual Studio debugger:
++...............................................
++
++Microsoft's Visual Studio debugger has a capability of customizable mouse
++float over data type descriptions. This is accomplished by editting the
++AUTOEXP.DAT file to include the following:
++
++ ; new for CBString
++ tagbstring =slen=<slen> mlen=<mlen> <data,st>
++ Bstrlib::CBStringList =count=<size()>
++
++In Visual C++ 6.0 this file is located in the directory:
++
++ C:\Program Files\Microsoft Visual Studio\Common\MSDev98\Bin
++
++and in Visual Studio .NET 2003 its located here:
++
++ C:\Program Files\Microsoft Visual Studio .NET 2003\Common7\Packages\Debugger
++
++This will improve the ability of debugging with Bstrlib under Visual Studio.
++
++Security
++--------
++
++Bstrlib does not come with explicit security features outside of its fairly
++comprehensive error detection, coupled with its strict semantic support.
++That is to say that certain common security problems, such as buffer overrun,
++constant overwrite, arbitrary truncation etc, are far less likely to happen
++inadvertently. Where it does help, Bstrlib maximizes its advantage by
++providing developers a simple adoption path that lets them leave less secure
++string mechanisms behind. The library will not leave developers wanting, so
++they will be less likely to add new code using a less secure string library
++to add functionality that might be missing from Bstrlib.
++
++That said there are a number of security ideas not addressed by Bstrlib:
++
++1. Race condition exploitation (i.e., verifying a string's contents, then
++raising the privilege level and execute it as a shell command as two
++non-atomic steps) is well beyond the scope of what Bstrlib can provide. It
++should be noted that MFC's built-in string mutex actually does not solve this
++problem either -- it just removes immediate data corruption as a possible
++outcome of such exploit attempts (it can be argued that this is worse, since
++it will leave no trace of the exploitation). In general race conditions have
++to be dealt with by careful design and implementation; it cannot be assisted
++by a string library.
++
++2. Any kind of access control or security attributes to prevent usage in
++dangerous interfaces such as system(). Perl includes a "trust" attribute
++which can be endowed upon strings that are intended to be passed to such
++dangerous interfaces. However, Perl's solution reflects its own limitations
++-- notably that it is not a strongly typed language. In the example code for
++Bstrlib, there is a module called taint.cpp. It demonstrates how to write a
++simple wrapper class for managing "untainted" or trusted strings using the
++type system to prevent questionable mixing of ordinary untrusted strings with
++untainted ones then passing them to dangerous interfaces. In this way the
++security correctness of the code reduces to auditing the direct usages of
++dangerous interfaces or promotions of tainted strings to untainted ones.
++
++3. Encryption of string contents is way beyond the scope of Bstrlib.
++Maintaining encrypted string contents in the futile hopes of thwarting things
++like using system-level debuggers to examine sensitive string data is likely
++to be a wasted effort (imagine a debugger that runs at a higher level than a
++virtual processor where the application runs). For more standard encryption
++usages, since the bstring contents are simply binary blocks of data, this
++should pose no problem for usage with other standard encryption libraries.
++
++Compatibility
++-------------
++
++The Better String Library is known to compile and function correctly with the
++following compilers:
++
++ - Microsoft Visual C++
++ - Watcom C/C++
++ - Intel's C/C++ compiler (Windows)
++ - The GNU C/C++ compiler (cygwin and Linux on PPC64)
++ - Borland C
++ - Turbo C
++
++Setting of configuration options should be unnecessary for these compilers
++(unless exceptions are being disabled or STLport has been added to WATCOM
++C/C++). Bstrlib has been developed with an emphasis on portability. As such
++porting it to other compilers should be straight forward. This package
++includes a porting guide (called porting.txt) which explains what issues may
++exist for porting Bstrlib to different compilers and environments.
++
++ANSI issues
++-----------
++
++1. The function pointer types bNgetc and bNread have prototypes which are very
++similar to, but not exactly the same as fgetc and fread respectively.
++Basically the FILE * parameter is replaced by void *. The purpose of this
++was to allow one to create other functions with fgetc and fread like
++semantics without being tied to ANSI C's file streaming mechanism. I.e., one
++could very easily adapt it to sockets, or simply reading a block of memory,
++or procedurally generated strings (for fractal generation, for example.)
++
++The problem is that invoking the functions (bNgetc)fgetc and (bNread)fread is
++not technically legal in ANSI C. The reason being that the compiler is only
++able to coerce the function pointers themselves into the target type, however
++are unable to perform any cast (implicit or otherwise) on the parameters
++passed once invoked. I.e., if internally void * and FILE * need some kind of
++mechanical coercion, the compiler will not properly perform this conversion
++and thus lead to undefined behavior.
++
++Apparently a platform from Data General called "Eclipse" and another from
++Tandem called "NonStop" have a different representation for pointers to bytes
++and pointers to words, for example, where coercion via casting is necessary.
++(Actual confirmation of the existence of such machines is hard to come by, so
++it is prudent to be skeptical about this information.) However, this is not
++an issue for any known contemporary platforms. One may conclude that such
++platforms are effectively apocryphal even if they do exist.
++
++To correctly work around this problem to the satisfaction of the ANSI
++limitations, one needs to create wrapper functions for fgets and/or
++fread with the prototypes of bNgetc and/or bNread respectively which performs
++no other action other than to explicitely cast the void * parameter to a
++FILE *, and simply pass the remaining parameters straight to the function
++pointer call.
++
++The wrappers themselves are trivial:
++
++ size_t freadWrap (void * buff, size_t esz, size_t eqty, void * parm) {
++ return fread (buff, esz, eqty, (FILE *) parm);
++ }
++
++ int fgetcWrap (void * parm) {
++ return fgetc ((FILE *) parm);
++ }
++
++These have not been supplied in bstrlib or bstraux to prevent unnecessary
++linking with file I/O functions.
++
++2. vsnprintf is not available on all compilers. Because of this, the bformat
++and bformata functions (and format and formata methods) are not guaranteed to
++work properly. For those compilers that don't have vsnprintf, the
++BSTRLIB_NOVSNP macro should be set before compiling bstrlib, and the format
++functions/method will be disabled.
++
++The more recent ANSI C standards have specified the required inclusion of a
++vsnprintf function.
++
++3. The bstrlib function names are not unique in the first 6 characters. This
++is only an issue for older C compiler environments which do not store more
++than 6 characters for function names.
++
++4. The bsafe module defines macros and function names which are part of the
++C library. This simply overrides the definition as expected on all platforms
++tested, however it is not sanctioned by the ANSI standard. This module is
++clearly optional and should be omitted on platforms which disallow its
++undefined semantics.
++
++In practice the real issue is that some compilers in some modes of operation
++can/will inline these standard library functions on a module by module basis
++as they appear in each. The linker will thus have no opportunity to override
++the implementation of these functions for those cases. This can lead to
++inconsistent behaviour of the bsafe module on different platforms and
++compilers.
++
++===============================================================================
++
++Comparison with Microsoft's CString class
++-----------------------------------------
++
++Although developed independently, CBStrings have very similar functionality to
++Microsoft's CString class. However, the bstring library has significant
++advantages over CString:
++
++1. Bstrlib is a C-library as well as a C++ library (using the C++ wrapper).
++
++ - Thus it is compatible with more programming environments and
++ available to a wider population of programmers.
++
++2. The internal structure of a bstring is considered exposed.
++
++ - A single contiguous block of data can be cut into read-only pieces by
++ simply creating headers, without allocating additional memory to create
++ reference copies of each of these sub-strings.
++ - In this way, using bstrings in a totally abstracted way becomes a choice
++ rather than an imposition. Further this choice can be made differently
++ at different layers of applications that use it.
++
++3. Static declaration support precludes the need for constructor
++ invocation.
++
++ - Allows for static declarations of constant strings that has no
++ additional constructor overhead.
++
++4. Bstrlib is not attached to another library.
++
++ - Bstrlib is designed to be easily plugged into any other library
++ collection, without dependencies on other libraries or paradigms (such
++ as "MFC".)
++
++The bstring library also comes with a few additional functions that are not
++available in the CString class:
++
++ - bsetstr
++ - bsplit
++ - bread
++ - breplace (this is different from CString::Replace())
++ - Writable indexed characters (for example a[i]='x')
++
++Interestingly, although Microsoft did implement mid$(), left$() and right$()
++functional analogues (these are functions from GWBASIC) they seem to have
++forgotten that mid$() could be also used to write into the middle of a string.
++This functionality exists in Bstrlib with the bsetstr() and breplace()
++functions.
++
++Among the disadvantages of Bstrlib is that there is no special support for
++localization or wide characters. Such things are considered beyond the scope
++of what bstrings are trying to deliver. CString essentially supports the
++older UCS-2 version of Unicode via widechar_t as an application-wide compile
++time switch.
++
++CString's also use built-in mechanisms for ensuring thread safety under all
++situations. While this makes writing thread safe code that much easier, this
++built-in safety feature has a price -- the inner loops of each CString method
++runs in its own critical section (grabbing and releasing a light weight mutex
++on every operation.) The usual way to decrease the impact of a critical
++section performance penalty is to amortize more operations per critical
++section. But since the implementation of CStrings is fixed as a one critical
++section per-operation cost, there is no way to leverage this common
++performance enhancing idea.
++
++The search facilities in Bstrlib are comparable to those in MFC's CString
++class, though it is missing locale specific collation. But because Bstrlib
++is interoperable with C's char buffers, it will allow programmers to write
++their own string searching mechanism (such as Boyer-Moore), or be able to
++choose from a variety of available existing string searching libraries (such
++as those for regular expressions) without difficulty.
++
++Microsoft used a very non-ANSI conforming trick in its implementation to
++allow printf() to use the "%s" specifier to output a CString correctly. This
++can be convenient, but it is inherently not portable. CBString requires an
++explicit cast, while bstring requires the data member to be dereferenced.
++Microsoft's own documentation recommends casting, instead of relying on this
++feature.
++
++Comparison with C++'s std::string
++---------------------------------
++
++This is the C++ language's standard STL based string class.
++
++1. There is no C implementation.
++2. The [] operator is not bounds checked.
++3. Missing a lot of useful functions like printf-like formatting.
++4. Some sub-standard std::string implementations (SGI) are necessarily unsafe
++ to use with multithreading.
++5. Limited by STL's std::iostream which in turn is limited by ifstream which
++ can only take input from files. (Compare to CBStream's API which can take
++ abstracted input.)
++6. Extremely uneven performance across implementations.
++
++Comparison with ISO C TR 24731 proposal
++---------------------------------------
++
++Following the ISO C99 standard, Microsoft has proposed a group of C library
++extensions which are supposedly "safer and more secure". This proposal is
++expected to be adopted by the ISO C standard which follows C99.
++
++The proposal reveals itself to be very similar to Microsoft's "StrSafe"
++library. The functions are basically the same as other standard C library
++string functions except that destination parameters are paired with an
++additional length parameter of type rsize_t. rsize_t is the same as size_t,
++however, the range is checked to make sure its between 1 and RSIZE_MAX. Like
++Bstrlib, the functions perform a "parameter check". Unlike Bstrlib, when a
++parameter check fails, rather than simply outputing accumulatable error
++statuses, they call a user settable global error function handler, and upon
++return of control performs no (additional) detrimental action. The proposal
++covers basic string functions as well as a few non-reenterable functions
++(asctime, ctime, and strtok).
++
++1. Still based solely on char * buffers (and therefore strlen() and strcat()
++ is still O(n), and there are no faster streq() comparison functions.)
++2. No growable string semantics.
++3. Requires manual buffer length synchronization in the source code.
++4. No attempt to enhance functionality of the C library.
++5. Introduces a new error scenario (strings exceeding RSIZE_MAX length).
++
++The hope is that by exposing the buffer length requirements there will be
++fewer buffer overrun errors. However, the error modes are really just
++transformed, rather than removed. The real problem of buffer overflows is
++that they all happen as a result of erroneous programming. So forcing
++programmers to manually deal with buffer limits, will make them more aware of
++the problem but doesn't remove the possibility of erroneous programming. So
++a programmer that erroneously mixes up the rsize_t parameters is no better off
++from a programmer that introduces potential buffer overflows through other
++more typical lapses. So at best this may reduce the rate of erroneous
++programming, rather than making any attempt at removing failure modes.
++
++The error handler can discriminate between types of failures, but does not
++take into account any callsite context. So the problem is that the error is
++going to be manifest in a piece of code, but there is no pointer to that
++code. It would seem that passing in the call site __FILE__, __LINE__ as
++parameters would be very useful, but the API clearly doesn't support such a
++thing (it would increase code bloat even more than the extra length
++parameter does, and would require macro tricks to implement).
++
++The Bstrlib C API takes the position that error handling needs to be done at
++the callsite, and just tries to make it as painless as possible. Furthermore,
++error modes are removed by supporting auto-growing strings and aliasing. For
++capturing errors in more central code fragments, Bstrlib's C++ API uses
++exception handling extensively, which is superior to the leaf-only error
++handler approach.
++
++Comparison with Managed String Library CERT proposal
++----------------------------------------------------
++
++The main webpage for the managed string library:
++http://www.cert.org/secure-coding/managedstring.html
++
++Robert Seacord at CERT has proposed a C string library that he calls the
++"Managed String Library" for C. Like Bstrlib, it introduces a new type
++which is called a managed string. The structure of a managed string
++(string_m) is like a struct tagbstring but missing the length field. This
++internal structure is considered opaque. The length is, like the C standard
++library, always computed on the fly by searching for a terminating NUL on
++every operation that requires it. So it suffers from every performance
++problem that the C standard library suffers from. Interoperating with C
++string APIs (like printf, fopen, or anything else that takes a string
++parameter) requires copying to additionally allocating buffers that have to
++be manually freed -- this makes this library probably slower and more
++cumbersome than any other string library in existence.
++
++The library gives a fully populated error status as the return value of every
++string function. The hope is to be able to diagnose all problems
++specifically from the return code alone. Comparing this to Bstrlib, which
++aways returns one consistent error message, might make it seem that Bstrlib
++would be harder to debug; but this is not true. With Bstrlib, if an error
++occurs there is always enough information from just knowing there was an error
++and examining the parameters to deduce exactly what kind of error has
++happened. The managed string library thus gives up nested function calls
++while achieving little benefit, while Bstrlib does not.
++
++One interesting feature that "managed strings" has is the idea of data
++sanitization via character set whitelisting. That is to say, a globally
++definable filter that makes any attempt to put invalid characters into strings
++lead to an error and not modify the string. The author gives the following
++example:
++
++ // create valid char set
++ if (retValue = strcreate_m(&str1, "abc") ) {
++ fprintf(
++ stderr,
++ "Error %d from strcreate_m.\n",
++ retValue
++ );
++ }
++ if (retValue = setcharset(str1)) {
++ fprintf(
++ stderr,
++ "Error %d from setcharset().\n",
++ retValue
++ );
++ }
++ if (retValue = strcreate_m(&str1, "aabbccabc")) {
++ fprintf(
++ stderr,
++ "Error %d from strcreate_m.\n",
++ retValue
++ );
++ }
++ // create string with invalid char set
++ if (retValue = strcreate_m(&str1, "abbccdabc")) {
++ fprintf(
++ stderr,
++ "Error %d from strcreate_m.\n",
++ retValue
++ );
++ }
++
++Which we can compare with a more Bstrlib way of doing things:
++
++ bstring bCreateWithFilter (const char * cstr, const_bstring filter) {
++ bstring b = bfromcstr (cstr);
++ if (BSTR_ERR != bninchr (b, filter) && NULL != b) {
++ fprintf (stderr, "Filter violation.\n");
++ bdestroy (b);
++ b = NULL;
++ }
++ return b;
++ }
++
++ struct tagbstring charFilter = bsStatic ("abc");
++ bstring str1 = bCreateWithFilter ("aabbccabc", &charFilter);
++ bstring str2 = bCreateWithFilter ("aabbccdabc", &charFilter);
++
++The first thing we should notice is that with the Bstrlib approach you can
++have different filters for different strings if necessary. Furthermore,
++selecting a charset filter in the Managed String Library is uni-contextual.
++That is to say, there can only be one such filter active for the entire
++program, which means its usage is not well defined for intermediate library
++usage (a library that uses it will interfere with user code that uses it, and
++vice versa.) It is also likely to be poorly defined in multi-threading
++environments.
++
++There is also a question as to whether the data sanitization filter is checked
++on every operation, or just on creation operations. Since the charset can be
++set arbitrarily at run time, it might be set *after* some managed strings have
++been created. This would seem to imply that all functions should run this
++additional check every time if there is an attempt to enforce this. This
++would make things tremendously slow. On the other hand, if it is assumed that
++only creates and other operations that take char *'s as input need be checked
++because the charset was only supposed to be called once at and before any
++other managed string was created, then one can see that its easy to cover
++Bstrlib with equivalent functionality via a few wrapper calls such as the
++example given above.
++
++And finally we have to question the value of sanitation in the first place.
++For example, for httpd servers, there is generally a requirement that the
++URLs parsed have some form that avoids undesirable translation to local file
++system filenames or resources. The problem is that the way URLs can be
++encoded, it must be completely parsed and translated to know if it is using
++certain invalid character combinations. That is to say, merely filtering
++each character one at a time is not necessarily the right way to ensure that
++a string has safe contents.
++
++In the article that describes this proposal, it is claimed that it fairly
++closely approximates the existing C API semantics. On this point we should
++compare this "closeness" with Bstrlib:
++
++ Bstrlib Managed String Library
++ ------- ----------------------
++
++Pointer arithmetic Segment arithmetic N/A
++
++Use in C Std lib ->data, or bdata{e} getstr_m(x,*) ... free(x)
++
++String literals bsStatic, bsStaticBlk strcreate_m()
++
++Transparency Complete None
++
++Its pretty clear that the semantic mapping from C strings to Bstrlib is fairly
++straightforward, and that in general semantic capabilities are the same or
++superior in Bstrlib. On the other hand the Managed String Library is either
++missing semantics or changes things fairly significantly.
++
++Comparison with Annexia's c2lib library
++---------------------------------------
++
++This library is available at:
++http://www.annexia.org/freeware/c2lib
++
++1. Still based solely on char * buffers (and therefore strlen() and strcat()
++ is still O(n), and there are no faster streq() comparison functions.)
++ Their suggestion that alternatives which wrap the string data type (such as
++ bstring does) imposes a difficulty in interoperating with the C langauge's
++ ordinary C string library is not founded.
++2. Introduction of memory (and vector?) abstractions imposes a learning
++ curve, and some kind of memory usage policy that is outside of the strings
++ themselves (and therefore must be maintained by the developer.)
++3. The API is massive, and filled with all sorts of trivial (pjoin) and
++ controvertial (pmatch -- regular expression are not sufficiently
++ standardized, and there is a very large difference in performance between
++ compiled and non-compiled, REs) functions. Bstrlib takes a decidely
++ minimal approach -- none of the functionality in c2lib is difficult or
++ challenging to implement on top of Bstrlib (except the regex stuff, which
++ is going to be difficult, and controvertial no matter what.)
++4. Understanding why c2lib is the way it is pretty much requires a working
++ knowledge of Perl. bstrlib requires only knowledge of the C string library
++ while providing just a very select few worthwhile extras.
++5. It is attached to a lot of cruft like a matrix math library (that doesn't
++ include any functions for getting the determinant, eigenvectors,
++ eigenvalues, the matrix inverse, test for singularity, test for
++ orthogonality, a grahm schmit orthogonlization, LU decomposition ... I
++ mean why bother?)
++
++Convincing a development house to use c2lib is likely quite difficult. It
++introduces too much, while not being part of any kind of standards body. The
++code must therefore be trusted, or maintained by those that use it. While
++bstring offers nothing more on this front, since its so much smaller, covers
++far less in terms of scope, and will typically improve string performance,
++the barrier to usage should be much smaller.
++
++Comparison with stralloc/qmail
++------------------------------
++
++More information about this library can be found here:
++http://www.canonical.org/~kragen/stralloc.html or here:
++http://cr.yp.to/lib/stralloc.html
++
++1. Library is very very minimal. A little too minimal.
++2. Untargetted source parameters are not declared const.
++3. Slightly different expected emphasis (like _cats function which takes an
++ ordinary C string char buffer as a parameter.) Its clear that the
++ remainder of the C string library is still required to perform more
++ useful string operations.
++
++The struct declaration for their string header is essentially the same as that
++for bstring. But its clear that this was a quickly written hack whose goals
++are clearly a subset of what Bstrlib supplies. For anyone who is served by
++stralloc, Bstrlib is complete substitute that just adds more functionality.
++
++stralloc actually uses the interesting policy that a NULL data pointer
++indicates an empty string. In this way, non-static empty strings can be
++declared without construction. This advantage is minimal, since static empty
++bstrings can be declared inline without construction, and if the string needs
++to be written to it should be constructed from an empty string (or its first
++initializer) in any event.
++
++wxString class
++--------------
++
++This is the string class used in the wxWindows project. A description of
++wxString can be found here:
++http://www.wxwindows.org/manuals/2.4.2/wx368.htm#wxstring
++
++This C++ library is similar to CBString. However, it is littered with
++trivial functions (IsAscii, UpperCase, RemoveLast etc.)
++
++1. There is no C implementation.
++2. The memory management strategy is to allocate a bounded fixed amount of
++ additional space on each resize, meaning that it does not have the
++ log_2(n) property that Bstrlib has (it will thrash very easily, cause
++ massive fragmentation in common heap implementations, and can easily be a
++ common source of performance problems).
++3. The library uses a "copy on write" strategy, meaning that it has to deal
++ with multithreading problems.
++
++Vstr
++----
++
++This is a highly orthogonal C string library with an emphasis on
++networking/realtime programming. It can be found here:
++http://www.and.org/vstr/
++
++1. The convoluted internal structure does not contain a '\0' char * compatible
++ buffer, so interoperability with the C library a non-starter.
++2. The API and implementation is very large (owing to its orthogonality) and
++ can lead to difficulty in understanding its exact functionality.
++3. An obvious dependency on gnu tools (confusing make configure step)
++4. Uses a reference counting system, meaning that it is not likely to be
++ thread safe.
++
++The implementation has an extreme emphasis on performance for nontrivial
++actions (adds, inserts and deletes are all constant or roughly O(#operations)
++time) following the "zero copy" principle. This trades off performance of
++trivial functions (character access, char buffer access/coersion, alias
++detection) which becomes significantly slower, as well as incremental
++accumulative costs for its searching/parsing functions. Whether or not Vstr
++wins any particular performance benchmark will depend a lot on the benchmark,
++but it should handily win on some, while losing dreadfully on others.
++
++The learning curve for Vstr is very steep, and it doesn't come with any
++obvious way to build for Windows or other platforms without gnu tools. At
++least one mechanism (the iterator) introduces a new undefined scenario
++(writing to a Vstr while iterating through it.) Vstr has a very large
++footprint, and is very ambitious in its total functionality. Vstr has no C++
++API.
++
++Vstr usage requires context initialization via vstr_init() which must be run
++in a thread-local context. Given the totally reference based architecture
++this means that sharing Vstrings across threads is not well defined, or at
++least not safe from race conditions. This API is clearly geared to the older
++standard of fork() style multitasking in UNIX, and is not safely transportable
++to modern shared memory multithreading available in Linux and Windows. There
++is no portable external solution making the library thread safe (since it
++requires a mutex around each Vstr context -- not each string.)
++
++In the documentation for this library, a big deal is made of its self hosted
++s(n)printf-like function. This is an issue for older compilers that don't
++include vsnprintf(), but also an issue because Vstr has a slow conversion to
++'\0' terminated char * mechanism. That is to say, using "%s" to format data
++that originates from Vstr would be slow without some sort of native function
++to do so. Bstrlib sidesteps the issue by relying on what snprintf-like
++functionality does exist and having a high performance conversion to a char *
++compatible string so that "%s" can be used directly.
++
++Str Library
++-----------
++
++This is a fairly extensive string library, that includes full unicode support
++and targetted at the goal of out performing MFC and STL. The architecture,
++similarly to MFC's CStrings, is a copy on write reference counting mechanism.
++
++http://www.utilitycode.com/str/default.aspx
++
++1. Commercial.
++2. C++ only.
++
++This library, like Vstr, uses a ref counting system. There is only so deeply
++I can analyze it, since I don't have a license for it. However, performance
++improvements over MFC's and STL, doesn't seem like a sufficient reason to
++move your source base to it. For example, in the future, Microsoft may
++improve the performance CString.
++
++It should be pointed out that performance testing of Bstrlib has indicated
++that its relative performance advantage versus MFC's CString and STL's
++std::string is at least as high as that for the Str library.
++
++libmib astrings
++---------------
++
++A handful of functional extensions to the C library that add dynamic string
++functionality.
++http://www.mibsoftware.com/libmib/astring/
++
++This package basically references strings through char ** pointers and assumes
++they are pointing to the top of an allocated heap entry (or NULL, in which
++case memory will be newly allocated from the heap.) So its still up to user
++to mix and match the older C string functions with these functions whenever
++pointer arithmetic is used (i.e., there is no leveraging of the type system
++to assert semantic differences between references and base strings as Bstrlib
++does since no new types are introduced.) Unlike Bstrlib, exact string length
++meta data is not stored, thus requiring a strlen() call on *every* string
++writing operation. The library is very small, covering only a handful of C's
++functions.
++
++While this is better than nothing, it is clearly slower than even the
++standard C library, less safe and less functional than Bstrlib.
++
++To explain the advantage of using libmib, their website shows an example of
++how dangerous C code:
++
++ char buf[256];
++ char *pszExtraPath = ";/usr/local/bin";
++
++ strcpy(buf,getenv("PATH")); /* oops! could overrun! */
++ strcat(buf,pszExtraPath); /* Could overrun as well! */
++
++ printf("Checking...%s\n",buf); /* Some printfs overrun too! */
++
++is avoided using libmib:
++
++ char *pasz = 0; /* Must initialize to 0 */
++ char *paszOut = 0;
++ char *pszExtraPath = ";/usr/local/bin";
++
++ if (!astrcpy(&pasz,getenv("PATH"))) /* malloc error */ exit(-1);
++ if (!astrcat(&pasz,pszExtraPath)) /* malloc error */ exit(-1);
++
++ /* Finally, a "limitless" printf! we can use */
++ asprintf(&paszOut,"Checking...%s\n",pasz);fputs(paszOut,stdout);
++
++ astrfree(&pasz); /* Can use free(pasz) also. */
++ astrfree(&paszOut);
++
++However, compare this to Bstrlib:
++
++ bstring b, out;
++
++ bcatcstr (b = bfromcstr (getenv ("PATH")), ";/usr/local/bin");
++ out = bformat ("Checking...%s\n", bdatae (b, "<Out of memory>"));
++ /* if (out && b) */ fputs (bdatae (out, "<Out of memory>"), stdout);
++ bdestroy (b);
++ bdestroy (out);
++
++Besides being shorter, we can see that error handling can be deferred right
++to the very end. Also, unlike the above two versions, if getenv() returns
++with NULL, the Bstrlib version will not exhibit undefined behavior.
++Initialization starts with the relevant content rather than an extra
++autoinitialization step.
++
++libclc
++------
++
++An attempt to add to the standard C library with a number of common useful
++functions, including additional string functions.
++http://libclc.sourceforge.net/
++
++1. Uses standard char * buffer, and adopts C 99's usage of "restrict" to pass
++ the responsibility to guard against aliasing to the programmer.
++2. Adds no safety or memory management whatsoever.
++3. Most of the supplied string functions are completely trivial.
++
++The goals of libclc and Bstrlib are clearly quite different.
++
++fireString
++----------
++
++http://firestuff.org/
++
++1. Uses standard char * buffer, and adopts C 99's usage of "restrict" to pass
++ the responsibility to guard against aliasing to the programmer.
++2. Mixes char * and length wrapped buffers (estr) functions, doubling the API
++ size, with safety limited to only half of the functions.
++
++Firestring was originally just a wrapper of char * functionality with extra
++length parameters. However, it has been augmented with the inclusion of the
++estr type which has similar functionality to stralloc. But firestring does
++not nearly cover the functional scope of Bstrlib.
++
++Safe C String Library
++---------------------
++
++A library written for the purpose of increasing safety and power to C's string
++handling capabilities.
++http://www.zork.org/safestr/safestr.html
++
++1. While the safestr_* functions are safe in of themselves, interoperating
++ with char * string has dangerous unsafe modes of operation.
++2. The architecture of safestr's causes the base pointer to change. Thus,
++ its not practical/safe to store a safestr in multiple locations if any
++ single instance can be manipulated.
++3. Dependent on an additional error handling library.
++4. Uses reference counting, meaning that it is either not thread safe or
++ slow and not portable.
++
++I think the idea of reallocating (and hence potentially changing) the base
++pointer is a serious design flaw that is fatal to this architecture. True
++safety is obtained by having automatic handling of all common scenarios
++without creating implicit constraints on the user.
++
++Because of its automatic temporary clean up system, it cannot use "const"
++semantics on input arguments. Interesting anomolies such as:
++
++ safestr_t s, t;
++ s = safestr_replace (t = SAFESTR_TEMP ("This is a test"),
++ SAFESTR_TEMP (" "), SAFESTR_TEMP ("."));
++ /* t is now undefined. */
++
++are possible. If one defines a function which takes a safestr_t as a
++parameter, then the function would not know whether or not the safestr_t is
++defined after it passes it to a safestr library function. The author
++recommended method for working around this problem is to examine the
++attributes of the safestr_t within the function which is to modify any of
++its parameters and play games with its reference count. I think, therefore,
++that the whole SAFESTR_TEMP idea is also fatally broken.
++
++The library implements immutability, optional non-resizability, and a "trust"
++flag. This trust flag is interesting, and suggests that applying any
++arbitrary sequence of safestr_* function calls on any set of trusted strings
++will result in a trusted string. It seems to me, however, that if one wanted
++to implement a trusted string semantic, one might do so by actually creating
++a different *type* and only implement the subset of string functions that are
++deemed safe (i.e., user input would be excluded, for example.) This, in
++essence, would allow the compiler to enforce trust propogation at compile
++time rather than run time. Non-resizability is also interesting, however,
++it seems marginal (i.e., to want a string that cannot be resized, yet can be
++modified and yet where a fixed sized buffer is undesirable.)
++
++===============================================================================
++
++Examples
++--------
++
++ Dumping a line numbered file:
++
++ FILE * fp;
++ int i, ret;
++ struct bstrList * lines;
++ struct tagbstring prefix = bsStatic ("-> ");
++
++ if (NULL != (fp = fopen ("bstrlib.txt", "rb"))) {
++ bstring b = bread ((bNread) fread, fp);
++ fclose (fp);
++ if (NULL != (lines = bsplit (b, '\n'))) {
++ for (i=0; i < lines->qty; i++) {
++ binsert (lines->entry[i], 0, &prefix, '?');
++ printf ("%04d: %s\n", i, bdatae (lines->entry[i], "NULL"));
++ }
++ bstrListDestroy (lines);
++ }
++ bdestroy (b);
++ }
++
++For numerous other examples, see bstraux.c, bstraux.h and the example archive.
++
++===============================================================================
++
++License
++-------
++
++The Better String Library is available under either the BSD license (see the
++accompanying license.txt) or the Gnu Public License version 2 (see the
++accompanying gpl.txt) at the option of the user.
++
++===============================================================================
++
++Acknowledgements
++----------------
++
++The following individuals have made significant contributions to the design
++and testing of the Better String Library:
++
++Bjorn Augestad
++Clint Olsen
++Darryl Bleau
++Fabian Cenedese
++Graham Wideman
++Ignacio Burgueno
++International Business Machines Corporation
++Ira Mica
++John Kortink
++Manuel Woelker
++Marcel van Kervinck
++Michael Hsieh
++Richard A. Smith
++Simon Ekstrom
++Wayne Scott
++
++===============================================================================
diff --git a/debian/patches/08-no-stream.patch b/debian/patches/08-no-stream.patch
new file mode 100644
index 0000000..70b7c1f
--- /dev/null
+++ b/debian/patches/08-no-stream.patch
@@ -0,0 +1,465 @@
+Description: No STREAM benchmark because of license problems
+ The debian guys don't like the STREAM benchmark license, so we
+ remove the STREAM benchmark for debian.
+ Author: Thomas Roehl <thomas.roehl at fau.de>$
+ Last-Update: 2015-05-05$
+ ---$
+ This patch header follows DEP-3: http://dep.debian.net/deps/dep3/$
+
+diff -Nur likwid-3.1.3.orig/test/Makefile likwid-3.1.3/test/Makefile
+--- likwid-3.1.3.orig/test/Makefile 2015-05-05 16:29:22.011134000 +0200
++++ likwid-3.1.3/test/Makefile 2015-05-05 16:32:21.055211000 +0200
+@@ -1,17 +1,11 @@
+ LIKWID_LIB = -L../ -llikwid
+ INCLUDES = -I../src/includes -I../ -I../MIC
+
+-all: testmarker testmarkerF90 stream streamM
++all: testmarker testmarkerF90
+
+ testmarkerF90: chaos.F90
+ ifort $(INCLUDES) -O3 -o $@ chaos.F90 $(LIKWID_LIB) -lpthread
+
+-stream: stream.c
+- icc -O3 $(INCLUDES) -mmic -openmp -o $@ -DLIKWID_PERFMON stream.c $(LIKWID_LIB) -lm
+-
+-streamM: stream.c
+- gcc -O3 $(INCLUDES) -fopenmp -o $@ -DLIKWID_PERFMON stream.c $(LIKWID_LIB) -lm
+-
+ testmarker:
+ gcc -O3 -std=c99 $(INCLUDES) -fopenmp -DLIKWID_PERFMON -o $@ testmarker-cnt.c $(LIKWID_LIB) -lm
+
+@@ -24,6 +18,6 @@
+ .PHONY: clean
+
+ clean:
+- rm -f stream streamM testmarker testmarkerF90
++ rm -f testmarker testmarkerF90
+
+
+diff -Nur likwid-3.1.3.orig/test/stream.c likwid-3.1.3/test/stream.c
+--- likwid-3.1.3.orig/test/stream.c 2015-05-05 16:29:22.008137000 +0200
++++ likwid-3.1.3/test/stream.c 1970-01-01 01:00:00.000000000 +0100
+@@ -1,423 +0,0 @@
+-/*-----------------------------------------------------------------------*/
+-/* Program: Stream */
+-/* Revision: $Id: stream.c,v 5.8 2007/02/19 23:57:39 mccalpin Exp mccalpin $ */
+-/* Original code developed by John D. McCalpin */
+-/* Programmers: John D. McCalpin */
+-/* Joe R. Zagar */
+-/* */
+-/* This program measures memory transfer rates in MB/s for simple */
+-/* computational kernels coded in C. */
+-/*-----------------------------------------------------------------------*/
+-/* Copyright 1991-2005: John D. McCalpin */
+-/*-----------------------------------------------------------------------*/
+-/* License: */
+-/* 1. You are free to use this program and/or to redistribute */
+-/* this program. */
+-/* 2. You are free to modify this program for your own use, */
+-/* including commercial use, subject to the publication */
+-/* restrictions in item 3. */
+-/* 3. You are free to publish results obtained from running this */
+-/* program, or from works that you derive from this program, */
+-/* with the following limitations: */
+-/* 3a. In order to be referred to as "STREAM benchmark results", */
+-/* published results must be in conformance to the STREAM */
+-/* Run Rules, (briefly reviewed below) published at */
+-/* http://www.cs.virginia.edu/stream/ref.html */
+-/* and incorporated herein by reference. */
+-/* As the copyright holder, John McCalpin retains the */
+-/* right to determine conformity with the Run Rules. */
+-/* 3b. Results based on modified source code or on runs not in */
+-/* accordance with the STREAM Run Rules must be clearly */
+-/* labelled whenever they are published. Examples of */
+-/* proper labelling include: */
+-/* "tuned STREAM benchmark results" */
+-/* "based on a variant of the STREAM benchmark code" */
+-/* Other comparable, clear and reasonable labelling is */
+-/* acceptable. */
+-/* 3c. Submission of results to the STREAM benchmark web site */
+-/* is encouraged, but not required. */
+-/* 4. Use of this program or creation of derived works based on this */
+-/* program constitutes acceptance of these licensing restrictions. */
+-/* 5. Absolutely no warranty is expressed or implied. */
+-/*-----------------------------------------------------------------------*/
+-#define _GNU_SOURCE
+-#include <stdlib.h>
+-# include <stdio.h>
+-# include <math.h>
+-# include <float.h>
+-# include <omp.h>
+-# include <limits.h>
+-# include <sys/time.h>
+-#include <sys/types.h>
+-#include <sys/syscall.h>
+-#include <unistd.h>
+-#include <sched.h>
+-#include <time.h>
+-#include <pthread.h>
+-
+-/* INSTRUCTIONS:
+- *
+- * 1) Stream requires a good bit of memory to run. Adjust the
+- * value of 'N' (below) to give a 'timing calibration' of
+- * at least 20 clock-ticks. This will provide rate estimates
+- * that should be good to about 5% precision.
+- */
+-
+-# define N 60000000
+-# define NTIMES 10
+-# define OFFSET 0
+-
+-/*
+- * 3) Compile the code with full optimization. Many compilers
+- * generate unreasonably bad code before the optimizer tightens
+- * things up. If the results are unreasonably good, on the
+- * other hand, the optimizer might be too smart for me!
+- *
+- * Try compiling with:
+- * cc -O stream_omp.c -o stream_omp
+- *
+- * This is known to work on Cray, SGI, IBM, and Sun machines.
+- *
+- *
+- * 4) Mail the results to mccalpin at cs.virginia.edu
+- * Be sure to include:
+- * a) computer hardware model number and software revision
+- * b) the compiler flags
+- * c) all of the output from the test case.
+- * Thanks!
+- *
+- */
+-#define gettid() syscall(SYS_gettid)
+-#include <likwid.h>
+-
+-# define HLINE "-------------------------------------------------------------\n"
+-
+-# ifndef MIN
+-# define MIN(x,y) ((x)<(y)?(x):(y))
+-# endif
+-# ifndef MAX
+-# define MAX(x,y) ((x)>(y)?(x):(y))
+-# endif
+-
+-static double a[N+OFFSET],
+- b[N+OFFSET],
+- c[N+OFFSET];
+-
+-static double avgtime[4] = {0}, maxtime[4] = {0},
+- mintime[4] = {FLT_MAX,FLT_MAX,FLT_MAX,FLT_MAX};
+-
+-static char *label[4] = {"Copy: ", "Scale: ",
+- "Add: ", "Triad: "};
+-
+-static double bytes[4] = {
+- 2 * sizeof(double) * N,
+- 2 * sizeof(double) * N,
+- 3 * sizeof(double) * N,
+- 3 * sizeof(double) * N
+- };
+-
+-static int
+-getProcessorID(cpu_set_t* cpu_set)
+-{
+- int processorId;
+-
+- for (processorId=0;processorId<128;processorId++)
+- {
+- if (CPU_ISSET(processorId,cpu_set))
+- {
+- break;
+- }
+- }
+- return processorId;
+-}
+-
+-int threadGetProcessorId()
+-{
+- cpu_set_t cpu_set;
+- CPU_ZERO(&cpu_set);
+- sched_getaffinity(gettid(),sizeof(cpu_set_t), &cpu_set);
+-
+- return getProcessorID(&cpu_set);
+-}
+-
+-extern double mysecond();
+-extern void checkSTREAMresults();
+-#ifdef _OPENMP
+-extern int omp_get_num_threads();
+-#endif
+-int
+-main()
+- {
+- int quantum, checktick();
+- int BytesPerWord;
+- register int j, k;
+- double scalar, t, times[4][NTIMES];
+-
+- /* --- SETUP --- determine precision and check timing --- */
+-
+- printf(HLINE);
+- printf("STREAM version $Revision: 5.8 $\n");
+- printf(HLINE);
+- BytesPerWord = sizeof(double);
+- printf("This system uses %d bytes per DOUBLE PRECISION word.\n",
+- BytesPerWord);
+-
+- printf(HLINE);
+- printf("Array size = %d, Offset = %d\n" , N, OFFSET);
+- printf("Total memory required = %.1f MB.\n",
+- (3.0 * BytesPerWord) * ( (double) N / 1048576.0));
+- printf("Each test is run %d times, but only\n", NTIMES);
+- printf("the *best* time for each is used.\n");
+-
+-#ifdef LIKWID_PERFMON
+- printf("Using likwid\n");
+-#endif
+-
+- LIKWID_MARKER_INIT;
+-
+-#ifdef _OPENMP
+- printf(HLINE);
+-#pragma omp parallel
+- {
+- LIKWID_MARKER_THREADINIT;
+-#pragma omp master
+- {
+- k = omp_get_num_threads();
+- printf ("Number of Threads requested = %i\n",k);
+- }
+-
+- printf ("Thread %d running on processor %d ....\n",omp_get_thread_num(),threadGetProcessorId());
+- }
+-#endif
+-
+- LIKWID_MARKER_START("init");
+- /* Get initial value for system clock. */
+-//#pragma omp parallel for
+- for (j=0; j<N; j++) {
+- a[j] = 1.0;
+- b[j] = 2.0;
+- c[j] = 0.0;
+- }
+- LIKWID_MARKER_STOP("init");
+-
+- printf(HLINE);
+-
+- if ( (quantum = checktick()) >= 1)
+- printf("Your clock granularity/precision appears to be "
+- "%d microseconds.\n", quantum);
+- else {
+- printf("Your clock granularity appears to be "
+- "less than one microsecond.\n");
+- quantum = 1;
+- }
+-
+- t = mysecond();
+-#pragma omp parallel for
+- for (j = 0; j < N; j++)
+- a[j] = 2.0E0 * a[j];
+- t = 1.0E6 * (mysecond() - t);
+-
+- printf("Each test below will take on the order"
+- " of %d microseconds.\n", (int) t );
+- printf(" (= %d clock ticks)\n", (int) (t/quantum) );
+- printf("Increase the size of the arrays if this shows that\n");
+- printf("you are not getting at least 20 clock ticks per test.\n");
+-
+- printf(HLINE);
+-
+- printf("WARNING -- The above is only a rough guideline.\n");
+- printf("For best results, please be sure you know the\n");
+- printf("precision of your system timer.\n");
+- printf(HLINE);
+-
+- /* --- MAIN LOOP --- repeat test cases NTIMES times --- */
+-
+- scalar = 3.0;
+- for (k=0; k<NTIMES; k++)
+- {
+- times[0][k] = mysecond();
+-#pragma omp parallel
+- {
+- LIKWID_MARKER_START("copy");
+-#pragma omp for
+- for (j=0; j<N; j++)
+- c[j] = a[j];
+- LIKWID_MARKER_STOP("copy");
+- }
+- times[0][k] = mysecond() - times[0][k];
+-
+- times[1][k] = mysecond();
+-#pragma omp parallel
+- {
+- LIKWID_MARKER_START("scale");
+-#pragma omp for
+- for (j=0; j<N; j++)
+- b[j] = scalar*c[j];
+- LIKWID_MARKER_STOP("scale");
+- }
+- times[1][k] = mysecond() - times[1][k];
+-
+- times[2][k] = mysecond();
+-#pragma omp parallel
+- {
+- LIKWID_MARKER_START("add");
+-#pragma omp for
+- for (j=0; j<N; j++)
+- c[j] = a[j]+b[j];
+- LIKWID_MARKER_STOP("add");
+- }
+- times[2][k] = mysecond() - times[2][k];
+-
+- times[3][k] = mysecond();
+-#pragma omp parallel
+- {
+- LIKWID_MARKER_START("triad");
+-#pragma omp for
+- for (j=0; j<N; j++)
+- a[j] = b[j]+scalar*c[j];
+- LIKWID_MARKER_STOP("triad");
+- }
+- times[3][k] = mysecond() - times[3][k];
+- }
+-
+- /* --- SUMMARY --- */
+-
+- for (k=1; k<NTIMES; k++) /* note -- skip first iteration */
+- {
+- for (j=0; j<4; j++)
+- {
+- avgtime[j] = avgtime[j] + times[j][k];
+- mintime[j] = MIN(mintime[j], times[j][k]);
+- maxtime[j] = MAX(maxtime[j], times[j][k]);
+- }
+- }
+-
+- printf("Function Rate (MB/s) Avg time Min time Max time\n");
+- for (j=0; j<4; j++) {
+- avgtime[j] = avgtime[j]/(double)(NTIMES-1);
+-
+- printf("%s%11.4f %11.4f %11.4f %11.4f\n", label[j],
+- 1.0E-06 * bytes[j]/mintime[j],
+- avgtime[j],
+- mintime[j],
+- maxtime[j]);
+- }
+- printf(HLINE);
+-
+- /* --- Check Results --- */
+- checkSTREAMresults();
+- printf(HLINE);
+-
+- LIKWID_MARKER_CLOSE;
+- return 0;
+-}
+-
+-# define M 20
+-
+-int
+-checktick()
+- {
+- int i, minDelta, Delta;
+- double t1, t2, timesfound[M];
+-
+-/* Collect a sequence of M unique time values from the system. */
+-
+- for (i = 0; i < M; i++) {
+- t1 = mysecond();
+- while( ((t2=mysecond()) - t1) < 1.0E-6 )
+- ;
+- timesfound[i] = t1 = t2;
+- }
+-
+-/*
+- * Determine the minimum difference between these M values.
+- * This result will be our estimate (in microseconds) for the
+- * clock granularity.
+- */
+-
+- minDelta = 1000000;
+- for (i = 1; i < M; i++) {
+- Delta = (int)( 1.0E6 * (timesfound[i]-timesfound[i-1]));
+- minDelta = MIN(minDelta, MAX(Delta,0));
+- }
+-
+- return(minDelta);
+- }
+-
+-
+-
+-/* A gettimeofday routine to give access to the wall
+- clock timer on most UNIX-like systems. */
+-
+-#include <sys/time.h>
+-
+-double mysecond()
+-{
+- struct timeval tp;
+- struct timezone tzp;
+- int i;
+-
+- i = gettimeofday(&tp,&tzp);
+- return ( (double) tp.tv_sec + (double) tp.tv_usec * 1.e-6 );
+-}
+-
+-void checkSTREAMresults ()
+-{
+- double aj,bj,cj,scalar;
+- double asum,bsum,csum;
+- double epsilon;
+- int j,k;
+-
+- /* reproduce initialization */
+- aj = 1.0;
+- bj = 2.0;
+- cj = 0.0;
+- /* a[] is modified during timing check */
+- aj = 2.0E0 * aj;
+- /* now execute timing loop */
+- scalar = 3.0;
+- for (k=0; k<NTIMES; k++)
+- {
+- cj = aj;
+- bj = scalar*cj;
+- cj = aj+bj;
+- aj = bj+scalar*cj;
+- }
+- aj = aj * (double) (N);
+- bj = bj * (double) (N);
+- cj = cj * (double) (N);
+-
+- asum = 0.0;
+- bsum = 0.0;
+- csum = 0.0;
+- for (j=0; j<N; j++) {
+- asum += a[j];
+- bsum += b[j];
+- csum += c[j];
+- }
+-
+-#ifndef abs
+-#define abs(a) ((a) >= 0 ? (a) : -(a))
+-#endif
+- epsilon = 1.e-8;
+-
+- if (abs(aj-asum)/asum > epsilon) {
+- printf ("Failed Validation on array a[]\n");
+- printf (" Expected : %f \n",aj);
+- printf (" Observed : %f \n",asum);
+- }
+- else if (abs(bj-bsum)/bsum > epsilon) {
+- printf ("Failed Validation on array b[]\n");
+- printf (" Expected : %f \n",bj);
+- printf (" Observed : %f \n",bsum);
+- }
+- else if (abs(cj-csum)/csum > epsilon) {
+- printf ("Failed Validation on array c[]\n");
+- printf (" Expected : %f \n",cj);
+- printf (" Observed : %f \n",csum);
+- }
+- else {
+- printf ("Solution Validates\n");
+- }
+-}
+-
diff --git a/debian/patches/series b/debian/patches/series
index 8dae98c..2887ed0 100644
--- a/debian/patches/series
+++ b/debian/patches/series
@@ -4,4 +4,6 @@
04-Makefile-overload.patch
05-ghashc-copyright.patch
06-ghashh-copyright.patch
+07-bstrlib.patch
+08-no-stream.patch
10-debianize.patch
--
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/likwid/likwid.git
More information about the Likwid-commit
mailing list