Documente Academic
Documente Profesional
Documente Cultură
Version 0.66
shablool@users.sf.net
Strinx Motivation
Need to solve real-life problems
Insufficient and error prone functionality of
C/C++ strings
Avoid heap exhaustion due to extensive
usage of std::basic_string
Avoid performance degradation of STL
containers in multi-threaded environment
Open source
Lightweight
C++ Design Aims
(Stroustroup, An Overview of the C++ Programing Langauge, 1999)
STL History
1979: Alexander Stepanov, Generic
Programing
1987: Stepanov & Musser, List prosessing
in Ada
1992: Stepanov & Lee, Initial C++ version
(HP)
1994: Approved by C++ ANSI/ISO standard
comitty
The Power of Generic
Programming
// Print items of list to std::cout, separated with
// newline.
template <typename ListT>
void print_list(const ListT& ls)
{
typedef typename ListT::value_type value_type;
typedef std::ostream_iterator<value_type> output_iterator;
std::copy(ls.begin(), ls.end(),
output_iterator(std::cout, "\n"));
}
Typical Usage of STL Containers
// Split s into two substring, using whitespace as delimiter.
// WARN: THIS CODE HAS BUG!
std::list<std::string>
stl_naive_split(const std::string& s)
{
size_t i, j;
std::list<std::string> toks_list;
i = s.find(' ');
j = s.find_first_not_of(' ', i);
toks_list.push_back(s.substr(0, i));
if (j < s.size())
{
toks_list.push_back(s.substr(j, s.size()j));
}
return toks_list;
}
Example: Split & Print
// Split argv[1] and print.
int main(int argc, char* argv[])
{
if (argc > 1)
{
std::string s(argv[1]);
print_list(stl_naive_split(s));
}
exit(EXIT_SUCCESS);
return 0;
}
What is a String?
Overview std::basic_string
template<typename _CharT, typename _Traits,
typename _Alloc>
class basic_string;
std::string s1(“hello, world”), s2;
std::string s3 = s1;
s2 = s3; S1
if (s2[0] == 'H') { ... }
S2
11,24 h e l l o , w o r l d \0 . . . .
S3
What is Wrong with
std::basic_string?
Always use heap allocation
“Behind the scenes” allocations
Monoliths "Unstrung" (GoTW #84)
Non-elegant interface
Prone to errors (e.g., tokenizing)
Strinx substring
A semantically rich wrapper over const-
char pointer and size.
Reference to existing strins, No memory
allocations!
SS
size=5
S1
h e l l o , w o r l d \0 . . . .
Operations with substrings
All immutable operations of
std::basic_string
substr, chop, split, trim
Token parsing
Generics (test_if, count_if)
Example: Strip & Split
int main(int argc, char* argv[])
{
using namespace strinx;
SubString ss, key, val;
for (int i = 1; i < argc; ++i)
{
// Strip leading and trailing whitespaces
ss = strip(SubString(argv[i]));
// Expecting key=value.
tie(key, val) = split(ss, '=');
key = strip(key); val = strip(val);
// Output
if (key.size() && val.size())
std::cout << "KEY: " << key
<< " VALUE: " << val << std::endl;
}
return 0;
}
Example: Token Parsing
int main(int arc, char * argv[])
{
using namespace strinx;
// Parse 'line' into tokens, delimeted by ':',';' or '/'.
// Output is: Ant Bee Cat Dog Eel Frog Giraffe
SString line("/Ant:::Bee;:Cat:Dog;Eel/Frog:/Giraffe///");
const char seps[] = ":;/";
SubString tok = find_token(line, seps);
while (tok.size() > 0)
{
std::cout << tok << ' ';
tok = find_next_token(line, tok, seps);
}
std::cout << std::endl;
return 0;
}
Char-Containers Alternatives to
std::basic_string
Class xstring: Dynamic allocated string.
Max-size is set upon construction. Uses
internal buffer for short strings.
Example: Stringify Types
typedef strinx::sstring<30, char> TString;
typedef std::complex<int> Complex;
// Converts complexnumber to string repr.
TString str(const Complex& c)
{
TString s;
strinx::format(s, "(%d,%d)", c.real(), c.imag());
return s;
}
// Stringify and print complex number
int main(void)
{
Complex c(5, 12);
std::cout << str(c) << std::endl;
return 0;
}
Logging
Complex c(5, 12);
printf("Complex: (%d,%d)\n", c.real(), c.imag());
std::cout << "Complex: (" << c.real() << ','
<< c.imag() << ")" << std::endl;
printf("Complex: %s\n", str(c).c_str());
std::cout << "Complex: " << str(c) << std::endl;
Format Types “on the Stack”
// “C” style string formating
void to_string(strinx::SString& s,
const std::complex<int>& c,
const basic_fmt* fmt = 0)
{
format(s, "(%d,%d)", c.real(), c.imag());
}
using namespace strinx;
std::complex<int> c(3, 4);
SString s;
format(s, "<%20r>", c);
std::cout << s << std::endl;
Locale
Relatively new feature in C++ standard
Not integrated well to std::basic_string
Strinx solution: set of functions over
substring:
SubString abc_lower("abcdef"), abc_upper("ABCDEF");
SubString xdigs_lower("0123456789abcdef");
// Case insensative compare.
int n = icompare(abc_lower, abc_upper, std::locale("C"));
// Remove any leading and trailing alphacharacters.
SubString ss = strip_alphas(xdigs_lower, std::locale("C"));
Memory Spaces of Linux
Process
Memory Allocator
Dynamic Memory Allocations on
Multithreaded Enviroments
Doug Lea allocator (glibc) does not
distinguish between threads
Modern allocators are better designed for
multithreaded enviroments running on
multprocessors architectures (Hoard,
Solaris mtmalloc, Google TCmalloc)
Some Storage Management Techniques for
Container Classes
(Doug Lea, C++ Report 1989)
The State of the Language
(Concurrency)
(Stroustrup, August 2008)
Memory Allocation in STL
Containers
// A standard container which offers fixed time access to
// individual elemnts
template<typename _Tp, typename _Alloc = std::allocator<_Tp>
>
class list;
_List_node<_Tp>*
_M_get_node()
{ return _M_impl._Node_alloc_type::allocate(1); }
void
_M_put_node(_List_node<_Tp>* __p)
{ _M_impl._Node_alloc_type::deallocate(__p, 1); }
Memory Pools
GNU libstdc++ Custom
Allocators
new_allocator, malloc_allocator:
Simply wraps ::operator new/malloc and
::operator delete/free.
__pool_alloc: A high-performance, single
pool allocator. The reusable memory is
shared among identical instantiations of
this type.
__mt_alloc: A high-performance fixed-size
allocator with exponentially-increasing
allocations.
Associative Containers
Set, Map, MultiSet, MultiMap
Underlying data structure: BST
AVL, Red-Black, TREAP (Skip-List)
Boost.Intrusive, Boost.Pool
Example: Random Numbers
// Generate random numbers within the range [first, last)
template <typename FwdIterator>
void stl_make_randoms(FwdIterator first, FwdIterator last)
{
while (first != last)
{
*first = rand();
++first;
}
}
template <typename ContainerT>
void strinx_make_randoms(ContainerT& c, int n)
{
while (n > 0)
{
c.push_back(rand());
}
}
Example: Random Numbers
int main(int argc, char* argv[])
{
// Calls 10 times to T's constructor!
int n_elems = 10;
std::list<int> stl_list(n_elems);
stl_make_randoms(stl_list.begin(), stl_list.end());
print_list(stl_list);
strinx::list<int> strinx_list(n_elems);
strinx_make_randoms(strinx_list, n_elems);
print_list(strinx_list);
strinx::bounded_list<int> strinx_list2(n_elems);
strinx_make_randoms(strinx_list2, n_elems);
print_list(strinx_list2);
return 0;
}
Abnormal Allocations
(Multithreaded)
What is n_elems=1?
What happens in case or runtime allocation
failure?
How can we know which thread failed to allocate
memory?
Strinx Containers
Normal: Allocates memory as needed via
allocator and retain an internal pool (free-
list) of unused memory, associated with
the object's instance
Bounded: Allocates a single continuous
memory region upon object's construction
and mange it internally.
Static Bounded: Manages an internal
memory region, whose max-size is set as
a template parameter.
Buffer & List
Strinx buffer is a fixed size container, with
double-ended queue semantics.
Designed to work with atomic_t (single-
reader single-writer threads)
bounded_list is designed for small sized
containers with list semantics
Example: Retokenize
#define MAX_TOKENS (10)
// Tokenize string (using ':', '/' or ';' as delimiters).
// Put result in list and dump to std::cout.
int main(int argc, char* argv[])
{
using namespace strinx;
list< SubString > tok_list(MAX_TOKENS);
SString line(" /Ant:::Bee;:Cat:Dog;Eel/Frog:/Giraffe/// ");
tokens(tok_list, strip(line), ":/;");
std::copy(tok_list.begin(), tok_list.end(),
std::ostream_iterator< xstring<char> >(std::cout, " "));
std::cout << std::endl;
return 0;
}
Performance Benchmarks
Intel Xeon 8cores, 2992.536MHz, 6144KB cache size, Linux Kernel
2.6.18, GCC 4.1.2 using -O3
Questions?