Power up C++ with STL: Part II (string, set, map)

See Power up C++ with STL: Part I (introduction && vector)

String
There is a special container to manipulate with strings. The string container has a few differences from vector<char>. Most of the differences come down to string manipulation functions and memory management policy.

String has a substring function without iterators, just indices:
 
 
  string  s  =   " hello "
 
string  
      s1 
=  s.substr( 0 3 ),  //  "hel" 
      s2  =  s.substr( 1 3 ),  //  "ell" 
      s3  =  s.substr( 0 , s.length() - 1 ),  " hell "  
      s4 
=  s.substr( 1 );  //  "ello" 

Beware of (s.length()-1) on empty string because s.length() is unsigned and unsigned(0) – 1 is definitely not what you are expecting!

Set
It’s always hard to decide which kind of container to describe first – set or map. My opinion is that, if the reader has a basic knowledge of algorithms, beginning from 'set' should be easier to understand.

Consider we need a container with the following features:
  • add an element, but do not allow duples [duplicates?]
  • remove elements
  • get count of elements (distinct elements)
  • check whether elements are present in set
This is quite a frequently used task. STL provides the special container for it – set. Set can add, remove and check the presence of particular element in O(log N), where N is the count of objects in the set. While adding elements to set, the duples [duplicates?] are discarded. A count of the elements in the set, N, is returned in O(1). We will speak of the algorithmic implementation of set and map later -- for now, let’s investigate its interface:
 
 
  set < int >  s; 

 
for ( int  i  =   1 ; i  <=   100 ; i ++
      s.insert(i); 
// Insert 100 elements, [1..100] 
 }
 

 s.insert(
42 );  //  does nothing, 42 already exists in set 

 
for ( int  i  =   2 ; i  <=   100 ; i  +=   2
      s.erase(i); 
// Erase even values 
 }
 

 
int  n  =   int (s.size());  //  n will be 50 

The push_back() member may not be used with set. It make sense: since the order of elements in set does not matter, push_back() is not applicable here.

Since set is not a linear container, it’s impossible to take the element in set by index. Therefore, the only way to traverse the elements of set is to use iterators.
 
 
  //  Calculate the sum of elements in set 
  set < int >  S; 
 
//  ... 
  int  r  =   0
 
for ( set < int > ::const_iterator it  =  S.begin(); it  !=  S.end(); it ++
      r 
+= *it; 
 }
 

It's more elegant to use traversing macros here. Why? Imagine you have a set< pair<string, pair< int, vector<int> > >. How to traverse it? Write down the iterator type name? Oh, no. Use our traverse macros instead.
 
 
  set <  pair < string , pair <   int , vector < int >   >   >  SS; 
 
int  total  =   0
 tr(SS, it) 

      total 
+= it->second.first; 
 }
 

Notice the 'it->second.first' syntax. Since 'it' is an iterator, we need to take an object from 'it' before operating. So, the correct syntax would be '(*it).second.first'. However, it’s easier to write 'something->' than '(*something)'. The full explanation will be quite long –just remember that, for iterators, both syntaxes are allowed.

To determine whether some element is present in set use 'find()' member function. Don’t be confused, though: there are several 'find()' ’s in STL. There is a global algorithm 'find()', which takes two iterators, element, and works for O(N). It is possible to use it for searching for element in set, but why use an O(N) algorithm while there exists an O(log N) one? While searching in set and map (and also in multiset/multimap, hash_map/hash_set, etc.) do not use global find – instead, use member function 'set::find()'. As 'ordinal' find, set::find will return an iterator, either to the element found, or to 'end()'. So, the element presence check looks like this:
 
 
  set < int >  s; 
 
//  ... 
  if (s.find( 42 !=  s.end()) 
      
// 42 presents in set 
 }
 
 
else  
      
// 42 not presents in set 
 }
 

Another algorithm that works for O(log N) while called as member function is count. Some people think that
 
 
  if (s.count( 42 !=   0
      
// … 
 }
 

or even
 
 
  if (s.count( 42 )) 
      
// … 
 }
 

is easier to write. Personally, I don’t think so. Using count() in set/map is nonsense: the element either presents or not. As for me, I prefer to use the following two macros:

#define  present(container, element) (container.find(element) != container.end()) 
 
#define  cpresent(container, element) (find(all(container),element) != container.end())

(Remember that all(c) stands for “c.begin(), c.end()”)

Here, 'present()' returns whether the element presents in the container with member function 'find()' (i.e. set/map, etc.) while 'cpresent' is for vector.

To erase an element from set use the erase() function.
 
 
  set < int >  s; 
 
//  … 
 s.insert( 54 ); 
 s.erase(
29 ); 

The erase() function also has the interval form:
  set < int >  s; 
 
//  .. 

 
set < int > ::iterator it1, it2; 
 it1 
=  s.find( 10 ); 
 it2 
=  s.find( 100 ); 
 
//  Will work if it1 and it2 are valid iterators, i.e. values 10 and 100 present in set. 
 s.erase(it1, it2);  //  Note that 10 will be deleted, but 100 will remain in the container 

Set has an interval constructor:
 
 
  int  data[ 5 =   51423 }
 
set < int >  S(data, data + 5 ); 

It gives us a simple way to get rid of duplicates in vector, and sort it:
 
 
 vector < int >  v; 
 
//  … 
  set < int >  s(all(v)); 
 vector
< int >  v2(all(s)); 

Here 'v2' will contain the same elements as 'v' but sorted in ascending order and with duplicates removed.

Any comparable elements can be stored in set. This will be described later.

Map
There are two explanation of map. The simple explanation is the following:
 
 
 map < string int >  M; 
 M[
" Top " =   1
 M[
" Coder " =   2
 M[
" SRM " =   10

 
int  x  =  M[ " Top " +  M[ " Coder " ]; 

 
if (M.find( " SRM " !=  M.end()) 
      M.erase(M.find(
"SRM")); // or even M.erase("SRM") 
 }
 

Very simple, isn’t it?

Actually map is very much like set, except it contains not just values but pairs <key, value>. Map ensures that at most one pair with specific key exists. Another quite pleasant thing is that map has operator [] defined.

Traversing map is easy with our 'tr()' macros. Notice that iterator will be an std::pair of key and value. So, to get the value use it->second. The example follows:
 
 
 map < string int >  M; 
 
//  … 
  int  r  =   0
 tr(M, it) 

      r 
+= it->second; 
 }
 

Don’t change the key of map element by iterator, because it may break the integrity of map internal data structure (see below).

There is one important difference between map::find() and map::operator []. While map::find() will never change the contents of map, operator [] will create an element if it does not exist. In some cases this could be very convenient, but it's definitly a bad idea to use operator [] many times in a loop, when you do not want to add new elements. That’s why operator [] may not be used if map is passed as a const reference parameter to some function:
  void  f( const  map < string int >&  M) ...
      
if(M["the meaning"== 42) ...// Error! Cannot use [] on const map objects! 
      }
 
      
if(M.find("the meaning"!= M.end()
         
&& M.find("the meaning")->second == 42) ...// Correct 
           cout << "Don't Panic!" << endl; 
      }
 
 }
 

Notice on Map and Set
Internally map and set are almost always stored as red-black trees. We do not need to worry about the internal structure, the thing to remember is that the elements of map and set are always sorted in ascending order while traversing these containers. And that’s why it’s strongly not recommended to change the key value while traversing map or set: If you make the modification that breaks the order, it will lead to improper functionality of container's algorithms, at least.

But the fact that the elements of map and set are always ordered can be practically used while solving TopCoder problems.

Another important thing is that operators ++ and -- are defined on iterators in map and set. Thus, if the value 42 presents in set, and it's not the first and the last one, than the following code will work:
 
 
  set < int >  S; 
 
//  ... 
  set < int > ::iterator it  =  S.find( 42 ); 
 
set < int > ::iterator it1  =  it, it2  =  it; 
 it1
--
 it2
++
 
int  a  =   * it1, b  =   * it2; 

Here 'a' will contain the first neighbor of 42 to the left and 'b' the first one to the right.

More on algorithms
It’s time to speak about algorithms a bit more deeply. Most algorithms are declared in the #include <algorithm> standard header. At first, STL provides three very simple algorithms: min(a,b), max(a,b), swap(a,b). Here min(a,b) and max(a,b) returns the minimum and maximum of two elements, while swap(a,b) swaps two elements.

Algorithm sort() is also widely used. The call to sort(begin, end) sorts an interval in ascending order. Notice that sort() requires random access iterators, so it will not work on all containers. However, you probably won't ever call sort() on set, which is already ordered.

You’ve already heard of algorithm find(). The call to find(begin, end, element) returns the iterator where ‘element’ first occurs, or end if the element is not found. Instead of find(...), count(begin, end, element) returns the number of occurrences of an element in a container or a part of a container. Remember that set and map have the member functions find() and count(), which works in O(log N), while std::find() and std::count() take O(N).

Other useful algorithms are next_permutation() and prev_permutation(). Let’s speak about next_permutation. The call to next_permutation(begin, end) makes the interval [begin, end) hold the next permutation of the same elements, or returns false if the current permutation is the last one. Accordingly, next_permutation makes many tasks quite easy. If you want to check all permutations, just write:
 
 
 vector < int >  v; 

 
for ( int  i  =   0 ; i  <   10 ; i ++
      v.push_back(i); 
 }
 

 
do  
      Solve(..., v); 
 }
  while (next_permutation(all(v)); 

Don’t forget to ensure that the elements in a container are sorted before your first call to next_permutation(...). Their initial state should form the very first permutation; otherwise, some permutations will not be checked.

String Streams
You often need to do some string processing/input/output. C++ provides two interesting objects for it: 'istringstream' and 'ostringstream'. They are both declared in #include <sstream>.

Object istringstream allows you to read from a string like you do from a standard input. It's better to view source:
 
 
  void  f( const   string &  s) 

      
// Construct an object to parse strings 
      istringstream is(s); 
 
      
// Vector to store data 
      vector<int> v; 

      
// Read integer while possible and add it to the vector 
      int tmp; 
      
while(is >> tmp) 
           v.push_back(tmp); 
      }
 
 }

The ostringstream object is used to do formatting output. Here is the code:
 
 
  string  f( const  vector < int >&  v) 

      
// Constucvt an object to do formatted output 
      ostringstream os; 

      
// Copy all elements from vector<int> to string stream as text 
      tr(v, it) 
           os 
<< ' ' << *it; 
      }
 

      
// Get string from string stream 
      string s = os.str(); 

      
// Remove first space character 
      if(!s.empty()) // Beware of empty string here 
           s = s.substr(1); 
      }
 

      
return s; 
 }
 

Summary
To go on with STL, I would like to summarize the list of templates to be used. This will simplify the reading of code samples and, I hope, improve your TopCoder skills. The short list of templates and macros follows:
 
 
 typedef vector < int >  vi; 
 typedef vector
< vi >  vvi; 
 typedef pair
< int , int >  ii; 
 
#define  sz(a) int((a).size()) 
 
#define  pb push_back 
 #defile all(c) (c).begin(),(c).end() 
 
#define  tr(c,i) for(typeof((c).begin() i = (c).begin(); i != (c).end(); i++) 
 
#define  present(c,x) ((c).find(x) != (c).end()) 
 
#define  cpresent(c,x) (find(all(c),x) != (c).end()) 

The container vector<int> is here because it's really very popular. Actually, I found it convenient to have short aliases to many containers (especially for vector<string>, vector<ii>, vector< pair<double, ii> >). But this list only includes the macros that are required to understand the following text.

Another note to keep in mind: When a token from the left-hand side of #define appears in the right-hand side, it should be placed in braces to avoid many nontrivial problems.
 

width="728" scrolling="no" height="90" frameborder="0" align="middle" src="http://download1.csdn.net/down3/20070601/01184120111.htm" marginheight="0" marginwidth="0">
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值