Effective STL: Item 2: Beware the illusion of container-independent

C++ 专栏收录该内容
4 篇文章 0 订阅

    The STL is based on generalization. Arrays are generalized into con-tainers
and parameterized on the types of objects they contain. Func-tions
are generalized into algorithms and parameterized on the types
of iterators they use. Pointers are generalized into iterators and
parameterized on the type of objects they point to.
That¡¯s just the beginning. Individual container types are generalized
into sequence and associative containers, and similar containers are
given similar functionality. Standard contiguous-memory containers
(see Item 1) offer random-access iterators, while standard node-based
containers (again, see Item 1) provide bidirectional iterators. Sequence
containers support push_front and/or push_back, while associative
containers don¡¯t. Associative containers offer logarithmic-time
lower_bound, upper_bound,andequal_range member functions, but
sequence containers don¡¯t.
      With all this generalization going on, it¡¯s natural to want to join the
movement. This sentiment is laudable, and when you write your own
containers, iterators, and algorithms, you¡¯ll certainly want to pursue
it. Alas, many programmers try to pursue it in a different manner.
Instead of committing to particular types of containers in their soft-ware,
they try to generalize the notion of a container so that they can
use, say, a vector, but still preserve the option of replacing it with
something like a deque or a list later ¡ª all without changing the code
that uses it. That is, they strive to write container-independent code.
This kind of generalization, well-intentioned though it is, is almost
always misguided.
        Even the most ardent advocate of container-independent code soon
realizes that it makes little sense to try to write software that will work
with both sequence and associative containers. Many member func-tions
exist for only one category of container, e.g., only sequence con-tainers
support push_front or push_back, and only associative
containers support count and lower_bound,etc.Even such basicsas
insert and erase have signatures and semantics that vary from category
to category. For example, when you insert an object into a sequence
container, it stays where you put it, but if you insert an object into anഊassociative container, the container moves the object to where it
belongs in the container¡¯s sort order. For another example, the form of
erase taking an iterator returns a new iterator when invoked on a
sequence container, but it returns nothing when invoked on an asso-ciative
container. (Item 9 gives an example of how this can affect the
code you write.)
      Suppose, then, you aspire to write code that can be used with the
most common sequence containers: vector, deque,andlist. Clearly,
you must program to the intersection of their capabilities, and that
means nouses ofreserve or capacity (see Item 14), because deque and
list don¡¯t offer them. The presence of list also means you give up opera-tor[],
and you limit yourself to the capabilities of bidirectional itera-tors.
That, in turn, means you must stay away from algorithms that
demand random access iterators, including sort, stable_sort,
partial_sort,andnth_element (see Item 31).
On the other hand, your desire to support vector rules out use of
push_front and pop_front,andbothvector and deque put the kibosh on
splice and the member form of sort. In conjunction with the con-straints
above, this latter prohibition means that there is no form of
sort you can call on your ¡°generalized sequence container.¡±
That¡¯s the obvious stuff. If you violate any of those restrictions, your
code will fail to compile with at least one of the containers you want to
be able to use. The code that will compile is more insidious.
The main culprit is the different rules for invalidation of iterators,
pointers, and references that apply to different sequence containers.
To write code that will work correctly with vector, deque,andlist,you
must assume that any operation invalidating iterators, pointers, or
references in any of those containers invalidates them in the container
you¡¯re using. Thus, you must assume that every call to insert invali-dates
everything, because deque::insert invalidates all iterators and,
lacking the ability to call capacity, vector::insert must be assumed to
invalidate all pointers and references. (Item 1 explains that deque is
unique in sometimes invalidating its iterators without invalidating its
pointers and references.) Similar reasoning leads to the conclusion
that every call to erase must be assumed to invalidate everything.
Want more? You can¡¯t pass the data in the container to a C interface,
because only vector supports that (see Item 16). You can¡¯t instantiate
your container with bool asthe type of objects to be stored,because,
as Item 18 explains, vector<bool> doesn¡¯t always behave like a vector,
and it never actually stores bools. You can¡¯t assume list¡¯s constant-ഊtime insertions and erasures, because vector and deque take linear
time to perform those operations.
         When all is said and done, you¡¯re left with a ¡°generalized sequence
container¡± where you can¡¯t call reserve, capacity, operator[], push_front,
pop_front, splice, or any algorithm requiring random access iterators; a
container where every call to insert and erase takes linear time and
invalidates all iterators, pointers, and references; and a container
incompatible with C where bools can¡¯t be stored. Is that really the
kind of container you want to use in your applications? I suspect not.
If you rein in your ambition and decide you¡¯re willing to drop support
for list, you still give up reserve, capacity, push_front,andpop_front;you
still must assume that all calls to insert and erase take linear time and
invalidate everything; you still lose layout compatibility with C; and
you still can¡¯t store bools.
     If you abandon the sequence containers and shoot instead for code
that can work with different associative containers, the situation isn¡¯t
much better. Writing for both set and map is close to impossible,
because sets store single objects while maps storepairs of objects.
Even writing for both set and multiset (or map and multimap)is tough.
The insert member function taking only a value has different return
types for sets/maps than fortheirmulti cousins, and you must reli-giously
avoid making any assumptions about how many copies of a
value are stored in a container. With map and multimap,youmust
avoid using operator[], because that member function exists only for
     Face the truth: it¡¯s not worth it. The different containers are different,
and they have strengths and weaknesses that vary in significant
ways. They¡¯re not designed to be interchangeable, and there¡¯s little
you can do to paper that over. If you try, you¡¯re merely tempting fate,
and fate doesn¡¯tlike to be tempted.
Still, the day will dawn when you¡¯ll realize that a container choice you
made was, er, suboptimal, and you¡¯ll need to use a different container
type. You now know that when you change container types, you¡¯ll not
only need to fix whatever problems your compilers diagnose, you¡¯ll
also need to examine all the code using the container to see what
needs to be changed in light of the new container¡¯s performance char-acteristics
and rules for invalidation of iterators, pointers, and refer-ences.
If you switch from a vector to something else, you¡¯ll also have to
make sure you¡¯re no longer relying on vector¡¯s C-compatible memory
layout, and if you switch to a vector,you¡¯llhave to ensure that you¡¯re
not using it to store bools.ഊGiven the inevitability of having to change container types from time to time, you can facilitate such changes in the usual manner: by
encapsulating, encapsulating, encapsulating. One of the easiest ways
to do this is through the liberal use of typedefs for container and iter-ator
types. Hence, instead of writing this,
class Widget { ... };
vector<Widget> vw;
Widget bestWidget;
... // give bestWidget a value
vector<Widget>::iterator i = // find a Widget with the
find(vw.begin(), vw.end(), bestWidget); // same value as bestWidget
write this:
class Widget { ... };
typedef vector<Widget> WidgetContainer;
typedef WidgetContainer::iterator WCIterator;
WidgetContainer vw;
Widget bestWidget;
WCIterator i = find(vw.begin(), vw.end(), bestWidget);
This makes it a lot easier to change container types, something that¡¯s
especially convenient if the change in question is simply to add a cus-tom
allocator. (Such a change doesn¡¯t affect the rules for iterator/
pointer/reference invalidation.)
class Widget { ... };
template<typename T> // see Item 10 for why this
SpecialAllocator { ... }; // needs to be a template
typedef vector<Widget, SpecialAllocator<Widget> > WidgetContainer;
typedef WidgetContainer::iterator WCIterator;
WidgetContainer vw; // still works
Widget bestWidget;
WCIterator i = find(vw.begin(), vw.end(), bestWidget); // still works
If the encapsulating aspects of typedefs mean nothing to you, you¡¯re
still likely to appreciate the work they can save. For example, if you
have an object of typeഊmap<string,
CIStringCompare> // CIStringCompare is ¡°case-//
insensitive string compare;¡±
// Item 19 describes it
and you want to walk through the map using const_iterators, do you
really want to spell out
map<string, vector<Widget>::iterator, CIStringCompare>::const_iterator
more than once? Once you¡¯ve used the STL a little while, you¡¯ll realize
that typedefs are your friends.
A typedef is just a synonym for some other type, so the encapsulation
it affords is purely lexical. A typedef doesn¡¯t prevent a client from
doing (or depending on) anything they couldn¡¯t already do (or depend
on). You need bigger ammunition if you want to limit client exposure
to the container choices you¡¯ve made. You need classes.
To limit the code that may require modification if you replace one con-tainer
type with another, hide the container in a class, and limit the
amount of container-specific information visible through the class
interface. For example, if you need to create a customer list, don¡¯t use
a list directly.Instead,create aCustomerList class, and hide a list in its
private section:
class CustomerList {
typedef list<Customer> CustomerContainer;
typedef CustomerContainer::iterator CCIterator;
CustomerContainer customers;
public: // limit the amount of list-specific
... // information visible through
}; // this interface
At first, this may seem silly. After all a customer list is a list,right?
Well, maybe. Later you may discover that you don¡¯t need to insert or
erase customers from the middle of the list as often as you¡¯d antici-pated,
but you do need to quickly identify the top 20% of your cus-tomers
¡ª a task tailor-made for the nth_element algorithm (see
Item 31). But nth_element requires random access iterators. It won¡¯t
work with a list. In that case, your customer ¡°list¡± might be better
implemented as a vector or a deque.
When you consider this kind of change, you still have to check every
CustomerList member function and every friend to see how they¡¯ll be
affected (in terms of performance and iterator/pointer/reference
invalidation, etc.), but if you¡¯ve done a good job of encapsulating Cus-ഊtomerList¡¯s implementation details, the impact on CustomerList clients
should be small. You can¡¯t write container-independent code, but they
might be able to.

  • 0
  • 0
  • 0
  • 一键三连
  • 扫一扫,分享海报

<p> <span style="font-size:14px;color:#337FE5;">【为什么学爬虫?】</span> </p> <p> <span style="font-size:14px;">       1、爬虫入手容易,但是深入较难,如何写出高效率的爬虫,如何写出灵活性高可扩展的爬虫都是一项技术活。另外在爬虫过程中,经常容易遇到被反爬虫,比如字体反爬、IP识别、验证码等,如何层层攻克难点拿到想要的数据,这门课程,你都能学到!</span> </p> <p> <span style="font-size:14px;">       2、如果是作为一个其他行业的开发者,比如app开发,web开发,学习爬虫能让你加强对技术的认知,能够开发出更加安全的软件和网站</span> </p> <p> <br /> </p> <span style="font-size:14px;color:#337FE5;">【课程设计】</span> <p class="ql-long-10663260"> <span> </span> </p> <p class="ql-long-26664262" style="font-size:11pt;color:#494949;"> 一个完整的爬虫程序,无论大小,总体来说可以分成三个步骤,分别是: </p> <ol> <li class="" style="font-size:11pt;color:#494949;"> 网络请求:模拟浏览器的行为从网上抓取数据。 </li> <li class="" style="font-size:11pt;color:#494949;"> 数据解析:将请求下来的数据进行过滤,提取我们想要的数据。 </li> <li class="" style="font-size:11pt;color:#494949;"> 数据存储:将提取到的数据存储到硬盘或者内存中。比如用mysql数据库或者redis等。 </li> </ol> <p class="ql-long-26664262" style="font-size:11pt;color:#494949;"> 那么本课程也是按照这几个步骤循序渐进的进行讲解,带领学生完整的掌握每个步骤的技术。另外,因为爬虫的多样性,在爬取的过程中可能会发生被反爬、效率低下等。因此我们又增加了两个章节用来提高爬虫程序的灵活性,分别是: </p> <ol> <li class="" style="font-size:11pt;color:#494949;"> 爬虫进阶:包括IP代理,多线程爬虫,图形验证码识别、JS加密解密、动态网页爬虫、字体反爬识别等。 </li> <li class="" style="font-size:11pt;color:#494949;"> Scrapy和分布式爬虫:Scrapy框架、Scrapy-redis组件、分布式爬虫等。 </li> </ol> <p class="ql-long-26664262" style="font-size:11pt;color:#494949;"> 通过爬虫进阶的知识点我们能应付大量的反爬网站,而Scrapy框架作为一个专业的爬虫框架,使用他可以快速提高我们编写爬虫程序的效率和速度。另外如果一台机器不能满足你的需求,我们可以用分布式爬虫让多台机器帮助你快速爬取数据。 </p> <p style="font-size:11pt;color:#494949;">   </p> <p class="ql-long-26664262" style="font-size:11pt;color:#494949;"> 从基础爬虫到商业化应用爬虫,本套课程满足您的所有需求! </p> <p class="ql-long-26664262" style="font-size:11pt;color:#494949;"> <br /> </p> <p> <br /> </p> <p> <span style="font-size:14px;background-color:#FFFFFF;color:#337FE5;">【课程服务】</span> </p> <p> <span style="font-size:14px;">专属付费社群+定期答疑</span> </p> <p> <br /> </p> <p class="ql-long-24357476"> <span style="font-size:16px;"><br /> </span> </p> <p> <br /> </p> <p class="ql-long-24357476"> <span style="font-size:16px;"></span> </p>
©️2020 CSDN 皮肤主题: 大白 设计师:CSDN官方博客 返回首页
钱包余额 0