头文件和Include: Why and How

最新推荐文章于 2023-05-02 09:54:24 发布

大唐游子

最新推荐文章于 2023-05-02 09:54:24 发布

阅读量404

点赞数

分类专栏： C&C++ 文章标签： c++ c语言

原文链接：http://www.cplusplus.com/articles/Gw6AC542/

版权

C&C++ 专栏收录该内容

46 篇文章 0 订阅

订阅专栏

头文件和Include: Why and How

简介

这篇文件介绍了一个常见的新手问题：如何理解#include, 头文件和源文件的关系。

为什么需要头文件

如果你刚写C++, 你可能会问为什么需要#include文件，为什么需要多个.cpp文件？原因很简单：

可以提升编译速度。当你的程序和代码越来越大，如果所有的东西都放到一个源文件中，即使你只做了一个小小的修改，所有的东西都要重新编译。对规模比较小的程序这可能不是个问题，但是对规模较大的程序，编译一次可能会耗费好几分钟。你能想象到每次小的修改都要等一段很长时间的情景吗？

编译 -> 等8分钟 -> “我去，忘了个分号” -> 编译 -> 等8分钟 -> 调试 -> 编译 -> 等8分钟
可以让你的代码组织得更合理。它把不同的概念放到不同的文件中，当你要做修改时很容易找到对应的代码。
允许你将接口和实现相分离。没看懂没关系，后面会讲。

C++程序的构建分为2个阶段。第一，每个源文件被分别独立编译。编译器为每个源文件产生中间结果，这些中间结果叫做目标文件. 所有这些源文件被分别编译完成后，最终被链接到一起，产生最终的二进制文件（可执行程序）.

这意味着每个文件都是和其他文件独立分开编译的. 结果就是编译的时候a.cpp对b.cpp中的内容一无所知，下面是个例子：

// in myclass.cpp

class MyClass
{
public:
  void foo();
  int bar;
};

void MyClass::foo()
{
  // do stuff
}

// in main.cpp

int main()
{
  MyClass a; // Compiler error: 'MyClass' is unidentified
  return 0;
}

虽然MyClass在myclass.cpp中声明了，但没在main.cpp中声明，编译main.cpp会发生错误.

这时候头文件就有用了。头文件允许你将接口(这里的MyClass)对其他源文件可见，但将实现（这里的MyClass成员函数体）放到你自己的.cpp文件中，如下：

// in myclass.h

class MyClass
{
public:
  void foo();
  int bar;
};

// in myclass.cpp
#include "myclass.h"

void MyClass::foo()
{
}

//in main.cpp
#include "myclass.h"  // defines MyClass

int main()
{
  MyClass a; // no longer produces an error, because MyClass is defined
  return 0;
}

#include语句就像做拷贝/粘贴动作。编译器在编译文件时会将#include这一行替换成所包含文件的内容。

.h/.cpp/.hpp/.cc等的区别

所有这些文件本质上都是文本文件，但是不同类型的文件应该有不同的扩展后缀：

头文件应该有.h类的扩展后缀(.h / .hpp / .hxx);
C++源文件应该使用.c类的扩展后缀（.cpp / .cxx / .cc）；
C源文件只应该用.c类型.

C++和C文件做区分是因为对一些编译器这两种是不同的。

那头文件和源文件的区别是什么？一般来说，头文件是被包含的，但不会被编译；源文件会被编译，但不会被包含。

有时候（但很少很少发生）也会包含源文件，比如实例化模板。总之记住：不要包含源文件！

头文件防护

如果你把一个文件包含了不止一次，会出现让人抓狂的错误：

// myclass.h

class MyClass
{
  void DoSomething() { }
};

// main.cpp
#include "myclass.h"   // define MyClass
#include "myclass.h"   // Compiler error - MyClass already defined

你可能会说，“我怎么可能把同一个文件包含2次呢？”。像上面的情形可能不太会发生，但下面的情形可能经常出现：

// x.h
class X { };

// a.h
#include "x.h"

class A { X x; }

// b.h
#include "x.h"

class B { X x; };

// main.cpp

#include "a.h"  // also includes "x.h"
#include "b.h"  // includes x.h again!  ERROR

有些人可能会告诉你别在头文件中放#include语句，被听他们的。在头文件中放#include语句没什么问题，只要你处理好如下两个问题：

只#include你真正需要包含的东西(下一节会讲)
在多次包含时添加头文件防护。

头文件防护是在文件头部通过#define定义一个唯一标识符的技巧，如下：

//x.h

#ifndef __X_H_INCLUDED__   // if x.h hasn't been included yet...
#define __X_H_INCLUDED__   //   #define this so the compiler knows it has been included

class X { };
#endif

在x.h第一次被包含时，定义了__X_H_INCLUDED__这个宏；当x.h再次被包含时，会检查失败，x.h就不会被重复包含了。

记住，总是要对头文件添加防护！

为什么不防护你的.cpp文件呢？因为你就不会包含.cpp文件。

正确的包含方式

你创造的类经常会依赖其他类。比如，子类总是依赖它的父类，因为一个类要从父类继承的话，在编译期间就要了解其父类。

有两种依赖你需要了解：

可以被前向声明的依赖
需要被#include的依赖

比如，类A使用类B，那么类B就是类A的一个依赖。是否可以前向声明，或需要被包含，取决于类A如何使用类B：

什么都不做：A和B没有任何关系；
什么都不做：对B的引用是在一个友元声明里；
前向声明B：A包含了一个B的指针或引用，B* myb；
前向声明B：一个或多个函数有一个B的对象/指针/引用作为参数或返回值， B MyFunction(B myb);
#include “b.h”: B是A的父类
#include “b.h”: A包含B的对象，B myb

要尽量选简单的选择，优先什么也不做，其此是前向声明，最后再#include头文件。

理想情况下，类的依赖应该放到头文件中，下面是一个“正确”的头文件例子：

//=================================
// include guard
#ifndef __MYCLASS_H_INCLUDED__
#define __MYCLASS_H_INCLUDED__

//=================================
// forward declared dependencies
class Foo;
class Bar;

//=================================
// included dependencies
#include <vector>
#include "parent.h"

//=================================
// the actual class
class MyClass : public Parent  // Parent object, so #include "parent.h"
{
public:
  std::vector<int> avector;    // vector object, so #include <vector>
  Foo* foo;                    // Foo pointer, so forward declare Foo
  void Func(Bar& bar);         // Bar reference, so forward declare Bar

  friend class MyFriend;       // friend declaration is not a dependency
                               //   don't do anything about MyFriend
};

#endif // __MYCLASS_H_INCLUDED__

上面的例子展示了两类不同的依赖以及如何处理它们。因为MyClass只使用了Foo的指针而没有使用Foo对象，所有我们可以前向声明Foo, 而不需要#include “foo.h”. 尽量使用前向声明，不需要时就不要#include.多余的#include会引入问题。

为什么这是正确的包含方法

总的观点就是使"myclass.h"自包含，不需要其他程序了解MyClass内部的工作。如果其他类要使用MyClass, 它直接#include "myclass.h"就够了。

另外的某某方法会要求你在#include "myclass.h"之前先#include MyClass所有的依赖，因为myclass.h不能自己包含它的全部依赖。这让人头疼，因为使用这个类很不直观。

这个例子展示了一个好的方法：

//example.cpp

//  I want to use MyClass
#include "myclass.h"   // will always work, no matter what MyClass looks like.
                       // You're done
               //  (provided myclass.h follows my outline above and does
               //   not make unnecessary #includes)

这是另外一个不好的某某方法：

//example.cpp

//  I want to use MyClass
#include "myclass.h"
   // ERROR 'Parent' undefined

出错了，再包含parent.h:

#include "parent.h"
#include "myclass.h"
   // ERROR 'std::vector' undefined

#include "parent.h"
#include <vector>
#include "myclass.h"
   // ERROR 'Support' undefined

为什么啊？我的类没用到Support啊？好吧，继续包含吧。。。

#include "parent.h"
#include <vector>
#include "support.h"
#include "myclass.h"
   // ERROR 'Support' undefined

present.h使用了Support，所以你必须在#include “parent.h"之前先包含"suport.h”.

那support.h要是再依赖其他头文件呢？按这种某某方法，我们不仅要记住每个类的依赖，还要记住它们的#include顺序。这很快就会成为一个噩梦。

如果你要对MyClass做小的修改会发生什么呢？比如你要用std::list替换std::vector。用某某方法，你必须修改每个#include “myclasss.h”的文件，把替换成；而采用我的方法，只需要修改"myclass.h"或"myclass.cpp".

我上面展示的“正确”的方法事关封装。所有使用MyClass的文件不需要指定MyClass使用了什么，也不需要#include MyClass的依赖。要使用MyClass，唯一要做的就是#include “MyClass.h”。头文件是自包含的，是面向对象友好的，易于使用和维护。

循环依赖

循环依赖就是两个类互相依赖。比如，类A依赖B，同时类B又依赖类A。如果你坚持上面说的“正确”的包含方法，尽量使用前向声明，通常不会碰到这个问题。

下面这个例子说明了为什么只包含需要的头文件：

// a.h -- assume it's guarded
#include "b.h"

class A { B* b; };

// b.h -- assume it's guarded
#include "a.h"

class B { A* a };

一眼看上去似乎没有什么错。B依赖A，所以包含它；A依赖B，也包含它。

这是个循环包含(也就无限包含)的问题。比如你要编译“a.cpp”:

// a.cpp
#include "a.h"

编译器会这样做：

#include "a.h"

   // start compiling a.h
   #include "b.h"

      // start compiling b.h
      #include "a.h"

         // compilation of a.h skipped because it's guarded

      // resume compiling b.h
      class B { A* a };        // <--- ERROR, A is undeclared

尽管你已经包含了“a.h”, 编译器在B类被编译之前不会看到A类。这就是循环包含问题。这也是为什么在使用指针或引用时，你应该尽量使用前向声明的原因。这里，"a.h"不该#include “b.h”，使用前向声明来声明B就行；同样的，b.h也应该通过前向声明来声明A。

当存在两个互相依赖时，也会发生循环包含问题（比如不能使用前向声明）：

// a.h (guarded)

#include "b.h"

class A
{
  B b;   // B is an object, can't be forward declared
};

// b.h (guarded)

#include "a.h"

class B
{
  A a;   // A is an object, can't be forward declared
};

然而这种情况在概念上时不可能的。这是一个设计缺陷。如果A包含了B对象，B又包含了A对象，然后A又包含了B对象… 产生了无限递归，两个类都不能被实例化。解决办法时一个类或两个类都包含另一个类的指针或引用，然后前向声明它即可。

函数内联

内联函数就是函数体需要在每个cpp文件中存在，否则会发生链接错误（因为它们不能在链接期间被链接，它们需要在编译期间被编译到代码中）。

这有可能发生循环引用：

class B
{
public:
  void Func(const A& a)   // parameter, so forward declare is okay
  {
    a.DoSomething();      // but now that we've dereferenced it, it
                          //  becomes an #include dependency
               // = we now have a potential circular inclusion
  }
};

关键点是当内联函数需要存在于头文件中时，它们不需要存在于类定义中。我们利用一下循环漏洞：

// b.h  (assume its guarded)

//------------------
class A;  // forward declared dependency

//------------------
class B
{
public:
  void Func(const A& a);  // okay, A is forward declared
};

//------------------
#include "a.h"        // A is now an include dependency

inline void B::Func(const A& a)
{
  a.DoSomething();    // okay!  a.h has been included
}

这么做是绝对安全的。完全避免了循环依赖问题，即使a.h包含了b.h。这是因为B类在被完全定义之前，#include并没有出现。

可是把#include放到头文件的末尾比较丑陋，有其他办法吗？有的，可以把函数体放到另一个头文件中：

// b.h

    // blah blah

class B { /* blah blah */ };

#include "b_inline.h"  // or I sometimes use "b.hpp"

// b_inline.h (or b.hpp -- whatever)

#include "a.h"
#include "b.h"  // not necessary, but harmless
                //  you can do this to make this "feel" like a source
                //  file, even though it isn't

inline void B::Func(const A& a)
{
  a.DoSomething();
}

这样做将接口和实现相分离，并允许实现被内联。

前向声明模板

前向声明对简单的类是很直观的方法，但对模板类就不那么直观了。考虑下面的场景：

// a.h

// included dependencies
#include "b.h"

// the class template
template <typename T>
class Tem
{
 /*...*/
  B b;
};

// class most commonly used with 'int'
typedef Tem<int> A;  // typedef'd as 'A'

// b.h

// forward declared dependencies
class A;  // error!

// the class
class B
{
 /* ... */
  A* ptr;
};

看上去符合逻辑，但代码不工作！因为A不是一个真正的类，而是一个typedef。同时注意我们不能#include “a.h"，因为存在循环依赖问题。

为了前向声明A，我们需要typedef它。这意味着我们需要前向声明typedef。像这样做：

template <typename T> class Tem;  // forward declare our template
typedef Tem<int> A;               // then typedef 'A'

这比前向声明class A要丑陋。并且，这样使模板类不易封装，它把模板类的内部布局完全暴露了出来。如果要做修改，会是大麻烦。

一个办法是创建一个头文件来包含模板类的前向声明，如下：

//a.h

#include "b.h"

template <typename T>
class Tem
{
 /*...*/
  B b;
};

//a_fwd.h

template <typename T> class Tem;
typedef Tem<int> A;

//b.h

#include "a_fwd.h"

class B
{
 /*...*/
  A* ptr;
};

大唐游子

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录