500行C++代码实现软件渲染器 - 1.Bresenham直线绘制算法

最新推荐文章于 2023-06-23 14:00:00 发布

xuzhimin1991

最新推荐文章于 2023-06-23 14:00:00 发布

阅读量1.1k

点赞数 1

分类专栏：图形学 OpenGL 3D

图形学同时被 3 个专栏收录

5 篇文章 6 订阅

订阅专栏

OpenGL

4 篇文章 0 订阅

订阅专栏

4 篇文章 0 订阅

订阅专栏

第一版

这一课的目标是绘制线框。为了实现这个目标，我们需要先学会绘制线段。我们可以直接阅读Bresenham直线算法的内容，但是我们还是自己写代码实现。要实现点（x0，y0）和（x1，y1）之间线段的绘制，代码应该怎么写呢？显然，代码可能是这样：

void line(int x0, int y0, int x1, int y1, TGAImage &image, TGAColor color) { 
    for (float t=0.; t<1.; t+=.01) { 
        int x = x0*(1.-t) + x1*t; 
        int y = y0*(1.-t) + y1*t; 
        image.set(x, y, color); 
    } 
}

直线绘制效果如下，完整代码在此。

第二版

先不论效率，第一版代码的问题还在于常量的选择，代码中该常量等于0.01。如果我们把常量设置为0.1，我们绘制的线段将会变成这样：

我们会轻易的发现必要的步骤：把常量设置为需要绘制的像素数量。那么，简单但不正确的改进代码看起来会是这样：

void line(int x0, int y0, int x1, int y1, TGAImage &image, TGAColor color) { 
    for (int x=x0; x<=x1; x++) { 
        float t = (x-x0)/(float)(x1-x0); 
        int y = y0*(1.-t) + y1*t; 
        image.set(x, y, color); 
    } 
}

注意！在我的学生的代码中，第一个代码错误就来源于整数的除法，如 (x-x0)/(x1-x0)。如果我们尝试以下代码绘制一些线段：

line(13, 20, 80, 40, image, white); 
line(20, 13, 40, 80, image, red); 
line(80, 40, 13, 20, image, red);

效果如下：

结果发现第一个线段是正确的，第二条线段有洞，而第三条线段根本没有绘制出来。注意，第一行代码和第三行代码绘制了同一条线段，只是颜色不同，方向相反。我们已经看到了白色的线段，它很好的被绘制出来了。我们希望将白色线段改成红色，但是却没有实现。这是为了测试对称性：也就是说线段的绘制结果不应该依赖与端点的顺序，线段（a,b）必须和线段（b,a）完全一样。

第三版

我们将两个点的坐标交换，保证x0始终小于x1，以此来解决第二版中红色线段未能绘制的问题。

而其中一条线段中间有洞，是因为线段的高度大于线段的宽度。我的学生经常建议通过以下方法来解决：

if (dx>dy) {for (int x)} else {for (int y)}

不会吧！

void line(int x0, int y0, int x1, int y1, TGAImage &image, TGAColor color) { 
    bool steep = false; 
    if (std::abs(x0-x1)<std::abs(y0-y1)) { // if the line is steep, we transpose the image 
        std::swap(x0, y0); 
        std::swap(x1, y1); 
        steep = true; 
    } 
    if (x0>x1) { // make it left−to−right 
        std::swap(x0, x1); 
        std::swap(y0, y1); 
    } 
    for (int x=x0; x<=x1; x++) { 
        float t = (x-x0)/(float)(x1-x0); 
        int y = y0*(1.-t) + y1*t; 
        if (steep) { 
            image.set(y, x, color); // if transposed, de−transpose 
        } else { 
            image.set(x, y, color); 
        } 
    } 
}

第四版-控制时间

提醒：编译器的优化（g++ -O3）往往比你优化代码的效果更好。这一节的内容的存在是因为历史原因。

第三版的代码效果很好，复杂度也正是实现最终渲染器所想要的。虽然它的效率显然是不高的，但是代码简洁、可读性强。同时注意，代码中也没有断言和边界检查，这很糟糕。在这个系列文章中，我不会重载这些代码，因为这些代码正被广泛阅读。同时，我会对代码检查进行系统地提醒。

因此，随然第三版代码能很好运行，但是我们仍然可以优化它。优化是一件危险的事情。我们需要清楚代码运行的平台。针对图形卡进行优化和针对CPU进行优化是完全不同的两件事情。在开展优化之前，我们必须对代码进行分析。并试图猜测，哪些操作是对资源消耗比较敏感的。

为了测试，我将之前的三条线段绘制了一百万次。我的CPU是Intel® Core(TM) i5-3450 CPU @ 3.10GHz，对于每一个像素代码都会调用TGAColor的拷贝构造函数。总共大概会有1000000*3*50个像素。调用次数很多，对吧？那么从哪里开始优化呢？分析结果会告诉我们。

我采用g++ -ggdb -g -pg -O0编译代码，然后运行gprof，结果如下：

%   cumulative   self              self     total 
 time   seconds   seconds    calls  ms/call  ms/call  name 
 69.16      2.95     2.95  3000000     0.00     0.00  line(int, int, int, int, TGAImage&, TGAColor) 
 19.46      3.78     0.83 204000000     0.00     0.00  TGAImage::set(int, int, TGAColor) 
  8.91      4.16     0.38 207000000     0.00     0.00  TGAColor::TGAColor(TGAColor const&) 
  1.64      4.23     0.07        2    35.04    35.04  TGAColor::TGAColor(unsigned char, unsigned char, unsigned char, unsigned char) 
  0.94      4.27     0.04                             TGAImage::get(int, int)

10%的时间花费在复制颜色上了，但是70%的时间花费在调用line()方法上了。那么这个方法就是我们需要优化的。

第四版-继续

我们注意到for循环中的除运算每次都是一样的，我们把它移到循环体外面。error变量记录了从当前(x,y)坐标到最佳线段的距离。每一次error大于一个像素，我们把y增加一，同时吧error减去一。

源代码在此：

void line(int x0, int y0, int x1, int y1, TGAImage &image, TGAColor color) { 
    bool steep = false; 
    if (std::abs(x0-x1)<std::abs(y0-y1)) { 
        std::swap(x0, y0); 
        std::swap(x1, y1); 
        steep = true; 
    } 
    if (x0>x1) { 
        std::swap(x0, x1); 
        std::swap(y0, y1); 
    } 
    int dx = x1-x0; 
    int dy = y1-y0; 
    float derror = std::abs(dy/float(dx)); 
    float error = 0; 
    int y = y0; 
    for (int x=x0; x<=x1; x++) { 
        if (steep) { 
            image.set(y, x, color); 
        } else { 
            image.set(x, y, color); 
        } 
        error += derror; 
        if (error>.5) { 
            y += (y1>y0?1:-1); 
            error -= 1.; 
        } 
    } 
}

这是gprof的输出结果：

%   cumulative   self              self     total 
 time   seconds   seconds    calls  ms/call  ms/call  name 
 38.79      0.93     0.93  3000000     0.00     0.00  line(int, int, int, int, TGAImage&, TGAColor) 
 37.54      1.83     0.90 204000000     0.00     0.00  TGAImage::set(int, int, TGAColor) 
 19.60      2.30     0.47 204000000     0.00     0.00  TGAColor::TGAColor(int, int) 
  2.09      2.35     0.05        2    25.03    25.03  TGAColor::TGAColor(unsigned char, unsigned char, unsigned char, unsigned char) 
  1.25      2.38     0.03                             TGAImage::get(int, int)

第五版-最终版

为什么我们一定要使用浮点数呢？唯一的原因是我们需要使用1除以dx并在循环体内与0.5进行比较。我们可以避免使用浮点数，把error变量替换为另外一个。我们称之为error2，并且假定它等于error*dx*2。代码在此：

void line(int x0, int y0, int x1, int y1, TGAImage &image, TGAColor color) { 
    bool steep = false; 
    if (std::abs(x0-x1)<std::abs(y0-y1)) { 
        std::swap(x0, y0); 
        std::swap(x1, y1); 
        steep = true; 
    } 
    if (x0>x1) { 
        std::swap(x0, x1); 
        std::swap(y0, y1); 
    } 
    int dx = x1-x0; 
    int dy = y1-y0; 
    int derror2 = std::abs(dy)*2; 
    int error2 = 0; 
    int y = y0; 
    for (int x=x0; x<=x1; x++) { 
        if (steep) { 
            image.set(y, x, color); 
        } else { 
            image.set(x, y, color); 
        } 
        error2 += derror2; 
        if (error2 > dx) { 
            y += (y1>y0?1:-1); 
            error2 -= dx*2; 
        } 
    } 
}

%   cumulative   self              self     total 
 time   seconds   seconds    calls  ms/call  ms/call  name 
 42.77      0.91     0.91 204000000     0.00     0.00  TGAImage::set(int, int, TGAColor) 
 30.08      1.55     0.64  3000000     0.00     0.00  line(int, int, int, int, TGAImage&, TGAColor) 
 21.62      2.01     0.46 204000000     0.00     0.00  TGAColor::TGAColor(int, int) 
  1.88      2.05     0.04        2    20.02    20.02  TGAColor::TGAColor(unsigned char, unsigned char, unsigned char, unsigned char)

现在，我们可以将不需要的副本删掉了，通过引用传递颜色进行调用就行了。最终版本代码里面，没有一个乘法和除法。最终执行时间从2.95秒下降到0.64秒。

线框渲染

现在我们可以准备完成线框的渲染了。你可以在这里查看源代码和测试模型。我使用了obj格式文件存储模型信息。我们的渲染器需要的信息是从顶点数组中读取出来的，格式如下：

v 0.608654 -0.568839 -0.416318

每一个顶点xyz坐标占一行，三角面的顶点信息如下：

f 1193/1240/1193 1180/1227/1180 1179/1226/1179

每个空格之后数字表示顶点数组中读取顶点的序号。这就表示，顶点序号为1193、1180、1179的顶点组成一个三角形。源代码中model.cpp包含了一个简单的转换器。通过以下main.cpp中的循环，我们的线框模型就能绘制出来了。

for (int i=0; i<model->nfaces(); i++) { 
    std::vector<int> face = model->face(i); 
    for (int j=0; j<3; j++) { 
        Vec3f v0 = model->vert(face[j]); 
        Vec3f v1 = model->vert(face[(j+1)%3]); 
        int x0 = (v0.x+1.)*width/2.; 
        int y0 = (v0.y+1.)*height/2.; 
        int x1 = (v1.x+1.)*width/2.; 
        int y1 = (v1.y+1.)*height/2.; 
        line(x0, y0, x1, y1, image, white); 
    } 
}