Program memory

11 篇文章 0 订阅

要build 一个C++ Program, 需要经过3个步骤:

(1) preprocessor: recognize meta-information about the code

这些meta information 如下: pricessir aims at directives which starts with # character like: #include <iostream>

-----#include <file>: insert the specified header file into the code at the location of directives

-----#define [key]  [value]: replace the key with the value everywhere,  or define constents or macros

-----#ifndef [key]

       #define [key]

      .............................

       #endif


 

(2)compiler: translate source code into machine depend object code 

(3)linker:link togrether all individual files into an application (.exe).


一旦我们生成了.exe 文件, 我们就可以运行了(run)。

例如程序:

#include <stdio.h>

int g_i = 100; /* A global variable */
int g_j; /* An uninitialized global variable */

int main(void) /* A function */
{
    int l_i = 1; /* A local variable */
    static int s_i = 2; /* A static local variable */
    int c;
    for (c = 0; c < 1000; c++)
    {
        l_i += c;
    }

    return 0;
}

编译上述程序, 生成可执行文件.exe 。

When this code is compiled and linked, we have an executable “program”. When we execute the program, this code will be loaded into the virtual address space that the operating system (操作系统分配的虚拟内存空间)allocates for it. And in this post I will talk about how the instructions and the data of the program are arranged in its virtual address space.

Basically, the memory space for a program has two parts: the code segment(代码段) which holds the program’s executable instructions (程序的可执行命令)and the data segment (数据段)which holds data manipulated by the instructions(可执行命令处理的数据).

此时, 该可执行文件被load进虚拟内存空间中。 该程序在虚拟内存空间的分配情况如下:

memory layout of a program


或者:



注意:

Code Segment(代码段)

The code segment, more often called Text Segment, starts usually from the low address and contains the executable instructions (code) of the program. The text segment is static and protected from modification, which means that once loaded, the content of the text segment cannot be modified.

In our example, the text segment contains the machine instructions corresponding to our main()function, including the initialization instruction of the local variable l_i, the initialization instruction of the loop counter c and the loop itself.

“Data Segment”

The “data segment” is more complicated than the code segment, and it is more “active” also. The “data segment” can be further classified into 4 sections.

Initialized Data Segment(已经初始化了的数据段)

The Initialized Data Segment is the portion of memory space which contains global variables(全局变量) and static variables(静态变量, 包括全局静态变量和局部静态变量) that are initialized by the code. In our example, the global variable g_i and the static variable s_i are stored in the Initialized Data Segment.

If you read carefully enough, you may have noticed that I put the title of this section in quotes. That’s because that we usually use data segment to refer to the Initialized Data Segment, I used the term “Data Segment” in the title and in the illustration to fabric a general term which says “a segment containing data”, which enclose the Initialized Data Segment, the Uninitialized Data Segment, the heap and the stack. If you are familiar with the x86 assembly language, you would probably often say “start a data section with the .DATA directive” and “allocate a stack space with the .STACK directive”. But here I used the term “Data Segment” to refer to all the sections which are used to stock data.

Uninitialized Data Segment(未初始化的数据段)

The Uninitialized Data Segment, also referred to as BSS, contains all the global variables and static variables that are not initialized by the programmer. In our example, the global variable g_j will be stored in the Uninitialized Data Segment.

Stack section and Heap area(栈区域和堆区域)

The stack section and the heap area, face to face, occupy the rest of the virtual memory space of the program. Usually, the stack section starts from the highest address of the virtual memory space and increases towards the lower address of the virtual memory space(栈地址由高地址向下生成). Contrarily, the heap area starts from the lowest address after the uninitialized data segment and increases towards the highest address of the virtual memory space(堆地址由低地址向上生成).

The stack section(栈区域) is used to store automatic variables (non-static local variables, 自动变量, 即非静态的局部变量, 注意是变量啊) and the calling environment each time a function is called. In the stack section, variable spaces are allocated dynamically(动态分配的) by moving up and down the stack pointer(栈指针) which indicates the top of the stack. When a variable goes out of scope, the stack pointer simply goes up and the variable space is no longer usable(出栈, 指针当然向上移动, 因为栈是向下生成的). This management manner makes the memory allocation in stack very fast. I will talk about the Memory allocation and variable scope in future posts.

The heap area(堆区域) is a space area often used for dynamic memory allocation and is managed by malloc,realloc and free). The allocation of the space for a new variable in the heap is usually much slower than is in the stack because the heap may contain non-contiguous regions(造成了不连续的区域) caused by dynamic allocation and free of spaces. The heap area(堆区域)is shared by all threads of the program’s process(该程序的进程的所有线程). In contrary, each thread pocesses  its own stack section(每一个线程都有自己栈区域).


注意, text 和data segment 的size 在我们的程序编译完成之后就fixed。 另一方面,  stack和heap segments的size 却能够随着程序的执行过程而 grow and shrink。 高高地址       +-----------+
	|           |
	|   stack    | (向下grow)
	|           |
	+-----------+
	|           |
	|           | 
	|           |
	+-----------+
	|           | 
        |    heap   |(向上grow)
| |+-----------+ V| || data || |+-----------+ ^| text | | growth低地址 +-----------+
We can have as many .text and .data blocks in the source code as we want. The assembler will consolidate all the .text blocks into the text segment, and and all the .data blocks into the data segment.

Each subprogram in a MAL program should have its own .text block and its own .data block.

	# Subprogram 1
	
		.data
		# Variables for subprogram 1
		
		.text
		# Subprogram body
		
		ret
	
	# Subprogram 2
	
		.data
		# Variables for subprogram 2
		
		.text
		# Subprogram body
		
		ret
	
	# Main program
	
		.data
		# Variables for main
		
		.text
		# Main body
		
		ret
	

Why use multiple .text and .data sections in a program?

Variables should be defined along with the subprogram that uses them, for the sake of readability.

In the assembler's point of view, all variables are global. The notion of a variable's scope in C or Java is enforced by the compiler, not the hardware. Hardware knows only about memory addresses, and the compiler must keep track of which addresses are used by each subprogram.

In assembly language, it is the programmer's responsibility to ensure that each subprogram accesses only its own variables. Although the assembler will not prevent it, you must ensure that no subprogram accesses variables of another subprogram. If you do, your program is not modular, and your subprograms are not independent, portable modules as they should be.

A subprogram should never be considered part of the larger program it is used in. It should be considered an independent module that can be used within any other program without modification.


EX1:

The size(1) command reports the sizes (in bytes) of the text, data, and bss segments. ( for more details please refer man page of size(1) )

1. Check the following simple C program

#include <stdio.h>
 
int main( void )
{
     return 0;
}
[narendra@CentOS]$ gcc memory-layout.c -o memory-layout
[narendra@CentOS]$ size memory-layout
text       data        bss        dec        hex    filename
960        248          8       1216        4c0    memory-layout

2. Let us add one global variable in program, now check the size of bss (highlighted in red color).

#include <stdio.h>
 
int global; /* Uninitialized variable stored in bss*/
 
int main( void )
{
     return 0;
}
[narendra@CentOS]$ gcc memory-layout.c -o memory-layout
[narendra@CentOS]$ size memory-layout
text       data        bss        dec        hex    filename
 960        248         12       1220        4c4    memory-layout

3. Let us add one static variable which is also stored in bss.

#include <stdio.h>
 
int global; /* Uninitialized variable stored in bss*/
 
int main( void )
{
     static int i; /* Uninitialized static variable stored in bss */
     return 0;
}
[narendra@CentOS]$ gcc memory-layout.c -o memory-layout
[narendra@CentOS]$ size memory-layout
text       data        bss        dec        hex    filename
 960        248         16       1224        4c8    memory-layout

4. Let us initialize the static variable which will then be stored in Data Segment (DS)

#include <stdio.h>
 
int global; /* Uninitialized variable stored in bss*/
 
int main( void )
{
     static int i = 100; /* Initialized static variable stored in DS*/
     return 0;
}
[narendra@CentOS]$ gcc memory-layout.c -o memory-layout
[narendra@CentOS]$ size memory-layout
text       data        bss        dec        hex    filename
960         252         12       1224        4c8    memory-layout

5. Let us initialize the global variable which will then be stored in Data Segment (DS)

#include <stdio.h>
 
int global = 10; /* initialized global variable stored in DS*/
 
int main( void )
{
     static int i = 100; /* Initialized static variable stored in DS*/
     return 0;
}
[narendra@CentOS]$ gcc memory-layout.c -o memory-layout
[narendra@CentOS]$ size memory-layout
text       data        bss        dec        hex    filename
960         256          8       1224        4c8    memory-layout



  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值