The 5-minute Guide to C Pointers

http://denniskubes.com/2012/08/16/the-5-minute-guide-to-c-pointers/

The 5-minute Guide to C Pointers

Pointers, Addresses and Dereferencing

What is a pointer? What is a memory address? What does dereferencing a pointer mean?

A pointer is a variable that holds the location of, literally the address-of, a variable stored in computer memory. An address is the actual address-of a variable in the computer memory. Dereferencing a pointer means getting the value stored in the memory at the address which the pointer “points” to.

The * is the value-at-address operator, also called the indirection operator. It is used both when declaring a pointer and when dereferencing a pointer. I will show why it is helpful to read it as the value-at-address operator in the code below. When we talk about getting the value-at-addess of the pointer we are talking about getting the value stored in the memory at the address which the pointer “points” to.

The & is the address-of operator and is used to reference the memory address of a variable. When we declare a variable, three things happen. One, computer memory is set aside for the variable. Two, The variable name is linked to that location in memory. And three, the value of the variable is placed into the memory that was set aside. By using the & operator in front of a variable name we can retrieve the memory address-of that variable. It is best to read this operator as address-of and we will show why below.

Take the following code which shows some common notations for the value-at-address (*) and adress-of (&) operators.

1 // declare an variable ptr which holds the value-at-address of an int type
2 int *ptr;
3 // declare assign an int the literal value of 1
4 int val = 1;
5 // assign to ptr the address-of the val variable
6 ptr = &val;
7 // dereference and get the value-at-address stored in ptr
8 int deref = *ptr;
9 printf("%d\n", deref);

If you take line 2 it reads a “declare a variable that points to the value-at-addess of an int type”. In other words declare a variable that holds a memory address where the value-at that memory address is a int. Line 4 we declare an int variable and assign it the literal value 1. Line 6 reads assign to the ptr pointer variable the address-of the variable val. This isn’t the value of the val variable (which is 1), this is the address-of the val variable in computer memory. Line 8 reads assign to the deref variable the value-at-address stored in the ptr pointer variable. We are getting the int value at the address stored in the ptr pointer, which in this case is 1. And finally we print out the deref variable on line 9 which prints 1.

Reading the operators as value-at-address (*) and address-of (&) helps to keep track of what is happening in any given line of code. I have found that sounding out the operators address-of and value-at-address helps to build a clearer mental model of what the code is doing. Soon it becomes second nature.

Think about pointers, addresses and values like envelopes, mailing addresses and houses. A pointer is like an envelope, on it we can write a mailing address. An address is like the mailing address, it is the location. The dereferenced value is like the house at the address, it is the value. A location represents the house but isn’t the house itself. We could erase the address on the envelope and write a different one if we wanted but that doesn’t change the house.

Using Pointers and Using Addresses

One question that comes up is where do you use a pointer versus where do you use an address? Let’s answer this with some code.

1 int *ptr;
2 int val = 1;
3 ptr = &val;
4  
5 // print out dereferenced values
6 printf("dereference *ptr = %d\n", *ptr);
7 printf("dereference address of val *(&val) = %d\n", *(&val));

Run that code and we get:

1 dereference *ptr = 1
2 dereference address of val *(&val) = 1

We create an int pointer, an int variable val, and assign the address-of the val variable to out pointer. We print out the dereferenced value of our pointer. And finally we print out the dereferenced value of the address-of val. This notation may look a little strange but remember to sound out the operators and it says to get the value-at-address for the address of val.

The point of this piece of code is to show that an address can be used wherever a pointer is used. An address can even be treated as a pointer and deferenced directly, with the proper parentheses notation.

Void Pointers, NULL pointers and Unintialized Pointers

A pointer can be declared to be a void type. A pointer can be set to NULL or 0. And a pointer variable that has been declared but not yet assigned a value is considered uninitialized.

1 int *uninit;// leave the int pointer uninitialized
2 int *nullptr = 0;// initialized to 0, could also be NULL
3 void *vptr;// declare as a void pointer type
4 int val = 1;
5 int *iptr;
6 int *backptr;
7  
8 // void type can hold any pointer type or reference
9 iptr = &val;
10 vptr = iptr;
11 printf("iptr=%p, vptr=%p\n", (void*)iptr, (void*)vptr);
12  
13 // assign void pointer back to an int pointer and dereference
14 backptr = vptr;
15 printf("*backptr=%d\n", *backptr);
16  
17 // print null and uninitialized pointers
18 printf("uninit=%p, nullptr=%p\n", (void*)uninit, (void*)nullptr);
19 // don't know what you will get back, random garbage?
20 // printf("*nullptr=%d\n", nullptr);
21 // will cause a segmentation fault
22 // printf("*nullptr=%d\n", nullptr);

Run the above code and you should get something like the output below although with different memory addresses.

1 iptr=0x7fff94b89c6c, vptr=0x7fff94b89c6c
2 *backptr=1
3 uninit=0x7fff94b89d50, nullptr=(nil)

On line 1 we leave our int pointer uninitialized. All pointers, like other variables, are uninitialized until they are assigned a value. Uninitialized pointers hold random memory addresses. Pointers can be assigned the literal value 0, an address-of a variable, or the value of another pointer. Line 2 we assign 0 to our pointer. I use the terms NULL pointer and pointer assigned a literal value of 0 interchangeably. Line 3 we declare void pointer type. Then an int value and a couple of int pointers on lines 4-6.

Lines 9-11 we assign a reference to one of our int pointers and then we assign that int pointer to our void pointer. Void pointers can hold any pointer type. They are mostly used for generic purposes such as data structures. Notice on line 11 we print out the locations of the both the int pointer and the void pointer. They now point to the same memory address. All pointers just hold memory addresses. The types they are declared are used only when dereferencing pointer values.

On lines 15-16 we assign our void pointer back to our backptr int pointer. Then we dereference and print the value pointed to by castptr, which is 1.

Line 19 is interesting. Here we print out the values of our uninitialized and NULL pointers. Notice that our uninitialized pointer has a memory location. This is a garbage location. Any variable that is declared but not initialized will have a garbage memory location. Pointers are no exception. There is no telling what is at that location in memory. This is the reason why you should never try to dereference an uninitialized pointer. Best case you get garbage back and you have a fun time trying to debug your program. Worst case the program crashes badly.

A 0 value is the only literal value that can be assigned to a pointer without generating a warning or using an explicit cast. The NULL macro from stdlib.h can also be used to assign a literal value of 0 to the pointer. Assigning NULL or 0 is useful as a placeholder until you are ready to use the pointer. The prevents the computer from assigning a random memory location. NULL pointers should never be dereferenced though as doing so will cause undefined behavior. On many systems it will caue a segmentation fault.

Pointers and Arrays

Arrays in C are contiguous blocks of memory that hold multiple objects of a specified type. Contrast that with a pointer that holds a single memory location. Arrays and pointers are not the same thing and are not interchangeable. That being said an array variable does point to the memory address of the first element of the array.

An array variable is constant. You can’t assign a pointer to an array variable, even if the pointer variable actually points to the same or a different array. You also cannot assign one array variable to another. You can assign an array variable to a pointer though and that is where things get confusing. When assigning the array to the pointer we are actually assigning the address of the first element in the array to the pointer.

1 int myarray[4] = {1,2,3,0};
2 int *ptr = myarray;
3 printf("*ptr=%d\n", *ptr);
4  
5 // cannot do this, array variables are constant
6 // myarray = ptr
7 // myarray = myarray2
8 // myarray = &myarray2[0]

On line 1 we initialize the array of ints. On line two we intiailize the int pointer and assign the array variable to it. Since the array variable actually is the memory address of the first element in the array, we have assigned the memory address of the first element in the array to the pointer. This is the same as doing int *ptr = &myarray[0], explicitly stating the address-of the first element in the array. Notice the pointer has to be the same type as the elements of the array, unless the pointer is a void pointer.

Pointers and Structs

Just like an array a pointer to a struct holds the memory address of the first element in the struct. And like arrays pointers to structs must be declared to point to the struct type or be void type.

1 struct person {
2   intage;
3   char*name;
4 };
5 struct person first;
6 struct person *ptr;
7  
8 first.age = 21;
9 char *fullname ="full name";
10 first.name = fullname;
11 ptr = &first;
12  
13 printf("age=%d, name=%s\n", first.age, ptr->name);

On lines 1-6 we declare the struct person, a variable to hold a person struct, and a pointer to a person struct. Line 8 we assign a literal int to the age member. Line 9-10 we declare a char pointer to a literal char array and then assign that to the struct name member. Line 11 we assign a reference to the first person struct to our struct pointer variable.

Line 13 we print out the age and name of our struct instance. Notice the two different notations, the . and the ->. With the age field we are accessing the struct instance directly and so we use the . notation. With the name field we are using our pointer to the struct instance and so we use the -> notation. This would be the same as doing (*ptr).name where we first derefence the pointer and then access the name field.

Pointers to Pointers

A pointer holds the address-of a variable. That variable can be another pointer. Take the following code:

1 int val = 1;
2 int *ptr = 0;
3 // declare a variable ptr2ptr which holds the value-at-address of
4 // an *int type which in holds the value-at-address of an int type
5 int **ptr2ptr = 0;
6 ptr = &val;
7 ptr2ptr = &ptr;
8 printf("&ptr=%p, &val=%p\n", (void*)&ptr, (void*)&val);
9 printf("ptr2ptr=%p, *ptr2ptr=%p, **ptr2ptr=%d\n", (void*)ptr2ptr, (void*)*ptr2ptr, **ptr2ptr);

When run you should get output similar to this but with different memory addresses.

1 &ptr=0x7fff390fa6f8, &val=0x7fff390fa70c
2 ptr2ptr=0x7fff390fa6f8, *ptr2ptr=0x7fff390fa70c, **ptr2ptr=1

Lines 1-2 are declaring and int variable val and an int pointer variable ptr. Line 5 is new. Here we are saying that we have a variable ptr2ptr that holds the value-at-address of another *int type. That int type in turn holding the value-at-address of an int type. In other words we have an address that when we dereference it we get another address. When we dereference that address we get an int value. There is no limit to the levels of indirection this can go but usually after 2-3 indirections it becomes mentally prohibitive to clearly understand what code is doing.

Line 6 we assign the ptr variable the address-of the val variable. We have seen this before. Line 7 we assign the ptr2ptr variable the address-of the ptr pointer variable. Double indirection. The ptr2ptr variable stores the address-of ptr which in turn stores the address-of val. Line 8 we print out the address-of the ptr and val variables. Line 9 we print out the value stored in ptr2ptr which is the same as &ptr. When we dereference that we get the address of val. What that is dereferenced we get the value 1. *ptr2ptr reads get the value-at-address stored in ptr2ptr which is the address-of ptr. **ptr2ptr reads get the value-at-address store in ptr which is also an address and then get the value-at-address of that address.

Conclusion

Hope this brief overview helps with some of the different types of pointers you will see. In a later post I might go into another type of advanced pointer that is used, the function pointer.

Questions and comments are always welcome.

Update 1: Thanks to phaomauchter and jorgem from hacker news for improvement suggestions. Made some changes to make some explanations more clear.
Update 2: Fixed spelling errors. Fixed unclear wording. Tried to make NULL pointer explanations more clear. Thanks to everybody for the comments and suggestions.
Update 3: Added a pointers to pointers section. Made descriptions of operators more clear.
Update 4: Thanks to JoachimS and Dinah for catching more spelling errors. I need to take a class in proofreading. Thanks to Radu for helping to clarify the value-at-address operator.
Update 5: Added a section about using pointers versus using addresses.



评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值