The Stack, Pointers, And Memory


Table Of Contents


Introduction

In response to the general lack of understanding and even worse, misunderstanding of how memory and pointers works inside of modern computers, I decided to write this document describing the inherant simplicity of it all.  Let's get to it.

 

Essentials

Memory, its properties and organization is a very simple idea to comprehend once people put away their often over-complexified ideas of what they believe it is.  One should keep in mind the simple idea that a computer is little more than a over-grown calculator with better I/O.

Basic Memory Format

Everthing inside of a computer is in binary.  Programs, text files, music, and everything else has been simplified to a linear set of bytes, with each byte consisting of eight bits.  A bit is of course a storage item capable of holding only two values: zero and one.  It is up to the running programs on the system to try & extract the meaning of specific bytes.  For example, CD quality music is simply a set of 16 bit digital samples taken 44 thousand times per second.  Executable programs are made up of a set of variable-length instructions for the processor.  Binary values for a few instructions (there are literally hundreds of instructions on many processors) are listed below:

 
 
Instruction
Binary Value
Divide
11110110
Multiply
0000111110101111
Subtract
00101000
Move Data Byte
10100100
2's Complement Negation
11110110
Compare String Data
10101110
Set Carry Flag
11111001
Figure 1: Program Instructions and their Binary Equivalents (Intel i486)

Basic Memory Organization

Memory is nothing more than a linear set of bytes.  The computer needs a method to access all the memory on the system and it does this through addressing: giving each byte in memory a name.  This name is basically a number.  The first byte of memory is 0.  The address of the last byte of memory depends on how much memory is present within the computer.  For example, if the computer has 100 bytes of memory, the address of the last byte in memory is 99.  There is a valid reason why computers count the first byte as byte 0 instead of byte 1, and the reason for that will become obvious later; for now, please accept that computers always start counting at 0.

 

Let me also note that these bytes can only be read and written.  One cannot 'move' memory; only copy it.  You can't move memory in the traditional sense of moving a piece of furniture across a room.  Instead, one has to copy the values to the new destination.  Usually, in a memory 'move' operation, the values are copied to the new location and then the memory at the old location is re-used for other purposes.

Pointers

Pointers are nothing more than a variable holding an address in memory.  For example,  the C declaration
char * foo;
declares a variable foo that is a pointer to a char.  The variable foo itself contains a memory address.   This declaration says that whatever is at the memory address foo contains should be interpreted as a char.

Note that the phrase "should be interpreted as," it underscores the earlier point that memory is just a set of bytes.  A char is simply a one byte value.  The name char is short for character, because one byte is the correct size to hold the ASCII value for one character.  The ASCII character set has a total of 256 values; the minimum value for a byte is 0 and the highest is 255.  Shabang, that's 256 possible values.  That is the only real correlation between a char and a letter of the alphabet.  Every letter of the alphabet has one of the 256 values in the ASCII table.  One byte can have any of 256 possible values.

The Intel 386 and its successors are capable of accessing up to 4,294,967,296 bytes of memory.  A 32 bit value has a minimum value of 0 and a maximum value of 4,294,967,295.  So, a 32 bit value will hold the addresses of any byte in memory.  On Intel machines, all pointers are 4 bytes == 32 bits long.

Program Segments

Memory within a computer typically is made of segments:  different portions of memory with specialized attributes.  Note that this is different from segmented memory, which I don't plan to cover here.

These segments are simply regions of memory with a starting and ending address, assigned attributes.  These attributes are usually access permissions, such as read, write, and execute permission.  Basically, a program will need access for a given operation upon a given segment.  A program cannot write to a segment that does not have the write permission attribute assigned to it.  The reason for this is safety: if a segement does not have write access enabled, an attempt to write to it is most probably a mistake by the program; which means it was a mistake by the programmer.
Common problems like "General Protection Faults," "Unrecoverable Application Errors," and the like are all cases of a program attempting to access a segment in a way it does not have permission to do.  Reasons for this will become obvious later.
 

Runtime Memory Organization

When a program runs in memory, it has the following memory organization:

 
 
CODE(RX)
DATA(RW)
BSS(RW)
HEAP(RW)
STACK(RW)
Figure 2: Loaded Program Segment Organization
 
Abbreviation
Permission
R
Read Access
W
Write Access
X
Execute Access
   These are all segments.  The letters in parenthesis list the access permissions of each segment.   Each segment is described in detail below.

 

The CODE Segment

This segment contains all the compiled executable code for the program.  Write permission to this segment has been disabled for two reasons:
  1. Code does not contain any sort of variables, and so the code has no practical reason to write over itself
  2. With the code segment read-only, it can be shared between different copies of the program executing at the same time
In the older days of computing, code use to modify itself for an increase of runtime speed, but today's modern processors are optimized for read-only code and any modification to code would only slow the processors down (the reason for this has to do with the processor's pipelining, but that will not be covered here).   So, one may safely assume that if a program attempts to modify its own code,  the attempt was unintentional.  Upon such an attempt, the operating system will usually alert the user that the program just shot itself in the foot and then kill the program.

The DATA and BSS Segments

These segments contains all the global data for the program.  Why they are seperate is beyond the scope of this document.  Obviously, a correct program would want to read and write to its global variables, and thus this segment had read and write access enabled.  One does not usually want to execute their global variables, and so execute permission is disabled.

 

The STACK Segment

In general, a stack is a single ended data structure with a first in, last out data ordering.  That means that if item A is placed into the stack ("Pushed"), and then item B is placed on the stack, then item B must be removed ("Popped") from the stack before item A may be retrieved.  This may seem kind of pointless, but the usefullness will be demonstrated soon.

The system implements a stack for all running programs.  The implementation of a stack is very simple: one variable kept inside the processor itself and a region of memory
 

Yes, the processor has a small, limited set of variables inside it.  These variables are called registers.  The number of registers is very limited, often times less than ten.  In most high-level languages, register access isn't permitted, but don't worry; the code the compiler generates uses registers heavily (and for yet more reasons not discussed here, it kind of has to).  But back to the topic at hand.

The register used for the stack is called the Stack Pointer, or SP for short.  The region of memory is the same very Stack segment we've been talking about.  Since a stack only has two operations, pushing (adding an item to the stack), and popping (removing an item from the stack), the easiest way to describe a stack is to just describe both of these operations.

When the program is loaded, the stack pointer is set to the highest address of the stack segment.  When an item is pushed onto the stack, two things occur.  First of all, the size of the item, in bytes, is subtracted from the stack pointer.  Then, all the bytes the item consists of are copied into the region of the stack segment that the stack pointer now points to.  When an item is popped from the stack, the size of the item is added to the stack pointer.  The copy of the item still resides on the stack, but will be overwritten when the next push occurs.
 

Why We Have A Stack
The stack is useful for storing context.  If a procedure simply pushes all its local variables onto the stack when it enters, and pops them off when its done, its complete context is nicely cleaned up afterwards.  What's nice is that when a procedure calls another procedure, the called procedure can do the same with its context; leaving the calling procedure's data completely alone.

 

Before we go any further into the stack, we have to define the Instruction Pointer, IP for short.  This is another register, like the Stack Pointer, which holds the address of the currently executing instruction in memory.  This register is maintained by the processor.  In fact, the processor runs by basically doing the following:

    1. Read instruction that IP points to
    2. Add length of instruction (in bytes) to IP
    3. Execute instruction
    4. Go to step1.
If the instruction run at step 3 modifies IP, then when the processor goes back to step 1, it will execute whatever instructions are at the new value of IP.  This type of instruction is typically called a jump instruction, because the processor jumps to a new location to find its next instruction.  This kind of instruction is very handy for procedure calls.

With that said, let's get back to the stack.

When a procedure is called, several other things are pushed onto the stack.  First of those is the address of the instruction immediately after the procedure call.  Then the parameters to the function are pushed onto the stack.  After the function has  finished, it will pop its own local variables off of the stack.  Then, it will pop its parameters off of the stack.  It will finally run a special instruction called a return.  This is a special processor instruction which pops off the top value of the stack and loads it into IP.  At this point, the stack will have the address of the next instruction of the calling procedure in it.  To make this a little clearer, a bit of code will help.

10     i=3;
11     foo(i);
12     i=4;

Basically, after line 10 is finished executing, the stack will have the address of the instructions composing line 12 pushed onto the stack.  Then, it will have the number 3 pushed onto the stack.  Then, the computer begins to execute the instructions comprising the procedure foo.   When it is done, foo will pop the value 3 off of the stack.  Then it will execute the return instruction, which will put the address of the first instruction of line 12 into IP.  And voilá, the state of the system, the Stack Pointer, the Instruction Pointer, etc. is all in the correct state for line 12 to execute.
 

The HEAP Segment

The heap is basically all the rest of memory the program has.  In that sense, it is often the largest segment in a program.  Programs use the heap to hold data that must exist after a function call return.  As shown in the previous section, when a function returns, all its local variables are re-used for other purposes soon after.  Now, the data segment could be used for holding this type of data, but that's really not its purpose.  The data segment is of fixed size and basically designed specifically to hold global data.  The heap segment is designed to hold values that have lifetimes greater than that of local variables but less than that of global variables.  To achieve this, the heap uses a technique called memory allocation.  In this technique, all the memory in the heap is managed by a pair of algorithms called the allocator and deallocator.  The allocator reserves a region of the heap for use.  The deallocator removes the reservation and allows the memory in the region to be re-used later for further reservations.

In the C language, the algorithms are implemented in a pair of functions called malloc and free.  C++ uses the operators new and delete. malloc and new implement allocation. free and delete implement deallocation.

Using Pointers

Pointers are extremely useful but can definitely be the most unsafe data type you'll ever use.Here are some basic examples.

Addresses

The addresses stored within pointers are easy to access and modify.Setting a pointer is as simple as setting an integer.One very useful technique to setting a pointer is to take the address of what you want to point to using the address-of operator, denoted with an ampersand (&).A programmer can do arithmetic on addresses just like any other variable:
void pointer_address_fun() {
     char c = `c';
     char *p;
     p = &c; // p now points to c.
     p--; // p now points to the address of the byte before c ((address of c)-1)
     p++; // p points to c again.
};

NULL Pointers

NULL pointers are simply pointers that point to address zero.Address zero is reserved to denote `this pointer points to nothing in particular.'Do not try to read or write from address zero.

 
// immediate crash program.
void dont_give_me_null(int *result) {
    int a = *result;  // read from NULL pointer (this will cause program to crash)
    *result = 4;      // write to NULL pointer  (this will also cause program to crash)
}
 
int main(int args, char ** argv) {
    dont_give_me_null(NULL);
    return 0;
}
This program crashes because the expression *result dereferences a NULL pointer.A safer implementation of dont_give_me_null is listed below:
void dont_give_me_null(int *result) {
    if (result) {  // legal use of NULL pointer - check if it's null
        int a = *result;
        *result = 4;
    }
}
The value of the pointer can be used in expressions, as listed above, as long as no attempt is made to look at the data stored at the address stored in the pointer.

Pointers and Arrays

Pointers and arrays implicitly convert between each other, like floats and doubles.This is an area of much confusion for many programmers, but in reality it's pretty simple.Arrays convert into pointers to their first element.Programmers can subscript pointers with indices as if they were arrays.Here are some examples:

void pointer_and_array_fun() {

    int array[10];

    int *pointer;

    pointer = array; // array to pointer conversion.address stored in pointer is that of the

    // first element in array. namely, pointer = &array[0];

    int foo = pointer[3]; // foo is equal to array[3].

    pointer++;

    foo = pointer[3]; // foo is equal to array[4]. why?

   }

Passing Pointers Between Functions

Pointers are a very useful way to share data between functions.One particularly good use is to have `output' variables.Pointers are also the only way to share array data between functions, making them very critical.
void set_variables( int * dest_int, int * dest_array, int array_length ) {
    *dest_int = 7;
    for(int I=0; I<array_length; I++) {
        dest_array[I]=I+1;
    }
}
int main(int args, char ** argv) {
    int a;
    int b[10];
    set_variables(&a, b, 10);  // notice how there's no &b - implicit array to pointer 
                               // conversion
}

Common Pitfalls

Often times, new programmers will make mistakes in their code which are free of syntax errors and logically make sense.  This section is designed to point out some of the more common ones whose errors are related to discussed topics in this document.

 

Returning A Pointer To A Local Variable

This is a common one.  The code for something like this in C++ is as follows:
char* MakeStringBuffer() {
      char buf[10];
      buf[0] = 'p';
      return buf;
}
 
int main() {
    char * val = MakeStringBuffer();
    strcpy(val, "Hello");
    printf(val);
    return 0;
}
This code will compile on most any C++ compiler with few or no problems.  This code could very well crash.  If it does not, and by some miracle the printf routine executes, garbage will most likely be printed to the screen.  On my machine, this code does crash immediately.  The problem here is that when MakeStringBuffer returns buf, buf no longer exists. buf is a local variable that is popped off of the stack upon return.  Hence, MakeStringBuffer returns a pointer which contains an address that is inside the stack segment.  In fact, the address contained there is the address of the first local variable for any function called by main directly.  So, when strcpy tries to copy a value into val, it will overwrite all of its local variables.  Obviously, strcpy will malfunction at this point, which on many machines (including mine) will result in the program crashing.  The correct implementation of this program follows:
char* MakeStringBuffer() {
      char * buf = new char[10];
      buf[0] = 'p';
      return buf;
}
 
int main() {
    char * val = MakeStringBuffer();
    strcpy(val, "Hello");
    printf(val);
    // not completely necessary in this specific case
    delete[] val;
    return 0;
}
Here, the address in the pointer returned by MakeStringBuffer is inside the heap, because that is where the new operator allocates memory from.  Since the returned address is no longer in the stack segment, and more importantly, is in the data segment where the memory region is still allocated, the strcpy to val will work fine. strcpy will not overwrite its own local variables.

 
Note the delete[] statement with the comment "// not completely necessary in this specific case".  Obviously, any memory that is allocated should be deallocated once its usefulness has expired; so that the memory may be reused later.  That is the reason for the delete[] statement.  However, the next line after the delete[] is the return  statement from main.  Whenmainreturns, the program exits.  When the program exits, all memory allocated to it by the operating system is deallocated.  Hence, the comment about the incomplete necessity of the delete[] statement; the operating system will do the exact same thing when the program exits.
Also, the reason that delete[] was used instead of delete is that C++ requires delete[]s use when deallocating arrays.

.. More Common Pitfalls to come as I see them...