diff --git a/docs/GetElementPtr.html b/docs/GetElementPtr.html index ac910887a6a..99319a49924 100644 --- a/docs/GetElementPtr.html +++ b/docs/GetElementPtr.html @@ -56,15 +56,92 @@ this leads to the following questions, all of which are answered in the following sections.
Quick answer: Because its already present.
+Having understood the previous question, a new + question then arises:
+Why is it okay to index through the first pointer, but + subsequent pointers won't be dereferenced?+
The answer is simply because memory does not have to be accessed to + perform the computation. The first operand to the GEP instruction must be a + value of a pointer type. The value of the pointer is provided directly to + the GEP instruction without any need for accessing memory. It must, + therefore be indexed like any other operand. Consider this example:
++ struct munger_struct { + int f1; + int f2; + }; + void munge(struct munger_struct *P) + { + P[0].f1 = P[1].f1 + P[2].f2; + } + ... + complex Array[3]; + ... + munge(Array);+
In this "C" example, the front end compiler (llvm-gcc) will generate three + GEP instructions for the three indices through "P" in the assignment + statement. The function argument P will be the first operand of each + of these GEP instructions. The second operand will be the field offset into + the struct munger_struct type, for either the f1 or + f2 field. So, in LLVM assembly the munge function looks + like:
++ void %munge(%struct.munger_struct* %P) { + entry: + %tmp = getelementptr %struct.munger_struct* %P, int 1, uint 0 + %tmp = load int* %tmp + %tmp6 = getelementptr %struct.munger_struct* %P, int 2, uint 1 + %tmp7 = load int* %tmp6 + %tmp8 = add int %tmp7, %tmp + %tmp9 = getelementptr %struct.munger_struct* %P, int 0, uint 0 + store int %tmp8, int* %tmp9 + ret void + }+
In each case the first operand is the pointer through which the GEP + instruction starts. The same is true whether the first operand is an + argument, allocated memory, or a global variable.
+To make this clear, let's consider a more obtuse example:
++ %MyVar = unintialized global int + ... + %idx1 = getelementptr int* %MyVar, long 0 + %idx2 = getelementptr int* %MyVar, long 1 + %idx3 = getelementptr int* %MyVar, long 2+
These GEP instructions are simply making address computations from the + base address of MyVar. They compute, as follows (using C syntax): +
+Since the type int is known to be four bytes long, the indices + 0, 1 and 2 translate into memory offsets of 0, 4, and 8, respectively. No + memory is accessed to make these computations because the address of + %MyVar is passed directly to the GEP instructions.
+The obtuse part of this example is in the cases of %idx2 and + %idx3. They result in the computation of addresses that point to + memory past the end of the %MyVar global, which is only one + int long, not three ints long. While this is legal in LLVM, + it is inadvisable because any load or store with the pointer that results + from these GEP instructions would produce undefined results.
+The GEP above yields an int* by indexing the int typed field of the structure %MyStruct. When people first look at it, they wonder why the long 0 index is needed. However, a closer inspection - of how globals and GEPs work reveals the need. Becoming aware of the following + of how globals and GEPs work reveals the need. Becoming aware of the following facts will dispell the confusion:
Quick answer: nothing.
The GetElementPtr instruction dereferences nothing. That is, it doesn't - access memory in any way. That's what the Load instruction is for. GEP is - only involved in the computation of addresses. For example, consider this:
+ access memory in any way. That's what the Load and Store instructions are for. + GEP is only involved in the computation of addresses. For example, consider + this:%MyVar = uninitialized global { [40 x int ]* } ... @@ -137,45 +218,6 @@ array there.
Quick answer: Because its already present.
-Having understood the previous question, a new - question then arises:
-Why is it okay to index through the first pointer, but - subsequent pointers won't be dereferenced?-
The answer is simply because - memory does not have to be accessed to perform the computation. The first - operand to the GEP instruction must be a value of a pointer type. The value - of the pointer is provided directly to the GEP instruction without any need - for accessing memory. It must, therefore be indexed like any other operand. - Consider this example:
-- %MyVar = unintialized global int - ... - %idx1 = getelementptr int* %MyVar, long 0 - %idx2 = getelementptr int* %MyVar, long 1 - %idx3 = getelementptr int* %MyVar, long 2-
These GEP instructions are simply making address computations from the - base address of MyVar. They compute, as follows (using C syntax):
-Since the type int is known to be four bytes long, the indices - 0, 1 and 2 translate into memory offsets of 0, 4, and 8, respectively. No - memory is accessed to make these computations because the address of - %MyVar is passed directly to the GEP instructions.
-Note that the cases of %idx2 and %idx3 are a bit silly. - They are computing addresses of something of unknown type (and thus - potentially breaking type safety) because %MyVar is only one - integer long.
-%MyVar = global { [10 x int ] } - %idx1 = getlementptr { [10 x int ] }* %MyVar, long 0, byte 0, long 1 + %idx1 = getlementptr { [10 x int ] }* %MyVar, long 0, ubyte 0, long 1 %idx2 = getlementptr { [10 x int ] }* %MyVar, long 1
In this example, idx1 computes the address of the second integer in the array that is in the structure in %MyVar, that is MyVar+4. The @@ -210,7 +252,7 @@ the type. Consider this example:
%MyVar = global { [10 x int ] } - %idx1 = getlementptr { [10 x int ] }* %MyVar, long 1, byte 0, long 0 + %idx1 = getlementptr { [10 x int ] }* %MyVar, long 1, ubyte 0, long 0 %idx2 = getlementptr { [10 x int ] }* %MyVar, long 1
In this example, the value of %idx1 is %MyVar+40 and its type is int*. The value of %idx2 is also