别名 (计算)
别名是指内存中的一个数据位置可以通过程序的多个符号名来访问。因此通过一个名字修改数据意味着所有别名关联的值也改变了。别名使理解、分析、优化程序变得困难。别名分析用于分析处理程序中的别名的信息。
例子
缓冲区溢出
For example, most implementations of the C programming language do not perform array bounds checking. One can then exploit the implementation of the programming language by the compiler and the computer architecture's assembly language conventions, to achieve aliasing effects by writing outside of the array (a type of buffer overflow). This invokes undefined behaviour according to the C language specification; however many implementations of C will show the aliasing effects described here.
If an array is created on the stack, with a variable laid out in memory directly beside that array, one could index outside the array and directly change the variable by changing the relevant array element. For example, if we have an int array of size 2 (for this example's sake, calling it arr), next to another int variable (call it i), arr[2] (i.e. the 3rd element) would be aliased to i if they are adjacent in memory.
# include <stdio.h>
int main()
{
int arr[2] = { 1, 2 };
int i=10;
/* Write beyond the end of arr. Undefined behaviour in standard C, will write to i in some implementations. */
arr[2] = 20;
printf("element 0: %d \t", arr[0]); // outputs 1
printf("element 1: %d \t", arr[1]); // outputs 2
printf("element 2: %d \t", arr[2]); // outputs 20, if aliasing occurred
printf("i: %d \t\t", i); // might also output 20, not 10, because of aliasing, but the compiler might have i stored in a register and print 10
/* arr size is still 2. */
printf("arr size: %d \n", (sizeof(arr) / sizeof(int)));
}
This is possible in some implementations of C because an array is a block of contiguous memory, and array elements are merely referenced by offsets off the address of the beginning of that block multiplied by the size of a single element. Since C has no bounds checking, indexing and addressing outside of the array is possible. Note that the aforementioned aliasing behaviour is undefined behaviour. Some implementations may leave space between arrays and variables on the stack, for instance, to align variables to memory locations that are a multiple of the architecture's native word size. The C standard does not generally specify how data is to be laid out in memory. (ISO/IEC 9899:1999, section 6.2.6.1).
It is not erroneous for a compiler to omit aliasing effects for accesses that fall outside the bounds of an array.
别名指针
Another variety of aliasing can occur in any language that can refer to one location in memory with more than one name (for example, with pointers). See the C example of the xor swap algorithm that is a function; it assumes the two pointers passed to it are distinct, but if they are in fact equal (or aliases of each other), the function fails. This is a common problem with functions that accept pointer arguments, and their tolerance (or the lack thereof) for aliasing must be carefully documented, particularly for functions that perform complex manipulations on memory areas passed to them.
Specified aliasing
Controlled aliasing behaviour may be desirable in some cases (that is, aliasing behaviour that is specified, unlike that enabled by memory layout in C). It is common practice in Fortran. The Perl programming language specifies, in some constructs, aliasing behaviour, such as in foreach loops. This allows certain data structures to be modified directly with less code. For example,
my @array = (1, 2, 3);
foreach my $element (@array) {
# Increment $element, thus automatically
# modifying @array, since $element is ''aliased''
# to each of @array's elements in turn.
$element++;
}
print "@array \n";
will print out "2 3 4" as a result. If one wanted to bypass aliasing effects, one could copy the contents of the index variable into another and change the copy.
优化时冲突
优化编译器在存在指针时往往对变量做出保守假设。如常量传播能否使用。代码重排序(code reordering)也受别名的影响,这可能会改善指令调度或允许更多的循环优化.
C语言的C99标准,提出了严格别名规则(strict aliasing rule)见section 6.5, paragraph 7。指出使用不同类型的指针访问同一内存位置是违规的。编译器因而可以假定不同类型的指针不会是别名,这可能带来性能的巨大提升。[1]一些著名项目,如Python 2违反了此规则。[2]Linux内核也解决了类似问题。[3] 使用gcc编译选项-fno-strict-aliasing
可关闭此规则。
C++11规定下述广义左值类型为严格别名规则的例外情形:
- 对象的动态类型
- cv量化版本
- signed或unsigned版本
- 聚合类型(如struct、class)或union类型,包含此前所指的类型作为它的元素,或非静态数据成员(包括递归嵌套类型)
- 动态类型的基类型
- char或unsigned
硬件别名
The term aliasing is also used to describe the situation where, due to either a hardware design choice or a hardware failure, one or more of the available address bits is not used in the memory selection process.[4] This may be a design decision if there are more address bits available than are necessary to support the installed memory device(s). In a failure, one or more address bits may be shorted together, or may be forced to ground (logic 0) or the supply voltage (logic 1).
- Example
For this example, we assume a memory design with 8 locations, requiring only 3 address lines (or bits) since 23 = 8). Address bits (named A2 through A0) are decoded to select unique memory locations as follows, in standard binary counter fashion:
A2 | A1 | A0 | Memory location |
---|---|---|---|
0 | 0 | 0 | 0 |
0 | 0 | 1 | 1 |
0 | 1 | 0 | 2 |
0 | 1 | 1 | 3 |
1 | 0 | 0 | 4 |
1 | 0 | 1 | 5 |
1 | 1 | 0 | 6 |
1 | 1 | 1 | 7 |
In the table above, each of the 8 unique combinations of address bits selects a different memory location. However, if one address bit (say A2) were to be shorted to ground, the table would be modified as follows:
A2 | A1 | A0 | Memory location |
---|---|---|---|
0 | 0 | 0 | 0 |
0 | 0 | 1 | 1 |
0 | 1 | 0 | 2 |
0 | 1 | 1 | 3 |
0 | 0 | 0 | 0 |
0 | 0 | 1 | 1 |
0 | 1 | 0 | 2 |
0 | 1 | 1 | 3 |
In this case, with A2 always being zero, the first four memory locations are duplicated and appear again as the second four. Memory locations 4 through 7 have become inaccessible.
If this change occurred to a different address bit, the decoding results would be different, but in general the effect would be the same: the loss of a single address bit cuts the available memory space in half, with resulting duplication (aliasing) of the remaining space.
参见
- 别名
- 指针别名
参考文献
- ^ Mike Acton. Understanding Strict Aliasing. 2006-06-01 [2017-11-20]. (原始内容存档于2013-05-08).
- ^ Neil Schemenauer. ANSI strict aliasing and Python. 2003-07-17 [2017-11-20]. (原始内容存档于2020-06-05).
- ^ Linus Torvalds. Re: Invalid compilation without -fno-strict-aliasing. 2003-02-26 [2017-11-20]. (原始内容存档于2020-11-12).
- ^ Michael Barr. Software Based Memory Testing. 2012-07-27 [2017-11-20]. (原始内容存档于2020-11-29).
外部链接
- Understanding Strict Aliasing(页面存档备份,存于互联网档案馆) – article by Mike Acton
- Aliasing, pointer casts and gcc 3.3 (页面存档备份,存于互联网档案馆) – informational article on NetBSD mailing list
- Type-based alias analysis in C++ (页面存档备份,存于互联网档案馆) – Informational article on type-based alias analysis in C++
- Understand C/C++ Strict Aliasing (页面存档备份,存于互联网档案馆) – article on strict aliasing originally from the boost developer's wiki