PHP“ foreach”实际上如何工作?

本文翻译自:How does PHP 'foreach' actually work?

Let me prefix this by saying that I know what foreach is, does and how to use it. 首先,我要说一下我知道foreach是什么,做什么以及如何使用它。 This question concerns how it works under the bonnet, and I don't want any answers along the lines of "this is how you loop an array with foreach ". 这个问题涉及它在引擎盖下的工作方式,我不希望“这就是使用foreach循环数组的方式”的答案。


For a long time I assumed that foreach worked with the array itself. 很长时间以来,我一直认为foreach与数组本身一起工作。 Then I found many references to the fact that it works with a copy of the array, and I have since assumed this to be the end of the story. 然后,我发现了很多关于它可以与数组副本一起使用的事实的引用,从那时起,我一直以为这是故事的结尾。 But I recently got into a discussion on the matter, and after a little experimentation found that this was not in fact 100% true. 但是我最近对此事进行了讨论,经过一番实验后发现这实际上并非100%正确。

Let me show what I mean. 让我表明我的意思。 For the following test cases, we will be working with the following array: 对于以下测试用例,我们将使用以下数组:

$array = array(1, 2, 3, 4, 5);

Test case 1 : 测试用例1

foreach ($array as $item) {
  echo "$item\n";
  $array[] = $item;
}
print_r($array);

/* Output in loop:    1 2 3 4 5
   $array after loop: 1 2 3 4 5 1 2 3 4 5 */

This clearly shows that we are not working directly with the source array - otherwise the loop would continue forever, since we are constantly pushing items onto the array during the loop. 这清楚地表明,我们不是直接使用源数组-否则循环将永远持续下去,因为在循环过程中我们不断将项目推入数组。 But just to be sure this is the case: 但是只是为了确保是这种情况:

Test case 2 : 测试用例2

foreach ($array as $key => $item) {
  $array[$key + 1] = $item + 2;
  echo "$item\n";
}

print_r($array);

/* Output in loop:    1 2 3 4 5
   $array after loop: 1 3 4 5 6 7 */

This backs up our initial conclusion, we are working with a copy of the source array during the loop, otherwise we would see the modified values during the loop. 这支持了我们的初始结论,我们在循环期间正在使用源数组的副本,否则将在循环期间看到修改后的值。 But... 但...

If we look in the manual , we find this statement: 如果查看手册 ,则会发现以下语句:

When foreach first starts executing, the internal array pointer is automatically reset to the first element of the array. 当foreach首先开始执行时,内部数组指针将自动重置为数组的第一个元素。

Right... this seems to suggest that foreach relies on the array pointer of the source array. 对...这似乎表明, foreach依赖于源数组的数组指针。 But we've just proved that we're not working with the source array , right? 但是我们刚刚证明我们没有使用源数组 ,对吗? Well, not entirely. 好吧,不完全是。

Test case 3 : 测试用例3

// Move the array pointer on one to make sure it doesn't affect the loop
var_dump(each($array));

foreach ($array as $item) {
  echo "$item\n";
}

var_dump(each($array));

/* Output
  array(4) {
    [1]=>
    int(1)
    ["value"]=>
    int(1)
    [0]=>
    int(0)
    ["key"]=>
    int(0)
  }
  1
  2
  3
  4
  5
  bool(false)
*/

So, despite the fact that we are not working directly with the source array, we are working directly with the source array pointer - the fact that the pointer is at the end of the array at the end of the loop shows this. 因此,尽管事实上我们并没有直接使用源数组,而是直接使用了源数组指针-指针在循环末尾位于数组的末尾这一事实表明了这一点。 Except this can't be true - if it was, then test case 1 would loop forever. 除非这不能成立-如果是,那么测试用例1将永远循环。

The PHP manual also states: PHP手册还指出:

As foreach relies on the internal array pointer changing it within the loop may lead to unexpected behavior. 由于foreach依赖内部数组指针,因此在循环内更改它可能导致意外行为。

Well, let's find out what that "unexpected behavior" is (technically, any behavior is unexpected since I no longer know what to expect). 好吧,让我们找出“意外行为”是什么(从技术上讲,任何行为都是意外的,因为我不再知道会发生什么)。

Test case 4 : 测试用例4

foreach ($array as $key => $item) {
  echo "$item\n";
  each($array);
}

/* Output: 1 2 3 4 5 */

Test case 5 : 测试用例5

foreach ($array as $key => $item) {
  echo "$item\n";
  reset($array);
}

/* Output: 1 2 3 4 5 */

...nothing that unexpected there, in fact it seems to support the "copy of source" theory. ...没有什么意外的,实际上似乎支持“源代码复制”理论。


The Question 问题

What is going on here? 这里发生了什么? My C-fu is not good enough for me to able to extract a proper conclusion simply by looking at the PHP source code, I would appreciate it if someone could translate it into English for me. 我的C-fu不足以让我仅通过查看PHP源代码就能得出正确的结论,如果有人可以为我翻译成英语,我将不胜感激。

It seems to me that foreach works with a copy of the array, but sets the array pointer of the source array to the end of the array after the loop. 在我看来, foreach使用数组的副本 ,但是将循环后的源数组的数组指针设置为数组的末尾。

  • Is this correct and the whole story? 这是正确的故事吗?
  • If not, what is it really doing? 如果没有,它到底在做什么?
  • Is there any situation where using functions that adjust the array pointer ( each() , reset() et al.) during a foreach could affect the outcome of the loop? 是否有任何情况在foreach期间使用调整数组指针的函数( each()reset()等)会影响循环的结果?

#1楼

参考:https://stackoom.com/question/gCSV/PHP-foreach-实际上如何工作


#2楼

In example 3 you don't modify the array. 在示例3中,您无需修改​​数组。 In all other examples you modify either the contents or the internal array pointer. 在所有其他示例中,您将修改内容或内部数组指针。 This is important when it comes to PHP arrays because of the semantics of the assignment operator. 对于PHP数组,这很重要,因为赋值运算符具有语义。

The assignment operator for the arrays in PHP works more like a lazy clone. PHP中数组的赋值运算符的工作方式更像是惰性克隆。 Assigning one variable to another that contains an array will clone the array, unlike most languages. 与大多数语言不同,将一个变量分配给另一个包含数组的变量将克隆该数组。 However, the actual cloning will not be done unless it is needed. 但是,除非需要,否则不会进行实际的克隆。 This means that the clone will take place only when either of the variables is modified (copy-on-write). 这意味着仅当修改了两个变量(写时复制)时,才会进行克隆。

Here is an example: 这是一个例子:

$a = array(1,2,3);
$b = $a;  // This is lazy cloning of $a. For the time
          // being $a and $b point to the same internal
          // data structure.

$a[] = 3; // Here $a changes, which triggers the actual
          // cloning. From now on, $a and $b are two
          // different data structures. The same would
          // happen if there were a change in $b.

Coming back to your test cases, you can easily imagine that foreach creates some kind of iterator with a reference to the array. 回到测试用例,您可以轻松地想象到foreach通过引用数组创建了某种迭代器。 This reference works exactly like the variable $b in my example. 此引用的工作方式与示例中的变量$b完全相同。 However, the iterator along with the reference live only during the loop and then, they are both discarded. 但是,迭代器和引用仅在循环期间存在,然后都被丢弃。 Now you can see that, in all cases but 3, the array is modified during the loop, while this extra reference is alive. 现在您可以看到,在除3之外的所有情况下,在循环期间都将修改数组,而该额外引用仍然有效。 This triggers a clone, and that explains what's going on here! 这会触发一个克隆,这说明了这里发生了什么!

Here is an excellent article for another side effect of this copy-on-write behaviour: The PHP Ternary Operator: Fast or not? 这是一篇出色的文章,说明了这种写时复制行为的另一个副作用: PHP三元运算符:速度快还是快?


#3楼

Some points to note when working with foreach() : 使用foreach()时需要注意的几点:

a) foreach works on the prospected copy of the original array. a) foreach在原始数组的预期副本上工作。 It means foreach() will have SHARED data storage until or unless a prospected copy is not created foreach Notes/User comments . 这意味着foreach()将具有共享的数据存储,直到或除非未为foreach Notes / User comments创建prospected copy

b) What triggers a prospected copy ? b)是什么触发了预期复制 A prospected copy is created based on the policy of copy-on-write , that is, whenever an array passed to foreach() is changed, a clone of the original array is created. 根据copy-on-writecopy-on-write的策略创建预期的副本,也就是说,每当更改传递给foreach()的数组时,都会创建原始数组的副本。

c) The original array and foreach() iterator will have DISTINCT SENTINEL VARIABLES , that is, one for the original array and other for foreach ; c)原始数组和foreach()迭代器将具有DISTINCT SENTINEL VARIABLES ,即,一个用于原始数组,另一个用于foreach see the test code below. 请参阅下面的测试代码。 SPL , Iterators , and Array Iterator . SPL迭代器数组迭代器

Stack Overflow question How to make sure the value is reset in a 'foreach' loop in PHP? 堆栈溢出问题如何确保在PHP的“ foreach”循环中重置该值? addresses the cases (3,4,5) of your question. 解决您的问题的情况(3,4,5)。

The following example shows that each() and reset() DOES NOT affect SENTINEL variables (for example, the current index variable) of the foreach() iterator. 下面的示例显示each()和reset()不会影响foreach()迭代器的SENTINEL变量(for example, the current index variable)

$array = array(1, 2, 3, 4, 5);

list($key2, $val2) = each($array);
echo "each() Original (outside): $key2 => $val2<br/>";

foreach($array as $key => $val){
    echo "foreach: $key => $val<br/>";

    list($key2,$val2) = each($array);
    echo "each() Original(inside): $key2 => $val2<br/>";

    echo "--------Iteration--------<br/>";
    if ($key == 3){
        echo "Resetting original array pointer<br/>";
        reset($array);
    }
}

list($key2, $val2) = each($array);
echo "each() Original (outside): $key2 => $val2<br/>";

Output: 输出:

each() Original (outside): 0 => 1
foreach: 0 => 1
each() Original(inside): 1 => 2
--------Iteration--------
foreach: 1 => 2
each() Original(inside): 2 => 3
--------Iteration--------
foreach: 2 => 3
each() Original(inside): 3 => 4
--------Iteration--------
foreach: 3 => 4
each() Original(inside): 4 => 5
--------Iteration--------
Resetting original array pointer
foreach: 4 => 5
each() Original(inside): 0=>1
--------Iteration--------
each() Original (outside): 1 => 2

#4楼

foreach supports iteration over three different kinds of values: foreach支持对三种不同类型的值进行迭代:

In the following, I will try to explain precisely how iteration works in different cases. 在下文中,我将尝试精确解释迭代在不同情况下如何工作。 By far the simplest case is Traversable objects, as for these foreach is essentially only syntax sugar for code along these lines: 到目前为止,最简单的情况是Traversable对象,因为这些foreach本质上只是这些行代码的语法糖:

foreach ($it as $k => $v) { /* ... */ }

/* translates to: */

if ($it instanceof IteratorAggregate) {
    $it = $it->getIterator();
}
for ($it->rewind(); $it->valid(); $it->next()) {
    $v = $it->current();
    $k = $it->key();
    /* ... */
}

For internal classes, actual method calls are avoided by using an internal API that essentially just mirrors the Iterator interface on the C level. 对于内部类,通过使用实质上只是在C级别上镜像Iterator接口的内部API,可以避免实际的方法调用。

Iteration of arrays and plain objects is significantly more complicated. 数组和普通对象的迭代要复杂得多。 First of all, it should be noted that in PHP "arrays" are really ordered dictionaries and they will be traversed according to this order (which matches the insertion order as long as you didn't use something like sort ). 首先,应该注意的是,在PHP中,“数组”实际上是有序的字典,它们将根据此顺序遍历(只要您不使用sort等内容,它们就与插入顺序匹配)。 This is opposed to iterating by the natural order of the keys (how lists in other languages often work) or having no defined order at all (how dictionaries in other languages often work). 这与通过键的自然顺序(其他语言的列表通常如何工作)或根本没有定义的顺序(其他语言的词典通常如何工作)进行迭代相反。

The same also applies to objects, as the object properties can be seen as another (ordered) dictionary mapping property names to their values, plus some visibility handling. 这同样适用于对象,因为对象属性可以看作是另一个(有序的)字典,将属性名称映射到其值,再加上一些可见性处理。 In the majority of cases, the object properties are not actually stored in this rather inefficient way. 在大多数情况下,对象属性实际上并不是以这种相当低效的方式存储的。 However, if you start iterating over an object, the packed representation that is normally used will be converted to a real dictionary. 但是,如果您开始遍历对象,则通常使用的打包表示形式将转换为实际字典。 At that point, iteration of plain objects becomes very similar to iteration of arrays (which is why I'm not discussing plain-object iteration much in here). 到那时,普通对象的迭代变得与数组的迭代非常相似(这就是为什么我在这里不讨论普通对象迭代的原因)。

So far, so good. 到现在为止还挺好。 Iterating over a dictionary can't be too hard, right? 遍历字典不会太难,对吧? The problems begin when you realize that an array/object can change during iteration. 当您意识到数组/对象可以在迭代过程中更改时,问题就开始了。 There are multiple ways this can happen: 发生这种情况的方式有多种:

  • If you iterate by reference using foreach ($arr as &$v) then $arr is turned into a reference and you can change it during iteration. 如果使用foreach ($arr as &$v)通过引用进行迭代,则$arr将变为引用,您可以在迭代期间进行更改。
  • In PHP 5 the same applies even if you iterate by value, but the array was a reference beforehand: $ref =& $arr; foreach ($ref as $v) 在PHP 5中,即使您按值进行迭代也是如此,但是该数组是预先引用的: $ref =& $arr; foreach ($ref as $v) $ref =& $arr; foreach ($ref as $v)
  • Objects have by-handle passing semantics, which for most practical purposes means that they behave like references. 对象具有按句柄传递的语义,对于大多数实际目的,这意味着它们的行为类似于引用。 So objects can always be changed during iteration. 因此,在迭代过程中始终可以更改对象。

The problem with allowing modifications during iteration is the case where the element you are currently on is removed. 迭代期间允许修改的问题是当前所在元素被删除的情况。 Say you use a pointer to keep track of which array element you are currently at. 假设您使用指针来跟踪当前所在的数组元素。 If this element is now freed, you are left with a dangling pointer (usually resulting in a segfault). 如果现在释放了此元素,则留下一个悬空的指针(通常会导致段错误)。

There are different ways of solving this issue. 有多种解决此问题的方法。 PHP 5 and PHP 7 differ significantly in this regard and I'll describe both behaviors in the following. PHP 5和PHP 7在这方面有很大不同,我将在下面描述这两种行为。 The summary is that PHP 5's approach was rather dumb and lead to all kinds of weird edge-case issues, while PHP 7's more involved approach results in more predictable and consistent behavior. 总结是,PHP 5的方法相当笨拙,并导致各种奇怪的极端情况问题,而PHP 7的方法更复杂,导致行为的可预测性和一致性更高。

As a last preliminary, it should be noted that PHP uses reference counting and copy-on-write to manage memory. 最后,应该注意的是PHP使用引用计数和写时复制来管理内存。 This means that if you "copy" a value, you actually just reuse the old value and increment its reference count (refcount). 这意味着,如果您“复制”一个值,则实际上只是重用旧值并增加其引用计数(refcount)。 Only once you perform some kind of modification a real copy (called a "duplication") will be done. 仅当您执行某种修改后,才会完成真实副本(称为“复制”)。 See You're being lied to for a more extensive introduction on this topic. 请参阅“ 您被骗了”以获取有关此主题的更广泛的介绍。

PHP 5 PHP 5

Internal array pointer and HashPointer 内部数组指针和HashPointer

Arrays in PHP 5 have one dedicated "internal array pointer" (IAP), which properly supports modifications: Whenever an element is removed, there will be a check whether the IAP points to this element. PHP 5中的数组具有一个专用的“内部数组指针”(IAP),该数组正确支持修改:每当删除元素时,都会检查IAP是否指向该元素。 If it does, it is advanced to the next element instead. 如果是这样,它将前进到下一个元素。

While foreach does make use of the IAP, there is an additional complication: There is only one IAP, but one array can be part of multiple foreach loops: 尽管foreach确实使用了IAP,但还有一个复杂之处:只有一个IAP,但是一个数组可以成为多个foreach循环的一部分:

// Using by-ref iteration here to make sure that it's really
// the same array in both loops and not a copy
foreach ($arr as &$v1) {
    foreach ($arr as &$v) {
        // ...
    }
}

To support two simultaneous loops with only one internal array pointer, foreach performs the following shenanigans: Before the loop body is executed, foreach will back up a pointer to the current element and its hash into a per-foreach HashPointer . 为了仅使用一个内部数组指针支持两个同时循环, foreach执行以下操作:在循环体执行之前, foreach将指向当前元素的指针及其哈希值HashPointer到每个HashPointer After the loop body runs, the IAP will be set back to this element if it still exists. 循环主体运行后,如果IAP仍然存在,则将其设置回该元素。 If however the element has been removed, we'll just use wherever the IAP is currently at. 但是,如果该元素已被删除,我们将仅使用IAP当前所在的位置。 This scheme mostly-kinda-sort of works, but there's a lot of weird behavior you can get out of it, some of which I'll demonstrate below. 这个方案主要是某种类型的作品,但是您可以摆脱很多奇怪的行为,下面将对其中的一些进行演示。

Array duplication 阵列复制

The IAP is a visible feature of an array (exposed through the current family of functions), as such changes to the IAP count as modifications under copy-on-write semantics. IAP是数组的一个可见功能(通过current的函数系列公开),因此对IAP计数的更改是写时复制语义下的修改。 This, unfortunately, means that foreach is in many cases forced to duplicate the array it is iterating over. 不幸的是,这意味着在许多情况下, foreach被迫复制正在迭代的数组。 The precise conditions are: 精确条件是:

  1. The array is not a reference (is_ref=0). 该数组不是引用(is_ref = 0)。 If it's a reference, then changes to it are supposed to propagate, so it should not be duplicated. 如果是参考,则应该传播对其的更改,因此不应重复。
  2. The array has refcount>1. 数组的引用计数> 1。 If refcount is 1, then the array is not shared and we're free to modify it directly. 如果refcount为1,则不共享该数组,我们可以自由地直接对其进行修改。

If the array is not duplicated (is_ref=0, refcount=1), then only its refcount will be incremented (*). 如果数组不重复(is_ref = 0,refcount = 1),则仅其refcount将递增(*)。 Additionally, if foreach by reference is used, then the (potentially duplicated) array will be turned into a reference. 此外,如果使用按引用的foreach ,则(可能重复的)数组将变为引用。

Consider this code as an example where duplication occurs: 将此代码视为发生重复的示例:

function iterate($arr) {
    foreach ($arr as $v) {}
}

$outerArr = [0, 1, 2, 3, 4];
iterate($outerArr);

Here, $arr will be duplicated to prevent IAP changes on $arr from leaking to $outerArr . 在这里, $arr将被复制,以防止$arr上的IAP更改泄漏到$outerArr In terms of the conditions above, the array is not a reference (is_ref=0) and is used in two places (refcount=2). 根据上述条件,该数组不是引用(is_ref = 0),并且在两个地方使用(refcount = 2)。 This requirement is unfortunate and an artifact of the suboptimal implementation (there is no concern of modification during iteration here, so we don't really need to use the IAP in the first place). 不幸的是,此要求是次佳实现的产物(这里没有考虑迭代过程中的修改,因此我们实际上并不需要首先使用IAP)。

(*) Incrementing the refcount here sounds innocuous, but violates copy-on-write (COW) semantics: This means that we are going to modify the IAP of a refcount=2 array, while COW dictates that modifications can only be performed on refcount=1 values. (*)在此处增加refcount听起来无害,但违反了写时复制(COW)语义:这意味着我们将修改refcount = 2数组的IAP,而COW指示只能在refcount上执行修改= 1个值。 This violation results in user-visible behavior change (while a COW is normally transparent) because the IAP change on the iterated array will be observable -- but only until the first non-IAP modification on the array. 违反行为会导致用户可见的行为更改(而COW通常是透明的),因为可以观察到迭代阵列上的IAP更改-但仅在阵列上第一次进行非IAP修改之前。 Instead, the three "valid" options would have been a) to always duplicate, b) do not increment the refcount and thus allowing the iterated array to be arbitrarily modified in the loop or c) don't use the IAP at all (the PHP 7 solution). 取而代之的是,三个“有效”选项是:a)始终重复,b)不增加refcount ,因此允许在循环中随意修改迭代数组,或者c)根本不使用IAP( PHP 7解决方案)。

Position advancement order 职位提升订单

There is one last implementation detail that you have to be aware of to properly understand the code samples below. 为了正确理解下面的代码示例,您必须知道最后一个实现细节。 The "normal" way of looping through some data structure would look something like this in pseudocode: 遍历某些数据结构的“正常”方式在伪代码中看起来像这样:

reset(arr);
while (get_current_data(arr, &data) == SUCCESS) {
    code();
    move_forward(arr);
}

However foreach , being a rather special snowflake, chooses to do things slightly differently: 但是, foreach是一个非常特殊的雪花,它选择做的事情略有不同:

reset(arr);
while (get_current_data(arr, &data) == SUCCESS) {
    move_forward(arr);
    code();
}

Namely, the array pointer is already moved forward before the loop body runs. 即,在循环体运行之前 ,数组指针已经向前移动。 This means that while the loop body is working on element $i , the IAP is already at element $i+1 . 这意味着,当循环体在元素$i上工作时,IAP已在元素$i+1 This is the reason why code samples showing modification during iteration will always unset the next element, rather than the current one. 这就是为什么在迭代过程中显示修改的代码示例将始终unset 下一个元素而不是当前元素的原因。

Examples: Your test cases 示例:您的测试用例

The three aspects described above should provide you with a mostly complete impression of the idiosyncrasies of the foreach implementation and we can move on to discuss some examples. 上面描述的三个方面应该为您大致了解一下foreach实现的特质,我们可以继续讨论一些示例。

The behavior of your test cases is simple to explain at this point: 此时,测试用例的行为很容易解释:

  • In test cases 1 and 2 $array starts off with refcount=1, so it will not be duplicated by foreach : Only the refcount is incremented. 在测试用例1和2中, $array从refcount = 1开始,因此它不会被foreach复制:仅增加refcount When the loop body subsequently modifies the array (which has refcount=2 at that point), the duplication will occur at that point. 当循环主体随后修改数组时(此时refcount = 2),复制将在该点进行。 Foreach will continue working on an unmodified copy of $array . Foreach将继续处理$array的未修改副本。

  • In test case 3, once again the array is not duplicated, thus foreach will be modifying the IAP of the $array variable. 在测试用例3中,数组不再重复,因此foreach将修改$array变量的IAP。 At the end of the iteration, the IAP is NULL (meaning iteration has done), which each indicates by returning false . 在迭代结束时,IAP为NULL(表示迭代已完成), each通过返回false指示。

  • In test cases 4 and 5 both each and reset are by-reference functions. 在测试用例4和5中, eachreset都是参考功能。 The $array has a refcount=2 when it is passed to them, so it has to be duplicated. $array传递给他们时具有refcount=2 ,因此必须重复。 As such foreach will be working on a separate array again. 这样, foreach将再次在单独的数组上工作。

Examples: Effects of current in foreach 示例:foreach中的current影响

A good way to show the various duplication behaviors is to observe the behavior of the current() function inside a foreach loop. 显示各种复制行为的一个好方法是观察foreach循环中current()函数的行为。 Consider this example: 考虑以下示例:

foreach ($array as $val) {
    var_dump(current($array));
}
/* Output: 2 2 2 2 2 */

Here you should know that current() is a by-ref function (actually: prefer-ref), even though it does not modify the array. 在这里,您应该知道current()是一个by-ref函数(实际上是:referr-ref),即使它不修改数组也是如此。 It has to be in order to play nice with all the other functions like next which are all by-ref. 它必须是为了与所有其他功能(如next ,它们都是by-ref。 By-reference passing implies that the array has to be separated and thus $array and the foreach-array will be different. 通过引用传递意味着必须将数组分开,因此$arrayforeach-array将不同。 The reason you get 2 instead of 1 is also mentioned above: foreach advances the array pointer before running the user code, not after. 上面还提到了2而不是1的原因: foreach 运行用户代码之前 (而不是之后)推进数组指针。 So even though the code is at the first element, foreach already advanced the pointer to the second. 因此,即使代码位于第一个元素, foreach已将指针前进到第二个元素。

Now lets try a small modification: 现在让我们尝试一个小的修改:

$ref = &$array;
foreach ($array as $val) {
    var_dump(current($array));
}
/* Output: 2 3 4 5 false */

Here we have the is_ref=1 case, so the array is not copied (just like above). 这里有is_ref = 1的情况,因此不复制数组(就像上面一样)。 But now that it is a reference, the array no longer has to be duplicated when passing to the by-ref current() function. 但是,既然它是一个引用,则在传递给by-ref current()函数时,不再需要复制该数组。 Thus current() and foreach work on the same array. 因此, current()foreach在同一数组上工作。 You still see the off-by-one behavior though, due to the way foreach advances the pointer. 但是,由于foreach前进指针的方式,您仍然会看到偏离行为。

You get the same behavior when doing by-ref iteration: 在进行by-ref迭代时,您会得到相同的行为:

foreach ($array as &$val) {
    var_dump(current($array));
}
/* Output: 2 3 4 5 false */

Here the important part is that foreach will make $array an is_ref=1 when it is iterated by reference, so basically you have the same situation as above. 在这里重要的是,在通过引用进行迭代时,foreach将使$array成为is_ref = 1,因此基本上您具有与上述相同的情况。

Another small variation, this time we'll assign the array to another variable: 另一个小的变化,这次我们将数组分配给另一个变量:

$foo = $array;
foreach ($array as $val) {
    var_dump(current($array));
}
/* Output: 1 1 1 1 1 */

Here the refcount of the $array is 2 when the loop is started, so for once we actually have to do the duplication upfront. 在开始循环时,这里$array的引用计数为2,因此实际上一次必须做一次复制。 Thus $array and the array used by foreach will be completely separate from the outset. 因此, $array和foreach所使用的数组将从一开始就完全分开。 That's why you get the position of the IAP wherever it was before the loop (in this case it was at the first position). 这就是为什么要获得IAP在循环之前的位置的原因(在这种情况下,它位于第一个位置)。

Examples: Modification during iteration 示例:迭代期间的修改

Trying to account for modifications during iteration is where all our foreach troubles originated, so it serves to consider some examples for this case. 尝试在迭代过程中考虑修改是我们所有foreach问题的起源,因此可以考虑这种情况下的一些示例。

Consider these nested loops over the same array (where by-ref iteration is used to make sure it really is the same one): 考虑在同一数组上的这些嵌套循环(使用by-ref迭代来确保它确实是相同的):

foreach ($array as &$v1) {
    foreach ($array as &$v2) {
        if ($v1 == 1 && $v2 == 1) {
            unset($array[1]);
        }
        echo "($v1, $v2)\n";
    }
}

// Output: (1, 1) (1, 3) (1, 4) (1, 5)

The expected part here is that (1, 2) is missing from the output because element 1 was removed. 此处的预期部分是因为元素1被删除,所以输出中缺少(1, 2) What's probably unexpected is that the outer loop stops after the first element. 出乎意料的是,外循环在第一个元素之后停止。 Why is that? 这是为什么?

The reason behind this is the nested-loop hack described above: Before the loop body runs, the current IAP position and hash is backed up into a HashPointer . 其背后的原因是上述的嵌套循环hack:在循环主体运行之前,将当前IAP位置和哈希值备份到HashPointer After the loop body it will be restored, but only if the element still exists, otherwise the current IAP position (whatever it may be) is used instead. 在循环体之后,它将被恢复,但是仅当元素仍然存在时才恢复,否则将使用当前IAP位置(无论可能是什么)代替。 In the example above this is exactly the case: The current element of the outer loop has been removed, so it will use the IAP, which has already been marked as finished by the inner loop! 在上面的示例中,情况确实如此:外循环的当前元素已被删除,因此它将使用IAP,该IAP已被内循环标记为完成!

Another consequence of the HashPointer backup+restore mechanism is that changes to the IAP through reset() etc. usually do not impact foreach . HashPointer备份+还原机制的另一个结果是,通过reset()等对IAP的更改通常不会影响foreach For example, the following code executes as if the reset() were not present at all: 例如,执行以下代码,就像根本不存在reset()一样:

$array = [1, 2, 3, 4, 5];
foreach ($array as &$value) {
    var_dump($value);
    reset($array);
}
// output: 1, 2, 3, 4, 5

The reason is that, while reset() temporarily modifies the IAP, it will be restored to the current foreach element after the loop body. 原因是,尽管reset()临时修改了IAP,但它将在循环体之后恢复为当前的foreach元素。 To force reset() to make an effect on the loop, you have to additionally remove the current element, so that the backup/restore mechanism fails: 要强制reset()对循环起作用,您必须另外删除当前元素,以使备份/还原机制失败:

$array = [1, 2, 3, 4, 5];
$ref =& $array;
foreach ($array as $value) {
    var_dump($value);
    unset($array[1]);
    reset($array);
}
// output: 1, 1, 3, 4, 5

But, those examples are still sane. 但是,这些例子仍然很理智。 The real fun starts if you remember that the HashPointer restore uses a pointer to the element and its hash to determine whether it still exists. 如果您还记得HashPointer还原使用指向该元素的指针及其哈希值以确定它是否仍然存在,那么真正的乐趣就开始了。 But: Hashes have collisions, and pointers can be reused! 但是:哈希有冲突,并且指针可以重用! This means that, with a careful choice of array keys, we can make foreach believe that an element that has been removed still exists, so it will jump directly to it. 这意味着,通过精心选择数组键,我们可以让foreach相信已删除的元素仍然存在,因此它将直接跳转到该元素。 An example: 一个例子:

$array = ['EzEz' => 1, 'EzFY' => 2, 'FYEz' => 3];
$ref =& $array;
foreach ($array as $value) {
    unset($array['EzFY']);
    $array['FYFY'] = 4;
    reset($array);
    var_dump($value);
}
// output: 1, 4

Here we should normally expect the output 1, 1, 3, 4 according to the previous rules. 在这里,我们应该通常期望的输出1, 1, 3, 4 ,根据以往的规则。 How what happens is that 'FYFY' has the same hash as the removed element 'EzFY' , and the allocator happens to reuse the same memory location to store the element. 发生什么情况是'FYFY'具有与删除的元素'EzFY'相同的哈希,并且分配器恰巧重用了相同的内存位置来存储该元素。 So foreach ends up directly jumping to the newly inserted element, thus short-cutting the loop. 因此,foreach最终直接跳转到新插入的元素,从而缩短了循环。

Substituting the iterated entity during the loop 在循环期间替换迭代的实体

One last odd case that I'd like to mention, it is that PHP allows you to substitute the iterated entity during the loop. 我要提到的最后一个奇怪的情况是,PHP允许您在循环期间替换迭代的实体。 So you can start iterating on one array and then replace it with another array halfway through. 因此,您可以在一个阵列上开始迭代,然后在中途将其替换为另一个阵列。 Or start iterating on an array and then replace it with an object: 或开始迭代数组,然后将其替换为对象:

$arr = [1, 2, 3, 4, 5];
$obj = (object) [6, 7, 8, 9, 10];

$ref =& $arr;
foreach ($ref as $val) {
    echo "$val\n";
    if ($val == 3) {
        $ref = $obj;
    }
}
/* Output: 1 2 3 6 7 8 9 10 */

As you can see in this case PHP will just start iterating the other entity from the start once the substitution has happened. 如您所见,在这种情况下,一旦替换发生,PHP就会从头开始迭代另一个实体。

PHP 7 PHP 7

Hashtable iterators 哈希表迭代器

If you still remember, the main problem with array iteration was how to handle removal of elements mid-iteration. 如果您还记得,数组迭代的主要问题是如何处理元素在迭代过程中的移除。 PHP 5 used a single internal array pointer (IAP) for this purpose, which was somewhat suboptimal, as one array pointer had to be stretched to support multiple simultaneous foreach loops and interaction with reset() etc. on top of that. 为此,PHP 5使用了一个内部数组指针(IAP),这有些次优,因为必须扩展一个数组指针以支持多个同时的foreach循环以及reset()等的交互。

PHP 7 uses a different approach, namely, it supports creating an arbitrary amount of external, safe hashtable iterators. PHP 7使用了不同的方法,即,它支持创建任意数量的外部安全哈希表迭代器。 These iterators have to be registered in the array, from which point on they have the same semantics as the IAP: If an array element is removed, all hashtable iterators pointing to that element will be advanced to the next element. 这些迭代器必须在数组中注册,从那时起,它们具有与IAP相同的语义:如果删除了数组元素,则指向该元素的所有哈希表迭代器都将前进到下一个元素。

This means that foreach will no longer use the IAP at all . 这意味着foreach将不再使用IAP The foreach loop will be absolutely no effect on the results of current() etc. and its own behavior will never be influenced by functions like reset() etc. foreach循环绝对不会影响current()等的结果,并且它自己的行为永远不会受到reset()等函数的影响。

Array duplication 阵列复制

Another important change between PHP 5 and PHP 7 relates to array duplication. PHP 5和PHP 7之间的另一个重要变化涉及数组复制。 Now that the IAP is no longer used, by-value array iteration will only do a refcount increment (instead of duplication the array) in all cases. 现在,IAP不再使用,按值数组迭代将在所有情况下仅增加refcount (而不是复制数组)。 If the array is modified during the foreach loop, at that point a duplication will occur (according to copy-on-write) and foreach will keep working on the old array. 如果在foreach循环中修改了数组,则将发生复制(根据写时复制),并且foreach将继续在旧数组上工作。

In most cases, this change is transparent and has no other effect than better performance. 在大多数情况下,此更改是透明的,除了提高性能外没有其他影响。 However, there is one occasion where it results in different behavior, namely the case where the array was a reference beforehand: 但是,在某些情况下它会导致不同的行为,即数组事先是引用的情况:

$array = [1, 2, 3, 4, 5];
$ref = &$array;
foreach ($array as $val) {
    var_dump($val);
    $array[2] = 0;
}
/* Old output: 1, 2, 0, 4, 5 */
/* New output: 1, 2, 3, 4, 5 */

Previously by-value iteration of reference-arrays was special cases. 以前,引用数组的按值迭代是特殊情况。 In this case, no duplication occurred, so all modifications of the array during iteration would be reflected by the loop. 在这种情况下,不会发生重复,因此循环过程中对数组的所有修改都会反映在循环中。 In PHP 7 this special case is gone: A by-value iteration of an array will always keep working on the original elements, disregarding any modifications during the loop. 在PHP 7中,这种特殊情况不复存在:数组的按值迭代将始终对原始元素进行处理,而无需考虑循环中的任何修改。

This, of course, does not apply to by-reference iteration. 当然,这不适用于按引用迭代。 If you iterate by-reference all modifications will be reflected by the loop. 如果按引用进行迭代,则所有修改都将在循环中反映出来。 Interestingly, the same is true for by-value iteration of plain objects: 有趣的是,对于普通对象的按值迭代也是如此:

$obj = new stdClass;
$obj->foo = 1;
$obj->bar = 2;
foreach ($obj as $val) {
    var_dump($val);
    $obj->bar = 42;
}
/* Old and new output: 1, 42 */

This reflects the by-handle semantics of objects (ie they behave reference-like even in by-value contexts). 这反映了对象的按句柄语义(即,即使在按值上下文中,它们的行为也像引用一样)。

Examples 例子

Let's consider a few examples, starting with your test cases: 让我们考虑一些示例,从测试用例开始:

  • Test cases 1 and 2 retain the same output: By-value array iteration always keep working on the original elements. 测试用例1和2保留相同的输出:按值数组迭代始终对原始元素起作用。 (In this case, even refcounting and duplication behavior is exactly the same between PHP 5 and PHP 7). (在这种情况下,甚至refcounting和复制行为在PHP 5和PHP 7之间也完全相同)。

  • Test case 3 changes: Foreach no longer uses the IAP, so each() is not affected by the loop. 测试用例3进行了更改: Foreach不再使用IAP,因此each()不受循环的影响。 It will have the same output before and after. 前后会有相同的输出。

  • Test cases 4 and 5 stay the same: each() and reset() will duplicate the array before changing the IAP, while foreach still uses the original array. 测试案例4和5保持不变: each()reset()将在更改IAP之前复制该数组,而foreach仍使用原始数组。 (Not that the IAP change would have mattered, even if the array was shared.) (即使阵列是共享的,IAP更改也不会很重要。)

The second set of examples was related to the behavior of current() under different reference/refcounting configurations. 第二组示例与在不同reference/refcounting配置下current()的行为有关。 This no longer makes sense, as current() is completely unaffected by the loop, so its return value always stays the same. 这不再有意义,因为current()完全不受循环的影响,因此其返回值始终保持不变。

However, we get some interesting changes when considering modifications during iteration. 但是,在迭代过程中考虑修改时,我们会得到一些有趣的变化。 I hope you will find the new behavior saner. 我希望您会发现新的行为更聪明。 The first example: 第一个例子:

$array = [1, 2, 3, 4, 5];
foreach ($array as &$v1) {
    foreach ($array as &$v2) {
        if ($v1 == 1 && $v2 == 1) {
            unset($array[1]);
        }
        echo "($v1, $v2)\n";
    }
}

// Old output: (1, 1) (1, 3) (1, 4) (1, 5)
// New output: (1, 1) (1, 3) (1, 4) (1, 5)
//             (3, 1) (3, 3) (3, 4) (3, 5)
//             (4, 1) (4, 3) (4, 4) (4, 5)
//             (5, 1) (5, 3) (5, 4) (5, 5) 

As you can see, the outer loop no longer aborts after the first iteration. 如您所见,外循环在第一次迭代后不再中止。 The reason is that both loops now have entirely separate hashtable iterators, and there is no longer any cross-contamination of both loops through a shared IAP. 原因是两个循环现在都具有完全独立的哈希表迭代器,并且不再通过共享的IAP交叉污染两个循环。

Another weird edge case that is fixed now, is the odd effect you get when you remove and add elements that happen to have the same hash: 现在已解决的另一个奇怪的边缘情况是,当删除和添加恰好具有相同哈希值的元素时,会得到奇怪的效果:

$array = ['EzEz' => 1, 'EzFY' => 2, 'FYEz' => 3];
foreach ($array as &$value) {
    unset($array['EzFY']);
    $array['FYFY'] = 4;
    var_dump($value);
}
// Old output: 1, 4
// New output: 1, 3, 4

Previously the HashPointer restore mechanism jumped right to the new element because it "looked" like it's the same as the removed element (due to colliding hash and pointer). 以前,HashPointer还原机制直接跳到新元素上,因为它“看起来”与删除的元素相同(由于哈希和指针冲突)。 As we no longer rely on the element hash for anything, this is no longer an issue. 由于我们不再依赖元素哈希进行任何操作,因此这不再是问题。


#5楼

NOTE FOR PHP 7 PHP 7的注意事项

To update on this answer as it has gained some popularity: This answer no longer applies as of PHP 7. As explained in the " Backward incompatible changes ", in PHP 7 foreach works on copy of the array, so any changes on the array itself are not reflected on foreach loop. 要更新该答案,因为它已经很流行:从PHP 7开始,此答案不再适用。如“ 向后不兼容的更改 ”中所述,在PHP 7中,foreach可以在数组副本上使用,因此数组本身的任何更改没有反映在foreach循环上。 More details at the link. 链接中有更多详细信息。

Explanation (quote from php.net ): 说明(引自php.net ):

The first form loops over the array given by array_expression. 第一种形式遍历array_expression给定的数组。 On each iteration, the value of the current element is assigned to $value and the internal array pointer is advanced by one (so on the next iteration, you'll be looking at the next element). 在每次迭代中,当前元素的值都分配给$ value,并且内部数组指针前进一个(因此,在下一次迭代中,您将查看下一个元素)。

So, in your first example you only have one element in the array, and when the pointer is moved the next element does not exist, so after you add new element foreach ends because it already "decided" that it it as the last element. 因此,在第一个示例中,数组中只有一个元素,并且当指针移动时,下一个元素不存在,因此在添加新元素后foreach结尾是因为它已经“决定”了它作为最后一个元素。

In your second example, you start with two elements, and foreach loop is not at the last element so it evaluates the array on the next iteration and thus realises that there is new element in the array. 在第二个示例中,您从两个元素开始,并且foreach循环不在最后一个元素处,因此它在下一次迭代时对数组进行求值,从而意识到数组中存在新元素。

I believe that this is all consequence of On each iteration part of the explanation in the documentation, which probably means that foreach does all logic before it calls the code in {} . 我相信,这都是文档中说明的“ 每次迭代”部分的全部结果,这可能意味着foreach在调用{}的代码之前先进行所有逻辑。

Test case 测试用例

If you run this: 如果运行此命令:

<?
    $array = Array(
        'foo' => 1,
        'bar' => 2
    );
    foreach($array as $k=>&$v) {
        $array['baz']=3;
        echo $v." ";
    }
    print_r($array);
?>

You will get this output: 您将获得以下输出:

1 2 3 Array
(
    [foo] => 1
    [bar] => 2
    [baz] => 3
)

Which means that it accepted the modification and went through it because it was modified "in time". 这意味着它接受了修改并经历了修改,因为修改是“及时的”。 But if you do this: 但是,如果您这样做:

<?
    $array = Array(
        'foo' => 1,
        'bar' => 2
    );
    foreach($array as $k=>&$v) {
        if ($k=='bar') {
            $array['baz']=3;
        }
        echo $v." ";
    }
    print_r($array);
?>

You will get: 你会得到:

1 2 Array
(
    [foo] => 1
    [bar] => 2
    [baz] => 3
)

Which means that array was modified, but since we modified it when the foreach already was at the last element of the array, it "decided" not to loop anymore, and even though we added new element, we added it "too late" and it was not looped through. 这意味着该数组已被修改,但是由于我们在foreach已位于数组的最后一个元素时对其进行了修改,因此它“决定”不再循环,即使我们添加了新元素,我们也将其添加为“太晚了”,它没有循环通过。

Detailed explanation can be read at How does PHP 'foreach' actually work? 可以在PHP'foreach'实际如何工作中阅读详细说明 which explains the internals behind this behaviour. 这解释了此行为的内部原因。


#6楼

As per the documentation provided by PHP manual. 根据PHP手册提供的文档。

On each iteration, the value of the current element is assigned to $v and the internal 在每次迭代中,当前元素的值分配给$ v,内部
array pointer is advanced by one (so on the next iteration, you'll be looking at the next element). 数组指针前进一个(因此,在下一次迭代中,您将查看下一个元素)。

So as per your first example: 因此,按照您的第一个示例:

$array = ['foo'=>1];
foreach($array as $k=>&$v)
{
   $array['bar']=2;
   echo($v);
}

$array have only single element, so as per the foreach execution, 1 assign to $v and it don't have any other element to move pointer $array仅具有单个元素,因此根据foreach执行,将1分配给$v ,并且没有其他元素可移动指针

But in your second example: 但是在第二个示例中:

$array = ['foo'=>1, 'bar'=>2];
foreach($array as $k=>&$v)
{
   $array['baz']=3;
   echo($v);
}

$array have two element, so now $array evaluate the zero indices and move the pointer by one. $array有两个元素,所以现在$ array计算零索引并将指针移一。 For first iteration of loop, added $array['baz']=3; 对于循环的第一次迭代,添加了$array['baz']=3; as pass by reference. 作为参考。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值