现在让我们来关注一下fsbr.尽管之前就FSBR本身已经说过了,但是代码中出现了很多关于fsbr的变量以及函数.如果不来梳理一下,恐怕你和我一样,仍然感到无限困惑,无限茫然.那么让我们点亮心灵的阿拉丁神灯,共同穿越这代码的迷朦.
struct uhci_hcd中有这么几个成员,unsigned int fsbr_is_on,unsigned int fsbr_is_wanted,unsigned int fsbr_expiring,struct timer_list fsbr_timer,这些全都是为fsbr准备的.足以看出写代码的人对fsbr的重视.
改变fsbr_is_on的就两个函数,uhci_fsbr_on和uhci_fsbr_off,顾名思义,前者让fsbr_is_on为1,后者让fsbr_is_on为0.
改变fsbr_is_wanted的也只有两个函数,uhci_urbp_wants_fsbr和uhci_scan_schedule.同样,前者让fsbr_is_wanted为1,后者让fsbr_is_wanted为0.
改变fsbr_expiring的倒是有三处,uhci_urbp_wants_fsbr,设置为0,uhci_fsbr_timeout,也是设置为0,uhci_scan_schedule,设置为1.
而fsbr_timer作为一个计时器,它是在uhci_start中通过setup_timer做的初始化,绑定了函数uhci_fsbr_timeout(),而在uhci_scan_schedule()中调用mod_timer来引爆了这个定时.在uhci_urbp_wants_fsbr()中调用del_timer删除了这个计时器.在uhci_stop中也调用del_timer_sync来排除了这颗定时.
我们不妨先来看一下给这个定时绑定的函数究竟长成什么样?uhci_fsbr_timeout()来自drivers/usb/host/uhci-q.c:
92 static void uhci_fsbr_timeout(unsigned long _uhci)
93 {
94 struct uhci_hcd *uhci = (struct uhci_hcd *) _uhci;
95 unsigned long flags;
96
97 spin_lock_irqsave(&uhci->lock, flags);
98 if (uhci->fsbr_expiring) {
99 uhci->fsbr_expiring = 0;
100 uhci_fsbr_off(uhci);
101 }
102 spin_unlock_irqrestore(&uhci->lock, flags);
103 }
可以看到这个函数无非就是调用uhci_fsbr_off()而已,除此以外就是设置fsbr_expiring为0.而执行uhci_fsbr_off()的前提是fsbr_expiring非0.于是咱们来到uhci_scan_schedule中去看调用mod_timer的上下文.
1705 /*
1706 * Process events in the schedule, but only in one thread at a time
1707 */
1708 static void uhci_scan_schedule(struct uhci_hcd *uhci)
1709 {
1710 int i;
1711 struct uhci_qh *qh;
1712
1713 /* Don't allow re-entrant calls */
1714 if (uhci->scan_in_progress) {
1715 uhci->need_rescan = 1;
1716 return;
1717 }
1718 uhci->scan_in_progress = 1;
1719 rescan:
1720 uhci->need_rescan = 0;
1721 uhci->fsbr_is_wanted = 0;
1722
1723 uhci_clear_next_interrupt(uhci);
1724 uhci_get_current_frame_number(uhci);
1725 uhci->cur_iso_frame = uhci->frame_number;
1726
1727 /* Go through all the QH queues and process the URBs in each one */
1728 for (i = 0; i < UHCI_NUM_SKELQH - 1; ++i) {
1729 uhci->next_qh = list_entry(uhci->skelqh[i]->node.next,
1730 struct uhci_qh, node);
1731 while ((qh = uhci->next_qh) != uhci->skelqh[i]) {
1732 uhci->next_qh = list_entry(qh->node.next,
1733 struct uhci_qh, node);
1734
1735 if (uhci_advance_check(uhci, qh)) {
1736 uhci_scan_qh(uhci, qh);
1737 if (qh->state == QH_STATE_ACTIVE) {
1738 uhci_urbp_wants_fsbr(uhci,
1739 list_entry(qh->queue.next, struct urb_priv, node));
1740 }
1741 }
1742 }
1743 }
1744
1745 uhci->last_iso_frame = uhci->cur_iso_frame;
1746 if (uhci->need_rescan)
1747 goto rescan;
1748 uhci->scan_in_progress = 0;
1749
1750 if (uhci->fsbr_is_on && !uhci->fsbr_is_wanted &&
1751 !uhci->fsbr_expiring) {
1752 uhci->fsbr_expiring = 1;
1753 mod_timer(&uhci->fsbr_timer, jiffies + FSBR_OFF_DELAY);
1754 }
1755
1756 if (list_empty(&uhci->skel_unlink_qh->node))
1757 uhci_clear_next_interrupt(uhci);
1758 else
1759 uhci_set_next_interrupt(uhci);
1760 }
可以看出,在调用mod_timer之前,我们就是设置了fsbr_expiring为1.而mod_timer设置的延时是FSBR_OFF_DELAY.这个宏的定义来自drivers/usb/host/uhci-hcd.h:
88 /* When no queues need Full-Speed Bandwidth Reclamation,
89 * delay this long before turning FSBR off */
90 #define FSBR_OFF_DELAY msecs_to_jiffies(10)
91
92 /* If a queue hasn't advanced after this much time, assume it is stuck */
93 #define QH_WAIT_TIMEOUT msecs_to_jiffies(200)
我们看到这里两个宏被定义到了一起,凭一种男人的直觉,这两个宏应该有某种联系.实际上在struct uhci_qh中有一个成员,unsigned int wait_expired,uhci_activate_qh中把它设置为0,uhci_advance_check中则两次设置它,一次设置为0,一次设置为1.这个变量就与宏QH_WAIT_TIMEOUT相关.
不过我们还是先看前面这个宏, FSBR_OFF_DELAY,由定义可知,它代表10毫秒.按照Alan Stern大侠的想法,尽管说FSBR这个机制是一种充分利用资源的机制,但是它也在一定程度上增加了系统的负荷,所以一旦它没有被使用了就应该尽快的disable掉.根据Alan Stern的经验,如果一个URB停止使用FSBR达到10毫秒,则关掉(turn off)FSBR,理论上来说10毫秒已经足够让驱动程序提交另一个URB了.
实际上在uhci_add_fsbr()中,判断的是如果一个urb的URB_NO_FSBR这个flag没有被设置,则设置urbp->fsbr为1.实际上也没有哪位哥们儿喜欢设置URB_NO_FSBR这个flag,所以基本上我们可以认为urbp->fsbr总是会被uhci_add_fsbr()设置为1.而调用uhci_add_fsbr()的函数就两个,uhci_submit_bulk()和uhci_submit_control().所以如果我们察看debugfs文件系统的输出信息就会发现,在没有Bulk传输没有控制传输的时候,FSBR一定是0,即fsbr_is_on一定是0.而在有Bulk传输或者全速控制传输的时候,FSBR则应该是1,比如下面这个情景就出自我在写U盘的时候,这一刻我copy了一个几十兆的文件至U盘:
localhost:~ # cat /sys/kernel/debug/uhci/0000/:00/:1d.0
Root-hub state: running FSBR: 1
HC status
usbcmd = 00c1 Maxp64 CF RS
usbstat = 0000
usbint = 000f
usbfrnum = (0)958
flbaseadd = 146eb958
sof = 40
stat1 = 0095 Enabled Connected
stat2 = 0080
Most recent frame: 31a41 (577) Last ISO frame: 31a41 (577)
Periodic load table
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
Total: 0, #INT: 0, #ISO: 0
这里我们看到的”FSBR”实际上对应于uhci->fsbr_is_on.
但是设置fsbr_is_on的是uhci_fsbr_on()而不是uhci_add_fsbr(),调用uhci_fsbr_on()的是uhci_urbp_wants_fsbr(),而在uhci_urbp_wants_fsbr()中需要判断urbp->fsbr,正如我们刚才说了,uhci_add_fsbr()把urbp->fsbr设置了1,所以这里uhci_fsbr_on才会被执行.
79 static void uhci_urbp_wants_fsbr(struct uhci_hcd *uhci, struct urb_priv *urbp)
80 {
81 if (urbp->fsbr) {
82 uhci->fsbr_is_wanted = 1;
83 if (!uhci->fsbr_is_on)
84 uhci_fsbr_on(uhci);
85 else if (uhci->fsbr_expiring) {
86 uhci->fsbr_expiring = 0;
87 del_timer(&uhci->fsbr_timer);
88 }
89 }
90 }
调用uhci_urbp_wants_fsbr()的有三个函数,而第一个自然是uhci_urb_enqueue().正如在uhci_urb_enqueue()中1435行看到的那样,uhci_activate_qh把qh给激活之后,就可以调用uhci_urbp_wants_fsbr来激活fsbr了.
那么在激活之后,什么时候又把FSBR设置为了0呢?也就是说把fsbr_is_on设置为0的uhci_fsbr_off什么时候被调用?事实上有两个地方,一个就是suspend_rh,一个就是uhci_fsbr_timeout.我们先来看后者,它正是前面我们说的那个定时所绑定的函数.而触发它的mod_timer函数在uhci_scan_schedule()被调用.但是要调用mod_timer须满足三个条件,uhci_scan_schedule()中1750行,这三个条件是,fsbr_is_on必须为1, fsbr_is_wanted必须为0, fsbr_expiring必须为0.第一个为1这很好理解,这也是必然的.第二个和第三个则和1738行这个uhci_urbp_wants_fsbr()有关了.对于fsbr_is_wanted,我们看到uhci_scan_schedule()中1721行首先就把它设置为了0,但是我们注意到,如果uhci_urbp_wants_fsbr执行了,就会把fsbr_is_wanted设置为1.至于fsbr_expiring,初始值就是0,也没人改过,所以它执行不执行uhci_urbp_wants_fsbr都依然是0,至少就目前这个上下文来看是这样.但问题是uhci_urbp_wants_fsbr是否被执行呢?这取决于1737行这个if判断语句.即qh->state是否等于QH_STATE_ACTIVE,而这取决于uhci_scan_qh.uhci_scan_qh的目的是看qh的urb队列是否已经空了,如果还没空,就再次调用uhci_activate_qh设置qh->state为QH_STATE_ACTIVE,如果已经空了,就调用uhci_make_qh_idle把qh->state设置为QH_STATE_IDLE.换言之,如果1737行这个if条件满足,说明qh的urb队伍里还有urb.既然有,那么就激活fsbr.即仍然设置fsbr_is_wanted为1.但是早晚有一天,qh队列会变成空的,因为传输总有结束的时候.等到那时候,uhci_scan_qh之后,qh->state就一定是QH_STATE_IDLE,所以等到那一天,uhci_urbp_wants_fsbr就不会被调用.换言之,因为没有了urb,所以我们没有必要再使用fsbr了.于是fsbr_is_wanted这次就是0.这种情况下,1752行和1753行终于有机会被执行了.这样,首先fsbr_expiring被设置为了1,其次,10毫秒之后,uhci_fsbr_timeout将被执行,从而uhci_fsbr_off也将被执行,fsbr终于停了下来,这时候我们再看debugfs,就该像下面这样,
localhost:~ # cat /sys/kernel/debug/uhci/0000/:00/:1d.0
Root-hub state: running FSBR: 0
HC status
usbcmd = 00c1 Maxp64 CF RS
usbstat = 0000
usbint = 000f
usbfrnum = (0)b04
flbaseadd = 10b57b04
sof = 40
stat1 = 0095 Enabled Connected
stat2 = 0080
Most recent frame: 384216 (534) Last ISO frame: 384216 (534)
Periodic load table
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
Total: 0, #INT: 0, #ISO: 0
FSBR将再次回到0.
但是,人算不如天算,你以为一切都在掌握之中,不料,在这个10ms之内,不知哪位哥们儿缺德,又给你提交一个Bulk类型的urb,你怎么办?
不要慌,要相信党,相信政府.
假设那位哥们儿在这10ms之内提交了一个Bulk类型的urb,则uhci_urb_enqueue会被调用,因而uhci_urbp_wants_fsbr再次被调用.那么回过头去看一下uhci_urbp_wants_fsbr(),你会发现,由于刚才设置了fsbr_expiring为1,所以这个函数的85行这个else if是满足的,因此uhci->fsbr_expiring又会被设置为0,但更重要的是del_timer会被调用,即即将爆炸的在它爆炸前10ms内被英明的党排除了.相信现在你和我一样,深刻体会到党的光芒照四方了吧?
咱们刚才说到有两个宏,已经明白了其中的一个,那么另一个宏呢,即QH_WAIT_TIMEOUT .事实上它和uhci_advance_check()有关.这个函数咱们以前讲过,但是细心的你一定注意到,当时咱们跳过了它的一部分代码.现在是时候去解读这段跳过的代码了.再次贴出uhci_advance_check()来.
1626 /*
1627 * Check for queues that have made some forward progress.
1628 * Returns 0 if the queue is not Isochronous, is ACTIVE, and
1629 * has not advanced since last examined; 1 otherwise.
1630 *
1631 * Early Intel controllers have a bug which causes qh->element sometimes
1632 * not to advance when a TD completes successfully. The queue remains
1633 * stuck on the inactive completed TD. We detect such cases and advance
1634 * the element pointer by hand.
1635 */
1636 static int uhci_advance_check(struct uhci_hcd *uhci, struct uhci_qh *qh)
1637 {
1638 struct urb_priv *urbp = NULL;
1639 struct uhci_td *td;
1640 int ret = 1;
1641 unsigned status;
1642
1643 if (qh->type == USB_ENDPOINT_XFER_ISOC)
1644 goto done;
1645
1646 /* Treat an UNLINKING queue as though it hasn't advanced.
1647 * This is okay because reactivation will treat it as though
1648 * it has advanced, and if it is going to become IDLE then
1649 * this doesn't matter anyway. Furthermore it's possible
1650 * for an UNLINKING queue not to have any URBs at all, or
1651 * for its first URB not to have any TDs (if it was dequeued
1652 * just as it completed). So it's not easy in any case to
1653 * test whether such queues have advanced. */
1654 if (qh->state != QH_STATE_ACTIVE) {
1655 urbp = NULL;
1656 status = 0;
1657
1658 } else {
1659 urbp = list_entry(qh->queue.next, struct urb_priv, node);
1660 td = list_entry(urbp->td_list.next, struct uhci_td, list);
1661 status = td_status(td);
1662 if (!(status & TD_CTRL_ACTIVE)) {
1663
1664 /* We're okay, the queue has advanced */
1665 qh->wait_expired = 0;
1666 qh->advance_jiffies = jiffies;
1667 goto done;
1668 }
1669 ret = 0;
1670 }
1671
1672 /* The queue hasn't advanced; check for timeout */
1673 if (qh->wait_expired)
1674 goto done;
1675
1676 if (time_after(jiffies, qh->advance_jiffies + QH_WAIT_TIMEOUT)) {
1677
1678 /* Detect the Intel bug and work around it */
1679 if (qh->post_td && qh_element(qh) == LINK_TO_TD(qh->post_td)) {
1680 qh->element = qh->post_td->link;
1681 qh->advance_jiffies = jiffies;
1682 ret = 1;
1683 goto done;
1684 }
1685
1686 qh->wait_expired = 1;
1687
1688 /* If the current URB wants FSBR, unlink it temporarily
1689 * so that we can safely set the next TD to interrupt on
1690 * completion. That way we'll know as soon as the queue
1691 * starts moving again. */
1692 if (urbp && urbp->fsbr && !(status & TD_CTRL_IOC))
1693 uhci_unlink_qh(uhci, qh);
1694
1695 } else {
1696 /* Unmoving but not-yet-expired queues keep FSBR alive */
1697 if (urbp)
1698 uhci_urbp_wants_fsbr(uhci, urbp);
1699 }
1700
1701 done:
1702 return ret;
1703 }
这里1673行以下的代码我们都没有讲过.首先wait_expired和advance_jiffies都不是第一次出现,事实上它们的赋值发生在uhci_activate_qh中,当时qh->wait_expired被设置为了0,而qh->advance_jiffies被设置为了当时的时间. QH_WAIT_TIMEOUT被定义为200毫秒,那么当我们现在执行uhci_scan_schedule的时候执行uhci_advance_check的时候,1676行,如果从qh激活到现在扫描过了200毫秒,对队列依然没有前进,按照经验,这是不正常的,这就相当于我坐公交车去上班,从百万庄大街坐319路到清华科技园,本来只要40分钟,可是如果哪天我坐了两个小时还没到,那么说明一定出问题了,要么就是出车祸了,要么就是严重堵车了.
那么这里的应对措施是什么呢?
1679行至1684行,注释说了,我家Intel的Bug,本着家丑不外扬的原则,飘过.
1686行,设置wait_expired为1.
1692行,if条件又是三个,第一,urbp不为空,第二,urbp->fsbr不为空,第三个,没有设置TD_CTRL_IOC.如果这三个条件满足,则调用uhci_unlink_qh().注释里说的很清楚,如果当前urbp->fsbr不为空,说明只要fsbr不取消,下一次还会执行到它,不妨这次先放过它.
然而如果1695行这个else里面的代码被执行了,就说明虽然队列没有前进,但是也还没有超时,即从qh激活到现在还不到200毫秒,这样如果qh的urb队列里面还有urbp,则执行uhci_urbp_wants_fsbr()以保证fsbr_is_on仍然是1.网友”做贼肾虚”不禁好奇的问:刚才我们看见,qh如果还没空,那么uhci_scan_schedule中那颗定时就不会被引爆,那么fsbr_is_on不就应该保持为1么?
然此言差矣!我们前面提过,设置fsbr_is_on为0的是函数uhci_fsbr_off(),而调用uhci_fsbr_off的除了刚才说的那个uhci_fsbr_timeout()之外,还有一个地方,它就是suspend_rh().事实上前面我们也已经看见了,suspend_rh()中最后一行就会调用uhci_fsbr_off().所以当我们从沉睡中醒来之后,我们有必要保证fsbr_is_on仍然是1.
看起来,似乎我们又看了一遍uhci_advance_check().但我们其实有一个疑问, 1692行调用uhci_unlink_qh()这个函数的后果究竟是什么?