[Android6.0] 数据业务重试机制

原创 2016年12月26日 10:45:12

Android 6.0 Framework telephony中数据业务链接错误处理一般分3种情况:

1. SETUP_DATA_CALL 时返回错误

2. Modem上报DATA_CALL_LIST包含错误码或者链接中断

3. 一段时间内没有上下行数据(TX/RX)


下面具体来看每种情况的处理。

1. SETUP_DATA_CALL失败

DataConnection在收到SETUP_DATA_CALL结果后,用Message通知DcTracker处理:
protected void onDataSetupComplete(AsyncResult ar) {
    if (ar.exception == null) {
        //链接成功
    }else{
        ...
        //标记permanent fail的次数,会影响后面onDataSetupCompleteError的判断
        if (isPermanentFail(cause)) apnContext.decWaitingApnsPermFailCount(); 
        apnContext.removeWaitingApn(apnContext.getApnSetting()); //从waiting列表中移除已经失败的APN
        onDataSetupCompleteError(ar);//继续处理错误
        ...
    }
}

处理Error的逻辑:
1. 如果apnContext中的所有waiting APN都失败了,且不是每个都发生permanent fail(永久性错误),则设置delay并重新发起这次连接
2. 如果apnContext中仍有没有尝试的waiting APN,则设置delay并尝试用下一个APN去连接
/**
     * Error has occurred during the SETUP {aka bringUP} request and the DCT
     * should either try the next waiting APN or start over from the
     * beginning if the list is empty. Between each SETUP request there will
     * be a delay defined by {@link #getApnDelay()}.
     */
    @Override
    protected void onDataSetupCompleteError(AsyncResult ar) {
        String reason = "";
        ApnContext apnContext = getValidApnContext(ar, "onDataSetupCompleteError");


        if (apnContext == null) return;


        //已经尝试过所有APN
        if (apnContext.getWaitingApns().isEmpty()) {
            apnContext.setState(DctConstants.State.FAILED);//apnContext state设置成FAILED
            mPhone.notifyDataConnection(Phone.REASON_APN_FAILED, apnContext.getApnType());

            //清除DataConnection
            apnContext.setDataConnectionAc(null);

            //如果所有APN都发生Permanent fail,则不做重试
            if (apnContext.getWaitingApnsPermFailCount() == 0) {
                if (DBG) {
                    log("onDataSetupCompleteError: All APN's had permanent failures, stop retrying");
                }
            } else {//执行重试
                int delay = getApnDelay(Phone.REASON_APN_FAILED);
                if (DBG) {
                    log("onDataSetupCompleteError: Not all APN's had permanent failures delay="
                            + delay);
                }
                startAlarmForRestartTrySetup(delay, apnContext);
            }
        } else {//waitingAPN中还有没有尝试的APN,继续尝试下一个
            if (DBG) log("onDataSetupCompleteError: Try next APN");
            apnContext.setState(DctConstants.State.SCANNING);
            // Wait a bit before trying the next APN, so that
            // we're not tying up the RIL command channel
            startAlarmForReconnect(getApnDelay(Phone.REASON_APN_FAILED), apnContext);//试下一个APN

        }
    }

附:ApnContext的所有状态

    /**
     * IDLE: ready to start data connection setup, default state
     * CONNECTING: state of issued startPppd() but not finish yet
     * SCANNING: data connection fails with one apn but other apns are available
     *           ready to start data connection on other apns (before INITING)
     * CONNECTED: IP connection is setup
     * DISCONNECTING: Connection.disconnect() has been called, but PDP
     *                context is not yet deactivated
     * FAILED: data connection fail for all apns settings
     * RETRYING: data connection failed but we're going to retry.
     *
     * getDataConnectionState() maps State to DataState
     *      FAILED or IDLE : DISCONNECTED
     *      RETRYING or CONNECTING or SCANNING: CONNECTING
     *      CONNECTED : CONNECTED or DISCONNECTING
     */
    public enum State {
        IDLE,
        CONNECTING,
        SCANNING,
        CONNECTED,
        DISCONNECTING,
        FAILED,
        RETRYING
    }


2. 链接中断

DcController监听RIL_UNSOL_DATA_CALL_LIST_CHANGED消息,获得每一个数据连接的更新:

mPhone.mCi.registerForDataNetworkStateChanged(getHandler(),
                    DataConnection.EVENT_DATA_STATE_CHANGED, null);

RIL上报DATA_CALL_LIST_CHANGED时会带上当前的Modem中的DataCall list,DcController将此dataCall list和上层的active list做对比:

1. 已经丢失 及 断开 的连接将会重试

2. 发生变化 和 发生永久错误的链接则需要清除

private void onDataStateChanged(ArrayList<DataCallResponse> dcsList) {

            // Create hashmap of cid to DataCallResponse
            HashMap<Integer, DataCallResponse> dataCallResponseListByCid =
                    new HashMap<Integer, DataCallResponse>();
            for (DataCallResponse dcs : dcsList) {
                dataCallResponseListByCid.put(dcs.cid, dcs);
            }

            //如果上报的dcsList中并没有找到对应的active的链接,则默认连接丢失并加入重试List
            ArrayList<DataConnection> dcsToRetry = new ArrayList<DataConnection>();
            for (DataConnection dc : mDcListActiveByCid.values()) {
                if (dataCallResponseListByCid.get(dc.mCid) == null) {
                    if (DBG) log("onDataStateChanged: add to retry dc=" + dc);
                    dcsToRetry.add(dc);
                }
            }
            // Find which connections have changed state and send a notification or cleanup
            // and any that are in active need to be retried.
            ArrayList<ApnContext> apnsToCleanup = new ArrayList<ApnContext>();

            boolean isAnyDataCallDormant = false;
            boolean isAnyDataCallActive = false;

            for (DataCallResponse newState : dcsList) {

                DataConnection dc = mDcListActiveByCid.get(newState.cid);
                //不在Active MAP中的连接,表明这个连接还没同步到上层,会有其他地方处理。
                if (dc == null) {
                    // UNSOL_DATA_CALL_LIST_CHANGED arrived before SETUP_DATA_CALL completed.
                    loge("onDataStateChanged: no associated DC yet, ignore");
                    continue;
                }
                
                if (dc.mApnContexts.size() == 0) {
                    if (DBG) loge("onDataStateChanged: no connected apns, ignore");
                } else {
                    // Determine if the connection/apnContext should be cleaned up
                    // or just a notification should be sent out.
                    if (newState.active == DATA_CONNECTION_ACTIVE_PH_LINK_INACTIVE) {
                            //连接INACTIVE,按照错误类型区分处理
                            DcFailCause failCause = DcFailCause.fromInt(newState.status);
                            if (failCause.isRestartRadioFail()) {
                                //恢复需要重启radio
                                mDct.sendRestartRadio();
                            } else if (mDct.isPermanentFail(failCause)) {
                                //链接发生不可恢复的错误,需要Cleanup
                                apnsToCleanup.addAll(dc.mApnContexts.keySet());
                            } else {
                                for (ApnContext apnContext : dc.mApnContexts.keySet()) {
                                    if (apnContext.isEnabled()) {
                                        //apn是enabled状态,重试
                                        dcsToRetry.add(dc);
                                        break;
                                    } else {
                                        //apn已经disabled,需要cleanup
                                        apnsToCleanup.add(apnContext);
                                    }
                                }
                            }

                    } else {
                        //LinkProperty发生变化
                        UpdateLinkPropertyResult result = dc.updateLinkProperty(newState);
                        if (result.oldLp.equals(result.newLp)) {
                            if (DBG) log("onDataStateChanged: no change");
                        } else {
                            //判断interface是否一致
                            if (result.oldLp.isIdenticalInterfaceName(result.newLp)) {
                                if (! result.oldLp.isIdenticalDnses(result.newLp) ||
                                        ! result.oldLp.isIdenticalRoutes(result.newLp) ||
                                        ! result.oldLp.isIdenticalHttpProxy(result.newLp) ||
                                        ! result.oldLp.isIdenticalAddresses(result.newLp)) {
                                    // If the same address type was removed and
                                    // added we need to cleanup
                                    CompareResult<LinkAddress> car =
                                        result.oldLp.compareAddresses(result.newLp);
                                    if (DBG) {
                                        log("onDataStateChanged: oldLp=" + result.oldLp +
                                                " newLp=" + result.newLp + " car=" + car);
                                    }
                                    boolean needToClean = false;
                                    //如果address发生变化,需要清除这个old connection
                                    for (LinkAddress added : car.added) {
                                        for (LinkAddress removed : car.removed) {
                                            if (NetworkUtils.addressTypeMatches(
                                                    removed.getAddress(),
                                                    added.getAddress())) {
                                                needToClean = true;
                                                break;
                                            }
                                        }
                                    }
                                    if (needToClean) {

                                        apnsToCleanup.addAll(dc.mApnContexts.keySet());
                                    } else {
                                        if (DBG) log("onDataStateChanged: simple change");
                                        //其他的LP变化,只做notify
                                        for (ApnContext apnContext : dc.mApnContexts.keySet()) {
                                             mPhone.notifyDataConnection(
                                                 PhoneConstants.REASON_LINK_PROPERTIES_CHANGED,
                                                 apnContext.getApnType());
                                        }
                                    }
                                } else {
                                    if (DBG) {
                                        log("onDataStateChanged: no changes");
                                    }
                                }
                            } else {
                                //interface发生改变,cleanUp这个old connection
                                apnsToCleanup.addAll(dc.mApnContexts.keySet());
                                if (DBG) {
                                    log("onDataStateChanged: interface change, cleanup apns="
                                            + dc.mApnContexts);
                                }
                            }
                        }
                    }
                }

                ...
            }

            ...

            //清除链接
            for (ApnContext apnContext : apnsToCleanup) {
               mDct.sendCleanUpConnection(true, apnContext);
            }

            //通知DataConnection链接丢失,需要发起重连
            for (DataConnection dc : dcsToRetry) {
                dc.sendMessage(DataConnection.EVENT_LOST_CONNECTION, dc.mTag);
            }
        }
    }

DataConnection ActiveState在收到LOST_CONNECTION消息后:

1. 如果重试次数没有达到上限,则设置定时重试,并切换到RetryingState

2. 如果不需要重试,则切换到Inactive状态,并可能通知DcTracker处理(onDataSetupCompleteError,可看第一种情况)

                case EVENT_LOST_CONNECTION: {
                    if (DBG) {
                        log("DcActiveState EVENT_LOST_CONNECTION dc=" + DataConnection.this);
                    }
                    if (mRetryManager.isRetryNeeded()) {
                        // We're going to retry
                        int delayMillis = mRetryManager.getRetryTimer();
                        //重试
                        mDcRetryAlarmController.startRetryAlarm(EVENT_RETRY_CONNECTION, mTag,
                                delayMillis);
                        transitionTo(mRetryingState);
                    } else {
                        mInactiveState.setEnterNotificationParams(DcFailCause.LOST_CONNECTION);
                        transitionTo(mInactiveState);
                    }
                    retVal = HANDLED;
                    break;
                }


RetryingState 收到RETRY消息后,发起连接并切换到ActivatingState

case EVENT_RETRY_CONNECTION: {
                    if (msg.arg1 == mTag) {
                        mRetryManager.increaseRetryCount();//计数

                        onConnect(mConnectionParams);//开始连接
                        transitionTo(mActivatingState);//切换到Activating State
                    } else {
                        if (DBG) {
                            log("DcRetryingState stale EVENT_RETRY_CONNECTION"
                                    + " tag:" + msg.arg1 + " != mTag:" + mTag);
                        }
                    }
                    retVal = HANDLED;
                    break;
                }

RetryManager负责重试相关的计数:

    public boolean isRetryNeeded() {
        boolean retVal = mRetryForever || (mRetryCount < mCurMaxRetryCount);
        if (DBG) log("isRetryNeeded: " + retVal);
        return retVal;
    }

3. 一段时间内持续没有接收到新的数据包

在Data完成连接后,DcTracker会定时检查TX/RX的更新,如果RX的值持续没有更新并超过设置的上限值,就会触发Recovery动作。



首先来看方法onDataStallAlarm,它由Alarm定时触发,执行这些操作:

更新TX/RX数据 -> 判断是否需要Recover并执行 -> 重新设置Alarm来触发下一次检查

protected void onDataStallAlarm(int tag) {
        if (mDataStallAlarmTag != tag) {
            if (DBG) {
                log("onDataStallAlarm: ignore, tag=" + tag + " expecting " + mDataStallAlarmTag);
            }
            return;
        }
        //更新mSentSinceLastRecv
        updateDataStallInfo();

        //默认值是10
        int hangWatchdogTrigger = Settings.Global.getInt(mResolver,
                Settings.Global.PDP_WATCHDOG_TRIGGER_PACKET_COUNT,
                NUMBER_SENT_PACKETS_OF_HANG);

        boolean suspectedStall = DATA_STALL_NOT_SUSPECTED;
        if (mSentSinceLastRecv >= hangWatchdogTrigger) {
            //一段时间没有RX,且超过watchdog的值,需要recover
            suspectedStall = DATA_STALL_SUSPECTED;
            sendMessage(obtainMessage(DctConstants.EVENT_DO_RECOVERY));
        } else {
            if (VDBG_STALL) {
                log("onDataStallAlarm: tag=" + tag + " Sent " + String.valueOf(mSentSinceLastRecv) +
                    " pkts since last received, < watchdogTrigger=" + hangWatchdogTrigger);
            }
        }
        //重新设置Alarm任务,一段时间后再次执行本方法(onDataStallAlarm)
        startDataStallAlarm(suspectedStall);
    }


updateDataStallInfo()负责记数,处理分3种情况:

1. 有TX 也有RX  -> 正常,重置计数和Recovery action(Recovery action后面会写到)

2. 有TX没有RX -> 异常,累计TX数据

3. 没有TX 只有RX  -> 正常,重置计数和Recovery action

private void updateDataStallInfo() {
        long sent, received;
        
        TxRxSum preTxRxSum = new TxRxSum(mDataStallTxRxSum);
        mDataStallTxRxSum.updateTxRxSum();

        sent = mDataStallTxRxSum.txPkts - preTxRxSum.txPkts;
        received = mDataStallTxRxSum.rxPkts - preTxRxSum.rxPkts;

        //收发正常,RecoveryAction重置
        if ( sent > 0 && received > 0 ) {
            if (VDBG_STALL) log("updateDataStallInfo: IN/OUT");
            mSentSinceLastRecv = 0;
            putRecoveryAction(RecoveryAction.GET_DATA_CALL_LIST);
        } else if (sent > 0 && received == 0) {

            //没有RX;若不在通话状态则需要累计本次发送量
            if (isPhoneStateIdle()) {
                mSentSinceLastRecv += sent;
            } else {
                mSentSinceLastRecv = 0;
            }

          //没有发数据,RecoveryAction重置
        } else if (sent == 0 && received > 0) {
            if (VDBG_STALL) log("updateDataStallInfo: IN");
            mSentSinceLastRecv = 0;
            putRecoveryAction(RecoveryAction.GET_DATA_CALL_LIST);
        } else {
            if (VDBG_STALL) log("updateDataStallInfo: NONE");
        }
    }

TX/RX数据由TrafficStats提供的静态方法获得,是native层方法统计所有Mobile的iface后返回的数据:

public void updateTxRxSum() {
            this.txPkts = TrafficStats.getMobileTcpTxPackets();
            this.rxPkts = TrafficStats.getMobileTcpRxPackets();
        }

最后看下doRecovery方法如何执行恢复数据。

doRecovery方法中有5种不同的Recovery action对应着各自的处理:
1. 向Modem主动查询DATA CALL LIST
2. 清除现有的数据链接
3. 重新驻网
4. 重启Radio
5. 深度重启Radio(根据高通的注释,这个操作涉及到RIL的设计)

如果一种方法执行之后,连接依然有问题,则执行下一种恢复方法,顺序类似于循环链表,直到恢复正常后updateDataStallInfo()将Action重置:

 protected void doRecovery() {
        if (getOverallState() == DctConstants.State.CONNECTED) {
            // Go through a series of recovery steps, each action transitions to the next action
            int recoveryAction = getRecoveryAction();
            switch (recoveryAction) {
            case RecoveryAction.GET_DATA_CALL_LIST:
                mPhone.mCi.getDataCallList(obtainMessage(DctConstants.EVENT_DATA_STATE_CHANGED));
                putRecoveryAction(RecoveryAction.CLEANUP);
                break;
            case RecoveryAction.CLEANUP:
                cleanUpAllConnections(Phone.REASON_PDP_RESET);
                putRecoveryAction(RecoveryAction.REREGISTER);
                break;
            case RecoveryAction.REREGISTER:
                mPhone.getServiceStateTracker().reRegisterNetwork(null);
                putRecoveryAction(RecoveryAction.RADIO_RESTART);
                break;
            case RecoveryAction.RADIO_RESTART:
                putRecoveryAction(RecoveryAction.RADIO_RESTART_WITH_PROP);
                restartRadio();
                break;
            case RecoveryAction.RADIO_RESTART_WITH_PROP:
                // This is in case radio restart has not recovered the data.
                // It will set an additional "gsm.radioreset" property to tell
                // RIL or system to take further action.
                // The implementation of hard reset recovery action is up to OEM product.
                // Once RADIO_RESET property is consumed, it is expected to set back
                // to false by RIL.
                EventLog.writeEvent(EventLogTags.DATA_STALL_RECOVERY_RADIO_RESTART_WITH_PROP, -1);
                if (DBG) log("restarting radio with gsm.radioreset to true");
                SystemProperties.set(RADIO_RESET_PROPERTY, "true");
                // give 1 sec so property change can be notified.
                try {
                    Thread.sleep(1000);
                } catch (InterruptedException e) {}
                restartRadio();
                putRecoveryAction(RecoveryAction.GET_DATA_CALL_LIST);
                break;
            default:
                throw new RuntimeException("doRecovery: Invalid recoveryAction=" +
                    recoveryAction);
            }
            mSentSinceLastRecv = 0;
        }
    }


Android启用GPRS成功后反馈流程(MTK)

常言道:好记性不如烂笔头。 回想当初摸索实现多网并发功能时候的艰辛,痛并快乐着。 我知道,做这样的事情的人,我不是第一个,也绝对不是最后一个,谨以此文记录过去的点滴,或也可以让些许人少走弯路,引以共勉...

Android 4.0 framework 数据业务学习总结(1)

简介 本条目用于记录本人对Android framework侧数据业务的阶段学习总结。 内容包括流程图,代码分析,BUG用例等。 第一阶段学习成果 本阶段主要注重对数据连接设置...

数据业务建立流程之发起网络连接过程(原)

经过前面这些过程,网络连接所需要的条件就全部准备就绪,接下来就是等待网络接入。         我们把网络接入过程简单分为三个阶段:         触发阶段             ----该阶段是...

Android 电话系统框架介绍

在android系统中rild运行在AP上,AP上的应用通过rild发送AT指令给BP,BP接收到信息后又通过rild传送给AP。AP与BP之间有两种通信方式: 1.Solicited Resp...

android RIL源码研究

这篇文章介绍ril.cpp中实现部分代码中的四个字符串输出函数及其相应的枚举类型。首先看requestToString()函数的代码,它的功能是将将ril_command.h和ril_unsol_co...

Android RIL模块非启动界面联网实战(二)

原文地址::http://yangyangzhao.blog.163.com/blog/static/17581636620101163758306/   在Android RIL模块非启动界面联...

android 系统数据业务---模式切换分析(上)

5 setPreferredNetworkType详解 5.1 RIL处理 RIL.java中setPreferredNetworkType方法如下, @Override public void se...

Volley超时重试机制详解

Volley超时重试机制基础用法Volley为开发者提供了可配置的超时重试机制,我们在使用时只需要为我们的Request设置自定义的RetryPolicy即可. 参考设置代码如下:int DEFAU...

Android超时机制的处理(很不错)

由于手机端应用的响应,与当时的无线通信网络状况有很大的关联。而通信网络往往具有不稳定,延迟长的特点。所以,在我们的应用程序中,当我们请求网络的时候,超时机制的应用就显得特别重要。 超时机制主要有...
  • jdsjlzx
  • jdsjlzx
  • 2012年06月08日 14:21
  • 17349

(M)SIM卡开机流程分析之默认APN设置

近日,一直在研究,默认APN的设置 当我们从代码和手机中看到,默认APN的显示是从content://telephony/carriers/preferapn的数据中查询到的,而这个是通过share...
内容举报
返回顶部
收藏助手
不良信息举报
您举报文章:[Android6.0] 数据业务重试机制
举报原因:
原因补充:

(最多只允许输入30个字)