本文将对Android平台上的Piwik-SDK挑选一个追踪页面事件的流程进行部分源码解析。
这是Piwik-SDK的github地址:https://github.com/piwik/piwik-sdk-android
我们来看一个追踪页面事件的完整方法:
TrackHelper.track()
.screen("/custom_vars")
.title("Custom Vars")
.variable(1, "first", "var")
.variable(2, "second", "long value")
.with(getTracker());
首先我们调用track()方法获得一个TrackHelper对象:
public static TrackHelper track() {
return new TrackHelper();
}
该对象拥有screen()、event()等一系列追踪方法,该类还拥有BaseEvent用于上传数据的内部类等,功能繁多。
接下来我们看看screen()方法,它直接new一个Screen对象(Screen同样是TrackHelper的内部类),并在构造方法中初始化入参,super()方法其实是在BaseEvent类中构造一个基础的TrackHelper对象,用于获取一个baseTrackMe对象用作其它用途,在此先不详谈。
Screen(TrackHelper baseBuilder, String path) {
super(baseBuilder);
mPath = path;
}
同样的,该类的title()方法也初始化了入参:
public Screen title(String title) {
mTitle = title;
return this;
}
然后我们看看它究竟对这些参数做了什么操作:
public TrackMe build() {
if (mPath == null) return null;
final TrackMe trackMe = new TrackMe(getBaseTrackMe())
.set(QueryParams.URL_PATH, mPath)
.set(QueryParams.ACTION_NAME, mTitle);
if (mCustomVariables.size() > 0) {
//noinspection deprecation
trackMe.set(QueryParams.SCREEN_SCOPE_CUSTOM_VARIABLES, mCustomVariables.toString());
}
for (Map.Entry<Integer, String> entry : mCustomDimensions.entrySet()) {
CustomDimension.setDimension(trackMe, entry.getKey(), entry.getValue());
}
return trackMe;
}
在Screen类的build()方法里,我们看到他们被保存到了一个TrackMe对象的HashMap中,同样地,若传入了非空的自定义变量对象mCustomVariables,也一并保存。build方法返回了一个TrackMe对象,那这个方法什么时候被调用呢?没错,就是在最后调用的with()方法里:
public void with(@NonNull Tracker tracker) {
TrackMe trackMe = build();
if (trackMe != null) tracker.track(trackMe);
}
每次调用该方法,参数都会被保存在一个TrackMe对象里,然后理所当然就是进行数据的上传了,我们来看它的track()方法:
public Tracker track(TrackMe trackMe) {
boolean newSession;
synchronized (mSessionLock) {
newSession = tryNewSession();
if (newSession) mSessionStartLatch = new CountDownLatch(1);
}
if (newSession) {
injectInitialParams(trackMe);
} else {
try {
// Another thread might be creating a sessions first transmission.
mSessionStartLatch.await(getDispatchTimeout(), TimeUnit.MILLISECONDS);
} catch (InterruptedException e) { Timber.tag(TAG).e(e, null); }
}
injectBaseParams(trackMe);
mLastEvent = trackMe;
if (!mOptOut) {
mDispatcher.submit(trackMe);
Timber.tag(LOGGER_TAG).d("Event added to the queue: %s", trackMe);
} else Timber.tag(LOGGER_TAG).d("Event omitted due to opt out: %s", trackMe);
// we did a first transmission, let the other through.
if (newSession) mSessionStartLatch.countDown();
return this;
}
我们看到它有一个同步代码块,用tryNewSession()判断是否是新的追踪对话然后再进行线程并发控制,若是新的会话则重新配置一些参数,否则等待另一线程执行完毕再进行新的操作。Tracker提供了setDispatchTimeout()来设置上传超时时间。最重要的是这段代码:
if (!mOptOut) {
mDispatcher.submit(trackMe);
Timber.tag(LOGGER_TAG).d("Event added to the queue: %s", trackMe);
} else Timber.tag(LOGGER_TAG).d("Event omitted due to opt out: %s", trackMe);
mOptOut是一个表示禁用的标志,在未被禁用的情况下,mDispatcher调用submit()方法开始上传数据:
public void submit(TrackMe trackMe) {
mEventCache.add(new Event(trackMe.toMap()));
if (mDispatchInterval != -1) launch();
}
我们看到它先把trackMe缓存起来,然后调用launch()方法:
private boolean launch() {
synchronized (mThreadControl) {
if (!mRunning) {
mRunning = true;
Thread thread = new Thread(mLoop);
thread.setPriority(Thread.MIN_PRIORITY);
thread.start();
return true;
}
}
return false;
}
已经很接近了,它使用了一个同步代码块,启动了线程mLoop进行处理,那这个线程做了什么?还是在Dispatcher这个类里:
private Runnable mLoop = new Runnable() {
@Override
public void run() {
while (mRunning) {
try {
// Either we wait the interval or forceDispatch() granted us one free pass
mSleepToken.tryAcquire(mDispatchInterval, TimeUnit.MILLISECONDS);
} catch (InterruptedException e) {Timber.tag(LOGGER_TAG).e(e); }
if (mEventCache.updateState(isConnected())) {
int count = 0;
List<Event> drainedEvents = new ArrayList<>();
mEventCache.drainTo(drainedEvents);
Timber.tag(LOGGER_TAG).d("Drained %s events.", drainedEvents.size());
for (Packet packet : mPacketFactory.buildPackets(drainedEvents)) {
boolean success = false;
try {
success = dispatch(packet);
} catch (IOException e) {
// While rapidly dispatching, it's possible that we are connected, but can't resolve hostnames yet
// java.net.UnknownHostException: Unable to resolve host "...": No address associated with hostname
Timber.tag(LOGGER_TAG).d(e);
}
if (success) {
count += packet.getEventCount();
} else {
Timber.tag(LOGGER_TAG).d("Unsuccesful assuming OFFLINE, requeuing events.");
mEventCache.updateState(false);
mEventCache.requeue(drainedEvents.subList(count, drainedEvents.size()));
break;
}
}
Timber.tag(LOGGER_TAG).d("Dispatched %d events.", count);
}
synchronized (mThreadControl) {
// We may be done or this was a forced dispatch
if (mEventCache.isEmpty() || mDispatchInterval < 0) {
mRunning = false;
break;
}
}
}
}
};
当循环进行时,它会在指定时间内尝试获取一个许可:
mSleepToken.tryAcquire(mDispatchInterval, TimeUnit.MILLISECONDS);
当网络已连接,它将事件集合保存在了drainedEvents,并为每一事件实例化一个Packet对象:
for (Packet packet : mPacketFactory.buildPackets(drainedEvents)) {}
进去buildPackets工厂方法,我们看看他是如何实例化的:
public List<Packet> buildPackets(@NonNull final List<Event> events) {
if (events.isEmpty()) return Collections.emptyList();
if (events.size() == 1) {
Packet p = buildPacketForGet(events.get(0));
if (p == null) return Collections.emptyList();
else return Collections.singletonList(p);
}
int packets = (int) Math.ceil(events.size() * 1.0 / PAGE_SIZE);
List<Packet> freshPackets = new ArrayList<>(packets);
for (int i = 0; i < events.size(); i += PAGE_SIZE) {
List<Event> batch = events.subList(i, Math.min(i + PAGE_SIZE, events.size()));
final Packet packet;
if (batch.size() == 1) packet = buildPacketForGet(batch.get(0));
else packet = buildPacketForPost(batch);
if (packet != null) freshPackets.add(packet);
}
return freshPackets;
}
可以看到,它会根据event的大小来设置使用Get或Post请求方式,返回一个Collection集合或者一个ArrayList。
再回到mloop线程,它便拿到每个Packet对象开始上传,并返回上传的结果:
success = dispatch(packet);
在dispatch这个方法里,主要是使用了HttpURLConnection发送网络请求:
public boolean dispatch(@NonNull Packet packet) throws IOException {
if (mDryRunTarget != null) {
mDryRunTarget.add(packet);
Timber.tag(LOGGER_TAG).d("DryRun, stored HttpRequest, now %s.", mDryRunTarget.size());
return true;
}
HttpURLConnection urlConnection = null;
try {
urlConnection = (HttpURLConnection) packet.openConnection();
urlConnection.setConnectTimeout(mTimeOut);
urlConnection.setReadTimeout(mTimeOut);
// IF there is json data we have to do a post
if (packet.getPostData() != null) { // POST
urlConnection.setDoOutput(true); // Forces post
urlConnection.setRequestProperty("Content-Type", "application/json");
urlConnection.setRequestProperty("charset", "utf-8");
final String toPost = packet.getPostData().toString();
if (mDispatchGzipped) {
urlConnection.addRequestProperty("Content-Encoding", "gzip");
ByteArrayOutputStream byteArrayOS = new ByteArrayOutputStream();
GZIPOutputStream gzipStream = null;
try {
gzipStream = new GZIPOutputStream(byteArrayOS);
gzipStream.write(toPost.getBytes(Charset.forName("UTF8")));
} finally { if (gzipStream != null) gzipStream.close();}
urlConnection.getOutputStream().write(byteArrayOS.toByteArray());
} else {
BufferedWriter writer = null;
try {
writer = new BufferedWriter(new OutputStreamWriter(urlConnection.getOutputStream(), "UTF-8"));
writer.write(toPost);
} finally { if (writer != null) writer.close(); }
}
} else { // GET
urlConnection.setDoOutput(false); // Defaults to false, but for readability
}
int statusCode = urlConnection.getResponseCode();
Timber.tag(LOGGER_TAG).d("status code %s", statusCode);
return checkResponseCode(statusCode);
} finally {
if (urlConnection != null) urlConnection.disconnect();
}
}
在上报线程里,上报的间隔mDispatchInterval被设置成默认120s,Dispatcher类提供了setDispatchInterval设置该参数。PIwik将上文提到的TrackMe对象封装成一个Event对象,然后缓存在了mEventCache中,然后在上报线程中遍历这个集合得到每一Packet对象,使用HttpURLConnection逐个上传到服务器。连接超时时间默认为5s,TrackMe类提供了setDispatchTimeout()方法设置连接超时时间。值得一提的是, urlConnection.setConnectTimeout(mTimeOut)中的mTimeOut与setDispatchTimeout()所指向的是同一对象,所以该方法同时也设置了连接超时时间。在Tracker类中,Piwik用当前系统时间减去会话开启时间来判断是否是新的会话,并且提供了setSessionTimeout来设置会话超时时长,mSessionTimeout默认30min。总的来说,Piwik采用分时间间隔(mDispatchInterval)一次性上报所有事件的上报策略,每一会话时长由mSessionTimeout决定。
这就是整个追踪的大概流程:初始化、保存数据、上传数据,其中有许多操作我还没有深究,有兴趣的朋友可以交流讨论。