A Threshold Autoregressive Model for Software Aging

 A Threshold Autoregressive Model for Software Aging

 

Xiu-E Chen, Quan Quan, Yun-Fei Jia and Kai-Yuan Cai

Department of Automatic Control

Beijing University of Aeronautics and Astronautics

Beijing 100083, China

xiuechen@asee.buaa.edu.cn  

xiuechen@gmail.com

Abstract

 

Long running software systems such as client-server type systems are known to experience an aging phenomenon called software aging, one in which the accumulation of errors during the execution of software leads to performance degradation and eventually results in failure. To study and counteract the phenomenon of software aging, we collect and log data on several system resource usage and activity parameters of a web server. Based on the experimental results, we argue that software aging process could be divided into four stages: robust stage, transition stage, failure-probable stage and failure stage. A non-linear threshold autoregressive (TAR) model is then proposed to model and forecast the resource usage in the respective stages. In comparison with AR model, this modeling method is more effective.

 

1. Introduction

 

The phenomenon of “software aging” in which the state of the software degrades with time has been reported in recent literature [1, 2, 3].

Software researches could be classified into theoretical researches, experimental researches, and engineering researches [5]. Present study on software aging could also be divided into two classes: model-based and measurement-based. Measurement-based approaches primarily validate the existence of software aging through measuring various parameters of computer system. Ref. [2] monitors various parameters of a server and then construct ARX model for resource parameters to estimate resource exhaustion time due to software aging. The methods of fractal analysis are applied online to detect the trend of performance degradation [3].

Compared with previous work, the main contribution of our paper is as follows: first, in our experiments, we adopt the experimental platform in [2] and improve it. Three client machines instead of one are used to simulate concurrent requests in real world; second, a non-linear TAR model is proposed to analyze and forecast the resource usage in different system stages. Compared with previously developed linear models, a non-linear model better reveals the heavily irregular and nonlinear nature of monitored resource usage. TAR model is also more efficient to describe resource usage in the respective stages in process of software aging.

 

2. Experiments

 

2.1. Experimental setup

 

The components and the system structure are illustrated in Figure 1.

 

Figure 1. Experimental setup

 

Different from [2], the system consists of three clients instead of one to simulate concurrent requests in real world and accelerate the process of software aging.

Within all the parameters monitored by top utility and httperf [4], we only concentrate on six of them:

 


1.        phymemfree-the amount of free physical memory.

2.        buffers-the amount of memory used for buffers.

3.        cache-the amount of memory used as page cache.

4.        swapused-the amount of used swap space.

5.        replytime-the interval between the time httperf sends out the first byte of request and the time it receives the first byte of reply.

 

2.2. Experiment I: variable workload          

 

In Experiment I, we imposed variable workload on the web server. The connection rate generated by httperf varied between 30 requests/sec and 960 requests/sec.

Figure 2. Connection rate in experiment I

(a)

(b)

Figure 3.Test results in Experiment I

 

From the plots in Figure 3, one can conclude that resource usage does reflect the performance of the server. The coefficients of correlation between the performance index replytime and other parameters monitored are shown in Table 1.

Table 1. Coefficients of correlation between response time and other parameters

 

replytime

loadavg

0.99

phymemfree

-0.08

buffers

-0.25

cache

-0.36

swapused

0.19

 

2.3. Experiment II: long duration

 

In Experiment II, the web server was run without rejuvenation for a long time until it crashed. Each of the clients generated 270 requests per second to get an html file from the server.

Figure  4 shows that the server goes through sudden performance degradation with replytime abruptly increases after running reliably for a period of time, which is referred to as base longevity interval in [1]. The performance of the server quickly degrades and goes into a failure probable stage.

Other long duration experiments showed the same characteristic of abrupt performance degradation as in Experiment II.

Figure 4.Test results in experiment II

 

Other runs of this experiment also validate the existence of software aging, in which the state of software degrades with time. Instead of being a gradual process, software aging observed in our experiments is an abrupt one.

 

3. Modeling and Data Analysis

 

In Table 1, it has been noticed that average CPU load correlates strongly with system performance, so the amount of average CPU load on the server is our modeling target. The parameters in Experiment II are used as the modeling object.

 

3.1. Software aging process

 

Based on the experimental results, we conjecture that the process of software aging could be divided into four stages:

1. A highly robust stage Sr

2. A transition stage St

3. A failure probable stage Sp

4. A failure stage SF

The probabilistic stage transition diagram is shown in Figure 5. Where ,  is a function of elapsed time t, denotes the transition probability from stage Si to stage Sj.

Figure 5. Probabilistic stage transition model for system performance

 

When the application starts, it stays in a highly robust stage Sr, and Prr(t), Ptr(t) and Ppt(t) is nearly 1; Prr(t), Ptr(t) and Ppt(t) decreases with time, and Prt(t) increases correspondingly which will lead the system into the transition stage St. The experimental results show that St  is a relatively transitory stage. The system will then enter Sp.With the increased value of PpF(t), the system will finally crashes.

 

3.2. Modeling

 

Since loadavg correlates strongly with response time, the variable loadavg can be deemed as an indicator of system stage. To simplify the process of modeling, two constants A and B are used in the paper to partition different system stages: when loadavg is below A, the system is at stage Sr; when loadavg is above B, the system stage is Sp; when loadavg is between A and B, the system is at a transition stage St.

To describe resource usage in different stages, a 3-threshold model is constructed:

   (1)

The delay parameter d is determined by investigating the Partial Autocorrelation Function (PACF) of loadavg. The lag at which PACF has the peak value is selected as d.

The optimal values of A, B and are selected by varying A and B over a selected range. For each pair of A and B, the optimum order  pj for the jth submodel corresponds to the value p that yields the minimum value for the Akaike  Information    Criteria (AIC) statistics AIC(pj) [6].The total AIC is computed by . This process is repeated for all subregions for each pair of A and B. The optimal values of A, B and are those that yield the minimum AICtotal. The process resulted in d=1, A=5,B=16, p1 = p2 = p3=2.

Figure 6. Measured and modeled average CPU load in Experiment II

 

3.3. Comparison with AR model

 

From Table 2 one can see that TAR model is superior to AR model in every subregion. The total AIC of TAR model is also lower than that of AR model.

In the second submodel, the square sum of residues is reduced about 67% by TAR. Since the second region mostly consists of points that change rapidly, the capability of TAR models to respond more rapidly to sudden changes than AR models is validated.

 The advantage of TAR model over AR model lies in that the AR modeling in threshold models is more accurate in each region than a single AR model. Furthermore, the threshold AR model is more capable to respond rapidly to sudden changes.


Table 2.Comparison of AR and TAR model

Model

Model poles

Square sum of  residues

AIC

AR:

[ 0.177,0 .979]

11.28

TAR:

[0 .017,0 .845]

[-0.170,1.217]

[0 .534-0.1516*i,

0 .534+0.1516*i]

10.71


 


From the poles of TAR model, one can conclude that:

a) The stage Sr is a stationary and robust process with both poles inside the unit circle.

b) The stage St is an unstable process with one pole outside the unit circle.

c) The stage Sp is an oscillation process.

Therefore, TAR model is also better in model explanation which reveals the kinetic property of each stage during software aging process.

 

4. Conclusions

 

In our experiments, we monitor system resource usage and activity parameters on a web server to study the phenomenon of software aging. We find that the server tends to go through sudden performance degradation after it runs reliably for a period. Therefore, we argue that the aging process could be divided into four stages: a highly robust stage Sr, a transition stage St, a failure probable stage Sp and the failure stage SF. A non-linear TAR model is used to estimate and forecast resource usage in respective stages. The results show that TAR model is superior to AR model in fitting the curve of resource usage in respective stages and TAR model has better performance in describing the kinetic property of software aging process as well. The modeling process of TAR model still needs to be improved along with new experimental results coming out. However, to our knowledge, this paper is the first study attempting to use threshold autoregressive method to model and forecast resource usage in respective stages. Actually, the concept of “threshold” is commonplace in computer system, e.g.

 

 

 

 

saturated load is the threshold of server capability above which the server behaves totally differently. So we do believe that the concept of bringing thresholds into hierarchical resource usage modeling will further verify its effectiveness in the future.

Here we also note that the work presented in this paper fall in the scope of empirical studies of software, rather than that of model-based studies. In a wider sense, this paper contributes to the emerging area of experimental softwarics [5], which is supposed to be the software counterpart to experimental physics.

 

References

 

[1] Y. Huang, C. Kintala, N. Kolettis , and N.D. Fulton,“ Software Rejuvenation: Analysis, Module and Applications,” Proc. The

25th International Symposium on Fault-Tolerant Computing, 1995, pp381-390.

[2] L. Li, K. Vaidyanathan, and K.S. Trivedi, “An Approach to Estimation of Software Aging in a Web Server ,”  Proc.

International Symposium on Empirical Software Engineering, 2002, pp91-100.

[3] M. Shereshevsky, J. Crowell, B. Cukic, V. Gandikota, Y. Liu, “Software Aging and Multifractality of Memory Resources,” Proc. the 2003 International Conference on Dependable Systems and Networks, 2003, pp721-730.

[4] D. Mosberger, T. Jin, “httperf-A Tool for Measuring Web Server Performance ,” In First Workshop on Internet Server Performance, 1998, pp59-67.

[5] K.Y.Cai, “Software Reliability Experimentation and Control”, Journal of Computer Science and Technology, Vol.21, No.5, 2006.

[6] M.B. Priestley, Non-linear and Non-stationary Time Series Analysis. Academic Press, 1989, pp 73-77.

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
资源包主要包含以下内容: ASP项目源码:每个资源包中都包含完整的ASP项目源码,这些源码采用了经典的ASP技术开发,结构清晰、注释详细,帮助用户轻松理解整个项目的逻辑和实现方式。通过这些源码,用户可以学习到ASP的基本语法、服务器端脚本编写方法、数据库操作、用户权限管理等关键技术。 数据库设计文件:为了方便用户更好地理解系统的后台逻辑,每个项目中都附带了完整的数据库设计文件。这些文件通常包括数据库结构图、数据表设计文档,以及示例数据SQL脚本。用户可以通过这些文件快速搭建项目所需的数据库环境,并了解各个数据表之间的关系和作用。 详细的开发文档:每个资源包都附有详细的开发文档,文档内容包括项目背景介绍、功能模块说明、系统流程图、用户界面设计以及关键代码解析等。这些文档为用户提供了深入的学习材料,使得即便是从零开始的开发者也能逐步掌握项目开发的全过程。 项目演示与使用指南:为帮助用户更好地理解和使用这些ASP项目,每个资源包中都包含项目的演示文件和使用指南。演示文件通常以视频或图文形式展示项目的主要功能和操作流程,使用指南则详细说明了如何配置开发环境、部署项目以及常见问题的解决方法。 毕业设计参考:对于正在准备毕业设计的学生来说,这些资源包是绝佳的参考材料。每个项目不仅功能完善、结构清晰,还符合常见的毕业设计要求和标准。通过这些项目,学生可以学习到如何从零开始构建一个完整的Web系统,并积累丰富的项目经验。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值