Why most large-scale Web sites are not written in Java

During the past few weeks I've had discussions with my colleague Geva Perry trying to answer the question Why most large-scale Web sites are not written in Java?

There is a lot of information in the blogosphere describing the architecture of many popular sites, such as Google, Amazon, eBay, LinkedIn, TypePad, WikiPedia and others.

The folks at Pingdom compiled some of this information, based on information from High-Scalability:

Scalablewebarchitecture_4

 

 

 

 

Looking at these architectures some observations come to mind: Most of these sites are using LAMP as the core runtime stack. Some have gone so far as to develop their own file system (Google, GFS). Some are using caching to solve the database bottleneck (memcached and the like). Many of them were forced to develop these solutions themselves, as at the time there was no ready-made alternative that could meet their requirements.

The application stack of these Web applications is very different from the stack that mission-critical applications in the financial world are built with. In the financial world, Java -- and to a lesser degree J2EE -- is used extensively. In recent years scalability requirements in capital markets led to a rapid shift in the middleware stack, introducing Compute Grid solutions for virtualization of CPU resources, enabling parallelization of batch applications. Data Grids were also introduced, enabling the virtualization of memory resources. Spring is becoming the common development framework in this world. At GigaSpaces, we're seeing more and more cases where Spring acts as a complete alternative to J2EE.

If we examine both worlds, we can see that both are facing similar challenges related to scalability. Not surprisingly, both ended up introducing similar solutions for addressing the scalability challenges:

On the Data Tier we see the following:

1. Adding a caching layer to take advantage of memory resources availability and reduce I/O overhead
2. Moving from a database-centric approach to partitioning, aka shards  

On the Business Logic Tier:

3. Adding parallelization semantics to the application tier (e.g., MapReduce)
4. Moving to scale-out application models to achieve linear scalability
5. Moving away from the classic two-phase commit and XA for transaction processing  (See: Lessons from Pat Helland: Life Beyond Distributed Transactions)

While there are many similar challenges, and to a certain degree, similar architectures, it seems that both worlds (Web and Financial) took different routes as it relates to the application stack.

Over at the High-Scalability site, someone posted the question: Why doesn't anyone use j2ee?
The answer given in that post can be summarized as follows:

1. LAMP provides a cost-effective solution (most of it relies on *free* open source stack).
2. Java is still used, but not as the primary language, i.e., it is used as one component either in the back-end or the front-end (e.g., servlets).

I have my own thoughts on this matter, but I'll be very interested to see if anyone has any reasonable explanation for it, before I jump in.

Thoughts?

 
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值