”The Importance of Perl “

息来源:中国Perl协会
文章作者:klaus
出处:中国Perl协会 FPC(Foundation of Perlchina)
原名:The Importance of Perl
原文:_0498.html">http://www.perl.com/pub/a/oreilly/perl/news/importance_0498.html
请保护作者的著作权,维护作者劳动的结晶。
尽管媒体们如此关注Java和ActiveX,但真正“让英特网活起来”的却应该是Perl,一种专业技术分析家们视而不见,但是却出现在所有网络管理员、系统管理员或程序员脑中的语言。他们每天的工作包括建立常规的网络应用,或是把不同的程序粘起来,以实现它们的设计者们没有预见到的功能。Sun的第一位网络管理员Hassan Schroeder评论道:“Perl是英特网的传送带/血管。”
Perl最初是由Larry Wall为UNIX开发的一种脚本语言,其目的是为了将UNIX shell的易用性和编程语言(比如C)的强大功能与可塑性结合起来。Perl很快成为了UNIX系统管理员们的偏爱的语言。
随着World Wide Web的发展,Perl的用途有了极大的拓展。公共网关接口(CGI)提供了在网络服务器和程序之间传递数据并利用网页返回结果的简单方法。而Perl则迅速成为了CGI编程的主要语言。
在功能强大的Win32移植版本出现之后,Perl也有力地入侵,成为为NT系统的脚本语言之一,特别是在系统管理、网站管理与编程方面大显身手。
曾经,在主流的分析家们认为CGI程序和Perl将会很快被Java,ActiveX和其他新的专门为网络开发的技术所取代。然而,他们没有料到的是, Perl继续在发展壮大,Microsoft的Active Server Pages(ASP)和Apache服务器的mod_perl都支持在服务器上直接运行perl程序,以及数据界面如DBI,Perl DataBase界面,为后台数据库与Perl的整合提供稳定的API。
这篇文章探讨了为什么Perl变得越来越重要的原因,不局限于网络,而是做为一种广泛用途的计算机语言。这些原因包括:
Perl这样的脚本语言与Java,C++或C这样的传统程序语言适合的任务有着跟本的不同。
Perl将许多程序“粘着”在一起的能力,或者说将一个程序的输出转化成另一个程序的输入的能力。
Perl在处理文本方面无以伦比的能力,如正则表达式等强大特性。当网络文本(HTML)重新兴起而成为所有应用软件和系统的网络“外交语言”后,这一点变得尤其重要。
分布式的开发团队以一种以有机的、进化式的方式,跟随着快速变化的需求。
一种好的脚本语言应该是一种高级软件开发语言,既能够快速地开发小工具,同时又拥有开发复杂程序所需要的工作流与数据组织形式。执行速度一定要快。在调用系统资源如文件操作,内部进程通信,进程控制等方面一定要有效率。一种好的脚本语言应该可以运行在所有流行的操作系统上,适合信息处理(自由文本格式)和数据处理(数字与二进制数据)。它要可插入,可扩展。Perl符合了上面所有的标准。
为什么/何时要用脚本语言?
John Ousterhout在他的文章中有力地阐述道,脚本:二十一世纪的高层次程序语言。“像Perl和Tcl这样的脚本语言代表了一种与C或Java这样的系统程序语言非常不同的编程风格。脚本语言是被设计来“粘着”应用程序的;它们使用无类型的方法以达到比系统程序语言更高层次和更快捷的应用程序开发。计算机速度的提升和各种应用程序的混和与变化正使得在未来脚本语言变得越来越重要。”
Ousterhout继续道:就在我们接近二十世纪的尾声的时候,人们编写程序的方法发生了一个跟本性的转变。这个转变是人们从C和C++这样的系统程序语言转向了Perl或Tcl这样的脚本语言。虽然许多人正处于这样的转变之中,但很少人意识到了它的发生,更少有人明白它为什么在发生....
脚本语言是被设计来完成与系统程序语言所不同的任务的,这导致了它们之间根本性的差异。系统程序语言的设计是从底层开始建立数据结构和算法,从最初级的计算机元素如内存单元开始。与之相反,脚本语言被设计用来做“胶着”的工作:它们假定已经存在很多有效的组件,然后将它们连接起来。系统程序语言使用严格的数据类型来控制复杂性,而脚本语言则没有数据类型,以便方便地连结组件并提供快速开发应用程序的能力。
脚本语言和系统程序语言是互补的,而且从60年代起主要的操作系统都同时支持它们。然而,近期的一些趋势,如更快的计算机、更好的脚本语言的出现,图形用户界面和组件体系结构的重要性不断增加,和英特网的发展,使得脚本语言的应用大大拓展。脚本语言将有越来越多的应用,而系统程许语言则主要被用来开发各种组件,这样的趋势在下个十年中还会继续。
系统管理员们是最早利用脚本语言的强大功能的人。任何一个操作系统中,都存在这种问题。一般是为了自动完成某种重复性的工作。即使Macintosh系统也需要一些用户定义的自动操作。任务可能很简单,比如自动备份和恢复系统,或者很复杂,比如周期性地储存硬盘上所有文件,或者存储二十四小时内所有系统设置的改动。有些时候已经有这样的工具可以完成这些工作,但自动操作需要有控制程序来启动它们,提取和转化它们的输出,以及协调这些程序的工作。
许多系统都内置了一些脚本语言,如VMS的DCI,MS-DOS的BAT文件,UNIX的shell脚本,IBM的Rexx,Windows的 Visual Basic and Visual Basic for Applications,还有Applescripts都是专用于某种系统的脚本语言的好例子。Perl的独特在于他打破了脚本语言与某个操作系统的紧密联系,而成为了一种在多个平台下广泛使用的脚本语言。
有些脚本语言,特别是Perl和Visual Basic,或者算上Tcl和Python,都做为多用途的语言而被广泛使用。成功的脚本语言一个长处在于它们很容易调用操作系统功能/服务。更高一层次来说,做为一种多用途的脚本语言,它们必须稳健到你可以使用它们编写复杂应用程序的程度。脚本语言可以用来编写原型,建模和做测试,但当脚本语言运行足够迅速和稳健的时候,原型就直接成为了应用程序。
那么,为什么不使用那些多用途的程序语言如C,C++或Java替代脚本语言呢?答案很简单:成本。开发时间比硬件和内存更昂贵。脚本语言容易学习并且使用起来很简单。
正如Ousterhout指出的,脚本语言通常没有什么数据类型概念。脚本不区分整数和浮点数,变量是没有类型的。这是脚本语言善于快速开发的原因之一。大概念是“不着急处理细节”。因为脚本语言擅长调用系统工具来做难做的事情(如拷贝文件和建立目录或文件夹),尚未实现的细节就可以用编译语言易于写成的小程序来处理。
对于编译语言来说数据类型有什么用呢?它使得内存管理更加容易,但对于程序员来说更难了。想想看:当FORTRAN流行的时候一个程序员赚多少钱一小时?内存值多少钱?而现在呢?时代变了。内存便宜,程序员很贵!
系统语言必须把任何东西都写出来,这使得编译复杂数据结构更容易,但是程序员更难编写。脚本语言尽量多地自己做出假定,尽量少地要求明确指定。这让脚本语言更容易学习,写起来也更快。其代价是编写复杂的数据结构和算法时很困难。但是Perl在复杂数据结构和算法方面都做得很好,同时也没有牺牲写简单程序时的简便。
解释语言与编译语言
绝大多数脚本语言都是解释型语言,感觉上好像不适合大型程序项目。这种说法是应该要纠正的。
确实,除某些有硬件特异性的语言之外,大部分情况下解释型语言程序都比编译语言要慢。解释型语言的优势在于,它写的程序在解释器所能安装的任何系统上都可以运行。解释器负责处理那些系统特异性的细节问题,而不是应用程序本身。(当然也有例外,比如这个应用程序可能调用了某个不可移殖的系统特性)
操作系统命令解释器如MS-DOS的command.com以及早期版本的UNIX C shell是解释器运行的很好例子:脚本里的命令一条一条都喂到解释器里去。对于效率影响最大的就是循环:循环中的每一条命令在每次运行的时候都重新解释。有些人认为所有的解释型语言都这么...缓慢、低效、一次一行。不过事实并不是这样。
实际上有一些中间型语言,运行的时候被编译成某些中间码,然后被解释器装载运行。Java就是一个例子,这让它成为了一种很有价值的跨平台语言。所有在不同硬件上的java解释器都能交流并共享数据和进程资源。对于嵌入系统来说这是非常棒的,因为嵌入系统实际上就是一种特殊目的的硬件。然而Java并不是一种脚本语言。它需要数据声明,而且是预先编译的。(除非你把实时编译也算在内—虽然它实际上只是生成代码)
Perl也是一种中间型语言。Perl代码根据需要一块一块地进行编译,所不同的是编译好的可执行部分被存在内存中而不是写成文件。任何一块Perl代码块只被编译一次。Perl在设计上的优势使得所有这些优化都很值得。Perl保留了解释语言的可移殖性,又有了接近编译语言的执行速度。已经快经历了十年历史的Perl,拥有数十万的开发者,现在又将经历它的五次脱胎换骨,它运行得既简洁又迅速。虽然在启动的时候可能会有一些延迟,因为需要一些时间编译代码,但是相对于代码执行的时间来说这很短暂。而且,像”fast CGI”这样的技术可以将反复执行的脚本镜像存在保留在内存中来避免启动延迟,除非这个脚本是第一次运行。
不管怎么说,Perl 5.005将有一个由牛津大学的Malcolm Beattie所写的编译器。这个编译器将消除编译过程中的启动延迟,并加入一些小的加速技术。它也消除了某些编写商业应用程序的程序员对脚本语言的生理恐惧。(使用编译器之后,其他人将无法再看到源代码)
信息处理与数据处理
互联网只是我们与计算机交流形式的许多巨大变化中的一个。这个改变在我们对工业的称乎中就能看得出来。过去它被称为“数据处理”,比如说:“如果我想中午拿到数据处理的结果的话,就得早上四点中把东西递交到数据中心去。”现在我们将它称为“信息服务”,比如“信息服务部的头正和我们的计划委员会一起工作”。兴趣和重点现在放在了“信息”而不是“数据”上。很明显,现在我们更关心信息,而信息往往同时包括文本和数据,而不仅仅是数据。Perl在处理信息方面是很优秀的。
Perl处理信息方面的很大一部分能力来源于一种叫做正则表达式的特殊语法。正则表达式赋予了Perl极大的处理和操作自由文本中的模式的能力。其他语言也有支持正则表达式的,(Java甚至有自由/免费的正则表达式库),但是没有一种能像Perl一样结合得这么好。
很多年以来,总的趋势一直都是将文本文件整合到特殊的应用文件格式中。唯有Unix,将ASCII文本定义为统一的程序文件交换格式,而其他的那些系统则让不兼容的文件格式越来越多。急剧扭转了这个趋势的,是互联网的出现。它的HTML数据格式是由有标记的ASCII文本组成的。由于互联网的重要性, HTML—以及与它相伴的ASCII文本—如今处在数据交换格式的中心地位,几乎所有的应用程序都可以输出它。微软甚至计划提供HTML方式的桌面。 HTML的继承者,XML(可扩展标记语言)如今已被广泛认为将成为混和环境下的标准文件交换格式。
HTML的强大显著地增强了Perl的力量。它是种理想的语言,无论是在核实用户输入在HTML表格中的内容,操作大量HTML内容,还是提取和分析各种/海量log文件的时候。
这只是Perl处理文本强大能力的一方面。Perl不仅给你许多分解数据的方法,还给你许多将数据粘回一起的办法。因此Perl在处理信息流并重新组装方面也很理想。可以很轻易地将信息转换后输入另一个程序或进行分析汇总。
有人说下一代的应用程序将不会是现在这些程序的样子,而是“信息化应用程序”,其中文本将构成大部分的用户界面。假想一个典型的企业内部网的应用程序系统:一个人力资源系统,雇员通过它来选择哪个公共基金会来投资他们的养老金,随时了解他们帐户里的数目,并取得相应信息来帮助他们投资。这个系统的界面将包含许多信息化的文档(一般是以HTML的形式),一些基于表格的简单CGI程序,以及到后台实时股票行情系统的链接(可能是英特网上的服务)。
利用传统的编程技术建立这样一个系统是很不实际的。任何一个公司的投资策略都会有不同,传统编程技术投入的巨大工作量在这样一个局限的项目里无法得到回报。而用web做为前台,利用perl脚本完成链接到后台数据库的任务的话,你可能不需要很多时间就能完成这样一个系统。
或者来看看Amazon.com,它可能是最成功的新网络经济的例子了。Amazon提供一个信息前台,以及一个后台数据库和订单系统,然后——你猜对了——用perl将它们连在了一起。
Perl对数据库的链接是由一组被功能强大的数据库独立界面支持的,它们被称为DBI。Perl+fast-cgi+DBI可能是互联网上使用最广泛的数据库链接/连接系统了。ODBC模块也可以提供类似功能。
考虑到Perl强大的前台文本处理能力,以及后台的数据库链接功能,因此你应该开始明白为什么Perl在新一代信息化应用中会起到越来越重要的作用了。
Perl的模式匹配和处理功能在其他方面的应用包括生物医学研究,以及数据挖掘。任何大的文本数据库,从人类基因组计划的基因序列分析到某些大网站的日志文件分析,都可以用Perl来处理。最终Perl进一步被用来做基于网络的研发和专门的英特网搜寻应用。在模式匹配和网络socket开发方面的优势成为英特网的通讯方面的基石,也使Perl成为建立网络机器人的最佳语言,这些机器人用于在英特网上查找关键信息。
Perl用来开发应用
开发人员越来越认识到Perl作为一种应用开发语言的功用。Perl使得传统语言没法做到的项目成为可能。并不只因为Perl开发简单,它也可以足够复杂,在需要的时候甚至使用最高级的面向对象语言技术。
在编制基于socket的客户端-服务器应用程序的时候Perl比C或C++要简单。用Perl编写自由文本处理程序比任何语言都更简单。Perl有一个由Perl写成的成熟的调试器,以及许多选项可以用来建立安全的应用程序。几乎任何一方面的应用都有免费的Perl模块可以使用,当需要的时候便可以动态加载。
Perl可以很容易的用编译好的C/C++甚至Java写的函数进行扩展。这表明调用一些还没用Perl写的功能或系统服务也很容易。当在非UNIX系统下运行的时候,由于可以调用这个系统的特殊功能,因此这种拓展能力变得更加有价值。
Perl也可以在编译程序中被调用,或者被插入到其他语言编写的程序中。人们正在建立一种标准的方法,将Perl整合到Java中去,也就是说Java的类将可以用Perl来写。目前为止,这些程序需要内嵌Perl解释器。不过1997年的第四季度O’Reilly & Associates的Perl资源工具箱将包含一个新的后台编译器,将Perl编译为Java字节码以解决这个问题。
图形界面
由于Perl是在UNIX系统下开发的,ASCII终端是主要的输入输出设备(即使是像X一样的图形系统也包含了单独窗口的字符终端),因此Perl并没有定义固有的图形界面(不过在今天这样群雄割据的图形界面的世界里这大概也算一种特性)。Perl采用扩展模块来创见图形界面的程序。使用最广泛的就是 Tk,其实最早它是为Tcl脚本语言开发的图形工具包,不过很快就被移殖到了Perl上。Tcl依然专注于X-Window系统,虽然她已经开始被移殖到微软的Windows系统上。
然而,如前所说,开发固有的图形界面已经变得不那么重要,因为web正逐渐成为多数应用程序标准的图形界面。“webtop”做为通用的跨平台应用正在快速的取代“desktop”。只要写一个“webtop”便可以用在UNIX,Mac,Windows/NT,Windows/95…任何一个有网页浏览器的系统。
实际上,越来越多的网站采用Perl和web来为一些传统的程序创建更简单易用的界面。比如Purdue大学网络计算中心为三十种电路模拟工具设计了一个网页界面,使用Perl从使用者填好的表格中提取数据并转化为命令行,发给Hub上连着的程序。
多线程
线程是做并行处理的很好的解决方法,尤其是当你在写双向通讯或事件驱动的程序的时候。1997年早些时候Perl已经有了一个多线程的补丁。在97年第四季度Perl5.005出现的时候,它将被整合进标准发布当中。
Perl一直支持的多任务模型是“fork”和“wait”。最小的调度单位是进程,它适用于UNIX。Windows/NT的多线程机制并不太一样,因此 Perl的可移殖性目前便受到了限制。不过如果在进程控制和其他应用之间建立抽象层,问题就解决了。而且,调和UNIX和Win32系统Perl接口的进程控制代码的工作正在进行,1997年的第四季度就会完成。
Win32系统上的Perl
6年,微软委托ActiveWare网络公司(现在的ActiveState公司)为NT资源库创建一个Perl与Win32系统的接口。如今网络上到处都可以见到这个移植版本,据说接近一半的Perl源代码下载都是用在Win32平台上的。
Perl进入像NT这样的Win32平台是有很多原因的。尽管有Visual Basic和Visual Basic for Apllications存在,Win32平台上的脚本语言支持依然比较弱。虽然VB是解释型脚本语言,但它依然是一种类型化的语言,用起来比较繁琐。而且它也没有像Perl那样强大的字符串处理能力。当建立大型NT站点的时候,系统管理员们则明显的认识到图形用户界面的限制,对于管理数百台计算机来说脚本语言是必须的。
很多时候会有这种情况,一些有经验的系统管理员常常被叫去管理那些不使用UNIX系统的站点,这时使用Perl是将UNIX的优点带到其他系统去的好办法。
你也不能低估web的影响力量。现在网上有数以千计用Perl编写的CGI程序和站点管理工具,支持Perl对于任何服务器平台说都是必需的。对于 Microsoft的NT服务器来说,O’Reilly和Netscape更显得重要,对Perl的支持是必须的。ActiveState的 PerlScript??可以让Perl在支持ASP的NT网络服务器,如Microsoft的IIS和O’Reilly的WebSite中的动态脚本引擎上运行。
除了核心的Perl语言解释器之外,ActiveState Perl的Win32??接口还包括特别针对Win32环境的模块。比如它提供了对自动操作对象的全面支持。随着越来越多的Windows系统资源和组件支持Perl端口,Win32版本的Perl将能够使用越来越多的系统功能。
扩展Perl的力量
和Microsoft的Visual Basic或Sun的Java不同,Perl没有一个巨大的公司为它撑腰。Perl最初是由Larry Wall开发并做为自由软件发布的。Larry后来开发Perl的工作是通过一个邮件组,在大概两百个合作者的帮助下进行的,这个邮件组叫做perl5- porters。最初这个邮件组是为了将Perl推向其他平台而建立的,但最终它成为开发Perl核心代码的贡献者们的聚集之处。
Perl5添加了一个扩展机制,独立的模块可以利用这个机制动态地加载到Perl程序之中。这导致了如今数百个附加模块的开发,其中许多重要的模块现在已经成为了Perl标准发行版本的一部分。附加的模块可以在综合Perl存档网络(CPAN)上得到。最好的进入CPAN的界面大概是www.perl.com,那里还包括许多书评,文章以及其他一些Perl程序员和使用者们关心的信息。
过去对使用自由软件曾经有的偏见,如今已经被粉碎了,因为人们认识到过去这些年来有许多最重大的计算机技术突破是从自由软件社区中产生的。 Internet本身很大程度上就是一个合作的自由软件项目,而且它的发展也是被那些自发组织的有远见的开发者所引导。类似的,在网络服务器平台中占有很大一块市场的是Apache,它也是一个自由软件项目,由大量的合作开发者社团创立,拓展和管理的。
除了持续不断的开发之外,Perl社区还通过新闻组和邮件提供活跃的技术支持。同时还存在无数咨询及付费的技术支持项目。无数的书籍提供了极好的文档材料,包括其中最著名的。Programming Perl,作者是Larry Wall,Randal Schwarz和Tom Chirstiansen。The Perl Journal和www.perl.com提供关于一些最新进展的信息。
总的来说,由于巨大的开发者团体和自由软件社区合作的传统,Perl具有和可以和最大的公司媲美的开发和支持资源。
实际应用的案例
接下来的部分包括一些用户实际应用的例子,从那个很多系统管理员都很熟悉的快刀斩乱麻式的“Perl拯救那天”的故事,到一些更大的常用应用程序。有些故事是从1997年八月19-21号在San Jose,CA召开的第一届Perl年度大会上拿来的,在会议进展上找来的程序描述上面标上了作者的名字。
案例 1 - 拯救了Netscape技术支持的程序语言
Dav Amann (dove@netscape.com)
好,我们来看看这个情况。你崭新的网络公司已经全面启动,你卖了多得超出你想象的浏览器,服务器和网络应用程序,你的公司大踏步的前进,最新的市场调查显示你的客户一年之内就已经超过了三十万。
现在唯一讨厌的问题是那三十万买了你的浏览器的家伙们可能会碰到点什么问题。他们可能不清楚到底他们要上的网在哪里,他们可能想要找人帮忙,他们可能想要找*你*来给他们技术支持。
当这种事情发生的时候,你大概会想:“好吧,那我写一些技术文章放到网上。”但是你开始着手这个计划的时候你会发现,你需要一种内容管理系统,一种发布系统,一些日志分析,然后收集和报告用户们在你的网站上的反馈,你早就该做这件事了。
幸运的是你知道Perl,然后你用Perl在三个月时间搞定了所有东西,仅仅靠了4个十分繁忙的技术支持工程师们的一些业余时间。
案例 2 - BYTE网站的快刀斩乱麻的转换
BYTE杂志准备要更新它自己的信息网络和会议系统,BIX,用这个系统编辑和读者可以交流各种信息。这个会议系统和Usenet很不同,倒和Mail- list有点像。可是许多BYTE的编辑都习惯用Usenet,因为他们一直订阅Usenet。因此BYTE建了一个接口,把BYTE内部的讨论组变成了 Usenet系统。使用的语言就是Perl,只用了几天的时间和不到一百行的程序。
案例 3 - 把客户的需求转到合适的专家那里
一个世界领先的计算机公司的性能测试小组想把用户需求的导航自动化。他们想利用企业内部网的设计解决这个问题,但是确没有任何经费预算。两个只有几周 Perl编程经验的工程师解决了这个问题。Perl脚本对查询的关键词进行自动匹配,然后将他们导航到他们要找的专家的网页。这个CGI程序不仅将客户指向他想找的专家页面和E-mail地址,而且自动把他的需求发送到专家那里。这个解决方案最终只花了短短几个星期,而且节省了很多预算。
案例 4 - email调查结果的收集和分析
一个Internet市场调查公司使用E-mail来做为调查手段,他们想对得到的一万个回复做自动化的分析。于是Perl又派上了用场。Perl脚本产生了SPSS的输入结果,虽然实际上Perl本身也可以用来做统计,如果这个统计学家会用Perl的话。
案例 5 - 跨平台的评测体系
SPEC(标准性能评测协会),一个评估计算机系统的工业协会,将他们的评测系统从SPEC92升级成SPEC95的时候,将主程序做了巨大的改动。他们希望能比较省力的让他们的系统能在UNIX以外的平台下运行。SPEC92系统是使用UNIX shell管理的,不可移植而且没法扩展。SPEC95系统则使用了一个用Perl写的可移植和扩展的管理引擎。这个程序充分利用了Perl的面向对象特性,Perl对C的拓展性,以及Perl的动态模块载入。将SPEC95移殖到Windows/NT平台很容易。移殖到VMS系统的主要难度则在于VMS 缺乏用户级别的fork方法。
案例 6 – 使用Perl工作的商业顾问
虽然很多年来我一直使用C语言工作,但是我发现再没有理由继续使用它了。我十年来的大部分工作都是获取、管理和转换信息,而不仅仅是数据。我参与开发的应用程序不过是带了图形界面的信息获取、管理和转化系统。Perl如今比任何其他的语言都胜任这项工作,不论是脚本语言还是系统编程语言。虽然我最开始只是使用Perl做为粘贴脚本和原型语言,但是现在我已经用它来干所有事情。它取代了我的C和UNIX shell程序。虽然,某些情况下我可能还是需要使用C语言,不过我希望Java最终将能够满足我的这些需求。
跨平台的GUI界面如今用HTML或本地运行上都做的很好,比如在企业内部网,或者是互联网上。
Perl提供了方便的数据结构接口以及商业数据库的界面模块。它为我提供了系统级的工具用于进程控制,文件管理,以及进程间通讯(只要有socket存在)。它让我可以用库,模块,包,还有子程序等等东西创建我的程序。它还可以让我写一些能够修改自身的程序,虽然看起来有点怪,不过有时候这个很必要。
Perl给我的最大好处在于我只需要原来五分之一的时间就可以完成一个复杂的任务。这个对于管理人员和客户都很有吸引力,不过最感兴趣的是那些为这个付钱的人。

案例 7 – Perl做为飞行数据分析的快速原型语言
Phil Brown, Mitre 公司高等飞行系统研发中心(CAASD)(philsie@crete.mitre.org)
由于它的稳健和可塑性,它已经成为CAASD中很多程序员使用的工具,用来开发概念模型的快速原型。交通流管理实验室(T-lab)已经在使用数以百计的 Perl程序,从简单的数据界西和描点制图,到测算空间领域的复杂性,并计算飞行器传过这些领域的飞行时间。这些程序的大小从10行一直到1200行。因为许多这样的程序都高强度地使用I/O,因此Perl由于其多样地解析和搜索特性成为完成这些任务最自然的选择。
案例 8 – 在线专业打印
iPrint折扣打印与网络文具商店(http://www.iprint.com)使用一种所见即所得的网络桌面发布程序,直接连结到后台的打印机,并且建立在一个复杂的实时多功能的产品与价格数据库技术之上。顾客来到这个网站,在线地建立,测试然后预定定制的打印物品,如名片,信纸,商标,邮票,以及其他东西,特别是一些广告。
iPrint系统包括一个前台系统(网站)和后台进程,免去了操作打印机需要的所有前期手工过程,并为iPrint会计系统提供所有需要的信息。这个系统里接近80000行的程序中95%都使用Perl v5.003写的,运行在WindowsNT4.0上。iPrint非常依赖RDBMS(SQL服务器),而所有与服务器的交互都是用Perl和ODBC 完成的。iPrint使用了包括MIME和Win32::ODBC在内的许多CPAN模块。
案例 9 – Amazon.com的作品编辑系统
Amazon.com使Perl开发了一个基于CGI的编辑作品系统,综合了写作(使用Microsoft Word或Emacs),维护(CVS版本控制和使用glimpse方法的搜索),以及输出(使用正规SGML工具)的整个流程。
作者先使用CGI程序建立一个SGML文档,填一个小表格,然后将在用户的home目录下产生一个部分完成的SGML文档,它也可以在Microsoft Windows中被加载。而后作者可以用自己喜欢的编辑器来完成这个文档。利用CGI程序,用户可以看到文档的变化(’cvs diff’)以及用HTML方式在递交之前看到他们的SGML文档(’cvs commit’)。作者可以用关键词搜索SGML库(使用glimpse方法),以及追踪版本变化(’cvs log’)。编辑们也可以利用CGI程序建立时间表。
Amazon.com建立了一个基本的SGML精简类,然后建立了一些子类来进行不同模式下对网站不同部分的提炼(含图片的HTML或没有图片的HTML,将来可能还有PointCast,XML,braille等等)。
所有的代码都是使用Perl写的。它使用了CGI以及HTML::Parser模块。
案例 10 – 新英格兰医院的特殊打印服务器
新英格兰地区的医院系统里使用了十二种操作系统,从大型机一直到个人电脑系统。同时存在七种不同的网络协议。有将近一万二千台PC和两千台同一型号的打印机,以及一千台特殊打印机。这个网络分布在整个城区,利用微波,T1,T3以及光纤。我们要做的事情是实现网络打印。由于特殊打印机是用来在每个专有网络中打印病人的注测和帐户信息的,它通过转有网络连结在IBM大型主机上。现在的目标是希望使用标准的协议利用标准的打印机来打印这些文档。
寻找了各种合适的可扩充打印服务系统之后,发现MIT Project Athena的Palladium可以做为不错的开发基础。不过它是独立打印服务器系统,不符合我们的要求,医院需要一种分布式的服务器。我们花费了两个月的时间希望将Palladium移殖到医院的平台上然后做些修改,但是最终我们发现这太不经济了。最后我们决定自己来建立我们要的系统,使用Perl做为核心程序,Tcl/Tk做GUI管理界面。Palladium有30000行源代码,而我们更复杂的分布式服务器系统只涌了5000行的Perl以及四个人月的工作量就完成了第一个版本。这个Perl程序在一台运行UNIX的60MHz的Pentium机器上运行的速度已经足够快,所以没有必要再用C 重写任何代码。
案例 11 – Purdue大学的网络计算中心
在将来,计算处理有可能采取以网络为基础的服务模式,类似今天的电力供应和电话系统的体系构架。这种模式需要一种能够利用网络访问软件与硬件资源的底层机制。为了实现这个功能,我们开发了一种基于网络的虚拟实验室(”The Hub”),可以让使用者利用Netscape这样的www浏览器访问和运行服务器上的软件。
The Hub是一个可以用www访问的各种模拟工具与相关信息的收集,它是高度模块化的系统,有接近12000行的Perl代码。它包含了几个组成部分:a)通过www访问的用户界面。b)提供访问控制(安全与隐私)以及任务控制(运行、中止,以及程序状态函数)。 c)支持逻辑(虚拟)资源组织与管理。在Hub上,用户何以:a)上载与操作输入文件。b)运行程序。 c)浏览与下载输出文件。所有过程都是通过www浏览器实现。其内部结构是一系列专门的服务程序(用perl5写成)组成的分布式实体。这些服务程序控制了本地和远端的软件与硬件资源。硬件资源包括任意的硬件平台,软件资源包括该平台上所有程序。(目前的版本还不支持交互式和基于GUI的程序)
The Hub允许各种工具根据它们的域被组织在一起并且可以交叉引用。资源可以通过一种特别设计用来描述工具与硬件特性的语言逐步地向这个系统添加。例如,一个新的设备可以仅仅通过描述它的型号,运行模式,操作系统等信息便很容易的添加到Hub系统中。类似地,一个新工具软件可以通过“告诉”Hub系统它的位置,输入方法(如命令行语句),可以运行在何种机器上(如Sparc5),以何种形式整合到Hub系统中(如电路模拟程序)等等信息来被整合进入Hub系统。这些工作通常可以在半小时内完成。
为了实现这种功能,Hub解析URL的方式和标准的面向文档的web服务器不同。URL的结构与底层的文件系统分离开来,而采用一种上下文敏感的方式解析(基于服务器上储存的用户详细状态),以此来完成虚拟帐户和自由存取控制。Lab引擎可以提供它的高性能计算能力给Hub系统随时调用。当一个用户请求运行一个程序时,lab引擎使用用户指定的输入文件来决定(通过人工智能子系统-同样是使用Perl编写的)使用哪些资源来运行,选择一个合适的平台(如工作站解决2-D问题,超级计算机解决3-D问题),将相关输入文件传到相应的平台,通过远端服务器启动程序。当计算结束之后,远端服务器提示lab引擎,然后取回输出文件,递交给用户。
最初的原型系统:半导体模拟Hub,包含来自四个大学的十三个半导体技术工具程序。在不到一年的时间里,超过250个用户进行了超过13000次的模拟运算。提供VLSI设计的Hub,计算机体系结构和并行计算技术也在最近几个月被添加进来。目前他们维护了十四个左右的程序。这些Hub系统现在在 purdue大学的一些本科生课程和研究生课程中被使用,同时也用来协助合作性的研究。经常使用这个系统的包括Puedue大学的一些学生和来自欧州和美国不同地区的一些研究人员。

 

---------------------------------------------------------------------------------------------------------------------------------------------------原文:

   http://www.oreillynet.com/pub/a/oreilly/perl/news/importance_0498.html

      

The Importance of Perl

by Tim O'Reilly, O'Reilly & Associates, Inc. and Ben Smith, Ronin House

Despite all the press attention to Java and ActiveX, the real job of "activating the Internet" belongs to Perl, a language that is all but invisible to the world of professional technology analysts but looms large in the mind of anyone -- webmaster, system administrator or programmer -- whose daily work involves building custom web applications or gluing together programs for purposes their designers had not quite foreseen. As Hassan Schroeder, Sun's first webmaster, remarked: "Perl is the duct tape of the Internet."

Perl was originally developed by Larry Wall as a scripting language for UNIX, aiming to blend the ease of use of the UNIX shell with the power and flexibility of a system programming language like C. Perl quickly became the language of choice for UNIX system administrators.

With the advent of the World Wide Web, Perl usage exploded. The Common Gateway Interface (CGI) provided a simple mechanism for passing data from a web server to another program, and returning the result of that program interaction as a web page. Perl quickly became the dominant language for CGI programming.

With the development of a powerful Win32 port, Perl has also made significant inroads as a scripting language for NT, especially in the areas of system administration and web site management and programming.

For a while, the prevailing wisdom among analysts was that CGI programs--and Perl along with them--would soon be replaced by Java, ActiveX and other new technologies designed specifically for the Internet. Surprisingly, though, Perl has continued to gain ground, with frameworks such as Microsoft's Active Server Pages (ASP) and the Apache web server's mod_perl allowing Perl programs to be run directly from the server, and interfaces such as DBI, the Perl DataBase Interface, providing a stable API for integration of back-end databases.

This paper explores some of the reasons why Perl will become increasingly important, not just for the web but as a general purpose computer language. These reasons include:

  • fundamental differences in the tasks best performed by scripting languages like Perl versus traditional system programming languages like Java, C++ or C.
  • Perl's ability to "glue together" other programs, or transform the output of one program so it can be used as input to another.
  • Perl's unparalleled ability to process text, using powerful features like regular expressions. This is especially important because of the re-emergence via the web of text files (HTML) as a lingua-franca across all applications and systems.
  • The ability of a distributed development community to keep up with rapidly changing demands, in an organic, evolutionary manner.

A good scripting language is a high-level software development language that allows for quick and easy development of trivial tools while having the process flow and data organization necessary to also develop complex applications. It must be fast while executing. It must be efficient when calling system resources such as file operations, interprocess communications, and process control. A great scripting language runs on every popular operating system, is tuned for information processing (free form text) and yet is excellent at data processing (numbers and raw, binary data). It is embeddable, and extensible. Perl fits all of these criteria.

When and Why a Scripting Language?
As John Ousterhout has elegantly argued in his paper, Scripting: Higher Level Programming for the 21st Century, "Scripting languages such as Perl and Tcl represent a very different style of programming than system programming languages such as C or Java. Scripting languages are designed for 'gluing' applications; they use typeless approaches to achieve a higher level of programming and more rapid application development than system programming languages. Increases in computer speed and changes in the application mix are making scripting languages more and more important for applications of the future."

Ousterhout goes on:

As we near the end of the 20th century a fundamental change is occurring in the way people write computer programs. The change is a transition from system programming languages such as C or C++ to scripting languages such as Perl or Tcl. Although many people are participating in the change, few people realize that it is occurring and even fewer people know why it is happening....

Scripting languages are designed for different tasks than system programming languages, and this leads to fundamental differences in the languages. System programming languages were designed for building data structures and algorithms from scratch, starting from the most primitive computer elements such as words of memory. In contrast, scripting languages are designed for gluing: they assume the existence of a set of powerful components and are intended primarily for connecting components together. System programming languages are strongly typed to help manage complexity, while scripting languages are typeless to simplify connections between components and provide rapid application development.

Scripting languages and system programming languages are complementary, and most major computing platforms since the 1960's have provided both kinds of languages. However, several recent trends, such as faster machines, better scripting languages, the increasing importance of graphical user interfaces and component architectures, and the growth of the Internet, have greatly increased the applicability of scripting languages. These trends will continue over the next decade, with scripting languages used for more and more applications and system programming languages used primarily for creating components.

System administrators were among the first to capitalize on the power of scripting languages. The problems are everywhere, on every operating system. They usually appear as the requirement to automate repetitive tasks. Even Macintosh operating systems need some user definable automation. It might be as simple as an automated backup and recovery system, or as complex as a periodic inventory of all the files on a disk, or all the system configuration changes in the last 24 hours. Many times, there are existing utilities that do part of the work, but automation requires a more general framework for running programs, capturing or transforming their output, and coordinating the work of multiple applications.

Most systems have included some form of scripting language. VMS's DCL, MS-DOS's .BAT files, UNIX's shell scripts, IBM's Rexx, Windows' Visual Basic and Visual Basic for Applications, and Applescript are good examples of scripting languages that are specific to a single operating system. Perl is fairly unique in that it has broken the tight association with a single operating system and become widely used as a scripting language on multiple platforms.

Some scripting languages, most notably Perl and Visual Basic, and to a lesser extent Tcl and Python, have gained wide use as general purpose programming languages. Successful scripting languages distinguish themselves by the ease with which they call and execute operating system utilities and services. To reach the next level, and function as general purpose languages, they must be robust enough that you can build entire complex application programs. The scripting language is used to prototype, model, and test. If the scripting language is robust and fast enough, the prototype evolves directly into the application.

So why not use a general purpose programming language like C, C++ or Java instead of a scripting language? The answer is simple: Cost. Development time is more expensive than fast hardware and memory. Scripting languages are easy to learn, and simple to use.

As Ousterhout points out, scripting languages typically lack data types. They don't distinguish between integer and floating point numbers. Variables are typeless. This is one of the ways that scripting languages speed up development. The concept is to "leave the details for later." Since scripting languages are generally good at calling system utilities to do the dirty work, for instance, copying files and building directories or file folders, the details can be handled by some small utility that, if it doesn't exist and is necessary, will be easy to write in a compiled language.

What do those data types do for compiled languages? They make memory management easier for the system, but harder for the programmer. Think about this: How much did a programmer make an hour when FORTRAN was on the ascendant? How much did memory cost then? How about now? Times have changed. Memory is cheap; programmers are expensive!

System languages need to have everything spelled out. This makes compilation of complex data structures easier, but programming harder. Scripting languages make as many assumptions as they can. As little as possible needs to be spelled out. This makes the scripting language easier to learn and faster to write in. The price to be paid is difficulty in developing complex data structures and algorithms. Perl, however, is good at both complex data structures and algorithms, without sacrificing ease of use for simple applications.

Interpreted vs. Compiled Languages

Most scripting languages are interpreted languages, which contributes to the perception that they may be inappropriate for large scale programming projects. This perception needs to be addressed.

With the exception of language specific hardware, it is true that interpreted programs are slower than compiled languages. The advantage of interpreted languages is that programs written in that language are portable to any system that the interpreter will run on. The system-specific details are handled by the interpreter, not by the application program. (There are always exceptions to this rule. For example, the application program may explicitly use a non-portable system resource.)

Operating system command interpreters such as MS-DOS's command.com and early versions of the UNIX C shell are good examples of how interpreters work: each command line is fed to the interpreter as it occurs in the script. The worst blow to efficiency is in any looping; each line in the loop is reinterpreted every time it is run. Some people think that all scripting languages work like this... slowly, inefficiently, a line at a time. This is not true.

However, there are middle languages, languages that are compiled to some intermediate code which is loaded and run by an interpreter at run time. Java is an example of this model; this is what will make Java a valuable a cross platform application language. All the Java interpreters on different hardware will be able to communicate and share data and process resources. This is perfect for embedded systems, where each device is actually a different kind of special purpose hardware. Java is not a scripting language, however. It requires data declarations. It is compiled ahead of time (unless you count Just-In-Time compilation -- really just code generation -- as part of the process).

Perl is also a middle language. Blocks of perl are compiled as needed, but the executable image is held in memory instead of written to a file. The compilation only happens once for any block of the perl script. The advantages of Perl's design make all this optimization work worth while. Perl maintains the portability of an interpreted language while achieving nearly the speed of a compiled language. Perl, nearly a decade old, with hundreds of thousands of developers, and now in its fifth incarnation, runs lean and fast. There is some amount of startup latency, as the script is initially compiled, but this is typically small relative to the overall performance of the script. In addition, techniques such as "fast CGI", which keeps the image of a frequently accessed CGI script in memory for repetitive re-execution, avoids this startup latency, except on the very first execution of a script.

In any event, Perl 5.005 will include a compiler, created by Malcolm Beattie of Oxford University. The compiler eliminates the startup latency of in-process compilation, and adds some other small speed-ups as well. It also addresses the psychological barrier programmers of commercial applications sometimes experience with respect to interpreted languages. (With a compiled language, the source code is no longer available for inspection by outside parties.)

 

Information Processing versus Data Processing

The World Wide Web is only one instance of a fundamental change in how we interact with computers. This change is visible in the very name we now give the industry. It used to be called "Data Processing," as in "I'll have to submit my job to the data processing center at 4 AM so that I can pick up my output before noon." Now we call it "Information Services" as in "the Director of Information Services is working with our planning committee." The interest and emphasis is now on "information" not "data." It is clear there is more interest in information, which typically includes a mix of text and numeric data, rather than just data. Perl excels at handling information.

An important part of Perl's information-handling power comes from a special syntax called regular expressions. Regular expressions give Perl enormous power to perform actions based on patterns that it recognizes in a body of free form text. Other languages support regular expressions as well (there is even a freeware regular expression library for Java), but no other language integrates them as well as Perl.

For many years, the trend was to embed text in specialized application file formats. Except for UNIX, which explicitly specified ASCII text as a universal file format for exchange between cooperating programs, most systems allowed incompatible formats to proliferate. This trend was reversed sharply by the World Wide Web, whose HTML data format consists of ASCII text with embedded markup tags. Because of the importance of the web, HTML -- and ASCII text with it -- is now center stage as an interchange format, exported by virtually all applications. There are even plans by Microsoft to provide an HTML view of the desktop. A successor to HTML, XML (eXtensible Markup Language) is widely expected to become a standard way of exchanging data in a mixed environment.

The increasing prominence of HTML plays directly to Perl's strengths. It is an ideal language for validating user input in HTML forms, for manipulating the contents of large collections of HTML files, or for extracting and analyzing data from voluminous log files.

That is only one side of the text processing power of Perl. Perl not only gives you several ways to pick data apart, but also several ways to glue data back together. Perl is thus ideal for taking apart an information stream and reconfiguring it. This can be done on the fly as a way of transforming information into input to other programs or for analysis and reporting.

One can argue that the next generation of computer applications will not be traditional software applications but "information applications", in which text forms a large percentage of the user interface. Consider the classic "Intranet" web application: a human resources system through which employees can choose which mutual funds in which to invest their retirement savings, track the performance of their account, and access information that helps them to make better investment decisions. The interface to such a system consists of a series of informational documents (typically presented as HTML), a few simple forms-based CGI scripts, and links to back-end systems (which may be outside services accessed via the Internet) for real-time stock quotes.

To build an application like this using traditional software techniques would be impractical. Each company's mix of available investments is unique; the application would not justify the amount of traditional programming required for such a localized application. Using the web as a front end, and perl scripts as a link to back end databases, you are essentially able to create a custom application in a matter of hours.

Or consider Amazon.com, perhaps the most visibly successful new web business. Amazon provides an information front-end to a back-end database and order-entry system, with, you guessed it, Perl, as a major component tying the two together.

Perl access to databases is supported by a powerful set of database-independent interfaces called DBI. Perl + fast-cgi + DBI is probably the most widely used "database connector" on the web. ODBC modules are also available.

Put together Perl's power to handle text on the front end, and connect to databases on the back end, and you begin to understand why it will play an increasingly important role in the new generation of information applications.

Other applications of Perl's ability to recognize and manipulate text patterns include biomedical research and data mining. Any large text database, from the gene sequences analyzed by the Human Genome Project to the log files collected by any large web site, can be studied and manipulated by Perl. Finally, Perl is increasingly being used for applications such as network-enabled research and specialized Internet search applications. Its strength with regular expressions and facility with sockets, the communications building block of the Internet, have made the language of choice for building Web robots, those programs that search the Internet for information.

 

Perl for Application Development

Developers are increasingly coming to realize Perl's value as an application development language. Perl makes it possible to realistically propose projects that would be unaffordable in the traditional system programming languages. Not only is it fast to build applications with Perl, but they can be very complex, even incorporating the best attributes of object-oriented programming if necessary.

It is easier to build socket-based client-server applications with Perl than with C or C++. It more efficient to build free text parsing applications in Perl than any other language. Perl has a sophisticated debugger (written in Perl), and many options for building secure applications. There are publicly available Perl modules for every sort of application. These can be dynamicly loaded as needed.

Perl can be easily extended with compiled functions written in C/C++ or even Java. This means that it is easy to include system services and functions that may not already be native to Perl. This is particularly valuable when working on non-UNIX platforms since the special attributes of that operating system can be included in the Perl language.

Perl can also be called from compiled applications, or embedded into applications written in other languages. Efforts are underway, for instance, to create a standard way to incorporate Perl into Java, such that Java classes could be created with Perl implementations. Currently, such applications must embed the Perl interpreter. A new compiler back-end, to be available in fourth quarter 1997 in O'Reilly & Associates' Perl Resource Kit, will remove this obstacle, allowing some Perl applications to be compiled to Java byte-code.

 

Graphical Interfaces

Because it was originally developed for the UNIX environment, where the ASCII terminal was the primary input/output device (and even windowing systems such as X preserved the terminal model within individual windows), Perl doesn't define a native GUI interface. (But in today's fragmented GUI world this can be construed as a feature.) Instead, there are Perl extension modules for creating applications with graphical interfaces. The most widely used is Tk, which was originally developed as a graphical toolkit for the Tcl scripting language, but which was soon ported to Perl. Tcl is still specific to the X Window System, though it is currently being ported to Microsoft Windows.

However, as noted earlier, the development of native windowing applications is becoming less important as the web becomes the standard GUI for many applications. The "webtop" is fast replacing the "desktop" as the universal cross-platform application target. Write one Web interface and it works on UNIX, Mac, Windows/NT, Windows/95...anything that has a Web browser.

In fact, an increasing number of sites use Perl and the Web to create new easier-to-use interfaces to legacy applications. For example, the Purdue University Network Computing Hub provides a web-based front-end to more than thirty different circuit simulation tools, using Perl to interpret user input into web forms and transform it into command sequences for programs connected to the hub.

 

Multithreading

Threads are a desireable abstraction for doing multiple and concurrent processing, particularly if you are programming for duplex communications or event driven applications. A multi-threading "patch" to Perl has been available since early 1997; it will be integrated into the standard distribution as of Perl version 5.005, in the fourth quarter.

The multitasking model that Perl has historically supported is "fork" and "wait." The granularity is the process. The flavor is UNIX. Unfortunately, the Windows/NT equivalent isn't quite the same. This is where the portability of Perl breaks down, at least for now. By building cross-platform multi-process Perl applications with a layer of abstraction between the process control and the rest of the application, the problems can be avoided. Furthermore, work is underway, to be completed in the fourth quarter of 1997, to reconcile the process-control code in the UNIX and Win32 ports of Perl.

 

Perl on Win32 Systems

In 1996, Microsoft commissioned ActiveWare Internet Corporation (now ActiveState Tool Corp) to create a port of Perl to Win32 for inclusion in the NT Resource Kit. That port has since become widely available on the net, and reportedly, nearly half of all downloads of the Perl source code are for the Win32 platform.

Perl has taken off on Win32 platforms such as NT for several reasons. Despite the presence of Visual Basic and Visual Basic for Applications, native scripting support on Win32 is relatively weak. While VB is an interpreted scripting language, it is still a typed language, which makes it somewhat more cumbersome to use. It also lacks the advanced string-handling capabilities that are so powerful in Perl. As efforts are underway to create larger-scale NT sites, the limitations of Graphical User Interfaces quickly become evident to administrators; scripting is essential for managing hundreds or thousands of machines.

It is not insignificant that many of the experienced administrators being called on to manage those sites cut their teeth on UNIX. Using Perl is a good way to bring the best of UNIX with you to other platforms.

Nor can you underestimate the drawing power of the web. As thousands of Perl-based CGI programs and site management tools are now available, Perl-support is essential for any web server platform. As NT-based web servers from Microsoft, O'Reilly and Netscape become a more important part of the web, Perl support is essential. In particular, ActiveState's PerlScript(tm) implementation allows Perl to be used as an active scripting engine on NT web servers such as Microsoft's IIS and O'Reilly's WebSite that support the Active Server Pages (ASP) technology.

In addition to the core Perl language interpreter, the ActiveState Perl for Win32(tm) port includes modules specifically targetted to the Win32 environment. For example, it provides full access to Automation objects. As more and more system resources and components support that interface under Windows, more aspects of the operating system are directly accessible by Perl for Win32.

 

Extending the Power of Perl

Unlike languages such as Microsoft's Visual Basic or Sun's Java, Perl does not have a large corporation behind it. Perl was originally developed by Larry Wall and made available as freeware. Larry is assisted in the further development of Perl by a group of about 200 regular contributors who collaborate via a mailing list called perl5-porters. The list was originally focussed on porting Perl to additional platforms, but gradually became the center for those adding to the core language.

In addition, Perl 5 includes an extension mechanism, by which independent modules can be dynamically loaded into a Perl program. This has led to the development of hundreds of add-in modules. Many of the most important modules have become part of the standard Perl distribution; additional modules are available via the Comprehensive Perl Archive Network (CPAN). The best entry point to the CPAN is probably the www.perl.com site, which also includes book reviews, articles, and other information of interest to Perl programmers and users.

While there has been a historical bias against using freeware for mission critical applications, this bias is crumbling rapidly, as it becomes widely recognized that many of the most significant computing advances of the past few decades have been developed by the freeware community. The Internet itself was largely developed as a collaborative freeware project, and its further development is still guided by a self-organizing group of visionary developers. Similarly, the leading web server platform in terms of market share, by a large margin, is Apache--again, a free software project created, extended and managed by a large collaborative developer community.

In addition to ongoing development, the Perl community provides active support via newsgroups and mailing lists. There are also numerous consultancies and paid support organizations. Excellent documentation is provided by numerous books, including most notably Programming Perl, by Larry Wall, Randal Schwarz and Tom Christiansen. The Perl Journal and www.perl.com provide information about the latest developments.

In short, because of the large developer base and the cooperative history of the freeware community, Perl has access to development and support resources matching those available to the largest corporations.

 

Application Stories

The following section includes a selection of user application stories, ranging from the quick and dirty "Perl saves the day" applications familiar to so many system administrators, to larger custom applications. Some of these application stories are taken from presentations at the first annual Perl Conference, held in San Jose, CA from August 19-21, 1997. The application descriptions from the conference proceedings are labeled with the names of their authors.

Case 1 - The Programming Language that Saved Netscape Technical Support
Dav Amann (dove@netscape.com)

Ok, so here's the situation. Your brand new exciting Internet company has taken off and you're selling more browsers, servers, and web applications than you ever hoped for, your company is growing by leaps and bounds, and the latest market information says that your customer base has just past the 30 million mark in less than a year.

And the only downside is that these 30 million folks might have a few problems with their browser; they might not know exactly what the Internet is; they might want to call someone for support. They might want to call *you* for technical support.

So, when this happens, you might think, "That's ok I'll just put some technical articles out on the web." But when you first look at the project, you realize that you're going to need some sort of Content Management System, some sort of Distribution system, some logging analysis, and gathering and reporting of feedback of your customers on your site. And you're going to want it yesterday.

Lucky for you, you know Perl. And with Perl you're able to get all of this built in 3 months in the spare time of 4 very busy technical support engineers.

Case 2 - A Quick and Dirty Conversion at BYTE

BYTE Magazine used to maintain its own information network and conferencing system, BIX, that both editors and readers used for exchanging ideas. The conferencing model was quite different from Usenet, somewhat closer to a mail-list. Since several of the BYTE editors were regular Usenet subscribers and preferred that model, BYTE built a gateway that translated and maintained the BIX editorial discussion groups as a private Usenet news group. The language was Perl. It took little more than a hundred lines of code and a few days of work.

Case 3 - Routing customer inquiries to appropriate experts

The performance testing group at one of the world's leading computer companies needed to automate query routing. They were directed to use their world-wide corporate Intranet, but not given any budget to do the project. Two engineers with only a few weeks of Perl experience created a solution. The Perl scripts responded to the query by matching key elements of queries with people with that expertise. The CGI programs not only pointed the client to the experts' Web-pages and E-mail addresses, but also passed the query on to all appropriate experts in their E-mail. The solution took no more than a few man-weeks and so could be asorbed into other budgets.

Case 4 - Collection and analysis of email survey data

An Internet market research firm that does its research using an E-mail survey wants to automate and generalize the handling of the anticipated ten thousand responses. Perl was used to automate the process. The Perl script generated input for SPSS, but would have been capable of doing statistical analysis if the statistician had known Perl.

Case 5 - A Cross-Platform Harness for Running Benchmarks

SPEC (the Standard Performance Evaluation Corporation), a industry consortium for benchmarking computer systems, radically changed the governing program when the SPEC92 benchmarks evolved to SPEC95. SPEC wanted to make it possible for their benchmarks to run other operating systems than UNIX without a major effort. The SPEC92 benchmarks were managed by UNIX shell scripts, unportable and inflexible. The SPEC95 benchmarks are managed by a portable, extensible engine written in Perl. The scripts take advantage of Perl's object oriented capabilities, Perl's extensibility with C, and Perl's dynamic module loading. Porting SPEC95 to Windows/NT was simple. The major problem with porting to VMS is its lack of user level forks.

Case 6 - Consultant working with Perl

Despite the years that I have spent developing in C, I have found little reason to continue to do so. Most of my work in the last ten years has been developing code that retrieves, manages, and converts information, not just data. The application programs I am involved in are merely graphical controls front-ending information retrieval, management, and conversion engines. Perl now fills the need for this kind of development better than any other language--scripting or system programming language. Even though I started using Perl merely as a glue scripting language and prototyping language, I now use it for everything. It has replaced both C and my UNIX shell programs. There will be times, I am sure, that I will have to write, or at least patch, a program in C. I expect that Java will eventually fill those requirements for me.

Cross-platform GUI interfaces are now done in HTML and run locally, in an Intranet, or as part of the Web.

Perl provides me with fast indexing to simple data structures and modules for talking to commercial databases. It provides me with system level tools for process management, file management, and interprocess communications wherever sockets are understood. It allows me to design my applications using libraries, modules, packages, and subroutines. It allows me to write applications that modify themselves; scary as that may seem, it is sometimes necessary.

The greatest benefit of Perl to me is that I can build solutions to complex problems in a fifth the time. This appeals to managers and clients, but particularly to the people paying the bills.

Case 7 - Perl as a Rapid-Prototyping Language for Flight Data Analysis
Phil Brown, Mitre Corporation Center for Advanced Aviation System Development (CAASD) (philsie@crete.mitre.org)

Because of its robustness and flexibility, Perl has become the language of choice by many programmers in CAASD for developing rapid-prototypes of concepts being explored. The Traffic Flow Management Lab (T-Lab) has implemented hundred of Perl programs that range from simple data parsing and generating plots, to measuring the complexity of regions of airspace and calculating the transit times of aircraft over these regions. The size of these applications range from about 10 lines to over 1200. Because many of the applications are very I/O intensive, Perl became the natural choice with its many parsing and searching features.

Case 8 - Online Specialty Printing
Dave Hodson (dave@iprint.com)

The iPrint Discount Printing & CyberStationery Shop (http://www.iPrint.com) is powered by a WYSIWYG, desktop publishing application on the Internet directly connected into a backend printer and sits on top of a sophisticated, real-time, multi-attributed product and pricing database technology. Customers come to our site to create, proof, and order customized popularly printed items--business cards, stationery, labels, stamps, specialty advertising items, etc online.

The iPrint system includes both a front-end (the website) and a back-end process that eliminates nearly all of the manual pre-flight process that printers perform and also provides all pertinent information to iPrint's accounting system. 95% of the approximately 80,000 lines of code used to perform this work is done using Perl v 5.003 with WinNT 4.0 OS. iPrint relies heavily on RDBMS (SQL Server) with all database interaction being performed by Perl and ODBC. iPrint uses many modules from the CPAN archives, including MIME and Win32::ODBC.

Case 9 - The Amazon.com Editorial Production System
Chris Mealy (mookie@amazon.com)

Amazon.com used Perl to develop a CGI-based editorial production system that integrates authoring (with Microsoft Word or Emacs), maintenance (version control with CVS and searching with glimpse), and output (with in-house SGML tools).

Writers use the CGI application to start an SGML document. They fill out a short form and then it generates a partially completed SGML document in the user's home directory, which may be mounted on their Microsoft Windows PC. The writer then uses their favorite editor to finish the document. With the CGI application, users see changes ('cvs diff') and their SGML rendered as HTML before submitting their document ('cvs commit'). Writers can do keyword searches of the SGML repository (by way of glimpse) and track changes ('cvs log'). Editors can also schedule content with the CGI application.

Amazon.com created a base SGML renderer class that is sub-classed to render different sections of the web site in different modes (html with graphics and html without graphics, and in the future, PointCast, XML, braille, etc).

All of the code is in Perl. It uses the CGI and HTML::Parser modules.

Case 10 - Specialty Print Servers at a New England Hospital

A major New England hospital uses twelve operating systems, from mainframes to desktop PCs. It has seven different network protocols. There are roughly twenty thousand PC workstations and two thousand printers of one type and one thousand speciality printers. The network is spread over an entire city using microwave, T1, T3, and private optical fiber. The problem is network printing. Specialty printers are required because the patient registration and billing system runs on IBM and Digital mainframes, the output going through their proprietary networks. The goal is to have all of the operating systems able to print to a standard printer through a standard protocol.

A search for appropriate scalable printer servers uncovered the MIT Project Athena's Palladium as a good starting point. However, its model of standalone print servers didn't fit. The hospital needed a distributed server model. When a two month effort to port Palladium to the hospital platform so that we could make the changes proved that it was not going to be economical, we decided to build exactly what we wanted in fast prototyping languages: Perl for the core application and Tcl/Tk for the GUI administrative interface. Palladium represents 30,000 lines of C. The more complex distributed server model required only 5,000 lines of Perl and only four man-months to achieve a first release. The Perl proved sufficiently fast on a 60MHz Pentium running a UNIX variant that no code required rewriting in C.

Case 11 - The Purdue University Network-Computing Hub
(Nirav H. Kapadia, Mark S. Lundstrom, Jose' A. B. Fortes)

In the future, computing may operate on a network-based and service-oriented model much like today's electricity and telecommunications infrastructures. This vision requires an underlying infrastructure capable of accessing and using network-accessible software and hardware resources as and when required. To address this need, we have developed a network-based virtual laboratory ("The Hub") that allows users to access and run existing software tools via standard world-wide web (WWW) browsers such as Netscape.

The Hub, a WWW-accessible collection of simulation tools and related information, is a highly modular software system that consists of approximately 12,000 lines of Perl5 code. It has been designed to: a) have a universally-accessible user-interface (via WWW browsers), b) provide access-control (security and privacy) and job-control (run, abort, and program status functions), and c) support logical (virtual) resource-organization and management. The Hub allows users to: a) upload and manipulate input-files, b) run programs, and c) view and download output - all via standard WWW browsers. The infrastructure is a distributed entity that consists of a set of specialized servers (written in Perl5) which access and control local and remote hardware and software resources. Hardware resources include arbitrary platforms, and software resources include any program (the current implementation does not support interactive and GUI-based programs).

The Hub allows tools to be organized and cross-referenced according to their domain. Resources can be added incrementally using a resource-description language specifically designed to facilitate the specification of tool and machine characteristics. For example, a new machine can be incorporated into the Hub simply by specifying its architecture (make, model, operating system, etc.) and starting a server on the machine. Similarly, a new tool can be added by "telling" the Hub the tool's location, its input behavior (e.g., command-line arguments), what kinds of machines it can run on (e.g., Sparc5), and how it fits into the logical organization of the Hub (e.g.,circuit simulation tool). Each of these tasks is typically accomplished in less than thirty minutes.

To facilitate this functionality, the Hub interprets the URLs differently from the standard document-oriented web servers. The structure of the URL is decoupled from that of the underlying filesystem and interpreted in a context-sensitive manner (based on user-specific state stored by the server), thus allowing virtual accounting and arbitrary access-control. The lab-engine provides the Hub with its on-demand high-performance computing capabilities. When a user requests the execution of a program, the lab-engine uses information in the user-specified input file to predict (via an artificial intelligence sub-system - also written in Perl5) the resources required for the run, selects an appropriate platform (e.g., workstation for a 2-D problem, supercomputer for a 3-D problem), transfers relevant input files to the selected machine, and initiates the program (via the remote server). When the run is completed, the remote server notifies the lab-engine, which retrieves the output files and informs the user.

The initial prototype, the Semiconductor Simulation Hub, currently contains thirteen semiconductor technology tools from four universities. In less than one year, over 250 users have performed more than 13,000 simulations. New Hubs for VLSI design, computer architectures, and parallel programming have been added in recent months; they currently contain a modest complement of fourteen tools. These Hubs are currently being used in several undergraduate and graduate courses at Purdue as well as to facilitate collaborative research. Regular user include students at Purdue University and researchers at several locations in the U.S. and Europe.

 

 


 

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值