Java as a Scientific Programming Language

转载 2004年08月20日 21:22:00

Java as a Scientific Programming Language (Part 1): More Issues for Scientific Programming in Java
By Ken Ritley

Java was designed as a modern, object-oriented programming language.  Its features such as platform-independence, ability to manage libraries, threads, etc. are important for modern scientific programs.  These are good reasons for a scientist to choose Java.

But Java was developed without the specific needs of scientists in mind.  This means that in some respects scientists must exercise a bit of creativity and patience.

For the scientist thinking about migrating to Java, there are many important issues to consider. What follows is a useful list of some of them.

Complex Numbers

Java, like C, does not support an intrinsic complex number type.

For those who may not know, a complex number (say, c) is simply an ordered pair of two numbers (say, a and b), which obeys some very simple rules of arithmetic.

The fundamental equations which describe almost everything in the universe — from weather systems to black holes to the way in which tiger populations depend on how many rabbits they eat — are based on complex numbers, not ordinary numbers.  Scientific programs have to be able to handle complex numbers efficiently, and scientific programmers have to be able to code complex formulas easily.

It's actually straightforward to "patch" Java's lack of a complex number data type.  One can use a class (let's call it Complex) with two member variables (say,  realC and imagC), and separate public methods for all the needed arithmetic operations (addition, subtraction, etc.).

But there a few problems with this.  One problem is such programs may be more inefficient, because the Java compiler will be forced to create a new object for each instance of a complex number.  For loops with hundreds of thousands of iterations, use of an intrinsic complex data type is obviously much faster.

Further, instead of being able to code a simple complex number equation such as c3=c1*c2,Java forces the scientist to code it, for example, such as c3=complexMultiply(c1,c2).  This means the complex number "patch" class must be distributed with each Java program, and depending on how these classes are implemented and defined, Java library subroutines may not easily integrate with one another.

This also breaks the Golden Rule of Programming, that well-written programs should be easily understandable by people.  As we'll discuss in Part 2, in a scientific program it's perfectly acceptable and even desirable to use variable names such as e, m,and c — because those are exactly how scientists write E=mc2!  But by forcing such nice equations like c3=c1*c2to be written with notation like c3=complexMultiply(c1,c2)means that they become harder to debug and more prone to errors.

Just for fun, here's a snapshot of a scientific programmer's nightmare (see the sidebar "Scientific Programs Should be Understandable by Scientists").

Scientific Programs Should be Understandable by Scientists


Imagine a scientific program, which may have DOZENS of formulas that look similar to the example above.

In Fortran, the equation looks like this:

vp = CSQRT( (1-v**2/c**2)/(1-v/c) )

But in Java, it might look like this:

vp = complexSqrt( complexDivide( complexSubtract(1, (complexDivide( complexPow(v,2), complexPow(c,2)))),
complexSubtract(1,complex( complexDivide(v,c)))));

However, the Java version won't give the right results, because there's (intentionally) a typing error!

Can you find it?  Do you want to debug a scientific program with DOZENS of these formulas?

— K.R.

But not to worry: The language C also lacks an intrinsic complex data type, and this hasn't stopped scientists from writing programs in this language! There are some standards for managing complex numbers, and there's even hope that the makers of Java will see fit to include complex numbers as a future extension of the language.

The Precision Problem

The same Java program will give the same results on all systems.  But ... will they be the right results?

For business applications, Java's arithmetic and mathematics offer certainly enough accuracy.

But for some applications, the IEEE 754 standard is actually a constraint.  For example, if a hardware platform offers more than the required precision and Java insists that numbers be rounded, then not only may accuracy sacrificed but also execution speed.  For those who are interested, here are some details of how Java handles numerics, more details, and even more details!

The prospective scientific Java programmer should rest easy on this issue: for the majority of the scientific community, IEEE 754 is enough and these "constraints" are likely to be unimportant.

The IDEs of March

Wander the halls of any scientific research institution and you'll see the same problem played out in many offices: To write and run Fortran or C program, a scientist will TELNET to a remote workstation (even a very slow machine!). That's where command-line compilers and ASCII text-editors live, and those are the only programming tools a scientist may know. The program output is then FTP'd back to the PC, where it is plotted and analyzed — frequently not without trouble because of linefeed/carriage-return differences. The whole procedure is then repeated until the scientist's fingers start to bleed.

It's a very sad loss of productivity since, say, the late 1980s, when a good scientific laboratory might have run a single VMS-based DEC system, loaded with all the tools a scientist needed.

The problem, of course, is not that the productivity has dropped, merely the relative productivity. Many scientists trained before about 1990 have never taken the time to explore the new tools available to programmers. Today, new scientists grow up with PCs in the home and they learn about these new scientific programming tools in college. In fact, scientific courses at universities are changing as a result of these tools.

So ... older scientists take note: Modern programs are written by modern programmers using modern IDEs. An IDE (integrated development environment) is a user-friendly shell which wraps the editor, compiler, debugger, and output window into one user-friendly package [please see Debugging in Java: Techniques for Bug Eradication

]. "I don't have the time to learn a specific new gadget" is a common excuse among some scientists.  But really, it's no excuse because IDEs are as universal as CD players. There's a box to put the source code, buttons to click on to compile, start and stop it, and a little output window that shows what's playing.

Nevertheless, a scientist needs to exercise caution. IDEs bring with them a new set of problems — compatibility problems — because they may combine highly-portable source code with nonportable library functions or files.  For Java the sitution is not so bad.  Because all aspects of the Java language are standardized, graphics included, very little effort is required to write IDE-independent source code. We'll discuss this in more detail in Part 2.

The User-Interface Issue

Be they theoretical results from a numerical simulation, images from a microscope, or else experimental data points with errorbars — scientific data needs to be seen to be understood. There are excellent Fortran and C compilers for every type of machine, and these languages are almost completely portable — except for graphics.

Because graphics and GUIs are an intrinsic part of Java, they also enjoy "write-once, run-anywhere" status. This includes basic tools for setting up GUIs with windows and boxes and buttons, as well as advanced tools for image processing (such as the Java Advanced Imaging JAI classes). And because of Java's popularity, the scientist new to Java will find the Internet full of free tools and source code for visualizing data, plotting equations, analyzing images, etc.

The Hardware Problem in the Scientific Laboratory


Scientists working at the European Synchtrotron Radiation Facility (ESRF) in Grenoble, France, know the problems that mixed-hardware environments create.

The ESRF is one of over 30 multi-billion-dollar laboratories worldwide that provide scientists with access to an intense beam of x-rays. It's here where molecular biologists unravel the atomic structure of proteins, physicists explore the basic properties of matter, materials scientists synthesize new materials, and physicians design new methods to combat cancer.

Scientists from around the world travel here to perform experiments at the ESRF, which exclusively uses Unix-based workstations and software to control the experiments and collect the data. But upon returning home to analyze their data, many of these scientists either prefer or else need to use tools only available on PCs. There are some solutions to help bridge the hardware gap, but until now scientists have always been forced to chose one platform for writing new software, then "damage-control" the consequences.

— K.R.

The Hardware Independence

Scientific laboratories are filled with both high-performance Unix-based workstations as well as PCs. The workstations offer speed and stability. PCs offer a more user-friendly work environment with plenty of tools for data analysis — but also tools like spreadsheets, word processors, and presentation managers that scientists need to write reports and present their findings. As shown in our example (see the sidebar "The Hardware Problem in the Scientific Laboratory"), until now scientists have always been forced to chose one platform for writing new software, then "damage-control" the consequences.

The numerical standardization of Java provides the ultimate scientific solution to this hardware problem: from loop iteration (a problem in Fortran), to the choice of fdlibm mathematical functions, right down to the 64-bit IEEE 754 implementation of floating point arithmetic. This means "write-once, run-anywhere" scientific programs in Java will crunch numbers in exactly the same way on the workstation in the lab as the data are being collected, on the PC in the office, or even on a notebook computer during flights to scientific conferences!

So ... while Java may not have been designed with the scientist in mind, its powerful features make it an important if somewhat less-than-ideal platform for modern scientific programming.  And with some creativity — and perhaps a bit of patience — any scientific programmer can take full advantage of what Java has to offer.



《The C Programming Language》读书笔记总结 <一>.基础篇

写了这么多年的C代码,回过头来再看《The C Programming Language》这本书,作者Brian W. Kernighan和C语言之父Dennis M. Ritchie。感觉里面的知识...
  • Eastmount
  • Eastmount
  • 2015年10月21日 16:14
  • 2036


最近维护一个比较老的项目,维护好了以后需要发布在tomcat7上。部署过程出现了好多的问题。下面总结一下与大家分享。 第一个问题:         上午部署项目出现问题,有一个页面无...
  • hy6688_
  • hy6688_
  • 2015年01月31日 17:23
  • 1837


Java命令参数说明大全  转载: http:/...
  • kangqiao182
  • kangqiao182
  • 2013年05月05日 00:28
  • 5222

Multi-Language Programming : Object As Medium

Software developers may encounter the problem to develop applications with multiple languages, maybe...
  • srplab1
  • srplab1
  • 2011年10月10日 20:13
  • 614

how to use a SQLite database in a standalone program with an HTML interface and VBScript as the programming language

This article describes how to use a SQLite database in a standalone program with an HTML interface a...
  • amibaren
  • amibaren
  • 2011年01月05日 22:45
  • 1045

The Java Programming Language4th读书笔记-第五章 嵌套类和接口

嵌套类型主要有两个用途: 嵌套类和嵌套接口使多个类型可以组织成逻辑上相关的组,并具有相应的作用域; 更重要的一点,嵌套类可以简单而有效地连接逻辑上相关的对象;该能力被广泛地用于事件框架;例如Java...
  • ming1205
  • ming1205
  • 2015年12月25日 19:52
  • 217

The Java Programming Language4th读书笔记-第九章 运算符和表达式

浮点除法和取余运算可能会产生无穷大或NaN的结果,但是绝不会抛出异常: x y x/y x%y 无穷 ±0.0 ±∞ NaN 无穷 ±∞ ±0.0 x ...
  • ming1205
  • ming1205
  • 2015年12月28日 18:22
  • 215

The Java Programming Language4th读书笔记-第十二章 异常与断言

异常主要分为: 检查型异常:编译器将检查我们的方法是否只抛出了它们声明自己会抛出的异常;虽然这种情况是异常的,但是一定程度上它的发生是可以预计的,而且一旦这种情况确实发生了,就必须以某种方式来处理;...
  • ming1205
  • ming1205
  • 2015年12月29日 13:24
  • 233

Code Conventions for the Java TM Programming Language

From SUN (Oracle) Code Conventions for the ...
  • thl789
  • thl789
  • 2012年09月27日 15:31
  • 2515

The Java Programming Language4th读书笔记-第十四章 线程

  • ming1205
  • ming1205
  • 2016年01月13日 19:48
  • 227
您举报文章:Java as a Scientific Programming Language