What Compsci textbooks don't tell you: Real world code sucks

最新推荐文章于 2024-07-17 19:57:07 发布

weixin_33753845

最新推荐文章于 2024-07-17 19:57:07 发布

阅读量165

点赞数

文章标签： c/c++ java

What Compsci textbooks don't tell you: Real world code sucks

By Dave Mandl

Posted in Developer, 21st December 2012 10:19 GMT

There’s a kind of cognitive dissonance in most people who’ve moved from the academic study of computer science to a job as a real-world software developer. The conflict lies in the fact that, whereas nearly every sample program in every textbook is a perfect and well-thought-out specimen, virtually no software out in the wild is, and this is rarely acknowledged.

To be precise: a tremendous amount of source code written for real applications is not merely less perfect than the simple examples seen in school — it’s outright terrible by any number of measures.

Due to bad design, sloppy or opaque coding practices, non-scalability, and layers of ugly “temporary” patches, it’s often difficult to maintain, harder still to modify or upgrade, painful or impossible for a new person joining the dev team to understand, or (a different kind of problem) slow and inefficient. In short, a mess.

Of course there are many exceptions, but they’re just that: exceptions. In my experience, software is, almost as a rule, bad in one way or another. And lest I be accused of over-generalising: in more than 20 years I’ve done work for maybe a dozen companies, almost all of them in the banking industry and many of them household names.

The technology people employed at these companies are considered to be the very best, if only because the pay tends to be so good. I’ll play it safe and stick to my actual experience in the financial sector even though I'm convinced this state of affairs is not limited to that one industry.

Getting back to the cognitive-dissonance problem: in casual discussion, developers and tech managers will talk about all the wonderful things their system does, the stellar technical skills of their team, and how much their users love them — and all that may be true.

But talk privately, colleague-to-colleague, to one of these developers about the quality of the code base, all the daily headaches, the quick hacks and patches, the laughable mistakes made by the original author of the system (who left the firm a couple of years ago), or the fear that the person who “knows the system” will leave for another job, and you’ll hear a different story: “Of course there are problems. Everyone knows that. Things are always this way — it’s barely even necessary to mention it.”

Very few coworkers with whom I’ve broached this subject have seen things differently, and I’ve often heard stories of costly screw-ups that would shock the most jaded techie. But on a daily basis, in all but the worst cases, it’s easier for developers to talk about things like what their system does, or how elegant its user interface is, than to dwell on any horrors lurking inside.

It also may be that, after years of working on a system with serious maintainability flaws, people simply become accustomed to the strange procedures they have to go through regularly to keep things running.

Complex systems + borked code = beelllions down the drain

In the financial business there have been several software-related blowups in the last few years that were big enough to make it onto the evening news. To name just three, there were: the Nasdaq failure that wreaked havoc with Facebook’s IPO; a trading fiasco at Knight Capital in August that led to widespread market disruption and a $400m drop in Knight’s market value; and the “flash crash” of May, 2010, which caused market losses of at least $1 trillion in a matter of minutes.

System glitches and bugs this visible and this costly are relatively rare, but for every one of them there are a hundred smaller ones that only a handful of people ever hear about. A Reuters article this summer with the title “Morgan Stanley Smith Barney Rainmakers Consider Exit” said this: “Several dozen Morgan Stanley Smith Barney advisers who manage tens of billions of dollars of client money are considering leaving the firm, saying that widespread technology problems have made it very difficult for them to do their jobs.” (Italics mine.)

These are all outright failures in highly complex systems, but poorly written code can crop up in applications of any size, and it may not lead to a direct, quantifiable loss. It will, however, require untold extra hours of work for routine support, make even minor upgrades painful, or force systems to be retired prematurely (in some cases, before they’ve even gone live). How does this happen?

Crappy software: Is it bad programming, or is it 'too good'?

The most common reason for the existence of bad software is bad programmers. Good software, misleadingly, is usually easy to read, but it’s not easy to write. There are an awful lot of developers out there who never learned the correct way to do things. Maybe they’re so enamored of a particular technology or coding technique that they insist on using it whether it’s appropriate or not (“If the only tool you have is a hammer…”). Maybe they’re in over their head on a project with a huge number of moving parts. Maybe they’ve been forced to pick up an unfamiliar language at a moment’s notice. Or maybe their thought processes just don’t translate into logical, supportable code. The best technologists will usually seek out the least boring work, or the highest compensation, but even the most exciting project may be staffed with bad programmers merely because budget constraints, or stinginess, prevented the firm from shelling out for more talented ones.

At the other end of the spectrum, many projects are sabotaged by developers who are “too good” — that is, people who insist on coding everything in the most complicated and impenetrable way possible. This may be because they feel the constant need to show how much they know, or because doing things the simple way is just not interesting enough. As one friend of mine, a heavyweight who has had to rewrite many terrible applications, once said to me: “They think that if they’re not writing 80 lines of code to add two numbers, they’re not using their education.”

In my experience, these people can cause more harm than anyone else. I’ve seen developers use the most tangled object-oriented techniques to do things that could have been accomplished much more easily with a trivial 10-line function. In C++, an everything-but-the-kitchen-sink language used heavily on Wall St, templates (to give just one example) enable this kind of behavior by allowing you to create the most esoteric generic classes imaginable.

In one case where I had to take over development from a C++ guru who felt the need to do everything in the most opaque, “sophisticated” way possible, his components simply had to be scrapped and rewritten from scratch. I couldn’t begin to understand the code, and neither could a colleague who was one of the best C++ developers I’d ever worked with. Four solid months of work in the trash bin.

If the original developer had stayed with the firm and finished the project, that would only have deferred the day of reckoning, since no one could ever have taken over support of this monster. (The joke name on Wall St for this kind of situation is “job security”: The sole expert on this system could never be sacked.) But even if the code had been marginally comprehensible, support would still have been a nightmare for anyone but the original developer, and it’s likely that a new person would have broken things by trying to make changes to delicate classes that he didn’t fully grasp.

Time-savers and face-savers

Another source of bad code is laziness. For programmers, there’s “good” laziness, which drives them to build tools that will relieve themselves and others of unnecessary drudge work — that’s what we’re here for, after all. And then there’s “bad” laziness, the kind that leads programmers to cut corners or do things in the quickest possible way, rather than taking the three extra hours to do them right. This always comes back to haunt someone—possibly the person who takes over six months later and doesn’t know that this tiny block of exceptional code exists, or why. Patches, almost by definition, are changes made without thought to the long-term consequences, and often sloppily, because they’re usually considered “temporary.”

For the record, however, I don’t think I’ve ever seen anyone go back and clean up a quick-and-dirty fix made two years previously just because it was the right thing to do. If the system is working, almost no manager will pay just to have you recode a piece of it “the right way,” without adding any new functionality. There’s always something more important that needs to be done—until that quick-and-dirty fix blows up and (because it’s urgent) gets replaced by another quick-and-dirty fix. To some lazy programmers, it must be said, none of this matters: They take the easy way out precisely because they know they won’t be around when their time bomb explodes.

There are languages that by their very nature make it easier to write bad code. As much as I love APL, a powerful language I once worked in that makes heavy use of Greek letters and other cryptic symbols, it’s easily abused, and I’ve seen some horrific APL systems written by people who hadn’t been trained properly.

(Unfortunately I had to support one such system early in my career. I prayed every day for a quick, painless death.)

Conway's famous Game of Life in Dyalog.com's one line of APL code

If, as an exercise, you wanted to write a program that no one in the world could make heads or tails of, the K language would make that a breeze: I once worked in a group that had a large codebase in K (which as it turns out is a distant, ugly relative of APL), and it never took me less than a half hour to decipher any one line of it.

As mentioned above, C++, despite its superficial similarities to Java, is infinitely easier than Java to write impenetrable code in. And one language I’ve been warned about, though I’ve never had the opportunity to use it, is Haskell, an offshoot of ML. According to a friend in academia who’s studied it, it’s “the Taliban version of ML,” in which it’s all but impossible to write readable code.

This Haskell line prints all the powers of 2 as explained on Stackoverflow

Ultimately, the greatest enemy of good programming practices is time. One of the reasons the code in your textbook is perfect and the code where you work isn’t is that the author of the book was allowed, or forced, to do things right.

In the real world, tight budgets, shortsighted managers, and unreasonable expectations from non-techies almost always conspire to make developers do things too quickly. The final product may be good enough now, and be perfectly understandable to the people who’ve just written it, but all that will change in a year, when there are new requirements and a new set of developers grappling with the hastily-thrown-together code. Additionally, the codebase in even a small production system can be orders of magnitude bigger than in most textbook examples, and large systems are far from easy to build. Despite protestations to the contrary, projects greater than a certain size and complexity (see the Reuters article cited above) are almost guaranteed to fail in some way without sufficient time for planning, design, testing, and adult supervision.

All the above aside, there’s one simple and completely painless way to prevent future generations from cursing you when they look at your code: Include some comments! ®

现实中的代码总是令人生恶--教科书没告诉你的

从专业学习到实际工作，开发者们总会受到一种强烈思想差异的冲击。这种思想冲突的原因在于，教科书中的程序总被认为是完美无缺的，而现实中的软件却很难被这样认为。

确切得说，真实世界中的程序不仅是不完美的，甚至是可怕的，不论从哪个角度而言。

糟糕的设计，草率的实践，无法扩展，复杂的临时修补，这些都使程序难以维护、升级，甚至造成新成员难以融入。

当然也会有例外，但它们也只能是例外。以我的经验，软件的设计大多是失败的，只是失败的方式不同罢了。过去的20多年，我在很多公司工作过，它们几乎都是家喻户晓的银行。这些银行的技术人员被认为是业内的顶尖高手，因为他们享有很高的薪水。他们要做的是，确保这些系统在金融业务中的安全，而这样的做法并不限于这一个领域。

让我们回到那个认知错乱的问题，一般而言，开发者们与技术经理总是谈论他们系统的优点，开发团队使用的主要技术以及他们的产品是如何受到用户的好评的—而这些并不一定都是事实。

平心而论，系统中的头疼之处，混乱的补丁，可笑的错误，以及最初团队人员离去所造成的创伤，你将会听到一个完全不同的故事--“事实上确实存在很多问题，甚至每个人都知道问题的存在。但事情并不以我们的意志而转移—因此我们完全没有必要谈论它们”。

很少有人对我提出的这个问题持不同的看法，我也经常听说那些足以吓到技术人员的混乱状况。而与系统中潜在的问题相比，人们更愿意谈论诸如系统的功能、漂亮的界面之类的问题。

而一旦习惯了在一个充满问题的系统上工作，人们便会习惯那些使系统正常运转所必须做出的奇怪工序。

近几年在金融领域有一些与软件相关的重大新闻：Nasdaq出错造成facebook的故障，Knight Capital一个失败的交易所导致的公司市值蒸发，以及闪电崩盘在数分钟内造成的十一美元损失。

系统的故障相对而言是不多见的，而这些细小的问题确实很少被人意识到的。路透社的一篇文章指出，一家公司的大量咨询师正在考虑离职，原因是很多技术问题使他们的工作难以进行。

复杂的系统总会出现错误，而糟糕的代码可以导致程序的突然崩溃，从而造成难以估量的后果。

weixin_33753845

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
What Compsci textbooks don't tell you: Real world code sucks

What Compsci textbooks don't tell you: Real world code sucks By Dave MandlPosted in Developer, 21st December 2012 10:19 GMT There’s a kind of cognitive dissonance in most people who’ve moved...
复制链接

扫一扫