








Samples for the 2-dimensional projection of kinetic trajectories are shown in Figure 7. The coil states are loosely gathered while the native states can form a black cluster with extremely high density in the 2-dimensional projection plane.

这里从第一句到第二句信息无法流动。“The coil states” 不知道是如何而来。读者发现下面改动后的句子更容易明白。

Kinetic trajectories are projected onto xx and yy variables in Figure 7. This figure shows two populated states. One corresponds to loosely gathered coil states while the other is the native state with a higher density.

在这个新段落里,新插入的第二句使每句均能从旧信息出发到新信息结束。第一句与第二句之间以“Figure”相连而第二句与第三句之间以“two states”相连。而新信息“coil states”则出现在第三句的最后。整段环环相扣,称为一个整体。再看一个例子:

The accuracy of the model structures is given by TM-score. In case of a perfect match to the experimental structure, TM-score would be 1.

在第二个句子里,旧信息“TM-score”被埋在中间,被新信息“a perfect match to experimental structure” 打断。这里建议修改如下:

The accuracy of the model structures is measured by TM-score, which is equal to 1 if there is a perfect match to the experimental structure.



The smallest URFs (URFA6L), a 207-nucleotide (nt) reading frame overlapping out of phase the NH2-terminal portion of the adenosinetrip hosphatase (ATPase) subinit 6 gene has been identified as the animal equivalent of the recently discovered yeast H+ -ATPase subunit 8 gene.


The smallest of the URFs is URFA6L, a 207-nucleotide (nt) reading frame overlapping out of phase the NH2-terminal portion of the adenosinetrip hosphatase (ATPase) subinit 6 Gene; it has been identified as the animal equivalent of the recently discovered yeast H±ATPase subunit 8 gene.


URFA6L has been identified as the animal equivalent of the recently discovered yeast H±ATPase subunit 8 gene.

Recently discovered yeast H±ATPase subunit 8 gene has a corresponding animal equivalent gene URFA6L.


The enthalpy of hydrogen bond formation between the nucleoside bases 2-deoxyguanosine (dG) and 2-deoxycytidine (dC) has been determined by direct measurement.

这个句子看起来像是在强调“direct measurement”。但这个不太像原作者的目的。颠倒一下会使句子更加平衡。

We have directly measured the enthalpy of hydrogen bond formation between the nucleoside bases 2-deoxyguanosine (dG) and 2-deoxycytidine (dC).



The enthalpy of hydrogen bond formation between the nucleoside bases 2-deoxyguanosine (dG) and 2-deoxycytidine (dC) has been determined by direct measurement. dG and dC were derivatized at the 5 and 3 hydroxyls with triisopropylsilyl groups to obtain solubility of the nucleosides in non-aqueous solvents and to prevent the ribose hydroxyls from forming hydrogen bonds. From isoperibolic titration measurements, the enthalpy of dC:dG base pair formation is -6.650.32 kcal/mol.


We have directly measured the enthalpy of hydrogen bond formation between the nucleoside bases 2-deoxyguanosine (dG) and 2-deoxycytidine (dC). dG and dC were derivatized at the 5 and 3 hydroxyls with triisopropylsilyl groups; these groups serve both to solubilize the nucleosides in non-aqueous solvents and to prevent the ribose hydroxyls from forming hydrogen bonds. The enthalpy of dC:dG base pair formation is -6.650.32 kcal/mol according to isoperibolic titration measurements,


Large earthquakes along a given fault segment do not occur at random intervals because it takes time to accumulate the strain energy for the rupture. The rates at which tectonic plates move and accumulate strain at their boundaries are approximately uniform. Therefore, in first approximation, one may expect that large ruptures of the same fault segment will occur at approximately constant time intervals. If subsequent main shocks have different amounts of slip across the fault, then the recurrence time may vary, and the basic idea of periodic main shocks must be modified.

在这个例子里,前两句共同阐明了积累张力的速度(Rate Of Strain Accumulation)。然而,第一句里的旧信息并没有放在第二句的开始。读者读到第三句的时候通常就不明白这段到底要讲什么了。更清晰的描述应该如下:

Large earthquakes along a given fault segment do not occur at random intervals because it takes time to accumulate the strain energy for the rupture. The rates of strain accumulation at the boundaries of tectonic plates are approximately uniform. Therefore, nearly constant time intervals (at first approximation) would be expected between large ruptures of the same fault segment. [However?], the recurrence time may vary; the basic idea of periodic main shocks may need to be modified if subsequent main shocks have different amounts of slip across the fault.






  1. 只提出"一"个中心命题。论文里的观点太多,不但不好写,问题也容易多,读者也不易记住你要说什么。
  2. 在这个中心命题的基础上,用一个迷人(但绝不夸张)的标题来吸引审稿人的兴趣。无偿审稿使审稿人只审批感兴趣的论文,如果你不能引起审稿人的兴趣,那最好不要发表那篇文章。编辑们有时会很郁闷,因为找不到有兴趣的审稿人。
  3. 合理解释每一个参数,合理说明每一个步骤。审稿人没时间考虑细节,程序和参数的合理化显示出你知道你在做什么,而不是凑数据.没理由要找理由,有理由要强调.
  4. 问问你自己是否提供了足够重复你工作的所有细节。审稿人(或读者)越容易再现你的工作,他就约可能接受你的文章。当然,审稿人并不会真正去复现你的工作,但你必须通过你的描述时他相信可以重做。
  5. 必须有说服力!尽量做彻底而不是半成品的工作!用多方面测试来证明你的中心命题。要使文章像律师证明无罪官司,预先回答一切可能提出的疑问。
  6. 引用所有重要的研究工作,特别是经典力作。写作的时候要再做全面文献检索,为了达到这些目标,写科学论文的时候必须参照一定的框架结构。




        如果文章是关于新的方法,技术或算法,要非常详细地写它的新颖之处.要用有逻辑的,合理的方式来描述它.这会帮助读者抓住新方法的要领.如果这个方法使用参数,则要把每个参数(或参数的取值)合理化,或者是以前用过的,或者可以从物理或数学推到出来,或者通过广泛的测试及优化.如果无法保障它的合理性,那就必须描述改变它会造成的影响(实际的结果应该在结果部分或讨论部分, 方法部分仅包含影响的描述).如果没有测试它们的合理性,你应该解释为什么(做的代价太贵了?太浪费时间?或者需要延期到将来做).


        一旦你对结果有更好的理解,你需要决定卖点,也就是说这篇文章最有意义的一个观点是什么?确定这篇文章的中心命题之后要组织所有的段落来证明,支持它,用数据(有必要的话再加数据)来证明它.同时也要排除其它可能性.放弃 与中心命题无关的数据,即使这些数据是很辛苦得来的.


        当你有了中心命题之后,就该决定文章的标题了.标题可以为你的方法,你的结果或者结果的隐含意义做广告.标题是用一句话来概括你的文章.应该把最重要,最吸引人的信息放进标题.比如,标题 “Steric restrictions in protein folding: an alpha-helix cannot be followed by a contiguous beta-strand” 主要突出了结果。另一方面,标题“Interpreting the folding kinetics of helical proteins” 突出了结果的含义。用标题 “Native proteins are surface-molten solids: Application of the Lindemann criterion for the solid versus liquid state” 的话,同时突出了方法和结果的含义。注意标题 “Native proteins are surface-molten solids” 是结果的解释,而不是结果本身。用既广泛又具体的标题,这样才能吸引更多的读者。



Assessing secondary structure assignments of protein structures by using pairwise sequence-alignment benchmarks
        The secondary structure of a protein refers to the local conformation of its polypeptide backbone. Knowing secondary structures of proteins is essential for their structure classification, understanding folding dynamics and mechanisms, and discovering conserved structural/functional motifs. Secondary structure information is also useful for sequence and multiple sequence alignment, structure alignment, and sequence to structure alignment (or threading). As a result, predicting secondary structures from protein sequences continues to be an active field of research fifty six years after Pauling and Corey first predicted that the most common regular patterns of protein backbones are the α-helix and the β-sheet. Prediction and application of protein secondary structures rely on prior assignment of the secondary-structure elements from a given protein structure by human or computational methods.
        Many computational methods have been developed to automate the assignment of secondary structures. Examples are DSSP,STRIDE, DEFINE, P-SEA, KAKSI,P-CURVE, XTLSSTR, SECSTR, SEGNO, and VoTAP. These methods are based on either the hydrogen-bond pattern, geometric features, expert knowledge or their combinations. However, they often disagree on their assignments. For example, disagreement among DSSP, P-CURVE, and DEFINE can be as large as 25%. More beta sheet is assigned by XTLSSTR and more pi-helix by SECSTR than by DSSP. The discrepancy among different methods is caused by non-ideal configurations of helices and sheets. As a result, defining the boundaries between helix, sheet, and coil is problematical and a significant source of discrepancies between different methods.
        Inconsistent assignment of secondary structures by different methods highlights the need for a criterion or a benchmark of “standard” assignments that could be used to assess and compare assignment methods. One possibility is to use the secondary structures assigned by the authors who solved the protein structures. STRIDE, in fact, has been optimized to achieve the highest agreement with the authors’ annotations. However, it is not clear what is the criterion used for manual or automatic assignment of secondary structures by different authors. Another possibility is to treat the consensus prediction by several methods as the gold standard. However, there is no obvious reason why each method should weight equally in assigning secondary structures and which method should be used in consensus. Other used criteria include helix-capping propensity, the deviation from ideal helical and sheet configurations, and structural accuracy produced by sequence-to-structure alignment guided by secondary structure assignment.
        In this paper, we propose to use sequence-alignment benchmarks for assessing secondary structure assignments. These benchmarks are produced by 3D-structure alignment of structurally homologous proteins. Instead of assessing the accuracy of secondary-structure assignment directly, which is not yet feasible, we compare the two assignments of secondary structures in structurally aligned positions. We assume that the best method should assign the same secondary-structure element to the highest fraction of structurally aligned positions. Certainly, structurally aligned positions do not always have the same secondary structures. Moreover, different structure-alignment methods do not always produce the same result. Nevertheless, this criterion provides a means to locate a secondary-structure assignment method that is most consistent with tertiary structure alignment. We suggest that this approach provides an objective evaluation of secondary structure assignment methods.

在这个例子里,标题推荐了一个评估指派蛋白质二级结构的方法。第一段以二级结构的定义开始(与标题相连)。整段描述了二级结构的重要性。最后一句过渡到指派二级结构的计算方法(下一段的主题)。注意“计算方法”放在句子的最后是为了强调而且和第二段的开始连接在起来。第二段则聚焦在计算方法中存在的问题。旧信息“计算方法”逐渐的变到了“它们的不一致”。第三段的第一句把主题从“不一致”(旧信息)转变成了“评估的办法”(新信息)。然后,介绍了这个领域已有的工作。第四段引入新方法并讨论了新方法的优点。第五段(这里没有给出)将会简要地讨论结果。每一个引言应该 包括研究领域的介绍和意义,做这工作的具体原因,结果和隐含的意义。一般而言,读者读完引言,对论文的来龙去脉就应该清清楚楚了。



One question about the complex homopolymer phase diagram presented here is whether it is caused by the discontinuous feature of the square-well potential. We cannot give a direct answer because the DMD simulation is required to obtain well-converged results for the thermodynamics. However, the critical phenomena predicted for a fluid composed of particles interacting with a square-well potential are as realistic as those predicted for a fluid composed of particles interacting with a LJ potential. Also an analogous complex phase diagram is found in simulations of LJ clusters. The present results for square-well homopolymers may well be found in more realistic homopolymer models and even in real polymers.



How to make an objective assignment of secondary structures based on a protein structure is an unsolved problem. Defining the boundaries between helix, sheet, and coil structures is arbitrary, and commonly accepted standard assignments do not exist. Here, we propose a criterion that assesses secondary-structure assignment based on the similarity of the secondary structures assigned to structurally aligned residues in sequence-alignment benchmarks. This criterion is used to rank six secondary-structure assignment methods: STRIDE, DSSP, SECSTR, KAKSI, P SEA, and SEGNO with three established sequence-alignment benchmarks (PREFAB, SABmark and SALIGN). STRIDE and KAKSI achieve comparable success rates in assigning the same secondary structure elements to structurally aligned residues in the three benchmarks. Their success rates are between 1-4% higher than those of the other four methods. The consensus of STRIDE, KAKSI, SECSTR, and P-SEA, called SKSP, improves assignments over the best single method in each benchmark by an additional 1%. These results support the usefulness of the sequence alignment benchmarks as the benchmarks for secondary structure assignment.

前两句陈述了问题。第三句提出了解决办法。这些句子后面跟着结果。整个摘要以总结收尾。 注意摘要里的主体部分是结果及其意义和影响。


  1. 认真对待写作.尽你最大努力花时间写作.它是科学研究的重要一环.文章没写好,没人看,没人用,等于没发表.
  2. 除非这个研究是全面彻底的,而且你试了所有可以支持你结论的方法,否则不要去发表
  3. 重新思考,并合理解释为什么做这项工作,做了什么,什么是最重要的发现?为什么用这个方法?为什么用这些参数?什么是以前做过的(更新文献搜索)?不同在什么地方?
  4. 要从批判的角度来看你的工作,想一想别人会怎样挑毛病.只有这样,才能找到弱点,进一步发展.我的许多论文是在反复讨论中大幅修改,许多计算经常要重新做.只有理顺和理解结果,文章才会更有意义.
  5. 要能回答所有合理的质疑.如果你自己有质疑,一定要搞清楚,否则别人又怎会相信.不要轻易相信得到的革命性发现.
  6. 要以高标准严要求写论文.不与烂文章比,争取建立自己的品牌.不要隐藏任何事实,不作假,不要低估其他研究者的智慧.让你的研究可重复.
  7. 从头(标题)到尾(结论)要从旧信息过度到新信息. 永远不要在句子的开头引入新信息.切记在术语被定义之前使用它们.
  8. 照抄别人文章里的句子是不道德的.这暴露出作者不愿意思考,只走捷径,不是一个正真科研工作者的样子.同时抄袭的句子常常会打断文章原有信息的流通,不利于读者对文章的理解.一定要用别人的原句,就必须用上引号,并引用参考文献.
  9. 在段首要有阐明整段主题的句子,在段尾要有连接下段的过渡句.从标题到结论都要连贯.句句相扣,段段相连,让一篇文章是一个整体而不是杂乱无章地把句子堆叠在一起.这样能使读者享受阅读你的文章.
  10. 写,重写,再重写.没人能第一次就写好.不花时间,不下功夫,写不好.我的文章一般要修改十次以上.




        此文中的一些例子出自 “The Science of Scientific Writing” by G. D. Gopen and J. A. Swan, American Scientist, 78, 550-558, 1990. 我在杜克大学Gopen教授1995年年度短训班受益非浅。我要特别感谢我的导师Martin Karplus(哈佛大学),George Stell (纽约州立大学-石溪校区), Harold L. Friedman (纽约州立大学-石溪校区) 和 Carol Hall (北卡罗来纳州立大学)的鼓励和指导。没有他们,我不会有那么多机会练习英文写作。最后,我要感谢我 的学生和博士后。他们对科学的贡献使我可以继续写论文,基金申请,或评论。此文中的一部分例子来自与他们合作的文章。此文初稿是用英文写的。由于我的中文打字速度太慢,特别感谢徐贝思帮我翻译成中文初稿。如果有不妥的地方是我的问题,请多指教。此文在网上出现以后,得到不少关注。特别感谢赵立平教授的建议及感谢许多校友和网友的指正和鼓励。



