Arm Instruction Condition Codes

Every practical general-purpose computing architecture has a mechanism of conditionally executing some code. Such mechanisms are used to implement the if construct in C, for example, in addition to several other cases that are less obvious.

 

ARM, like many other architectures, implements conditional execution using a set of flags which store state information about a previous operation. I intend, in this post, to shed some light on the operation of these flags. Of course, theArchitecture Reference Manual is the definitive source of information, so if you need to know about a specific corner-case that I do not cover here, that is where you need to look.

 

A Realistic Example

 

Consider a simple fragment of C code:

 

  1. for (i = 10; i != 0; i--) {  
  2.   do_something();  
  3. }  

 

A compiler might implement that structure as follows:

 

  1.   mov     r4, #10  
  2. loop_label:  
  3.   bl      do_something  
  4.   sub     r4, r4, #1  
  5.   cmp     r4, #0  
  6.   bne     loop_label  

 

The last two instructions are of particular interest. The cmp (compare) instruction comparesr4 with 0, and the bne instruction is simply ab (branch) instruction that executes if the result of the cmp instruction was "not equal". The code works becausecmp sets some global flags indicating various properties of the operation. Thebne instruction — which is really just a b (branch) with ane condition code suffix — reads these flags to determine whether or not to branch1.

 

The following code implements a more efficient solution:

 

  1.   mov     r4, #10  
  2. loop_label:  
  3.   bl      do_something  
  4.   subs    r4, r4, #1  
  5.   bne     loop_label  

 

Adding the s suffix to sub causes it to update the flags itself, based on the result of the operation. This suffix can be added to many (but not all) arithmetic and logical operations2.

 

In the rest of the article, I will explain what the condition flags are, where they are stored, and how to test them using condition codes.

 

Condition-Code Analysis Tool

 

If you have an ARM platform (or emulator) handy, the attached ccdemo application can be used to experiment with the operations discussed in the article. The application allows you to pick an operation and two operands, and shows the resulting flags and a list of which condition codes will match. When writing assembly code, it can also be a rather useful development tool.

 

The Flags

 

The simplest way to set the condition flags is to use a comparison operation, such ascmp. This mechanism is common to many processor architectures, and the semantics (if not the details) ofcmp will likely be familiar. In addition, we have already seen that many instructions (such assub in the example) can be modified to update the condition flags based on the result by adding ans suffix. That's all well and good, but what information is stored, and how can we access it?

 

The additional information is stored in four condition flag bits in the APSR (Application Processor Status Register), or the CPSR (Current Processor Status Register) if you are used to pre-ARMv7 terminology3,4. The flags indicate simple properties such as whether or not the result was negative, and are used in various combinations to detect higher-level relationships such as "greater than" and suchlike. Once I have described the flags, I will explain how they map onto condition codes (such as ne in the previous example).

 

N: Negative

 

The N flag is set by an instruction if the result is negative. In practice, N is set to thetwo's complement sign bit of the result (bit 31).

 

Z: Zero

 

The Z flag is set if the result of the flag-setting instruction is zero.

 

C: Carry (or Unsigned Overflow)

 

The C flag is set if the result of an unsigned operation overflows the 32-bit result register. This bit can be used to implement 64-bit unsigned arithmetic, for example.

 

V: (Signed) Overflow

 

The V flag works the same as the C flag, but for signed operations. For example,0x7fffffff is the largest positive two's complement integer that can be represented in 32 bits, so0x7fffffff + 0x7fffffff triggers a signed overflow, but not an unsigned overflow (or carry): the result,0xfffffffe, is correct if interpreted as an unsigned quantity, but represents a negative value (-2) if interpreted as a signed quantity.

 

Flag-Setting Example

 

Consider the following example:

 

  1. ldr     r1, =0xffffffff  
  2. ldr     r2, =0x00000001  
  3. adds    r0, r1, r2  

 

The result of the operation would be 0x100000000, but the top bit is lost because it does not fit into the 32-bit destination register and so the real result is0x00000000. In this case, the flags will be set as follows:

 

Flag Explanation
N = 0The result is 0, which is considered positive, and so theN (negative) bit is set to 0.
Z = 1The result is 0, so theZ (zero) bit is set to 1.
C = 1We lost some data because the result did not fit into 32 bits, so the processor indicates this by settingC (carry) to 1.
V = 0From a two's complement signed-arithmetic viewpoint,0xffffffff really means -1, so the operation we did was really(-1) + 1 = 0. That operation clearly does not overflow, so V (overflow) is set to0.

 

If you fancy it, you can check this with the ccdemo application. The output looks like this:

 

  1. $ ./ccdemo adds 0xffffffff 0x1  
  2. The results (in various formats):  
  3.        Signed:         -1 adds          1 =          0  
  4.      Unsigned: 4294967295 adds          1 =          0  
  5.   Hexadecimal: 0xffffffff adds 0x00000001 = 0x00000000  
  6. Flags:  
  7.   N (negative): 0  
  8.   Z (zero)    : 1  
  9.   C (carry)   : 1  
  10.   V (overflow): 0  
  11. Condition Codes:  
  12.   EQ: 1    NE: 0  
  13.   CS: 1    CC: 0  
  14.   MI: 0    PL: 1  
  15.   VS: 0    VC: 1  
  16.   HI: 0    LS: 1  
  17.   GE: 1    LT: 0  
  18.   GT: 0    LE: 1  

 

Reading the Flags

 

We have worked out how to set the flags, but how does that result in the ability to conditionally execute some code? Being able to set the flags is pointless if you cannot then react to them.

 

The most common method of testing the flags is to use conditional execution codes. This mechanism is similar to mechanisms used in other architectures, so if you are familiar with other machines you might recognize the following pattern, which maps cleanly onto C's if/else construct:

 

  1.   cmp     r0, #20  
  2.   bhi     do_something_else  
  3. do_something:  
  4.   @ This code runs if (r0 <= 20).  
  5.   b       continue    @ Prevent do_something_else from executing.  
  6. do_something_else:  
  7.   @ This code runs if (r0 > 20).  
  8. continue:  
  9.   @ Other code.  

 

In effect, attaching one of the condition codes to an instruction causes it to executeif the condition is true. Otherwise, it does nothing, and is essentially anop.

 

The following table lists the available condition codes, their meanings (where the flags were set by acmp or subs instruction), and the flags that are tested:

 

Code Meaning (for cmp or subs) Flags Tested
eqEqual.Z==1
neNot equal.Z==0
cs or hsUnsigned higher or same (or carry set).C==1
cc or loUnsigned lower (or carry clear).C==0
miNegative. The mnemonic stands for "minus".N==1
plPositive or zero. The mnemonic stands for "plus".N==0
vsSigned overflow. The mnemonic stands for "V set".V==1
vcNo signed overflow. The mnemonic stands for "V clear".V==0
hiUnsigned higher.(C==1) && (Z==0)
lsUnsigned lower or same.(C==0) || (Z==1)
geSigned greater than or equal.N==V
ltSigned less than.N!=V
gtSigned greater than.(Z==0) && (N==V)
leSigned less than or equal.(Z==1) || (N!=V)
al (or omitted)Always executed.None tested.

 

It is fairly obvious how the first few work because they test individual flags, but the others rely on specific combinations of flags. In practice, you very rarely need to know exactly what is happening; the mnemonics hide the complexity of the comparisons.

 

Here, once again, is the example for-loop code I gave earlier:

 

  1.   mov     r4, #10  
  2. loop_label:  
  3.   bl      do_something  
  4.   subs    r4, r4, #1  
  5.   bne     loop_label  

 

It should now be easy enough to work out exactly what is happening here:

 

  • The subs instruction sets the flags based on the result of r4-1. In particular, the Z flag will be set if the result is 0, and it will be clear if the result is anything else.
  • The bne instruction only executes if condition ne is true. That condition is true ifZ is clear, so the bne iterates the loop until Z is set (and thereforer4 is 0).

 

Dedicated Comparison Instructions

 

The cmp instruction (that we saw in the first example) can be thought of as asub instruction that doesn't store its result: if the two operands are equal, the result of the subtraction will be zero, hence the mapping betweeneq and the Z flag. Of course, we could just use a sub instruction with a dummy register, but you can only do that if you have a register to spare. Dedicated comparison instructions are therefore quite commonly used.

 

There are actually four dedicated comparison instructions available, and they perform operations as described in the following table:

 

Instruction Description
cmpWorks like subs, but does not store the result.
cmnWorks like adds, but does not store the result.
tstWorks like ands, but does not store the result.
teqWorks like eors, but does not store the result.

 

Note that the dedicated comparison operations do not require the s suffix; theyonly update the flags, so the suffix would be redundant.

 

End Note

 

Whilst the condition flag mechanism is fairly simple in principle, there are a lot of details to take in, and seeing some real examples will probably be useful! I will make a point of presenting some examples of realistic usage in a future blog post.

 


 

1Technically, most instructions can be executed conditionally, not just branches. However, I will discuss such conditional execution in more detail in another article.

 

2TheInstruction Set Quick Reference Card summarises the flag-setting abilities of each instruction. TheArchitecture Reference Manual contains detailed information about exactly how the flags are updated for each instruction.

 

3TheAPSR and CPSR are actually the same on ARMv7, despite having separate names, but only the condition codes and one or two other bits are defined for theAPSR. The other bits should not really be accessed directly anyway, so the renaming is essentially a clean-up of the old mixed-accessCPSR. Note, however, that GCC (4.3.3 at least) does not accept APSR, so you have to use CPSR in your assembly source if you want to access it.

 

4In general, you will very rarely need to directly access the APSR because the condition codes give you the functionality you usually need from them anyway. However, if you really want to see what is in there, you can access it using themsr and mrs instructions. Indeed, this is the method that theccdemo application uses to give information about the specified operation.


 

In my previous post (Condition Codes 1), I explained that some instructions can set some global condition codes, and that these codes can be used to conditionally execute code. I gave some examples of usage. One such example was an assembly implementation of C'sif/else construct:

 

  1.   cmp     r0, #20  
  2.   bhi     do_something_else  
  3. do_something:  
  4.   @ This code runs if (r0 <= 20).  
  5.   b       continue    @ Prevent do_something_else from executing.  
  6. do_something_else:  
  7.   @ This code runs if (r0 > 20).  
  8. continue:  
  9.   @ Other code.  

 

The example is valid, and will work on any ARM core. However, is this an efficient solution if you only need to execute one or two instructions in each case? Consider the following C code:

 

  1. if (a > 10) {  
  2.   a = 10;  
  3. else {  
  4.   a = a + 1;  
  5. }  

 

It should be clear that the code increments a unless it has hit or exceeded a limit of 10, in which case it is set to 10. Mapping this onto ourif/else example, this might be implemented in assembly as follows:

 

  1.   cmp     r0, #10  
  2.   blo     r0_is_small  
  3. r0_is_big:  
  4.   mov     r0, #10  
  5.   b       continue  
  6. r0_is_small:  
  7.   add     r0, r0, #1  
  8. continue:  
  9.   @ Other code.  

 

The above code executes one of two instructions, either the mov or theadd. However, it uses two branch instructions to achieve this. Without branch prediction, these branches can take several cycles to execute. Even with branch prediction, the pattern may not be easily predicted. Finally, even with perfect branch prediction, each branch instruction takes four bytes of instruction memory, so code size may become a problem.

 

An Improved Example

 

One of the features of the ARM instruction set is that almost every instruction encoding includes a 4-bit field that represents a condition code. If the condition attached to an instruction passes, the instruction executes. Otherwise, it has no effect, as if you had used a nop instruction. Using this knowledge, we can implement the previous example more efficiently as follows:

 

  1.   cmp     r0, #10  
  2.   movhs   r0, #10  
  3.   addlo   r0, r0, #1  

 

Unconditionally-Executed Instructions

 

In the ARM instruction set, the condition code is encoded using a 4-bit field in the instruction. The encoding includes 3 bits to identify an operation, and a fourth bit to invert the condition. Theeq condition, for example, is the exact opposite of the ne condition. It may interest authors of JIT compilers to know that the least significant bit of the condition code can be inverted to obtain the opposite condition code. For example, eq (equal) is encoded as '0000' and ne (not equal) as '0001'. This works for every condition code with the exception of theal (always) condition, encoded as '1110'. It would be wasteful to dedicate one sixteenth of the instruction set to instructions that can never execute. Instead, this portion of the instruction set is used for the few instructions which cannot be executed conditionally.

 

Here are a few examples of instructions which will always execute unconditionally in the ARM instruction set:

 

  • blx <label> cannot be conditionally executed, but blx <register> (and all other branch instructions) can.
  • Most NEON instructions. For example, SIMD (NEON) variants of vadd cannot be conditionally executed, though the scalar (VFP) variants can.
  • Hint instructions, such as pld (preload data).
  • Barriers, such as dmb (data memory barrier), dsb (data synchronization barrier),isb (instruction synchronization barrier).

 

As always, the ARMv7-AR Architecture Reference Manual contains the most complete and accurate information, as does theInstruction Set Quick Reference Card.

 

Conditional Execution and High-Performance Processors

 

In the time when few processors had branch prediction and when code size was very constrained, conditional execution was an excellent way to save code space whilst also improving performance in many programs. This is still true for today's real-time processors and micro-controllers. However, ARM's application-class processors include branch predictors which often make the branch-basedif/else construction more attractive than conditional instructions. A predicted branch may be very cheap, or even free in some cases. In addition, conditional execution can, in some cases, prevent out-of-order execution as it adds additional instruction stream dependencies.

 

In some cases, it can be difficult to know whether to use conditional execution or traditional conditional branches for a particular application. However, as a general rule-of-thumb, it's probably best to use conditional instructions for sequences of three instructions or fewer, and branches for longer sequences. The best-performing solution varies between processors as they have different pipeline and branch predictor designs, and it also varies depending on the specific instruction sequence you are using. Also note that the fastest solution is not necessarily the smallest.

 

Thumb

 

In the original 16-bit Thumb instruction set, only branches could be conditional. In Thumb-2, theit instruction was added to provide functionality and behaviour similar to conditional instructions in ARM. Thumb-2'sit instruction can also conditionally execute some instructions which are normally unconditionally executed in ARM state. I won't say more about it now, though it will be covered in detail in mynext post in this series.



Thumb-2 can make use of the same conditional execution features that theARM instruction set provides. For conditionally executing one or two instructions, this mechanism can provide code-size and performance benefits over the (more conventional) conditional branching mechanism.

 

I noted at the end of the last post in this series that this mechanism is not directly available to Thumb. Instead, Thumb-2 has an instruction —it — which can provide the same functionality as ARM conditional execution. In this article, I will describe theit instruction, and I will also explain a few caveats of condition-setting instructions in Thumb-2. Note that theit instruction is only available to Thumb-2, and so most of this article will not be relevant to the old Thumb instruction set1.

 

The it Instruction

 

With the exception of simple conditional branches, Thumb-2 instructions do not have the 4-bit condition code field that most ARM instruction have. Instead, Thumb-2 has theit instruction, which conditionally executes up to four subsequent instructions. The instructions affected by anit instruction are said to be in an it block.

 

The mnemonic it represents an if-then construct. If the condition code (given as an argument to the instruction) evaluates totrue, then the next instruction is executed. Up to three additional t (then) or e (else) codes can be added to control the execution of the subsequent instructions. For example, readite as if-then-else, and ittee as if-then-then-else- else. The following code either incrementsr0, or resets it to 0 if it is greater than or equal to10:

 

  1. .syntax unified   @ Remember this!  
  2. .thumb  
  3. [...]  
  4. cmp     r0, #10  
  5. ite     lo        @ if r0 is lower than 10 ...  
  6. addlo   r0, #1    @ ... then r0 = r0 + 1  
  7. movhs   r0, #0    @ ... else r0 = 0  

 

Note that the conditionally-executed instructions inside the it block must still be given condition codes, as they would in ARM assembly. Assemblers will check that the condition you gave toit is consistent with those on the individual instructions. The then conditions must match the condition code, and any else conditions must be the opposite condition. In the example, theelse condition was hs (higher or same) — the opposite oflo (lower). The table below shows the condition codes and their opposites:

 

Condition Code   Opposite
Code Description Code Description
eqEqual.  neNot equal.
hs (or cs)Unsigned higher or same (or carry set).lo (or cc)Unsigned lower (or carry clear).
miNegative.plPositive or zero.
vsSigned overflow.vcNo signed overflow.
hiUnsigned higher.lsUnsigned lower or same.
geSigned greater than or equal.ltSigned less than.
gtSigned greater than.leSigned less than or equal.
al (or omitted)Always executed.There is no opposite toal.

 

Whilst it is valid to give condition code al to the it, it has no opposite as there is nonever code. It is not valid to specify the al condition code in anit instruction that uses an else clause.

 

Branches

 

Just like other instructions, Thumb-2's branches can be conditionally executed usingit. Indeed, some branches cannot be conditionally executed without using anit block. However, any branches that exist in an it blockmust be the last instruction in the block. The following, for example, is unpredictable:

 

  1. ite     eq  
  2. blxeq   some_label  @ UNPREDICTABLE during an IT block.  
  3. movne   r0, #0  

 

The correct way to implement the above would be to put the mov before theblx, as follows:

 

  1. ite     ne  
  2. movne   r0, #0  
  3. blxeq   some_label  @ Ok at the end of an IT block.  

 

Compatibility with ARM Assembly

 

The it instruction is valid in ARM assembly, though it will not generate any code. This is done for compatibility with Thumb-2 assembly, and allows most assembly sequences to be assembled for both ARM and Thumb-2.

 

Simple Conditional Branches

 

Just like ARM code, a simple Thumb b instruction can be made conditional by adding a suitable condition code suffix. Indeed, theif/else example provided in my last post will assemble for Thumb just as it will for ARM.

 

Interesting Optimization Possibilities

 

Condition Code al

 

16-bit forms of Thumb arithmetic instructions usually set the condition flags. When inside anit block, however, the 16-bit forms do not set the flags. This property can be useful in combination with condition codeal. Consider the following code sequence:

 

  1.                         @ Instruction Size  
  2. add     r0, r0, #1      @   4 bytes  
  3. add     r1, r1, #1      @   4 bytes  
  4. add     r2, r2, #1      @   4 bytes  
  5. add     r3, r3, #1      @   4 bytes  
  6.                   @ Total: 16 bytes  

 

Writing an equivalent code sequence using an it block can result in smaller code size:

 

  1.                         @ Instruction Size  
  2. itttt   al              @   2 bytes  
  3. addal   r0, r0, #1      @   2 bytes  
  4. addal   r1, r1, #1      @   2 bytes  
  5. addal   r2, r2, #1      @   2 bytes  
  6. addal   r3, r3, #1      @   2 bytes  
  7.                   @ Total: 10 bytes  

 

It should be noted that the 16-bit forms have additional limitations, so the it trick used above may not always be applicable. The restrictions vary between each instruction, but typically the 16-bit instruction forms can typically only accessr0-r7 and have a very restricted range of immediate constants. For details, refer to theArchitecture Reference Manual.

 

Flag Setting

 

Because (outside of it blocks) most arithmetic instruction that set the flags have 16-bit forms, code size can be dramatically improved by setting the flags even when not necessary. This will provide the best (smallest) code size possible. However, depending on your target processor, this technique may have a small negative performance impact. It is perhaps advisable to use theal condition trick or 32-bit instructions in performance-critical code.

 

You can force the assembler to produce 16-bit instructions by adding a .n suffix. Assemblers will do this anyway, but if your instruction cannot be encoded using a 16-bit form and you specify.n, the assembler will give an error message.

 

  1. [...]                   @ Not in an IT block.  
  2. adds.n  r1, r2, r3      @ Generates a 16-bit instruction.  
  3. add.n   r1, r2, r3      @ Error: No 16-bit form for this.  

 

Refer to the Architecture Reference Manual for details of each instruction, and information about the constraints of the 16-bit forms. There are many exceptions and special cases so I won't describe them here in detail.

 

oating-point comparisons in the ARM architecture use the same mechanism as integer comparisons. However, there are some unavoidable caveats because the range of supported relationships is different for floating-point values. There are two problems to consider here: Setting the flags from a VFP comparison, and interpreting the flags with condition codes.

 

This post is applicable to all processors with VFP. The mechanisms I will describe do not differ between VFP variants. Similarly, the mechanisms are equally available in ARM and Thumb-2 modes. I described conditional execution in Thumb-2 in mylast article.

 

Setting the Flags with VFP

 

As I described at the start of this series, the integer cmp instruction performs an integer comparison and updates theAPSR (Application Processor Status Register) with information about the result of the comparison. TheAPSR holds the condition flags used by the processor for conditional execution. When VFP is used to perform a floating-point comparison, thevcmp instruction is used to update the FPSCR (Floating- Point System Control Register). This isn't usually useful by itself, however, as the processor cannot directly use theFPSCR for conditional execution. The vmrs instruction must be used to transfer the flags to theAPSR 1.

 

  1. .syntax unified             @ Remember this!  
  2. [...]  
  3. vcmp    d0, d1  
  4. vmrs    APSR_nzcv, FPSCR    @ Get the flags into APSR.  
  5. [...]                       @ Do something with the condition flags.  

 

Note that some versions of the GNU assembler do not accept all of the new instruction variants (with the "v" prefix). In this case, usefcmp in place of vcmp, and fmstat (with no arguments) in place ofvmrs.

 

Flag Meanings

 

The integer comparison flags support comparisons which are not applicable to floating-point numbers. For example, floating-point values are always signed, so there is no need for unsigned comparisons. On the other hand, floating- point comparisons can result in the unordered result (meaning that one or both operands was NaN, or"not a number"). IEEE-754 defines four testable relationships between two floating-point values, and they map onto the ARM condition codes as follows:

 

        

IEEE-754 Relationship ARM APSR Flags
N Z C V
Equal0110
Less Than1000
Greater Than0010
Unordered (At least one argument wasNaN.)0011

 

Compare with Zero

 

Unlike the integer instructions, most VFP (and NEON) instructions can operate only on registers, and cannot accept immediate values encoded in the instruction stream. Thevcmp instruction is a notable exception in that it has a special-case variant that allows quick and easy comparison with zero.

 

Interpreting the Flags

 

Once the flags are in the APSR, they may be used almost as if an integer comparison had set the flags. However, floating-point comparisons support different relationships, so the integer condition codes do not always make sense. The following table is equivalent to the condition code table from the first post in this series, but it describes floating-point comparisons as well as integer comparisons:

 

Code Meaning (when set by vcmp) Meaning (when set by cmp) Flags Tested
eqEqual to.Equal to.Z==1
neUnordered, or not equal to.Not equal to.Z==0
cs or hsGreater than, equal to, or unordered.Greater than or equal to (unsigned).C==1
cc or loLess than.Less than (unsigned).C==0
miLess than.Negative.N==1
plGreater than, equal to, or unordered.Positive or zero.N==0
vsUnordered. (At least one argument wasNaN.)Signed overflow.V==1
vcNot unordered. (No argument wasNaN.)No signed overflow.V==0
hiGreater than or unordered.Greater than (unsigned).(C==1) && (Z==0)
lsLess than or equal to.Less than or equal to (unsigned).(C==0) || (Z==1)
geGreater than or equal to.Greater than or equal to (signed).N==V
ltLess than or unordered.Less than (signed).N!=V
gtGreater than.Greater than (signed).(Z==0) && (N==V)
leLess than, equal to or unordered.Less than or equal to (signed).(Z==1) || (N!=V)
al (or omitted)Always executed.Always executed.None tested.

 

It should be obvious that the condition code is attached to the instruction reading the flags, and the source of the flags makes no difference to the flags that are tested. It is themeaning of the flags that differs when you perform a vcmp rather than acmp. Similarly, it is clear that the opposite conditions still hold. (For example,hs is still the opposite of lo.)

 

The flags when set by cmp generally have analogous meanings when set byvcmp. For example, gt still means "greater than". However, the unordered condition and the removal of the signed conditions can confuse matters. Often, for example, it is desirable to uselo — normally an unsigned "less than" check — in place of lt, because it does not match in the unordered case.

 

Performance Considerations

 

Be aware than vmrs effectively implements a data transfer between VFP and the integer core, and this operation can be relatively expensive on some cores. In addition, there is clearly a data dependency betweenvcmp and vmrs and another between vmrs and the conditional instruction. It is advisable to structure your code such that the flags are set and transferred many instructions before they are actually read. This is also true of integer comparisons, though the effect is likely to be more significant when using VFP.

 

Some instruction timing information and latency information is available for theCortex-A8 and Cortex-A9 processors.

 

Examples

 

VFP Version of ccdemo

 

In my first post in this series, I provided an example program ("ccdemo") to show how the flags and condition codes interact. A VFP version (usingvcmp) is attached to this article.

 

Complex Number Addition with Special NaN Handler

 

  1. @ Add complex numbers (or two-element vectors) in s3:s2 and s5:s4, storing  
  2. @ the result in s1:s0. If either element of the result is NaN, jump to a  
  3. @ special handler.  
  4. vadd    s0, s2, s4  
  5. vadd    s1, s3, s5  
  6. vcmp    s0, s1  
  7. vmrs    APSR_nzcv, FPSCR  
  8. bvs     nan_handler  

 

Loop Condition

 

  1.   @ This implements a loop that calculates d0=d0-(1/d0) until d0 is negative.  
  2.   vmov    d0, #10.0     @ Some starting value.  
  3.   vmov    d2, #1.0      @ We need the constant 1.0 in the loop.  
  4.   
  5. 1:  [...]               @ Do something interesting with d0.  
  6.   
  7.   vdiv    d1, d2, d0    @ d1=(1/d0)  
  8.   vsub    d0, d0, d1    @ d0=d0-(1/d0)  
  9.   vcmp    d0, #0        @ Special case of vcmp for compare-with-zero.  
  10.   vmrs    APSR_nzcv, FPSCR  
  11.   
  12.   bge     1b  

 

Implementation of fmax

 

  1.   @ A typical implementation of "fmax".  
  2.   @ Put into d0 the greatest of d1 and d2.  
  3.   @  - If one argument is NaN, the result is the other argument.  
  4.   @  - If both arguments are NaN, the result is NaN.  
  5.   @ I have used ["it" blocks][cc3] here so the sequence can be assembled as either  
  6.   @ ARM or Thumb-2 code.  
  7.   vcmp    d1, d2  
  8.   vmrs    APSR_nzcv, FPSCR  
  9.   it      vs        @ Code "vs" means "unordered".  
  10.   bvs     1f        @ Jump to the NaN handler.  
  11.   
  12.   @ Normal-case (not-NaN) handler.  
  13.   ite     ge  
  14.   vmovge  d0, d1    @ Select d1 if it is the greatest (or equal).  
  15.   vmovlt  d0, d2    @ Select d2 if it is the greatest.  
  16.   b       2f        @ Jump over the NaN handler.  
  17.   
  18. 1:  
  19.   @ NaN handler. We know that at least one argument was NaN.  
  20.   vcmp    d1, #0  
  21.   vmrs    APSR_nzcv, FPSCR  
  22.   ite     vc        @ Code "vc" means "not unordered".  
  23.   vmovvc  d0, d1    @ d1 wasn't NaN, so make it the result.  
  24.   vmovvs  d0, d2    @ d1 was NaN, so choose d2. (This might be NaN too.)  
  25.   
  26. 2:  
  27.   @ Done. The result is in d0.  
  28.   [...] 
1 Thevmrs instruction can also transfer the flags (along with the rest of theFPSCR) to an arbitrary general-purpose integer register, but this is usually only useful for accessing fields in theFPSCR other than the condition flags.


http://community.arm.com/groups/processors/blog/2010/07/16/condition-codes-1-condition-flags-and-codes


  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值