AT&T汇编语言语法格式与Intel的区别

11 篇文章 0 订阅

AT&T汇编语言语法与Intel的类似,你可以参考gas手册。

区别在下面几点(摘自gas manual):

AT&T Syntax versus Intel Syntax  //AT&T语法与Intel语法的对比

-------------------------------


orignal:

In order to maintain compatibility with the output of gcc, as supports AT&T System V/386 assembler syntax. 

This is quite different from Intel syntax. 

We mention these differences because almost all 80386 documents used only Intel syntax. 

Notable differences between the two syntaxes are: 

翻译:

为了与gcc的output(gcc -s source_file)保持兼容,因为gcc支持AT&T System V/386汇编语法格式。

这种AT&T的汇编语法格式与Intel的汇编语法格式有显著的不同。

我们之所以提到这些不同,是因为几乎所有的80386 documents 文档都使用Intel的语法格式。

这两种语法格式的显著区别如下:

1>立即数:(immediate operand)

AT&T immediate operands are preceded by `$';  //AT&T的立即数前有前导的'$‘;

Intel immediate operands are undelimited (Intel `push 4' is AT&T `pushl $4').  //Intel的立即数没有限定符


2>寄存器操作数:(register operand)

AT&T register operands are preceded by `%';  //AT&T的寄存器操作数前有前导的%限定

Intel register operands are undelimited.         //Intel的寄存器操作数没有限定。


3>绝对跳转指令:(absolute jump/call)

AT&T absolute (as opposed to PC relative) jump/call operands are prefixed by `*';  //AT&T的绝对跳转指令前有前缀*;

they are undelimited in Intel syntax.  //Intel的绝对跳转指令前没有限定。


4> 源操作数和目的操作数的位置:(source and destination location)

AT&T and Intel syntax use the opposite order for source and destination operands.  //两者源操作数与目的操作数的位置相反

Intel `add eax, 4' is `addl $4, %eax'.  // Intel的格式: op-code 目的操作数,源操作数;

                                                          // AT&T的格式:op-code源操作数,目的操作数;

The `source, dest' convention is maintained for compatibility with previous Unix assemblers.

//AT&T的这种’源操作数,目的操作数‘的约定(规约)是为了与先前的Unix Assemblers保持兼容性。


 5>内存操作数的size:(b, w, l, )

//在AT&T语法格式中,操作数的存储尺寸是由op-code最后一个字符决定的:

//b (byte, 8), w(word,16),  l(long, 32)。

In AT&T syntax the size of memory operands is determined from the last character of the opcode name.

 Opcode suffixes of `b', `w', and `l' specify byte (8-bit), word (16-bit), and long (32-bit) memory references. 


//Intel 语法实现操作数的size通过,operand的前缀,如,byte ptr (byte,  8), word ptr(word,16), dword ptr(double word, 32)

Intel syntax accomplishes this by prefixes memory operands (not the opcodes themselves) 

with `byte ptr', `word ptr', and `dword ptr'.

//两者间的等价举例: 

Thus, Intel `mov al, byte ptr foo' is `movb foo, %al' in AT&T syntax. 


6>长jump/call 和长ret (long jumps/calls and long ret)

Immediate form long jumps and calls are `lcall/ljmp $section, $offset' in AT&T syntax; 

//AT&T语法格式:lcall/ljmp $section, $offset

 the Intel syntax is `call/jmp far section:offset'.

//Intel语法格式:call/jmp far section:offset


//同样 long return,指令也相似: 

Also, the far return instruction is `lret $stack-adjust' in AT&T syntax; //AT&T语法格式 

Intel syntax is `ret far stack-adjust'.  //Intel语法格式


7>其他: multiple sections

The AT&T assembler does not provide support for multiple section programs. 

//AT&T assebmler 不提供对多段程序的支持。

Unix style systems expect all programs to be single sections. 

Unix风格的系统认为所有的程序都是一个段。


8> references: 参考书目:

 <1>.参考sun的x86汇编手册:http://oldlinux.org/download/805-4693.pdf


补充:

Brennan's Guide to Inline Assembly

by Brennan "Bas" Underwood

Document version 1.1.2.2

Ok. This is meant to be an introduction to inline assembly under DJGPP. DJGPP is based on GCC, so it uses the AT&T/UNIX syntax and has a somewhat unique method of inline assembly. I spent many hours figuring some of this stuff out and told Info that I hate it, many times.

Hopefully if you already know Intel syntax, the examples will be helpful to you. I've put variable names, register names and other literals in bold type.

The Syntax

So, DJGPP uses the AT&T assembly syntax. What does that mean to you?
  • Register naming:
    Register names are prefixed with "%". To reference eax:
    AT&T:  %eax
    Intel: eax
    
  • Source/Destination Ordering:
    In AT&T syntax (which is the UNIX standard, BTW) the source is always on the left, and the destination is always on the right.
    So let's load ebx with the value in eax:
    AT&T:  movl %eax, %ebx
    Intel: mov ebx, eax
    
  • Constant value/immediate value format:
    You must prefix all constant/immediate values with "$".
    Let's load eax with the address of the "C" variable booga, which is static.
    AT&T:  movl $_booga, %eax
    Intel: mov eax, _booga
    
    Now let's load ebx with 0xd00d:
    AT&T:  movl $0xd00d, %ebx
    Intel: mov ebx, d00dh
    
  • Operator size specification:
    You must suffix the instruction with one of bw, or l to specify the width of the destination register as a byteword or longword. If you omit this, GAS (GNU assembler) will attempt to guess. You don't want GAS to guess, and guess wrong! Don't forget it.
    AT&T:  movw %ax, %bx
    Intel: mov bx, ax
    
    The equivalent forms for Intel is byte ptrword ptr, and dword ptr, but that is for when you are...
  • Referencing memory:
    DJGPP uses 386-protected mode, so you can forget all that real-mode addressing junk, including the restrictions on which register has what default segment, which registers can be base or index pointers. Now, we just get 6 general purpose registers. (7 if you use ebp, but be sure to restore it yourself or compile with -fomit-frame-pointer.)
    Here is the canonical format for 32-bit addressing:
    AT&T:  immed32(basepointer,indexpointer,indexscale)
    Intel: [basepointer + indexpointer*indexscale + immed32]
    
    You could think of the formula to calculate the address as:
      immed32 + basepointer + indexpointer * indexscale
    
    You don't have to use all those fields, but you do have to have at least 1 of immed32, basepointer and you MUST add the size suffix to the operator!
    Let's see some simple forms of memory addressing:

    • Addressing a particular C variable:
      AT&T:  _booga
      Intel: [_booga]
      
      Note: the underscore ("_") is how you get at static (global) C variables from assembler. This only works with global variables. Otherwise, you can use extended asm to have variables preloaded into registers for you. I address that farther down.

    • Addressing what a register points to:
      AT&T:  (%eax)
      Intel: [eax]
      

    • Addressing a variable offset by a value in a register:
      AT&T: _variable(%eax)
      Intel: [eax + _variable]
      

    • Addressing a value in an array of integers (scaling up by 4):
      AT&T:  _array(,%eax,4)
      Intel: [eax*4 + array]
      

    • You can also do offsets with the immediate value:
      C code: *(p+1) where p is a char *
      AT&T:  1(%eax) where eax has the value of p
      Intel: [eax + 1]
      

    • You can do some simple math on the immediate value:
      AT&T: _struct_pointer+8
      
      I assume you can do that with Intel format as well.

    • Addressing a particular char in an array of 8-character records:
      eax holds the number of the record desired. ebx has the wanted char's offset within the record.
      AT&T:  _array(%ebx,%eax,8)
      Intel: [ebx + eax*8 + _array]
      
    Whew. Hopefully that covers all the addressing you'll need to do. As a note, you can put esp into the address, but only as the base register.

Basic inline assembly

The format for basic inline assembly is very simple, and much like Borland's method.
asm ("statements");
Pretty simple, no? So
asm ("nop");
will do nothing of course, and
asm ("cli");
will stop interrupts, with
asm ("sti");
of course enabling them. You can use  __asm__  instead of  asm  if the keyword  asm  conflicts with something in your program.

When it comes to simple stuff like this, basic inline assembly is fine. You can even push your registers onto the stack, use them, and put them back.

asm ("pushl %eax\n\t"
     "movl $0, %eax\n\t"
     "popl %eax");
(The \n's and \t's are there so the  .s  file that GCC generates and hands to GAS comes out right when you've got multiple statements per  asm .)
It's really meant for issuing instructions for which there is no equivalent in C and don't touch the registers.

But if you do touch the registers, and don't fix things at the end of your asm statement, like so:

asm ("movl %eax, %ebx");
asm ("xorl %ebx, %edx");
asm ("movl $0, _booga");
then your program will probably blow things to hell. This is because GCC hasn't been told that your  asm  statement clobbered  ebx  and  edx  and  booga , which it might have been keeping in a register, and might plan on using later. For that, you need:

Extended inline assembly

The basic format of the inline assembly stays much the same, but now gets Watcom-like extensions to allow input arguments and output arguments.

Here is the basic format:

asm ( "statements" : output_registers : input_registers : clobbered_registers);
Let's just jump straight to a nifty example, which I'll then explain:
asm ("cld\n\t"
     "rep\n\t"
     "stosl"
     : /* no output registers */
     : "c" (count), "a" (fill_value), "D" (dest)
     : "%ecx", "%edi" );
The above stores the value in  fill_value   count  times to the pointer  dest .

Let's look at this bit by bit.

asm ("cld\n\t"
We are clearing the direction bit of the  flags  register. You never know what this is going to be left at, and it costs you all of 1 or 2 cycles.
     "rep\n\t"
     "stosl"
Notice that GAS requires the  rep  prefix to occupy a line of it's own. Notice also that  stos  has the  l  suffix to make it move  longwords .
     : /* no output registers */
Well, there aren't any in this function.
     : "c" (count), "a" (fill_value), "D" (dest)
Here we load  ecx  with  count eax  with  fill_value , and  edi  with  dest . Why make GCC do it instead of doing it ourselves? Because GCC, in its register allocating, might be able to arrange for, say,  fill_value  to already be in  eax . If this is in a loop, it might be able to preserve  eax  thru the loop, and save a  movl  once per loop.
     : "%ecx", "%edi" );
And here's where we specify to GCC, "you can no longer count on the values you loaded into  ecx  or  edi  to be valid." This doesn't mean they will be reloaded for certain. This is the clobberlist.

Seem funky? Well, it really helps when optimizing, when GCC can know exactly what you're doing with the registers before and after. It folds your assembly code into the code it's generates (whose rules for generation look remarkably like the above) and then optimizes. It's even smart enough to know that if you tell it to put (x+1) in a register, then if you don't clobber it, and later C code refers to (x+1), and it was able to keep that register free, it will reuse the computation. Whew.

Here's the list of register loading codes that you'll be likely to use:

a        eax
b        ebx
c        ecx
d        edx
S        esi
D        edi
I        constant value (0 to 31)
q,r      dynamically allocated register (see below)
g        eax, ebx, ecx, edx or variable in memory
A        eax and edx combined into a 64-bit integer (use long longs)
Note that you can't directly refer to the byte registers ( ah al , etc.) or the word registers ( ax bx , etc.) when you're loading this way. Once you've got it in there, though, you can specify  ax  or whatever all you like.

The codes have to be in quotes, and the expressions to load in have to be in parentheses.

When you do the clobber list, you specify the registers as above with the %. If you write to a variable, you must include "memory" as one of The Clobbered. This is in case you wrote to a variable that GCC thought it had in a register. This is the same as clobbering all registers. While I've never run into a problem with it, you might also want to add "cc" as a clobber if you change the condition codes (the bits in the flags register the jnzje, etc. operators look at.)

Now, that's all fine and good for loading specific registers. But what if you specify, say, ebx, and ecx, and GCC can't arrange for the values to be in those registers without having to stash the previous values. It's possible to let GCC pick the register(s). You do this:

asm ("leal (%1,%1,4), %0"
     : "=r" (x)
     : "0" (x) );
The above example multiplies x by 5 really quickly (1 cycle on the Pentium). Now, we could have specified, say  eax . But unless we really need a specific register (like when using  rep movsl  or  rep stosl , which are hardcoded to use  ecx edi , and  esi ), why not let GCC pick an available one? So when GCC generates the output code for GAS, %0 will be replaced by the register it picked.

And where did "q" and "r" come from? Well, "q" causes GCC to allocate from eaxebxecx, and edx"r" lets GCC also consider esi and edi. So make sure, if you use "r"that it would be possible to use esi or edi in that instruction. If not, use "q".

Now, you might wonder, how to determine how the %n tokens get allocated to the arguments. It's a straightforward first-come-first-served, left-to-right thing, mapping to the"q"'s and "r"'s. But if you want to reuse a register allocated with a "q" or "r", you use "0""1""2"... etc.

You don't need to put a GCC-allocated register on the clobberlist as GCC knows that you're messing with it.

Now for output registers.

asm ("leal (%1,%1,4), %0"
     : "=r" (x_times_5)
     : "r" (x) );
Note the use of  =  to specify an output register. You just have to do it that way. If you want 1 variable to stay in 1 register for both in and out, you have to respecify the register allocated to it on the way in with the  "0"  type codes as mentioned above.
asm ("leal (%0,%0,4), %0"
     : "=r" (x)
     : "0" (x) );
This also works, by the way:
asm ("leal (%%ebx,%%ebx,4), %%ebx"
     : "=b" (x)
     : "b" (x) );
2 things here:
  • Note that we don't have to put ebx on the clobberlist, GCC knows it goes into x. Therefore, since it can know the value of ebx, it isn't considered clobbered.
  • Notice that in extended asm, you must prefix registers with %% instead of just %. Why, you ask? Because as GCC parses along for %0's and %1's and so on, it would interpret %edx as a %e parameter, see that that's non-existent, and ignore it. Then it would bitch about finding a symbol named dx, which isn't valid because it's not prefixed with % and it's not the one you meant anyway.
Important note:  If your assembly statement  must  execute where you put it, (i.e. must not be moved out of a loop as an optimization), put the keyword  volatile  after  asm  and before the ()'s. To be ultra-careful, use
__asm__ __volatile__ (...whatever...);
However, I would like to point out that if your assembly's only purpose is to calculate the output registers, with no other side effects, you should leave off the  volatile keyword so your statement will be processed into GCC's common subexpression elimination optimization.

Some useful examples

#define disable() __asm__ __volatile__ ("cli");

#define enable() __asm__ __volatile__ ("sti");
Of course,  libc  has these defined too.
#define times3(arg1, arg2) \
__asm__ ( \
  "leal (%0,%0,2),%0" \
  : "=r" (arg2) \
  : "0" (arg1) );

#define times5(arg1, arg2) \
__asm__ ( \
  "leal (%0,%0,4),%0" \
  : "=r" (arg2) \
  : "0" (arg1) );

#define times9(arg1, arg2) \
__asm__ ( \
  "leal (%0,%0,8),%0" \
  : "=r" (arg2) \
  : "0" (arg1) );
These multiply arg1 by 3, 5, or 9 and put them in arg2. You should be ok to do:
times5(x,x);
as well.
#define rep_movsl(src, dest, numwords) \
__asm__ __volatile__ ( \
  "cld\n\t" \
  "rep\n\t" \
  "movsl" \
  : : "S" (src), "D" (dest), "c" (numwords) \
  : "%ecx", "%esi", "%edi" )
Helpful Hint: If you say  memcpy()  with a constant length parameter, GCC will inline it to a  rep movsl  like above. But if you need a variable length version that inlines and you're always moving dwords, there ya go.
#define rep_stosl(value, dest, numwords) \
__asm__ __volatile__ ( \
  "cld\n\t" \
  "rep\n\t" \
  "stosl" \
  : : "a" (value), "D" (dest), "c" (numwords) \
  : "%ecx", "%edi" )
Same as above but for  memset() , which doesn't get inlined no matter what (for now.)

#define RDTSC(llptr) ({ \
__asm__ __volatile__ ( \
        ".byte 0x0f; .byte 0x31" \
        : "=A" (llptr) \
        : : "eax", "edx"); })
Reads the TimeStampCounter on the Pentium and puts the 64 bit result into llptr.

The End

"The End"?! Yah, I guess so.

If you're wondering, I personally am a big fan of AT&T/UNIX syntax now. (It might have helped that I cut my teeth on SPARC assembly. Of course, that machine actually had a decent number of general registers.) It might seem weird to you at first, but it's really more logical than Intel format, and has no ambiguities.

If I still haven't answered a question of yours, look in the Info pages for more information, particularly on the input/output registers. You can do some funky stuff like use"A" to allocate two registers at once for 64-bit math or "m" for static memory locations, and a bunch more that aren't really used as much as "q" and "r".

Alternately, mail me, and I'll see what I can do. (If you find any errors in the above, please, e-mail me and tell me about it! It's frustrating enough to learn without buggy docs!) Or heck, mail me to say "boogabooga."

It's the least you can do.


Related Usenet posts:


Thanks to Eric J. Korpela <korpela@ssl.Berkeley.EDU> for some corrections.
Have you seen the DJGPP2+Games Page? Probably.
Page written and provided by Brennan Underwood.
Copyright © 1996 Brennan Underwood. Share and enjoy!
Page created with  vi , God's own editor. 


上文的original link:http://www.delorie.com/djgpp/doc/brennan/brennan_att_inline_djgpp.html


  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
提供的源码资源涵盖了Java应用等多个领域,每个领域都包含了丰富的实例和项目。这些源码都是基于各自平台的最新技术和标准编写,确保了在对应环境下能够无缝运行。同时,源码中配备了详细的注释和文档,帮助用户快速理解代码结构和实现逻辑。 适用人群: 适合毕业设计、课程设计作业。这些源码资源特别适合大学生群体。无论你是计算机相关专业的学生,还是对其他领域编程感兴趣的学生,这些资源都能为你提供宝贵的学习和实践机会。通过学习和运行这些源码,你可以掌握各平台开发的基础知识,提升编程能力和项目实战经验。 使用场景及目标: 在学习阶段,你可以利用这些源码资源进行课程实践、课外项目或毕业设计。通过分析和运行源码,你将深入了解各平台开发的技术细节和最佳实践,逐步培养起自己的项目开发和问题解决能力。此外,在求职或创业过程中,具备跨平台开发能力的大学生将更具竞争力。 其他说明: 为了确保源码资源的可运行性和易用性,特别注意了以下几点:首先,每份源码都提供了详细的运行环境和依赖说明,确保用户能够轻松搭建起开发环境;其次,源码中的注释和文档都非常完善,方便用户快速上手和理解代码;最后,我会定期更新这些源码资源,以适应各平台技术的最新发展和市场需求。 所有源码均经过严格测试,可以直接运行,可以放心下载使用。有任何使用问题欢迎随时与博主沟通,第一时间进行解答!
### 回答1: "At" 是一个英文介词,通常用来表示位置、时间或方向等概念。在位置方面,在英文中 "at" 通常用来表示某个特定位置或场所,例如 "我在家里",可以翻译为 "I am at home"。在时间方面,"at" 通常用来表示具体的时刻或时间点,例如 "我会在下午3点到达",可以翻译为 "I will arrive at 3 PM"。而在方向方面,"at" 则通常用来表示某个方向的位置,例如 "他站在门口",可以翻译为 "He is standing at the door"。 "At" 这个单词很常用,我们可以通过学习它在语境中的不同用法来更好地理解它的含义。在使用 "at" 时需要注意单词后面应该跟上什么类型的名词,以便正确表达我们所想要的意思。最后,我们也应该注意英语中介词的灵活性,不同语言中介词的用法也有很大的差异,因此我们需要多加练习和理解。 ### 回答2: "At" 可以有多种含义和用法,这个词最常见的意思是介词。作为介词时,它表示位置、方向、时间等概念。 首先,它可以表达位置,例如:“The cat is sitting at the table.” 这句话中,“at the table” 表示小猫在桌子上。另外,它还可以表示某个建筑物、街道、城市等地方,例如:“I live at 123 Main Street.” 这句话中,“at 123 Main Street” 表示某人住在这个地址。 其次, "at" 还可以用来表示方向,例如:“Turn left at the traffic light.” 这句话中,“at the traffic light” 表示在交通灯处向左转。此外,它还可以表示在某个特定的位置上,例如:“We waited at the airport gate for two hours.” 这句话中,“at the airport gate” 表示在机场的航站楼候机口里等待。 最后, "at" 还可以表示时间,例如:“We will meet at 7pm tonight.” 这句话中,“at 7pm” 表示在晚上7点这个时间点上会见。除此之外,它还可以表示某个特定时间段,例如:“I am going on vacation at the end of the month.” 这句话中,"at the end of the month" 表示本月底会去度假。 总之, "at" 是一个常用的介词,可以表示位置、方向、时间等。它是学习英语语法的基础之一。 ### 回答3: 在英语中,“at”是一个常见的介词,可用于表示时间、地点或位置。例如,“at 9 o'clock”表示在九点钟,“at the park”表示在公园,“at the top of the mountain”表示在山顶。在电子邮件和聊天中,“at”也常用于引用某人的用户名,例如“@JohnDoe”表示提到了名为JohnDoe的用户。此外,“at”还可用作缩写词,例如“ATM”表示自动取款机(Automated Teller Machine),“ATP”表示三磷酸腺苷(Adenosine triphosphate)。总之,“at”是一个在英语中非常常用和多功能的介词,可以表示时间、地点、位置等,并在某些情况下用作缩写词的一部分。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值