WINDOWS MOBILE 的ARM 体系支持

This topic covers support for ARM architectures in Windows Embedded CE 6.0.

On ARMv4 and ARMv5 processors, cache is organized as a virtual-indexed, virtual-tagged (VIVT) cache in which both the index and the tag are based on the virtual address. The main advantage of this method is that cache lookups are faster because the translation look-aside buffer (TLB) is not involved in matching cache lines for a virtual address. However, this caching method does require more frequent cache flushing because of cache aliasing, in which the same physical address can be mapped to multiple virtual addresses.

Note:
Throughout this topic, the term flush is used for writing back and invalidating cache lines.

On ARMv6 and ARMv7 processors, cache is organized as a virtual-indexed, physical-tagged (VIPT) cache. The cache line index is derived from the virtual address. However, the tag is specified by using the physical address. The main advantage is that cache aliasing is not an issue because every physical address has a unique tag in the cache. However, a cache entry cannot be determined to be valid until the TLB has translated the virtual address to a physical address that matches the tag. Generally, the TLB lookup cost offsets the performance gain achieved by avoiding cache aliasing.

CE 6.0 supports both VIVT and VIPT caching by automatically detecting the architecture ID and using that information to control cache flushing.

Cache flushes are categorized as one of the following:

  • TLB flush
  • Instruction cache (I-cache) flush
  • Data cache (D-cache) flush

Cache flushing is generally done in the following ways:

  • User-initiated cache flush, by using the CacheRangeFlush function
  • Turning the device off and back on
  • Page acquisition or release, by using internal OEM functions to get a page and then free it
  • Process deletion
  • Uncaching a page
  • API call return to a server other than the current server
  • Thread switching to a process other than the current active process

For ARMv6 and ARMv7 processors, cache flushing in thread switching to a process other than the current active process is limited to the following instances:

  • The hardware does not support VIPT I-cache: In ARMv6 and ARMv7, it is optional for I-cache to be VIPT. Data cache is VIPT or physically-indexed and physically-tagged (PIPT) in MPCore systems. If the hardware does not support VIPT I-cache, the OS flushes the I-cache.
  • The system is out of address-space identifiers (ASIDs) for each virtual address: In this case, the OS flushes the whole TLB.

This means that, whereas on ARMv4 and ARMv5 processors the whole cache, I-cache, D-cache, and TLB, is flushed on every thread switch to a different process. On ARMv6 and ARMv7 processors, the D-cache is never flushed on thread switch. The I-cache is flushed only if the processor does not support VIPT cache. The TLB is flushed only if all 255 supported ASIDs have been used. This reduction of cache flushes should improve overall system performance.

In addition, moving to VIPT has performance advantages for the following OS features in CE 6.0:

  • Memory-mapped files: On an ARMv4 or ARMv5 system, all read/write views are marked as uncached to prevent aliasing. Marking the views as uncached affects overall system performance. However, in VIVT, you must prevent aliasing. On ARMv6 and ARMv7 systems, views are marked as cached.
  • VirtualAllocCopyEx: In CE 6.0, if a kernel mode driver creates an explicit alias in which two virtual addresses map to the same physical address by using VirtualAllocCopyEx, the OS marks both the source and destination addresses as uncached to avoid cache aliasing on ARMv4 and ARMv5 systems. On ARMv6 and ARMv7 systems, source and destination addresses are marked as cached. Even though this function can be called only from kernel mode, this affects both kernel-mode and user-mode drivers. Device Manager copies the data only for user-mode drivers.

In ARMv7 processors, L2 cache is considered part of the architecture. For earlier versions, it was optional. In Windows Embedded CE, support for L2 cache is limited to enabling L2 cache write-back and discard on L1 cache write-back and discard. Hardware support for L2 cache is dictated by the OEM-specific code. Enabling L2 cache support in the OEM adaptation layer (OAL) resembles other cache operations in the OEMCacheRangeFlush function implementation. Typically, if a device supports L2 cache, OAL code is updated to honor L2 cache flush flags passed to OEMCacheRangeFlush and discard or write back L2 cache accordingly.

All virtual addresses are translated to physical addresses using page table entries. CE 6.0 introduced the following changes for ARMv6 and ARMv7 page table entries:

  • Non-global (NG) bit: This is marked on the page table entry for a non-kernel mode address. In Windows Embedded CE, this is for any address that has a value less than the shared heap address (0x70000000).
  • Separate page directories for global and non-global addresses: ARM v6 and ARMv7 has different page directory entries for global and non-global addresses. On ARMv6 and ARMv7 processors, this reduces the number of predefined pages needed to store non-global mappings to two pages, instead of four.
  • Execute-never (XN) bit: This is set for page table entries referring to non-execute pages. In CE 6.0, this is set only for the function trap area, the range starting with 0xf0000000. Any attempt to execute code from these pages causes an error that the OS uses to trap and route the function call appropriately. Windows Embedded CE only uses this bit to trap addresses. Potentially, in future releases of Windows Embedded CE, this bit could be used to prevent code execution from stack, heap, and any other non-code pages.

In ARMv6 and ARMv7, a unique ASID is used to tag TLB. CE 6.0 supports a maximum of 255 ASIDs. Therefore, it does not have to flush the TLB on thread context switching unless all 255 ASIDs have been used.

The rules for invalidating TLB entries are affected in the following ways:

  • If the ASID of the current process does not match the ASID of the TLB entry to be invalidated, the OS invalidates the whole cache. This is because the TLB is tagged with an ASID.
  • This condition applies only to user-mode addresses on which the NG bit is set, because ASID is used to tag only non-global TLB entries. For global addresses, invalidating the TLB invalidates only the specified address range.

In CE 6.0, access to unaligned data is currently supported only on ARM v6 and ARMv7 processors. Individual applications can enable or disable unaligned access. If the underlying design does not support unaligned access, an error is returned. The IOCTL_KLIB_UNALIGNENABLE I/O control enables and disables unaligned access.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值