Tilera Cache Control

Support for moving blocks of memory in and out of a core's cache.

The Tile Processor supports both coherent and incoherent memory models. Coherent shared memory provides the shared memory model familiar to most programmers working in pthreads environments - loads and stores behave as if all cores are accessing one global memory scratchpad. The incoherent memory model allows each core to keep its own copy of a memory location, so that writes to that address might never be visible to other cores.

Working With Coherent Memory

Most parallel algorithms are written to work with coherent shared memory. When writing such algorithms, remember that the Tile Processor implements a relaxed memory model. In order to guarantee that a store operation to a coherent memory address is visible to other tiles, the core that issued the store instruction must perform a "memory fence". The coherent memory fence operation, provided by tmc_mem_fence(), blocks the processor from issuing any other instructions until all previous stores are visible to all other cores.

The memory fence operation is particularly important when implementing shared memory synchronization algorithms. Suppose core A wants to write a data structure to coherent shared memory and then set a flag telling core B that the data is ready to be consumed. A memory fence is required between the data structure store instructions and the flag store instruction; otherwise the relaxed memory model might allow core B to see the "data is ready" flag while stores to the data structure are still in flight.

In general, we recommend that application developers avoid this kind of low-level shared memory algorithm development. The MDE provides the standard pthreads synchronization mechanisms as well as some TMC extensions. These provided primitives should be adequate for many applications.

Working with Incoherent Memory

The Tile Processor also allows applications to allocate incoherent memory. Incoherent memory allows each core to keep its own, locally-cached version of a memory address without automatically synchronizing that copy with any other core. Thus, a store by core A to an incoherent address cannot be guaranteed to be visible to core B unless core A flushes the new value out to DRAM and core B then reloads its copy from DRAM. Working with this memory model presents more of a challenge than using coherent memory.

Incoherent memory accesses are most frequently used when interacting with I/O devices. On TILE64, the I/O shims can only read memory values from DRAM, so applications must flush I/O data to memory before posting it to egress. Similarly, an application must invalidate any locally cached copies of a memory address before receiving an ingress packet. See the NetIO API Reference (UG212) for more information on working with I/O devices and incoherent memory. The TILEPro I/O shims support direct-to-cache memory accesses, so applications developed for TILEPro are not required to deal with incoherent memory at all.

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值