hifi 3
parrellel
2 24bit/32bit
4 16bit
multiplication
2 32*32bit
float
2 IEEE-754 floating point MACs per cycle
HiFi 4
- simd support
- 2 32bit or 4 16bit pararel
- 4,32bit multiplication
- 16 way 64bit AE_DR register
- 8 way 8bit AE_EP register
VLIW:
HiFi 4 can issue up to four operations in a single 88-bit instruction bundle or two operations
in a 48-bit bundle using Xtensa LX FLIX (VLIW) technology.
Understanding the slotting is important when optimizing code for HiFi 4. Often a loop is limited by operations that can only go in one slot or another. For example, it is never possible to issue more than one (possible SIMD) store per cycle. If a loop is limited by the operations in one slot, there is no point in trying to optimize the operations in another slot.
- the first slot supports loads and stores and core operations
- the second slot supports loads and core operations
- the last two slots support ALU and multiply operations
Profile
First, the Xtensa instruction
set simulator (ISS) can directly generate the profile data. This is the easiest, most accurate,
and most flexible option
the second option is to profile your program
running on a hardware implementation of your system. Hardware profiling requires certain
Xtensa processor features, and it uses statistical sampling, which makes the results less
accurate. Other profiling tools may be available from third-party operating system vendors.
Tie: a subset of verilog
Benefit of Floating point
- speed up the time to market
- eliminate scaling operations for fixed point:
scaling operation need more MIPS - streamline data flow
homogeneous register set instead of accumulators which require data movement - higher quality of audio signal
But floating point has 10-times the area and power consumption of fixed point hardware