Synergistic Processing Elements (SPE)
Each SPE is composed of a "Synergistic Processing Unit", SPU, and a "Memory Flow Controller", MFC (DMA, MMU, and bus interface).[31] An SPE is a RISC processor with 128-bit SIMD organization[27][32][33] for single and double precision instructions. With the current generation of the Cell, each SPE contains a 256 KiB embedded SRAM for instruction and data, called "Local Storage" (not to be mistaken for "Local Memory" in Sony's documents that refer to the VRAM) which is visible to the PPE and can be addressed directly by software. Each SPE can support up to 4 GiB of local store memory. The local store does not operate like a conventional CPU cache since it is neither transparent to software nor does it contain hardware structures that predict which data to load. The SPEs contain a 128-bit, 128 entry register file and measures 14.5 mm² on a 90 nm process. An SPE can operate on 16 8-bit integers, 8 16-bit integers, 4 32-bit integers or single precision floating-point numbers in a single clock cycle, as well as a memory operation. Note that the SPU cannot directly access system memory; the 64-bit virtual memory addresses formed by the SPU must be passed from the SPU to the SPE memory flow controller (MFC) to set up a DMA operation within the system address space.
ff大神说下spu调用虚拟内存是多少的神奇吧。