Virtual-Reality Maze
FMM2010.6N-body FMM FMM #863 2009AA01220108dz501600/#CT MRI SURF CPU SURF GPGPU SURF CUDA-SURF CPU GPU SURF #N-body N-body N-body
1#N-body PPParticle to Particle O(N2)PMParticle Mesh Method O(NlogN)TMTree MethodBH Barnes-Hut O(NlogN) 1% FMM Fast Multipole MethodBH O(N)
#FMM FMM
#FMM FMM 1
Kfar
23
4#FMM Multipole ExpansionME Local ExpansionLE4 ME LE MELE O(N)
#l d 2ld ME LE
231
#a b b b b b
i b i -1 i -1 I b
#11 Kernel P2MParticle to Multipole Expansion M2MMultipole Expansion to Multipole ExpansionM2LMultipole Expansion to Local Expansion L2LLocal Expansion to Local Expansion L2PLocal Expansion to ParticleMultipole ExpansionLocal ExpansionMELEMELEP2MM2MM2LL2LL2P# FMM PetFMM
PetFMM
#PetFMM Vortex Particle Method[11] N-body PetFMM [12]C++ PETSc Library CPU#PetFMM
p l N NP NIL NLL NBP2MM2MM2LL2LL2PP2P*8p-84p2+8p-810p2+8p-44p2+8p-88p-89+6p-43p2+5p-47p2+4p-23p2+5p-46p-45/--2p2--2exp-----114p-127p2+13p-1219p2+12p-67p2+13p-1214p-1217NNBNILNBNBNNLLNPN#PetFMM Bytes
Bytes
p l N NP NIL NLL NBMELE16pNB16pNB16NB27*4NB9*4NB28NP2MM2MM2LL2LL2PP2P28N(16p+16)NB 16p+16)NBNIL(16p+16)NB4l(16p+16)8NLLNPN16p4l16pNB16pNB16N#PetFMM N>=10101618141610101011
L2P P2P 0.5Bytes / OpsP2MM2MM2LL2LL2PP2P101014160.1640.2620.0570.458180.1490.2350.0500.45616160.6510.2620.0570.529180.6320.2350.0500.527101114160.1350.2620.0570.469180.1200.2350.0500.46916160.1840.2620.0570.451180.1680.2350.0500.449# SIMD PE 45/65 nm 512DDR2/DDR3
# FMA Kernel SFUSpecial Function Unit10.667 GB/s DDR3-1333 0.5 500MHz 512 10 Kernel FMM #N-body FMM O(N) FMM Kernel SIMD FMM Kernel #FFTLINPACK
FMM
CPU GPGPU #Any question?
Thank you The End