[Graph Applications]
- TDGraph: A Topology-Driven Accelerator for High-Performance Streaming Graph Processing
[Processing in Memory]
- To PIM or Not for Emerging General Purpose Processing in DDR Memory Systems
[Industry Session]
- Software-Hardware Co-design for Fast and Scalable Training of Deep Learning Recommendation Models
- Understanding Data Storage and Ingestion for Large-Scale Deep Recommendation Model Training
[Persistent Memory]
- Sibyl: Adaptive and Extensible Data Placement in Hybrid Storage Systems using Online Reinforcement Learning
[Learning]
- Cascading Structured Pruning: Enabling High Data Reuse for Sparse DNN Accelerators
- Anticipating and Eliminating Redundant Computations in Accelerated Sparse Training
[Learning II and Security II]
- Themis: A Network Bandwidth-Aware Collective Scheduling Policy for Distributed Training of DL Models
[Novel Architectures]
- BioHD: An Efficient Genome Sequence Search Platform Using HyperDimensional Memorization
- EDAM: Edit Distance tolerant Approximate Matching content addressable memory
[Learning III]
- Training Personalized Recommendation Systems from (GPU) Scratch: Look Forward not Backwards
- Mokey: Enabling Narrow Fixed-Point Inference for Out-of-the-Box Floating-Point Transformer Models
- Accelerating Attention through Gradient-Based Learned Runtime Pruning
[Graph Applications and Algorithms]
- Graphite: Optimizing Graph Neural Networks on CPUs Through Cooperative Software-Hardware Techniques
- SmartSAGE: Training Large-scale Graph Neural Networks using In-Storage Processing Architectures
- Hyperscale FPGA-As-A-Service Architecture for Large-Scale Distributed Graph Neural Network
- Crescent: Taming Memory Irregularities for Accelerating Deep Point Cloud Analytics