[综述] 二值神经网络 Binary Neural Networks

来源:https://github.com/htqin/awesome-model-quantization

Survey_Papers

Survey_of_Binarization

Our survey paper Binary Neural Networks: A Survey (Pattern Recognition) is a comprehensive survey of recent progress in binary neural networks. For details, please refer to:

Binary Neural Networks: A Survey [Paper] [Blog]

Haotong QinRuihao GongXianglong Liu*, Xiao Bai, Jingkuan Song, and Nicu Sebe.

 

Survey_of_Quantization

The survey paper A Survey of Quantization Methods for Efficient Neural Network Inference (ArXiv) is a comprehensive survey of recent progress in quantization. For details, please refer to:

A Survey of Quantization Methods for Efficient Neural Network Inference [Paper]

Amir Gholami* , Sehoon Kim* , Zhen Dong* , Zhewei Yao* , Michael W. Mahoney, Kurt Keutzer. (* Equal contribution)

 

Benchmark

MQBench

The paper MQBench: Towards Reproducible and Deployable Model Quantization Benchmark (NeurIPS 2021) is a benchmark and framework for evluating the quantization algorithms under real world hardware deployments. For details, please refer to:

MQBench: Towards Reproducible and Deployable Model Quantization Benchmark [Paper] [Project]

Yuhang Li, Mingzhu Shen, Jian Ma, Yan Ren, Mingxin Zhao, Qi Zhang, Ruihao Gong, Fengwei Yu, Junjie Yan.

Papers

Keywordsqnn: quantized neural networks | bnn: binarized neural networks | hardware: hardware deployment | snn: spiking neural networks | other

Statistics: 🔥 highly cited | ⭐ code is available and star > 50

2022

  • [IJCV] Distribution-sensitive Information Retention for Accurate Binary Neural Network. [bnn]
  • [IJCAI] BiFSMN: Binary Neural Network for Keyword Spotting. [bnn] [code]
  • [ICLR] BiBERT: Accurate Fully Binarized BERT. [bnn][code]
  • [CVPR] Implicit Feature Decoupling with Depthwise Quantization. [qnn]
  • [CVPR] Learnable Lookup Table for Neural Network Quantization. [qnn]
  • [CVPR] Mr.BiQ: Post-Training Non-Uniform Quantization based on Minimizing the Reconstruction Error. [qnn]
  • [CVPR] Nonuniform-to-Uniform Quantization: Towards Accurate Quantization via Generalized Straight-Through Estimation. [qnn]
  • [CVPR] Data-Free Network Compression via Parametric Non-uniform Mixed Precision Quantization. [qnn]
  • [CVPR] Instance-Aware Dynamic Neural Network Quantization. [qnn]
  • [IJCAI] RAPQ: Rescuing Accuracy for Power-of-Two Low-bit Post-training Quantization. [qnn]
  • [IJCAI] MultiQuant: Training Once for Multi-bit Quantization of Neural Networks. [qnn]
  • [NeurIPS] Leveraging Inter-Layer Dependency for Post -Training Quantization. [qnn]
  • [NeurIPS] Theoretically Better and Numerically Faster Distributed Optimization with Smoothness-Aware Quantization Techniques. [qnn]
  • [NeurIPS] Entropy-Driven Mixed-Precision Quantization for Deep Network Design. [qnn]
  • [NeurIPS] Redistribution of Weights and Activations for AdderNet Quantization. [qnn]
  • [NeurIPS] FP8 Quantization: The Power of the Exponent. [qnn]
  • [NeurIPS] Towards Efficient Post-training Quantization of Pre-trained Language Models. [qnn]
  • [NeurIPS] Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning. [qnn]
  • [NeurIPS] ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers. [qnn]
  • [NeurIPS] ClimbQ: Class Imbalanced Quantization Enabling Robustness on Efficient Inferences. [qnn]
  • [IJCAI] FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer. [qnn] [code] [71⭐]
  • [ICLR] F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization. [qnn]
  • [ICLR] 8-bit Optimizers via Block-wise Quantization. [qnn]
  • [ICLR] Toward Efficient Low-Precision Training: Data Format Optimization and Hysteresis Quantization. [qnn]
  • [ICLR] Information Bottleneck: Exact Analysis of (Quantized) Neural Networks. [qnn]
  • [ICLR] QDrop: Randomly Dropping Quantization for Extremely Low-bit Post-Training Quantization. [qnn]
  • [ICLR] SQuant: On-the-Fly Data-Free Quantization via Diagonal Hessian Approximation. [qnn][code]
  • [ICLR] Optimal ANN-SNN Conversion for High-accuracy and Ultra-low-latency Spiking Neural Networks. [snn]
  • [ICLR] VC dimension of partially quantized neural networks in the overparametrized regime. [qnn]
  • [arxiv] Q-ViT: Fully Differentiable Quantization for Vision Transformer [qnn]

2021

  • [ICLR] BiPointNet: Binary Neural Network for Point Clouds. [bnn] [torch]
  • [CVPR] Diversifying Sample Generation for Accurate Data-Free Quantization. [qnn]
  • [ACM MM] VQMG: Hierarchical Vector Quantised and Multi-hops Graph Reasoning for Explicit Representation Learning. [other]
  • [ACM MM] Fully Quantized Image Super-Resolution Networks. [qnn]
  • [NeurIPS] Qimera: Data-free Quantization with Synthetic Boundary Supporting Samples. [qnn]
  • [NeurIPS] Post-Training Quantization for Vision Transformer. [mixed]
  • [NeurIPS] Post-Training Sparsity-Aware Quantization. [qnn]
  • [NeurIPS] Divergence Frontiers for Generative Models: Sample Complexity, Quantization Effects, and Frontier Integrals.
  • [NeurIPS] VQ-GNN: A Universal Framework to Scale up Graph Neural Networks using Vector Quantization. [other]
  • [NeurIPS] Qu-ANTI-zation: Exploiting Quantization Artifacts for Achieving Adversarial Outcomes .
  • [NeurIPS] A Winning Hand: Compressing Deep Networks Can Improve Out-of-Distribution Robustness. [bnn] [torch]
  • [CVPR] Learnable Companding Quantization for Accurate Low-bit Neural Networks. [qnn]
  • [CVPR] Zero-shot Adversarial Quantization. [qnn] [torch]
  • [CVPR] Binary Graph Neural Networks. [bnn] [torch]
  • [CVPR] Network Quantization with Element-wise Gradient Scaling. [qnn] [torch]
  • [CVPR] PokeBNN: A Binary Pursuit of Lightweight Accuracy [bnn] [tf]
  • [ICLR] BiPointNet: Binary Neural Network for Point Clouds. [bnn] [torch]
  • [ICLR] Reducing the Computational Cost of Deep Generative Models with Binary Neural Networks. [bnn]
  • [ICLR] High-Capacity Expert Binary Networks. [bnn]
  • [ICLR] Multi-Prize Lottery Ticket Hypothesis: Finding Accurate Binary Neural Networks by Pruning A Randomly Weighted Network. [bnn]
  • [ICLR] BRECQ: Pushing the Limit of Post-Training Quantization by Block Reconstruction. [qnn] [torch]
  • [ICLR] Neural gradients are near-lognormal: improved quantized and sparse training. [qnn]
  • [ICLR] Training with Quantization Noise for Extreme Model Compression. [qnn]
  • [ICLR] Incremental few-shot learning via vector quantization in deep embedded space. [qnn]
  • [ICLR] Degree-Quant: Quantization-Aware Training for Graph Neural Networks. [qnn]
  • [ICLR] BSQ: Exploring Bit-Level Sparsity for Mixed-Precision Neural Network Quantization. [qnn]
  • [ICLR] Simple Augmentation Goes a Long Way: ADRL for DNN Quantization. [qnn]
  • [ICLR] Sparse Quantized Spectral Clustering. [qnn]
  • [ICLR] WrapNet: Neural Net Inference with Ultra-Low-Resolution Arithmetic. [qnn]
  • [ECCV] PAMS: Quantized Super-Resolution via Parameterized Max Scale. [qnn]
  • [AAAI] Distribution Adaptive INT8 Quantization for Training CNNs. [qnn]
  • [AAAI] Stochastic Precision Ensemble: Self‐Knowledge Distillation for Quantized Deep Neural Networks. [qnn]
  • [AAAI] Optimizing Information Theory Based Bitwise Bottlenecks for Efficient Mixed-Precision Activation Quantization. [qnn]
  • [AAAI] OPQ: Compressing Deep Neural Networks with One-shot Pruning-Quantization. [qnn]
  • [AAAI] Scalable Verification of Quantized Neural Networks. [qnn]
  • [AAAI] Uncertainty Quantification in CNN through the Bootstrap of Convex Neural Networks. [qnn]
  • [AAAI] FracBits: Mixed Precision Quantization via Fractional Bit-Widths. [qnn]
  • [AAAI] Post-­‐training Quantization with Multiple Points: Mixed Precision without Mixed Precision. [qnn]
  • [AAAI] Vector Quantized Bayesian Neural Network Inference for Data Streams. [qnn]
  • [AAAI] TRQ: Ternary Neural Networks with Residual Quantization. [qnn]
  • [AAAI] Memory and Computation-Efficient Kernel SVM via Binary Embedding and Ternary Coefficients. [bnn]
  • [AAAI] Compressing Deep Convolutional Neural Networks by Stacking Low-­Dimensional Binary Convolution Filters. [bnn]
  • [AAAI] Training Binary Neural Network without Batch Normalization for Image Super-Resolution. [bnn]
  • [AAAI] SA-BNN: State-­Aware Binary Neural Network. [bnn]
  • [ACL] On the Distribution, Sparsity, and Inference-time Quantization of Attention Values in Transformers. [qnn]
  • [arxiv] Any-Precision Deep Neural Networks. [mixed] [torch]
  • [arxiv] ReCU: Reviving the Dead Weights in Binary Neural Networks. [bnn] [torch]
  • [arxiv] Post-Training Quantization for Vision Transformer. [qnn]
  • [arxiv] A Survey of Quantization Methods for Efficient Neural Network Inference.
  • [arxiv] PTQ4ViT: Post-Training Quantization Framework for Vision Transformers. [qnn]

2020

  • [CVPR] Forward and Backward Information Retention for Accurate Binary Neural Networks. [bnn] [torch] [105⭐]
  • [ACL] End to End Binarized Neural Networks for Text Classification. [bnn]
  • [AAAI] HLHLp: Quantized Neural Networks Traing for Reaching Flat Minima in Loss Sufrface. [qnn]
  • [AAAI] [72🔥] Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT. [qnn]
  • [AAAI] Sparsity-Inducing Binarized Neural Networks. [bnn]
  • [AAAI] Towards Accurate Low Bit-Width Quantization with Multiple Phase Adaptations.
  • [COOL CHIPS] A Novel In-DRAM Accelerator Architecture for Binary Neural Network. [hardware]
  • [CoRR] Training Binary Neural Networks using the Bayesian Learning Rule. [bnn]
  • [CVPR] [47🔥] GhostNet: More Features from Cheap Operations. [qnn] [tensorflow & torch] [1.2k⭐]
  • [CVPR] APQ: Joint Search for Network Architecture, Pruning and Quantization Policy. [qnn] [torch] [76⭐]
  • [CVPR] Rotation Consistent Margin Loss for Efficient Low-Bit Face Recognition. [qnn]
  • [CVPR] BiDet: An Efficient Binarized Object Detector. [ qnn ] [torch] [112⭐]
  • [CVPR] Fixed-Point Back-Propagation Training. [video] [qnn]
  • [CVPR] Low-Bit Quantization Needs Good Distribution. [qnn]
  • [DATE] BNNsplit: Binarized Neural Networks for embedded distributed FPGA-based computing systems. [bnn]
  • [DATE] PhoneBit: Efficient GPU-Accelerated Binary Neural Network Inference Engine for Mobile Phones. [bnn] [hardware]
  • [DATE] OrthrusPE: Runtime Reconfigurable Processing Elements for Binary Neural Networks. [bnn]
  • [ECCV] Learning Architectures for Binary Networks. [bnn] [torch]
  • [ECCV]PROFIT: A Novel Training Method for sub-4-bit MobileNet Models. [qnn]
  • [ECCV] ProxyBNN: Learning Binarized Neural Networks via Proxy Matrices. [bnn]
  • [ECCV] ReActNet: Towards Precise Binary Neural Network with Generalized Activation Functions. [bnn] [torch] [108⭐]
  • [ECCV] Differentiable Joint Pruning and Quantization for Hardware Efficiency. [hardware]
  • [EMNLP] TernaryBERT: Distillation-aware Ultra-low Bit BERT. [qnn]
  • [EMNLP] Fully Quantized Transformer for Machine Translation. [qnn]
  • [ICET] An Energy-Efficient Bagged Binary Neural Network Accelerator. [bnn] [hardware]
  • [ICASSP] Balanced Binary Neural Networks with Gated Residual. [bnn]
  • [ICML] Training Binary Neural Networks through Learning with Noisy Supervision. [bnn]
  • [ICLR] DMS: Differentiable Dimension Search for Binary Neural Networks. [bnn]
  • [ICLR] [19🔥] Training Binary Neural Networks with Real-to-Binary Convolutions. [bnn] [code is comming] [re-implement]
  • [ICLR] BinaryDuo: Reducing Gradient Mismatch in Binary Activation Network by Coupling Binary Activations. [bnn] [torch]
  • [ICLR] Mixed Precision DNNs: All You Need is a Good Parametrization. [mixed] [code] [73⭐]
  • [IJCV] Binarized Neural Architecture Search for Efficient Object Recognition. [bnn]
  • [IJCAI] CP-NAS: Child-Parent Neural Architecture Search for Binary Neural Networks. [bnn]
  • [IJCAI] Towards Fully 8-bit Integer Inference for the Transformer Model. [qnn] [nlp]
  • [IJCAI] Soft Threshold Ternary Networks. [qnn]
  • [IJCAI] Overflow Aware Quantization: Accelerating Neural Network Inference by Low-bit Multiply-Accumulate Operations. [qnn]
  • [IJCAI] Direct Quantization for Training Highly Accurate Low Bit-width Deep Neural Networks. [qnn]
  • [IJCAI] Fully Nested Neural Network for Adaptive Compression and Quantization. [qnn]
  • [ISCAS] MuBiNN: Multi-Level Binarized Recurrent Neural Network for EEG Signal Classification. [bnn]
  • [ISQED] BNN Pruning: Pruning Binary Neural Network Guided by Weight Flipping Frequency. [bnn] [torch]
  • [MICRO] GOBO: Quantizing Attention-Based NLP Models for Low Latency and Energy Efficient Inference. [qnn] [nlp]
  • [MLST] Compressing deep neural networks on FPGAs to binary and ternary precision with HLS4ML. [hardware] [qnn]
  • [NeurIPS] Rotated Binary Neural Network. [bnn] [torch]
  • [NeurIPS] Searching for Low-Bit Weights in Quantized Neural Networks. [qnn] [torch]
  • [NeurIPS] Universally Quantized Neural Compression. [qnn]
  • [NeurIPS] Efficient Exact Verification of Binarized Neural Networks. [bnn] [torch]
  • [NeurIPS] Path Sample-Analytic Gradient Estimators for Stochastic Binary Networks. [bnn] [code]
  • [NeurIPS] HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks. [qnn]
  • [NeurIPS] Bayesian Bits: Unifying Quantization and Pruning. [qnn]
  • [NeurIPS] Robust Quantization: One Model to Rule Them All. [qnn]
  • [NeurIPS] Closing the Dequantization Gap: PixelCNN as a Single-Layer Flow. [qnn] [torch]
  • [NeurIPS] Adaptive Gradient Quantization for Data-Parallel SGD. [qnn] [torch]
  • [NeurIPS] FleXOR: Trainable Fractional Quantization. [qnn]
  • [NeurIPS] Position-based Scaled Gradient for Model Quantization and Pruning. [qnn] [torch]
  • [NN] Training high-performance and large-scale deep neural networks with full 8-bit integers. [qnn]
  • [Neurocomputing] Eye localization based on weight binarization cascade convolution neural network. [bnn]
  • [PR] [23🔥] Binary neural networks: A survey. [bnn]
  • [PR Letters] Controlling information capacity of binary neural network. [bnn]
  • [SysML] Riptide: Fast End-to-End Binarized Neural Networks. [qnn] [tensorflow] [129⭐]
  • [TPAMI] Hierarchical Binary CNNs for Landmark Localization with Limited Resources. [bnn] [homepage] [code]
  • [TPAMI] Deep Neural Network Compression by In-Parallel Pruning-Quantization.
  • [TPAMI] Towards Efficient U-Nets: A Coupled and Quantized Approach.
  • [TVLSI] Phoenix: A Low-Precision Floating-Point Quantization Oriented Architecture for Convolutional Neural Networks. [qnn]
  • [WACV] MoBiNet: A Mobile Binary Network for Image Classification. [bnn]
  • [IEEE Access] An Energy-Efficient and High Throughput in-Memory Computing Bit-Cell With Excellent Robustness Under Process Variations for Binary Neural Network. [bnn] [hardware]
  • [IEEE Trans. Magn] SIMBA: A Skyrmionic In-Memory Binary Neural Network Accelerator. [bnn]
  • [IEEE TCS.II] A Resource-Efficient Inference Accelerator for Binary Convolutional Neural Networks. [hardware]
  • [IEEE TCS.I] IMAC: In-Memory Multi-Bit Multiplication and ACcumulation in 6T SRAM Array. [qnn]
  • [IEEE Trans. Electron Devices] Design of High Robustness BNN Inference Accelerator Based on Binary Memristors. [bnn] [hardware]
  • [arxiv] Training with Quantization Noise for Extreme Model Compression. [qnn] [torch]
  • [arxiv] Binarized Graph Neural Network. [bnn]
  • [arxiv] How Does Batch Normalization Help Binary Training? [bnn]
  • [arxiv] Distillation Guided Residual Learning for Binary Convolutional Neural Networks. [bnn]
  • [arxiv] Accelerating Binarized Neural Networks via Bit-Tensor-Cores in Turing GPUs. [bnn] [code]
  • [arxiv] MeliusNet: Can Binary Neural Networks Achieve MobileNet-level Accuracy? [bnn] [code] [192⭐]
  • [arxiv] RPR: Random Partition Relaxation for Training; Binary and Ternary Weight Neural Networks. [bnn] [qnn]
  • [paper] Towards Lossless Binary Convolutional Neural Networks Using Piecewise Approximation. [bnn]
  • [arxiv] Understanding Learning Dynamics of Binary Neural Networks via Information Bottleneck. [bnn]
  • [arxiv] BinaryBERT: Pushing the Limit of BERT Quantization. [bnn] [nlp]
  • [ECCV] BATS: Binary ArchitecTure Search. [bnn]

2019

  • [AAAI] Efficient Quantization for Neural Networks with Binary Weights and Low Bitwidth Activations. [qnn]
  • [AAAI] [31🔥] Projection Convolutional Neural Networks for 1-bit CNNs via Discrete Back Propagation. [bnn]
  • [APCCAS] Using Neuroevolved Binary Neural Networks to solve reinforcement learning environments. [bnn] [code]
  • [BMVC] [32🔥] XNOR-Net++: Improved Binary Neural Networks. [bnn]
  • [BMVC] Accurate and Compact Convolutional Neural Networks with Trained Binarization. [bnn]
  • [CoRR] RBCN: Rectified Binary Convolutional Networks for Enhancing the Performance of 1-bit DCNNs. [bnn]
  • [CoRR] TentacleNet: A Pseudo-Ensemble Template for Accurate Binary Convolutional Neural Networks. [bnn]
  • [CoRR] Improved training of binary networks for human pose estimation and image recognition. [bnn]
  • [CoRR] Binarized Neural Architecture Search. [bnn]
  • [CoRR] Matrix and tensor decompositions for training binary neural networks. [bnn]
  • [CoRR] Back to Simplicity: How to Train Accurate BNNs from Scratch? [bnn] [code] [193⭐]
  • [CVPR] [53🔥] Structured Binary Neural Networks for Accurate Image Classification and Semantic Segmentation. [bnn]
  • [CVPR] SeerNet: Predicting Convolutional Neural Network Feature-Map Sparsity Through Low-Bit Quantization. [qnn]
  • [CVPR] [218🔥] HAQ: Hardware-Aware Automated Quantization with Mixed Precision. [qnn] [hardware] [torch] [233⭐]
  • [CVPR] [48🔥] Quantization Networks. [bnn] [torch] [82⭐]
  • [CVPR] Fully Quantized Network for Object Detection. [qnn]
  • [CVPR] Learning Channel-Wise Interactions for Binary Convolutional Neural Networks. [bnn]
  • [CVPR] [31🔥] Circulant Binary Convolutional Networks: Enhancing the Performance of 1-bit DCNNs with Circulant Back Propagation. [bnn]
  • [CVPR] [36🔥] Regularizing Activation Distribution for Training Binarized Deep Networks. [bnn]
  • [CVPR] A Main/Subsidiary Network Framework for Simplifying Binary Neural Network. [bnn]
  • [CVPR] Binary Ensemble Neural Network: More Bits per Network or More Networks per Bit? [bnn]
  • [FPGA] Towards Fast and Energy-Efficient Binarized Neural Network Inference on FPGA. [bnn] [hardware]
  • [GLSVLSI] Binarized Depthwise Separable Neural Network for Object Tracking in FPGA. [bnn] [hardware]
  • [ICCV] [55🔥] Differentiable Soft Quantization: Bridging Full-Precision and Low-Bit Neural Networks. [qnn]
  • [ICCV] Bayesian optimized 1-bit cnns. [bnn]
  • [ICCV] Searching for Accurate Binary Neural Architectures. [bnn]
  • [ICCV] Data-Free Quantization Through Weight Equalization and Bias Correction. [qnn] [hardware] [torch]
  • [ICML] Efficient 8-Bit Quantization of Transformer Neural Machine Language Translation Model. [qnn] [nlp]
  • [ICLR] [37🔥] ProxQuant: Quantized Neural Networks via Proximal Operators. [qnn] [torch]
  • [ICLR] An Empirical study of Binary Neural Networks' Optimisation. [bnn]
  • [ICIP] Training Accurate Binary Neural Networks from Scratch. [bnn] [code] [192⭐]
  • [ICUS] Balanced Circulant Binary Convolutional Networks. [bnn]
  • [IJCAI] Binarized Neural Networks for Resource-Efficient Hashing with Minimizing Quantization Loss. [bnn]
  • [IJCAI] Binarized Collaborative Filtering with Distilling Graph Convolutional Network. [bnn]
  • [ISOCC] Dual Path Binary Neural Network. [bnn]
  • [IEEE J. Emerg. Sel. Topics Circuits Syst.] Hyperdrive: A Multi-Chip Systolically Scalable Binary-Weight CNN Inference Engine. [hardware]
  • [IEEE JETC] [128🔥] Eyeriss v2: A Flexible Accelerator for Emerging Deep Neural Networks on Mobile Devices. [hardware]
  • [IEEE J. Solid-State Circuits] An Energy-Efficient Reconfigurable Processor for Binary-and Ternary-Weight Neural Networks With Flexible Data Bit Width. [qnn]
  • [MDPI Electronics] A Review of Binarized Neural Networks. [bnn]
  • [NeurIPS] MetaQuant: Learning to Quantize by Learning to Penetrate Non-differentiable Quantization. [qnn] [torch]
  • [NeurIPS] Latent Weights Do Not Exist: Rethinking Binarized Neural Network Optimization. [bnn] [tensorflow]
  • [NeurIPS] [43🔥] Regularized Binary Network Training. [bnn]
  • [NeurIPS] [44🔥] Q8BERT: Quantized 8Bit BERT. [qnn] [nlp]
  • [NeurIPS] Fully Quantized Transformer for Improved Translation. [qnn] [nlp]
  • [NeurIPS] Normalization Helps Training of Quantized LSTM. [qnn] [bnn]
  • [RoEduNet] PXNOR: Perturbative Binary Neural Network. [bnn] [code]
  • [SiPS] Knowledge distillation for optimization of quantized deep neural networks. [qnn]
  • [TMM] [45🔥] Deep Binary Reconstruction for Cross-Modal Hashing. [bnn]
  • [TMM] Compact Hash Code Learning With Binary Deep Neural Network. [bnn]
  • [IEEE TCS.I] Xcel-RAM: Accelerating Binary Neural Networks in High-Throughput SRAM Compute Arrays. [hardware]
  • [IEEE TCS.I] Recursive Binary Neural Network Training Model for Efficient Usage of On-Chip Memory. [bnn]
  • [VLSI-SoC] A Product Engine for Energy-Efficient Execution of Binary Neural Networks Using Resistive Memories. [bnn] [hardware]
  • [paper] [43🔥] BNN+: Improved Binary Network Training. [bnn]
  • [arxiv] Self-Binarizing Networks. [bnn]
  • [arxiv] Towards Unified INT8 Training for Convolutional Neural Network. [qnn]
  • [arxiv] daBNN: A Super Fast Inference Framework for Binary Neural Networks on ARM devices. [bnn] [hardware] [code]
  • [arxiv] QKD: Quantization-aware Knowledge Distillation. [qnn]
  • [arxiv] [59🔥] Mixed Precision Quantization of ConvNets via Differentiable Neural Architecture Search. [qnn]

 

2018

  • [AAAI] From Hashing to CNNs: Training BinaryWeight Networks via Hashing. [bnn]
  • [AAAI] [136🔥] Extremely Low Bit Neural Network: Squeeze the Last Bit Out with ADMM. [qnn] [homepage]
  • [CAAI] Fast object detection based on binary deep convolution neural networks. [bnn]
  • [CoRR] LightNN: Filling the Gap between Conventional Deep Neural Networks and Binarized Networks. [bnn]
  • [CoRR] BinaryRelax: A Relaxation Approach For Training Deep Neural Networks With Quantized Weights. [bnn]
  • [CVPR] [63🔥] Two-Step Quantization for Low-bit Neural Networks. [qnn]
  • [CVPR] Effective Training of Convolutional Neural Networks with Low-bitwidth Weights and Activations. [qnn]
  • [CVPR] [97🔥] Towards Effective Low-bitwidth Convolutional Neural Networks. [qnn]
  • [CVPR] Modulated convolutional networks. [bnn]
  • [CVPR] [67🔥] SYQ: Learning Symmetric Quantization For Efficient Deep Neural Networks. [qnn] [code]
  • [CVPR] [630🔥] Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference. [qnn]
  • [ECCV] Training Binary Weight Networks via Semi-Binary Decomposition. [bnn]
  • [ECCV] [47🔥] TBN: Convolutional Neural Network with Ternary Inputs and Binary Weights. [bnn] [qnn] [torch]
  • [ECCV] [202🔥] LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks. [qnn] [tensorflow] [188⭐]
  • [ECCV] [145🔥] Bi-Real Net: Enhancing the Performance of 1-bit CNNs With Improved Representational Capability and Advanced Training Algorithm. [bnn] [torch] [120⭐]
  • [FCCM] ReBNet: Residual Binarized Neural Network. [bnn] [tensorflow]
  • [FPL] FBNA: A Fully Binarized Neural Network Accelerator. [hardware]
  • [ICLR] [65🔥] Loss-aware Weight Quantization of Deep Networks. [qnn] [code]
  • [ICLR] [230🔥] Model compression via distillation and quantization. [qnn] [torch] [284⭐]
  • [ICLR] [201🔥] PACT: Parameterized Clipping Activation for Quantized Neural Networks. [qnn]
  • [ICLR] [168🔥] WRPN: Wide Reduced-Precision Networks. [qnn]
  • [ICLR] Analysis of Quantized Models. [qnn]
  • [ICLR] [141🔥] Apprentice: Using Knowledge Distillation Techniques To Improve Low-Precision Network Accuracy. [qnn]
  • [IJCAI] Deterministic Binary Filters for Convolutional Neural Networks. [bnn]
  • [IJCAI] Planning in Factored State and Action Spaces with Learned Binarized Neural Network Transition Models. [bnn]
  • [IJCNN] Analysis and Implementation of Simple Dynamic Binary Neural Networks. [bnn]
  • [IPDPS] BitFlow: Exploiting Vector Parallelism for Binary Neural Networks on CPU. [bnn]
  • [IEEE J. Solid-State Circuits] [66🔥] BRein Memory: A Single-Chip Binary/Ternary Reconfigurable in-Memory Deep Neural Network Accelerator Achieving 1.4 TOPS at 0.6 W. [hardware] [qnn]
  • [NCA] [88🔥] A survey of FPGA-based accelerators for convolutional neural networks. [hardware]
  • [NeurIPS] [150🔥] Training Deep Neural Networks with 8-bit Floating Point Numbers. [qnn]
  • [NeurIPS] [91🔥] Scalable methods for 8-bit training of neural networks. [qnn] [torch]
  • [MM] BitStream: Efficient Computing Architecture for Real-Time Low-Power Inference of Binary Neural Networks on CPUs. [bnn]
  • [Res Math Sci] Blended coarse gradient descent for full quantization of deep neural networks. [qnn] [bnn]
  • [TCAD] XNOR Neural Engine: A Hardware Accelerator IP for 21.6-fJ/op Binary Neural Network Inference. [hardware]
  • [TRETS] [50🔥] FINN-R: An End-to-End Deep-Learning Framework for Fast Exploration of Quantized Neural Networks. [qnn]
  • [TVLSI] An Energy-Efficient Architecture for Binary Weight Convolutional Neural Networks. [bnn]
  • [arxiv] Training Competitive Binary Neural Networks from Scratch. [bnn] [code] [192⭐]
  • [arxiv] Joint Neural Architecture Search and Quantization. [qnn] [torch]
  • [CVPR] Explicit loss-error-aware quantization for low-bit deep neural networks. [qnn]

2017

  • [CoRR] BMXNet: An Open-Source Binary Neural Network Implementation Based on MXNet. [bnn] [code]
  • [CVPR] [251🔥] Deep Learning with Low Precision by Half-wave Gaussian Quantization. [qnn] [code] [118⭐]
  • [CVPR] [156🔥] Local Binary Convolutional Neural Networks. [bnn] [torch] [94⭐]
  • [FPGA] [463🔥] FINN: A Framework for Fast, Scalable Binarized Neural Network Inference. [hardware] [bnn]
  • [ICASSP)] Fixed-point optimization of deep neural networks with adaptive step size retraining. [qnn]
  • [ICCV] [130🔥] Binarized Convolutional Landmark Localizers for Human Pose Estimation and Face Alignment with Limited Resources. [bnn] [homepage] [torch] [207⭐]
  • [ICCV] [55🔥] Performance Guaranteed Network Acceleration via High-Order Residual Quantization. [qnn]
  • [ICLR] [554🔥] Incremental Network Quantization: Towards Lossless CNNs with Low-Precision Weights. [qnn] [torch] [144⭐]
  • [ICLR] [119🔥] Loss-aware Binarization of Deep Networks. [bnn] [code]
  • [ICLR] [222🔥] Soft Weight-Sharing for Neural Network Compression. [other]
  • [ICLR] [637🔥] Trained Ternary Quantization. [qnn] [torch] [90⭐]
  • [InterSpeech] Binary Deep Neural Networks for Speech Recognition. [bnn]
  • [IPDPSW] On-Chip Memory Based Binarized Convolutional Deep Neural Network Applying Batch Normalization Free Technique on an FPGA. [hardware]
  • [JETC] A GPU-Outperforming FPGA Accelerator Architecture for Binary Convolutional Neural Networks. [hardware] [bnn]
  • [NeurIPS] [293🔥] Towards Accurate Binary Convolutional Neural Network. [bnn] [tensorflow]
  • [Neurocomputing] [126🔥] FP-BNN: Binarized neural network on FPGA. [hardware]
  • [MWSCAS] Deep learning binary neural network on an FPGA. [hardware] [bnn]
  • [arxiv] [71🔥] Ternary Neural Networks with Fine-Grained Quantization. [qnn]
  • [arxiv] ShiftCNN: Generalized Low-Precision Architecture for Inference of Convolutional Neural Networks. [qnn] [code] [53⭐]

2016

  • [CoRR] [1k🔥] DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients. [qnn] [code] [5.8k⭐]
  • [ECCV] [2.7k🔥] XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks. [bnn] [torch] [787⭐]
  • [ICASSP)] Fixed-point Performance Analysis of Recurrent Neural Networks. [qnn]
  • [NeurIPS] [572🔥] Ternary weight networks. [qnn] [code] [61⭐]
  • [NeurIPS)] [1.7k🔥] Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1. [bnn] [torch] [239⭐]
  • [CVPR] [270🔥] Quantized convolutional neural networks for mobile devices. code

 

2015

  • [ICML] [191🔥] Bitwise Neural Networks. [bnn]
  • [NeurIPS] [1.8k🔥] BinaryConnect: Training Deep Neural Networks with binary weights during propagations. [bnn] [code] [330⭐]
  • [arxiv] Resiliency of Deep Neural Networks under quantizations. [qnn]

Codes_and_Docs

  • [Doc] ZF-Net: An Open Source FPGA CNN Library.

  • [Doc] Accelerating CNN inference on FPGAs: A Survey.

  • [中文] An Overview of Deep Compression Approaches.

  • [中文] 嵌入式深度学习之神经网络二值化 - FPGA实现

 

  • 1
    点赞
  • 11
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值