'Network Compression' Outline

outline

(Curtis Kim) #1

Network Compression

Network Compression 카테고리에서는 딥러닝 모델의 가속화 내지는 모델의 메모리 사이즈를 줄이는 것과 관련된 논문을 리뷰합니다.

  • Redundant한 Network Weight를 줄이거나, Factorize, Weight Quantization 등의 기법을 사용해 네트워크 사이즈를 줄이거나 속도를 개선하는 분야

Quantize / Weight Compression

Weight Compression

Squeezenet: Alexnet-level accuracy with 50x fewer parameters and ¡1mb model size.

Quantized / Low-Precision Neural Networks

Improving the speed of neural networks on CPUs

Binarynet: Training deep neural networks with weights and activations constrained to +1 or -1.

XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks

Bitwise neural networks

Pruning

Deep compression: Com- pressing deep neural network with pruning, trained quanti- zation and huffman coding.

Learning both weights and connections for efficient neural network.


Sparse Convolution : Factorizing / Compressing Convolutional Layers

네트워크에서 가장 시간이 오래 걸리는 부분으로 평가되는, Convolutional Operation을 줄이려는 노력.

Learning structured sparsity in deep neural networks.

Quantized convolutional neural networks for mobile devices

LCNN: Lookup-based Convolutional Neural Network

  • 2016.11
  • arxiv: https://arxiv.org/abs/1611.06473
  • #cvpr #cvpr2017
  • “Our fastest LCNN offers 37.6x speed up over AlexNet while maintaining 44.3% top-1 accuracy.”

  • 한정된 수의 Dictionary를 이용해 Convolution Filter를 복원하는 컨셉.

MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications

  • 2017.4
  • arxiv : https://arxiv.org/abs/1704.04861
  • #google
  • “We present a class of efficient models called MobileNets for mobile and embedded vision applications. MobileNets are based on a streamlined architecture that uses depth-wise separable convolutions to build light weight deep neural networks.”

  • Convolution Filter를 2개의 Convolution으로 Factorize해 Operation 수를 줄이는 컨셉

ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices

  • 2017.1
  • Channel Shuffle : “채널 수를 최대한 유지하면서 커넥션 자체를 Sparse하게 만든 것이 더 성능이 좋다는 것이 이 논문의 요약”

  • Mobilenet에서의 Depthwise Separable Convolution은 동일하게 사용했지만 Channel Shuffle이라는 개념을 추가적으로 도입함.
  • 모델이 작으면 작을수록 Mobilenet에 비해 성능 차이가 월등히 나아지며, 매우 빠른 속도로 arm 기반 프로세서에서 동작하는 것을 보여었음.

[MobilenetV2] Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation

  • 2018.1
  • The MobileNetV2 architecture is based on an in- verted residual structure where the input and output of the residual block are thin bottleneck layers opposite to traditional residual models which use expanded repre- sentations in the input and output.

  • image

Pelee: A Real-Time Object Detection System on Mobile Devices

Accelerating / Fast Algorithm

Code Optimization

Reference

[1] https://handong1587.github.io/deep_learning/2015/10/09/acceleration-model-compression.html


'Image Classification' Outline