top of page
Writer's pictureChockalingam Muthian

EfficientNet

Improving Accuracy and Efficiency through AutoML and Model Scaling


Convolutional neural networks (CNNs) are commonly developed at a fixed resource cost, and then scaled up in order to achieve better accuracy when more resources are made available. For example, ResNet can be scaled up from ResNet-18 to ResNet-200 by increasing the number of layers, and recently, GPipe achieved 84.3% ImageNet top-1 accuracy by scaling up a baseline CNN by a factor of four. The conventional practice for model scaling is to arbitrarily increase the CNN depth or width, or to use larger input image resolution for training and evaluation. While these methods do improve accuracy, they usually require tedious manual tuning, and still often yield suboptimal performance. What if, instead, we could find a more principled method to scale up a CNN to obtain better accuracy and efficiency?

There is a novel model scaling method that uses a simple yet highly effective compound coefficient to scale up CNNs in a more structured manner. Unlike conventional approaches that arbitrarily scale network dimensions, such as width, depth and resolution, this method uniformly scales each dimension with a fixed set of scaling coefficients. Powered by this novel scaling method and recent progress on AutoML, there is a model called EfficientNets, which superpass state-of-the-art accuracy with up to 10x better efficiency (smaller and faster). 

Compound Model Scaling: A Better Way to Scale Up CNNs


While scaling individual dimensions improves model performance, and balance all dimensions of the network—width, depth, and image resolution—against the available resources which eventually will improve overall performance. 

The first step in the compound scaling method is to perform a grid search to find the relationship between different scaling dimensions of the baseline network under a fixed resource constraint (e.g., 2x more FLOPS).This determines the appropriate scaling coefficient for each of the dimensions mentioned above. Then apply those coefficients to scale up the baseline network to the desired target model size or computational budget. 


This compound scaling method consistently improves model accuracy and efficiency for scaling up existing models such as MobileNet (+1.4% imagenet accuracy), and ResNet (+0.7%), compared to conventional scaling methods.

EfficientNet Architecture

EfficientNet Performance with other CNN on Imagenet


In general, the EfficientNet models achieve both higher accuracy and better efficiency over existing CNNs, reducing parameter size and FLOPS by an order of magnitude. For example, in the high-accuracy regime, EfficientNet-B7 reaches state-of-the-art 84.4% top-1 / 97.1% top-5 accuracy on ImageNet, while being 8.4x smaller and 6.1x faster on CPU inference than the previous Gpipe. Compared with the widely used ResNet-50, our EfficientNet-B4 uses similar FLOPS, while improving the top-1 accuracy from 76.3% of ResNet-50 to 82.6% (+6.3%).

Model Size vs. Accuracy Comparison

By providing significant improvements to model efficiency, EfficientNets could potentially serve as a new foundation for future computer vision tasks. EfficientNet models is an open source model now which we hope can benefit the larger machine learning community.

86 views0 comments

Recent Posts

See All

LLM Tech Stack

Pre-trained AI models represent the most important architectural change in software development. They make it possible for individual...

Comments


bottom of page