Week 6

ref: https://dl.acm.org/doi/pdf/10.1145/3065386

Article reviewed: ImageNet Classification with Deep CNN

Basic:

Summary:
If we have immense computing power (GPU) then we can use the CNN (a mixed of Conv+fully connected) and utilize labeled data to improve the image classification. This will work for Super Vision. To train large, deep CNN faster, non-saturating neurons and a very efficient GPU . to reduce overfitting and error rate. Evaluate the classification performance on the weight of each connection.

?😇 Questions:

  • Is randomly assigned weights ok?
  • what are labeled data?
  • Error rate top1 & top5
  • Local Response Normalization
  • It works for static data, what about videos?

Keywords: labeled data, GPU, ImageNet, 2D conv

Five C’s:
Category: Deep Learning / ImageNet

Context: image and Supervision

Correctness: ⭐️⭐️⭐️⭐️⭐️

Contributions: Fast training + high accuracy for supervision, labeled datasets to present NN

Clarity: ⭐️⭐️⭐️

Outline:

1. Introduction:

Labeled data are small. MNIST digit-recognition task <0.3% human performance.

To achieve better training result, we need immense prior knowledge, and need model to make correct assumptions.

Datasets ImageNet: over 15 million labeled dataset for 22,000 categories. ILSVRC2010

2. Architecture

Network Structure: 8 learned layers : 5 conv + 3 fully connected LocalResponse normalization:

Overlapping Pooling: zxz centered , s=z then, square CNN. s<z then overlapping. >

multinomial logistic Regression Dropout: 0.5 Drop two fully connected layers to reduce overfit

<< batch size: 128 momentum 0.9 weight decay : 0.0005> 📍 Equal learning rate for all layers