現在位置首頁 > 博碩士論文 > 詳目
論文中文名稱:加速式卷積神經網路模型 [以論文名稱查詢館藏系統]
論文英文名稱:An Accelerative Convolution Neural Network Model [以論文名稱查詢館藏系統]
院校名稱:臺北科技大學
學院名稱:機電學院
系所名稱:自動化科技研究所
畢業學年度:106
畢業學期:第一學期
出版年度:106
中文姓名:鄭仲勝
英文姓名:Chung-Sheng Checg
研究生學號:104618014
學位類別:碩士
語文別:中文
口試日期:2018/01/29
論文頁數:52
指導教授中文名:林顯易
指導教授英文名:Hsien-I Lin
口試委員中文名:李俊賢;林惠勇
中文關鍵詞:機器學習加速網路模型深度學習
英文關鍵詞:Machine learningAccelerative Neural Network ModelDeep learning
論文中文摘要:隨者工業4.0的來臨,工業界裡,越來越多工廠朝著自動化發展機器學習,而自動化工廠中常以機器人為核心應用,主要進行重現性以及需要勞力的工作。近幾年結合感測器與機器學習技術的機械手臂,逐漸被發展成能適應更多不同的工作項目。機器學習是能讓電腦透過海量的資訊自行學會其中的規則,自行更正錯誤的一門技術,比起早期基於人工規則的系統,展現了許多的優越性,但在傳統機器學習中,需要對原始的數據進行特徵擷取並且在有限樣本的情況下對於較複雜的函數逼近時能力有限,如果想讓機器達到人腦的反應效果,只使用淺層學習模型是遠遠不夠的。直到近年來深度學習發展出,比起傳統淺層學習演算法更能夠逼近複雜的函數,且能夠自動擷取足以代表整個資料特性的特徵,不需要對資料進行複雜的預處理,有優良的泛化能力和通用性。但它的缺點是擁有龐大的運算量,主要原因是為了要自動學習資料的特徵以及逼近複雜的函式,需要多層的網路結構,通常越多層的網路架構其預測與分類的效能會越好,但同時也伴隨者數以百萬或千萬的參數。然而,過於龐大的參數量容易造成過擬合的問題,和電腦記憶體的使用,因此我們發現大量減少卷積神經網路中特徵層與全連接層間的參數數量,在些微降低辨識率的情況下,加速網路模型並減少計算所花費的時間。為了驗證本研究所提出加速卷積網路模型的通用性,我們使用AlexNet與VGGNet兩種卷積神經網路分別在THUR15K[36]、Caltech-101[37]、Caltech-256[38]、GHIM10k[39]資料庫進行驗證,實驗結果顯示,利用所提出的加速模型,雖然會犧牲些微的辨識率(最大下降為1.34%),但參數量卻能大幅減少(最多約76%)。以及利用這個方法實現在工廠元件辨識上,實驗結果顯示,雖然會下降0.05%的辨識率,但整體參數少了一半一上(約63%),辨識時間也從原本的120ms縮短成65ms。
論文英文摘要:Machine learning is a technology that allows computers to learn the rules through vast amounts of information and correct their mistakes themselves. It show the superiority to conventional artificial methods. However in shallow learning, the capability of modelling complex functions is limited in the case of finite samples. Thus, shallow learning models are not enough to simulate human brains in solving difficult problems. Until recently, deep learning was proposed to model complex functions that shallow learning cannot achieve and automatically extract data features. Deep learning works great in generalization even without data pre-processing. However, its disadvantage is computation consumption. In order to automatically learn the characteristics of data and to model complex functions, a multi-layered network is required. Usually when a network has more layers, its prediction performance is better. But it comes with a downside that the network has millions of weighting parameters. Because a great number of parameters causes over-fitting and the massive use of computer memory, we propose a method to reduce the parameters of a convolutional neural network. The goal is to reduce the number of network parameters as many as possible but slightly degrade the accuracy. To validate the proposed method, THUR15K[36], Caltech-101[37], Caltech-256[38], GHIM10k[39] databases are used.. The experimental results show that the parameters is greatly reduced with a slight drop on accuracy (about 1.34%).
論文目次:摘 要 i
ABSTRACT iii
誌 謝 v
目 錄 vii
表目錄 viiii
圖目錄 ixx
第一章 緒論 1
1.1 背景與動機 1
1.2 研究目的 2
1.3 研究大綱 3
1.4 研究貢獻 3
第二章 文獻探討 4
2.1 深度學習 4
2.1.1 加速深度學習技術 5
第三章 相關理論介紹 8
3.1 類神經網路 8
3.2 卷積神經網路 10
3.2.1 局部感知 11
3.2.2 權值共享 12
3.2.3 卷積層 13
3.2.4 池化層 13
3.2.5 修正線性單元 14
3.2.6 Dropout技術 15
第四章 加速深度學習網路模型 17
4.1 加速神經網路模型 17
第五章 視覺系統 28
5.1 使用軟硬體設備 29
5.1.1 工業相機 29
5.1.2 小型輸送帶 30
5.1.3 深度學習框架Caffe 31
5.2 影像偵測系統 32
5.2.1 高斯背景模型相減 32
5.2.2 感興趣區域擷取 34
5.2.3 旋轉姿態 35
5.3 卷積網路學習乙太網路面相 35
5.3.1 資料蒐集與擴增 35
5.3.2 訓練模型 41
第六章 實驗 43
第七章 結論與未來展望 47
參考文獻 48
論文參考文獻:參考文獻
[1] G. E. Hinton, S. Osindero, and Y.-W. Teh, “A fast learning algorithm for deep belief nets,” Neural computation, vol. 18, no. 7, pp. 1527–1554, 2006.
[2] G. Hinton, L. Deng, D. Yu, G. Dahl, A.-r. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. Sainath, et al., “Deep neural networks for acoustic modeling,”.
[3] J. Huang and B. Kingsbury, “Audio-visual deep learning for noise robust speech recognition,” in Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on, pp. 7596–7599, IEEE, 2013.
[4] W. Ouyang and X. Wang, “Joint deep learning for pedestrian detection,” in Proceedings of the IEEE International Conference on Computer Vision, pp. 2056–2063, 2013.
[5] C. Farabet, C. Couprie, L. Najman, and Y. LeCun, “Learning hierarchical features for scene labeling,” IEEE transactions on pattern analysis and machine intelligence, vol. 35, no. 8, pp. 1915–1929, 2013.
[6] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998.
[7] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in neural information processing systems, pp. 1097–1105, 2012.
[8] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, pp. 248–255, IEEE, 2009.
[9] X. Glorot, A. Bordes, and Y. Bengio, “Deep sparse rectifier neural networks,” in Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pp. 315–323, 2011.
[10] G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. R. Salakhutdinov, “Improving neural networks by preventing co-adaptation of feature detectors,” arXiv preprint arXiv:1207.0580, 2012.
[11] S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” in International Conference on Machine Learning, pp. 448–456, 2015.
[12] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1–9, 2015.
[13] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
[14] S. Srisuk and S. Ongkittikul, “Robust face recognition based on weighted deepface,” in Electrical Engineering Congress (iEECON), 2017 International, pp. 1–4, IEEE, 2017.
[15] S. Contreras and F. De La Rosa, “Using deep learning for exploration and recognition of objects based on images,” in Robotics Symposium and IV Brazilian Robotics Symposium (LARS/SBR), 2016 XIII Latin American, pp. 1–6, IEEE, 2016.
[16] G.-S. Hsu, A. Ambikapathi, S.-L. Chung, and C.-P. Su, “Robust license plate detection in the wild,” in Advanced Video and Signal Based Surveillance (AVSS), 2017 14th IEEE International Conference on, pp. 1–6, IEEE, 2017.
[17] D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, et al., “Mastering the game of go with deep neural networks and tree search,” Nature, vol. 529, no. 7587, pp. 484–489, 2016.
[18] Y. Cheng, D. Wang, P. Zhou, and T. Zhang, “A survey of model compression and acceleration for deep neural networks,” arXiv preprint arXiv:1710.09282, 2017.
[19] M. Lin, Q. Chen, and S. Yan, “Network in network,” arXiv preprint arXiv:1312.4400, 2013.
[20] F. N. Iandola, S. Han, M. W. Moskewicz, K. Ashraf, W. J. Dally, and K. Keutzer, “Squeezenet: Alexnet-level accuracy with 50x fewer parameters and< 0.5 mb model size,” arXiv preprint arXiv:1602.07360, 2016.
[21] S. Han, H. Mao, and W. J. Dally, “Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding,” arXiv preprint arXiv:1510.00149, 2015.
[22] S. Srinivas and R. V. Babu, “Data-free parameter pruning for deep neural networks,” arXiv preprint arXiv:1507.06149, 2015.
[23] H. Li, A. Kadav, I. Durdanovic, H. Samet, and H. P. Graf, “Pruning filters for efficient convnets,” arXiv preprint arXiv:1608.08710, 2016.
[24] J. Ba and R. Caruana, “Do deep nets really need to be deep?,” in Advances in neural information processing systems, pp. 2654–2662, 2014.
[25] T. Chen, I. Goodfellow, and J. Shlens, “Net2net: Accelerating learning via knowledge transfer,” arXiv preprint arXiv:1511.05641, 2015.
[26] S. Han, J. Pool, S. Narang, H. Mao, E. Gong, S. Tang, E. Elsen, P. Vajda, M. Paluri, J. Tran, et al., “Dsd: Dense-sparse-dense training for deep neural networks,” 2016.
[27] G. Hinton, O. Vinyals, and J. Dean, “Distilling the knowledge in a neural network,” arXiv preprint arXiv:1503.02531, 2015.
[28] P. Luo, Z. Zhu, Z. Liu, X. Wang, X. Tang, et al., “Face model compression by distilling knowledge from neurons.,” in AAAI, pp. 3560–3566, 2016.
[29] S. Han, J. Pool, J. Tran, and W. Dally, “Learning both weights and connections for efficient neural network,” in Advances in Neural Information Processing Systems, pp. 1135–1143, 2015.
[30] F. Mamalet and C. Garcia, “Simplifying convnets for fast learning,” Artificial Neural Networks and Machine Learning–ICANN 2012, pp. 58–65, 2012.
[31] M. Jaderberg, A. Vedaldi, and A. Zisserman, “Speeding up convolutional neural networks with low rank expansions,” arXiv preprint arXiv:1405.3866, 2014.
[32] A. L. Maas, A. Y. Hannun, and A. Y. Ng, “Rectifier nonlinearities improve neural network acoustic models,” in Proc. ICML, vol. 30, 2013.
[33] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell, “Caffe: Convolutional architecture for fast feature embedding,” in Proceedings of the 22nd ACM international conference on Multimedia, pp. 675–678, ACM, 2014.
[34] Z. Zivkovic, “Improved adaptive gaussian mixture model for background subtraction,” in Pattern Recognition, 2004. ICPR 2004. Proceedings of the 17th International Conference on, vol. 2, pp. 28–31, IEEE, 2004.
[35] R. C. Gonzalez, R. E. Woods, et al., “Digital image processing,” 1992.
[36] M.-M. Cheng, N. J. Mitra, X. Huang, and S.-M. Hu, “Salientshape: Group saliency in image collections,” The Visual Computer, vol. 30, no. 4, pp. 443–453, 2014.
[37] L. Fei-Fei, R. Fergus, and P. Perona, “Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories,” Computer vision and Image understanding, vol. 106, no. 1, pp. 59–70, 2007.
[38] G. Griffin, A. Holub, and P. Perona, “Caltech-256 object category dataset,” 2007.
[39] G.-H. Liu, J.-Y. Yang, and Z. Li, “Content-based image retrieval using computational visual attention model,” pattern recognition, vol. 48, no. 8, pp. 2554–2566, 2015.
論文全文使用權限:同意授權於2021-02-13起公開