現在位置首頁 > 博碩士論文 > 詳目
  • 同意授權
論文中文名稱:使用Kinect體感攝影機藉由人體骨架進行人類動作識別 [以論文名稱查詢館藏系統]
論文英文名稱:Using Human Skeleton to Recognizing Human Motion by Kinect's Camera [以論文名稱查詢館藏系統]
院校名稱:臺北科技大學
學院名稱:電資學院
系所名稱:資訊工程系研究所
畢業學年度:99
出版年度:100
中文姓名:吳貴崗
英文姓名:Kuei-Kang Wu
研究生學號:97598058
學位類別:碩士
語文別:中文
口試日期:2011-07-25
論文頁數:65
指導教授中文名:張厥煒
指導教授英文名:Chueh-Wei Chang
口試委員中文名:楊士萱;奚正寧
中文關鍵詞:人體骨架人類動作動作識別KinectRule-BasedExample-Based
英文關鍵詞:Human SkeletonHuman MotionMotion RecognitionKinectRule-BasedExample-Based
論文中文摘要:人類動作的分析和識別是現行研究電腦視覺及圖形識別等領域的重要議題之一。人類的動作是由連續靜態姿勢構成,因為動作本身在空間與時間上具有相當複雜的高維度資訊,且動作可能會產生自我遮蔽(Self-Occlusion)現象,因此在一般的2D攝影機要精確地分析動作仍然有其瓶頸。在表達人體全身這方面,直覺上來說,以人體骨架來代表比用人體外型輪廓更為直觀。因此在本論文,使用Kinect體感攝影機進行人體追蹤及動作捕捉,進而找出人體骨架的關節三維直角座標資訊來進行人類動作識別。
本論文建立一套以視覺角度不變(View-Invariant)的動作比對模式。本論文提出使用Rule-Based混合Example-Based的方法。使用Kinect搭配OpenNI所產生的人體骨架15個關節的三維直角座標( x, y, z )值,座標經過正規化後,得到一組以身體軀幹為原點的物體座標系統。本論文以棒球運動為例,使用人體骨架關節之間的關係來估計姿勢,建立投手高壓投球規則,作為樣本比對前的濾器(Filter),濾除不可能為投手高壓投球的動作,再使用現場動作跟從訓練樣本中以K-Means分群演算法算出的最佳5個關鍵姿勢樣本比對15個關節的三維直角座標,計算現場動作跟最佳5個關鍵姿勢樣本的15個關節的歐幾里得距離,其總和即為動作差距,動作差距越小則相似度越高。
  此一動作比對模式無論在辨識的平均相似度為86.10%及執行速度為25 FPS,都有不錯的效果,可推廣至其他動作的識別。
論文英文摘要:This paper establishes a motion matching model of View-Invariant. This paper presents the use of Rule-Based and Example-Based approach. Using Kinect withOpenNI to generated three-dimensional Cartesian coordinates (x, y, z) value of the human skeleton joints, and then using the translation method of the geometrytransfer to normalized coordinates. After normalized coordinates, we get the objects coordinate system in which used the torso center as the origin.
This paper used baseball as example. In order to creating the rules of overhand pitching. We used the relationship between the joints position of human skeleton to estimating motion postures. Before motion matching, used these rules as the filter to filtering those posture which can’t be overhand pitching. Then use the live motion to matching the 3D coordinates of 15 joints of the best sample which are calculated and selected from training samples by K-Means clustering algorithms. Calculating the3D coordinates distances between 15 joints of live motion skeleton and the bestsample skeleton. The smaller distance the higher similarity.
This paper establishes the motion matching model of View-Invariant has good results in terms of average recognize similarity is 86.10% and process speed is 25FPS. It can be extended to recognize the other athletics motion.
論文目次:摘 要 i
ABSTRACT ii
誌 謝 iii
目 錄 iv
表目錄 vi
圖目錄 vii
第一章 緒論 1
1.1 研究動機 1
1.2 研究目的 2
1.3 研究範圍與限制 4
1.4 論文架構 5
第二章 相關文獻及技術探討 6
2.1 人類動作辨識文獻 6
2.2 人體骨架文獻 11
2.3 自然使用者介面(Natural User Interface) 12
2.4 Kinect體感攝影機 14
2.5 Kinect人體骨架化與人體骨架化偵測追蹤 16
2.5.1 人體偵測 17
2.5.2 人體骨架化追蹤演算法 19
2.5.3 人體骨架化追蹤系統 22
第三章 系統架構 27
3.1 人類動作辨識系統概述 27
3.2 人類動作辨識系統架構流程 27
第四章 人類動作特徵與動作規則濾器 32
4.1 人類動作特徵 32
4.1.1 骨架關節三維座標 33
4.1.2 三維座標正規化(Normalized) 35
4.2 動作規則濾器 37
4.2.1 分析高壓投球法動作-以右手投球為例 37
4.2.2 建立高壓投球法動作規則 40
第五章 動作樣本訓練與動作識別 45
5.1 動作樣本訓練 45
5.1.1 動能(Motion Energy) 45
5.1.2 關鍵姿勢(Key Poses)序列 46
5.1.3 K-Means分群演算法 47
5.1.4 最佳5個關鍵姿勢樣本 49
5.2 動作識別 52
第六章 實驗結果 54
6.1 實驗方法 54
6.2 實驗結果與探討 57
6.2.1 辨識正確之情況 57
6.2.2 辨識失敗之情況 60
第七章 結論與未來展望 62
7.1 結論 62
7.2 未來展望 62
參考文獻 63
論文參考文獻:[1] Microsoft Kinect, http://www.xbox.com/zh-TW/kinect
[2] John Underkoffler, "John Underkoffler points to the future of UI", Feb. 2010
http://www.ted.com/talks/lang/eng/john_underkoffler_drive_3d_data_with_a_gesture.html
[3] PrimeSense, http://www.primesense.com/
[4] Low-Cost Depth Cameras, http://www.hizook.com/blog/2010/03/28/low-cost-depth-cameras-aka-ranging-cameras-or-rgb-d-cameras-emerge-2010
[5] 洪春暉,http://www.ctimes.com.tw/News/ShowNews.asp?O=HJV4MAXZJA2SA-0ME1
[6] Jamie Shotton, Andrew Fitzgibbon, Mat Cook, Toby Sharp, Mark Finocchio, Richard Moore, Alex Kipman, Andrew Blake, Microsoft Research Cambridge & Xbox Incubation, "Real-Time Human Pose Recognition in Parts from Single Depth Images", IEEE CVPR, June 2011
[7] Kinect In Depth, http://www.tomshardware.com/reviews/game-developers-conference-gdc-2011-world-of-warcraft,2882-2.html
[8] PrimeSense, "Prime Sensor NITE 1.3 Algorithms notes", http://www.primesense.com/
[9] Xiaofei Ji, Honghai Liu, "Advances in View-Invariant Human Motion Analysis: A Review", IEEE Transactions On Systems, Man, And Cybernetics—Part C: Applications And Reviews, Vol. 40, No. 1, January 2010
[10] Daniel Weinland, Remi Ronfard, Edmond Boyer, "A Survey of Vision-Based Methods for Action Representation, Segmentation and Recognition", INRIA, Feb. 2010
[11] Ronald Poppe, "Vision-based human motion analysis: An overview", Computer Vision and Image Understanding 108, 2007, pp.4–18
[12] Fengjun Lv, Ramakant Nevatia, Institute for Robotics and Intelligent Systems, University of Southern California, "Single View Human Action Recognition using Key Pose Matching and Viterbi Path Searching", IEEE Conference on Computer Vision and Pattern Recognition, June 2007
[13] Y.C.Wu, H.S.Chen, W.J.Tsai, S.Y.Lee, J.Y.Yu, "Human Action Recognition Based On Layered-HMM", 2008 IEEE international conference on multimedia and expo ICME 2008 proceedings
[14] Aaron F. Bobick, James W. Davis, "The Recognition of Human Movement Using Temporal Templates", IEEE Transactions On Pattern Analysis and Machine Intelligence, Vol. 23, No. 3, March 2001
[15] Daniel Weinland, Remi Ronfard, Edmond Boyer, "Free Viewpoint Action Recognition using Motion History Volumes", Computer Vision and Image Understanding, Volume 104, Issues 2-3, November 2006, pp.249-257
[16] M. Blank, L. Gorelick, E. Shechtman, M. Irani, and R. Basri, "Actions as space-time shapes", in Proc. IEEE Conference. Computer Vision, 2005, vol. 2, pp.1395-1402.
[17] F. Lv and R. Nevatia, "Recognition and segmentation of 3D human action using HMM and multi-class AdaBoost", in Proc. Eur. Conf. Comput. Vis., 2006, vol. 4, pp. 359–372.
[18] Hironobu Fujiyoshi, Alan J. Lipton, "Real-time Human Motion Analysis by Image Skeletonization", in Proc. IEEE Workshop on Application Computer Vision, pp.15-21, 1998.
[19] Hironobu Fujiyoshi, Alan J. Lipton, "Real-time Human Motion Analysis by Image Skeletonization", Workshop on Applications of Computer Vision – WACV, 2004.
[20] Hsuan-Sheng Chen, Hua-Tsung Chen, Yi-Wen Chen and Suh-Yin Lee, "Human Action Recognition Using Star Skeleton", VSSN'06, October 27, 2006
[21] Duan-Yu Chena, Hong-Yuan Mark Liao, Sheng-Wen Shih, "Continuous Human Action Segmentation and Recognition Using A Spatio-Temporal Probabilistic Framework", IEEE International Symposium on Multimedia, 2006
[22] User interface, http://en.wikipedia.org/wiki/User_Interface
[23] Natural user interface, http://en.wikipedia.org/wiki/Natural_User_Interface
[24] SixthSense, http://www.pranavmistry.com/projects/sixthsense/
[25] Time-of-flight, http://en.wikipedia.org/wiki/Time-of-flight
[26] 身體就是控制器,微軟Kinect是怎麼做到的?http://www.techbang.com.tw/posts/2936-get-to-know-how-it-works-kinect

[27] How Motion Detection Works in Xbox Kinect, http://gizmodo.com/5681078/how-motion-detection-works-in-xbox-kinect
[28] OpenNI, "OpenNI UserGuide", http://OpenNI.org
[29] Ron Forbes, Arjun Dayal, "How You Become the Controller", http://www.xbox.com/en-US/Live/EngineeringBlog/122910-HowYouBecometheController
[30] 台灣棒球維基館,http://twbsball.dils.tku.edu.tw/wiki/index.php/%E9%A6%96%E9%A0%81
[31] 江明宏,棒球基本練習法,台南:大坤書局,2005,第14-51頁
[32] Euclidean distance , http://en.wikipedia.org/wiki/Euclidean_distance
[33] K-Means Clustering, http://en.wikipedia.org/wiki/K-means_clustering
[34] K-Means, http://neural.cs.nthu.edu.tw/jang/books/dcpr/kMeans.asp?title=3-3+K-means+%E5%88%86%E7%BE%A4%E6%B3%95
[35] OpenNI GoogleGroups, http://groups.google.com/group/openni-dev
[36] Heresy, "OpenNI / Kinect", http://kheresy.wordpress.com/index_of_openni_and_kinect/
論文全文使用權限:同意授權於2016-08-08起公開