現在位置首頁 > 博碩士論文 > 詳目
論文中文名稱:MapReduce基礎之多重支持度關聯分析模型-以醫療檢驗資料為例 [以論文名稱查詢館藏系統]
論文英文名稱:Association Analysis Model with Multiple Support Based on MapReduce - A Case of Radioimmunoassay Data [以論文名稱查詢館藏系統]
院校名稱:臺北科技大學
學院名稱:管理學院
系所名稱:資訊與財金管理系碩士班
畢業學年度:104
畢業學期:第二學期
出版年度:105
中文姓名:何信俞
英文姓名:Xin-Yu He
研究生學號:103AB8021
學位類別:碩士
語文別:中文
口試日期:2016/06/06
指導教授中文名:王貞淑
指導教授英文名:Chen-Shu Wang
口試委員中文名:丁一賢;蕭文龍
口試委員英文名:I-Hsine Ting;Wen-Lung Shiau
中文關鍵詞:多重支持度關聯分析放射免疫分析MapReduce資料比對
英文關鍵詞:Support Multiple Association AnalysisRadioimmunoassayMapReduceData Matching
論文中文摘要:隨著網路及科技技術的快速發展,國內的醫療院所幾乎導入功能及整合性強大的醫院資訊系統 (Hospital Information System, HIS),由於醫療資訊系統所涵蓋包含非常多類型的子系統,又系統間經常需要傳遞資料,因此可能會導致不同的子系統間資料內容產生不一致等問題。而本研究透過適當的方法處理及分析,改善資料品質,進一步加強醫院資訊系統的完整性。
本研究所提出之分析模型架構中,首先在資料前置處理的流程,採取自動化比對的方式,降低人工比對常常容易造成的錯誤且耗費時間與人力,也能提高資料的正確性。由實驗結果顯示該自動化比對機制能夠有效修復部份資料上錯誤,此外本研究所提出之分析模型中的關聯分析模型也加入多重支持度的概念。又考量到資料量增加時的運算速度,因此本研究將所開發之分析模型建構於MapReduce的架構進行平行運算,它能夠處理龐大的資料量,進而發現RIA檢查項目中的關聯性。
最後根據實驗結果顯示本研究之分析模型架構能夠有效縮短運算時間,並找出十五條具有意義的關聯規則。經由實際與檢驗科醫師進行交叉驗證後證實,本研究模型所發現之關聯特徵樣式,有利日後提供相關醫療行為之建議,進而改善醫療品質,也能提高病患看診時的安全性。
論文英文摘要:With the network and technology advances in the domestic medical center provider almost-on feature and integration of the hospital information system. Because of medical information systems covered include numerous types of subsystem, and systems are often need to pass data. This may result in different subsystem data content generated not consistent. This research through the appropriate method processing and analysis, improve data quality, further enhance the integrity of hospital information systems.
This research proposed model architecture, data pre-processing in the process, we take automated alignment method. We were able to reduce manual matching often easy to cause the error and time-consuming and labor and can improve data accuracy. From the experimental results show that the automatic alignment mechanism can effectively repair the most of data errors. Further analysis model correlation analysis model, proposed in the research has also added support for multiple concept. And considering the amount of data increases the speed, therefore this research the development of analysis model built in MapReduce framework for parallel computing, it can handle large amount of data, and discover RIA items in the association.
Finally, we analyze the experimental results show that the model structure of the research can shorten the operation time, and find fifteen meaningful association rules. We actually and examination physician for cross-after verification that the study models have found the associated characteristics style. These advantages provide relevant medical behavior of recommendations to improve quality of care that can improve patient visit is security.
論文目次:摘要 i
ABSTRACT iii
誌謝 v
目錄 vi
表目錄 viii
圖目錄 ix
第一章 緒論 1
1.1 研究背景與動機 1
1.2 研究目的 3
1.3 研究架構 4
1.4 研究範圍與限制 6
第二章 文獻探討 7
2.1 實驗室資訊系統(LIS) 7
2.2 醫院資訊系統(HIS) 9
2.3 LIS、HIS與DRS的介面相關性及資料品質問題 11
2.4 資料比對的方法 13
2.5 資料探勘 14
2.6 多重支持度關聯規則 17
2.7 MapReduce原理 19
第三章 研究方法 21
3.1 資料收集 22
3.2 研究模型 23
3.3 基於MapReduce的多重支持度關聯分析 26
第四章 資料分析與結果 28
4.1 資料敘述性統計 28
4.2 資料前置處理 30
4.3資料多重支持度分析 37
4.3.1 實驗環境 37
4.3.2 實驗資料 38
4.3.3 實驗過程與結果 39
第五章 結論與未來研究方向 44
5.1結論 44
5.2未來研究方向 45
參考文獻 46
附錄 51
HIS系統介紹 52
論文參考文獻:[1] 吕雪骥、李龙澍,「FP-Growth 算法 MapReduce 化研究」,計算機技術與發展,第二十二卷,第十一期,2002,第123-126頁
[2] 周賢昭,1998,以資料礦探採技術發展臨床路徑之研究,碩士論文,中山大學資訊管理所,高雄。
[3] 邱彗株,2011,自動化資料比對法應用在醫學中心放射免疫資料之分析檢測,碩士論文,淡江大學全球華商經營管理數位學習所,臺北。
[4] 姜秀卿、陳文欽、劉立、洪仲箴,以安全為導向之長期照護資訊系統,長期照護雜誌,10(1),2006,第1-9頁。
[5] 高智雄,檢驗資訊系統技術層面之探討。中華民國醫檢會報,20(1),2005,第74-92頁。
[6] 梁水金,2002,建立一個 Web-based 資料挖掘系統提供藥物交互作用資訊查詢,碩士論文,逢甲大學資訊工程所,臺中。
[7] 莊莉瑩,2000,資料挖掘機制在臨床路徑之應用,碩士論文,東海大學工業工程所,台中。
[8] 陳世源,1998,資料採礦技術在病例與藥品關連性之研究,碩士論文,中山大學資訊管理所,高雄。
[9] 陳玉豐,2003,資料探勘在實證醫學上之研究-以闌尾切除、疝氣、糖尿病、胃出血為例,碩士論文,中國醫藥學院醫療管理所,臺中。
[10] 陳迪祥,2003,以資料探勘技術發掘疾病隱藏關係之研究,碩士論文,暨南國際大學資訊管理所,南投。
[11] 曾元顯,分類不一致對文件自動分類效果的影響,大學圖書館,9(1),2005,第2-19頁。
[12] 黃仁鵬、蔡季嵐,挖掘關聯規則之階段搜尋演算法-GSA,電子商務學報,第九卷,第四期,2007,第823-845。
[13] 楊振銘、何維翰,「權重式多重支持度關聯規則於大型資料庫挖掘之研究」,第十四屆國際資訊管理學術研討會,2003。
[14] 蔡懇鐸、陳金德、陳中明、王正一。1994。電子計算機在臨床檢驗之應用。內科學誌,5:2,民83.6,頁145-160。
[15] 藍中賢,2000,結合模糊集合理論與貝氏分類法之資料探勘技術--應用於健保局醫療費用審查作業,碩士論文,元智大學資訊管理所,桃園。
[16] 行政院衛生署,醫院資訊系統規範2.0,台北市:行政院,2009,第1-335頁
[17] Ammenwerth, E., Gräber, S., Herrmann, G., Bürkle, T., & König, J. (2003). Evaluation of health information systems—problems and challenges.International journal of medical informatics, 71(2), 125-135.
[18] Ballou, D. P., & Pazer, H. L. (1985). Modeling data and process quality in multi-input, multi-output information systems. Management science, 31(2), 150-162.
[19] Berry, M. J., & Linoff, G. (1997). Data mining techniques: for marketing, sales, and customer support. John Wiley & Sons, Inc..
[20] Brauer, B., “Data Quality –Spinning Straw Into Gold,” Available [Online] at: http://www2.sas.com/proceedings/sugi26/p117-26.pdf, 2000.
[21] Chengalur-Smith, I. N., Ballou, D. P., & Pazer, H. L. (1999). The impact of data quality information on decision making: an exploratory analysis. Knowledge and Data Engineering, IEEE Transactions on, 11(6), 853-864.
[22] Collen, M. F. (1991). A brief historical overview of hospital information system (HIS) evolution in the United States. International journal of bio-medical computing, 29(3), 169-189.
[23] Curt, H. (1995). The devil’s in the detail: techniques, tools, and applications for database mining and knowledge discovery-part 1. Intelligent Software Strategies, 6(9), 1-15.
[24] Cykana, P., Paul, A., & Stern, M. (1996). DoD Guidelines on Data Quality Management. In IQ (pp. 154-171).
[25] Da Silva, R., Stasiu, R., Orengo, V. M., & Heuser, C. A. (2007). Measuring quality of similarity functions in approximate data matching. Journal of Informetrics, 1(1), 35-46.
[26] Dean, J., & Ghemawat, S. (2008). MapReduce: simplified data processing on large clusters. Communications of the ACM, 51(1), 107-113.
[27] DeLone, W. H., & McLean, E. R. (1992). Information systems success: The quest for the dependent variable. Information systems research, 3(1), 60-95.
[28] Fayyad, U., Piatetsky-Shapiro, G., & Smyth, P. (1996). From data mining to knowledge discovery in databases. AI magazine, 17(3), 37.
[29] Frawley, W. J., Piatetsky-Shapiro, G., & Matheus, C. J. (1992). Knowledge discovery in databases: An overview. AI magazine, 13(3), 57.
[30] Gardyn, E. (1997, October). A Data Quality Handbook for a Data Warehouse. In IQ (pp. 267-290).
[31] Goodhue, D. L. (1995). Understanding user evaluations of information systems. Management science, 41(12), 1827-1844.
[32] Grupe, F. H., & Mehdi Owrang, M. (1995). Data base mining discovering new knowledge and competitive advantage. Information System Management,12(4), 26-31.
[33] Jarke, M., & Vassiliou, Y. (1997, May). Data Warehouse Quality: A Review of the DWQ Project. In IQ (pp. 299-313).
[34] Kleissner, C. (1998, January). Data mining for the enterprise. In System Sciences, 1998., Proceedings of the Thirty-First Hawaii International Conference on (Vol. 7, pp. 295-304). IEEE.
[35] Kovac, R., Lee, Y. W., & Pipino, L. (1997, October). Total Data Quality Management: The Case of IRI. In IQ (pp. 63-79).
[36] Lee, Y. W., Strong, D. M., Kahn, B. K., & Wang, R. Y. (2002). AIMQ: a methodology for information quality assessment. Information & management, 40(2), 133-146.
[37] Li, H., Wang, Y., Zhang, D., Zhang, M., & Chang, E. Y. (2008, October). Pfp: parallel fp-growth for query recommendation. In Proceedings of the 2008 ACM conference on Recommender systems (pp. 107-114). ACM.
[38] Liu, B., Hsu, W., & Ma, Y. (1999, August). Mining association rules with multiple minimum supports. In Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 337-341). ACM.
[39] Liu, B., Hsu, W., Chen, S., & Ma, Y. (2000). Analyzing the subjective interestingness of association rules. Intelligent Systems and their Applications, IEEE, 15(5), 47-55.
[40] Mandke, V. V., & Nayar, M. K. (1997, October). Information Integrity: A Structure for its Definition. In IQ (pp. 314-338).
[41] Matsumura, A., & Shouraboura, N. (1996). Competing with Quality Information. In IQ (pp. 72-86).
[42] McDowall, R. D. (1995). A matrix for a LIMS with a strategic focus. Laboratory Automation & Information Management, 31(1), 57-64.
[43] Meyen, D., & Willshire, M. J. (1997). A Data Quality Engineering Framework. In IQ (pp. 95-116).
[44] Nicholson, S. (2006). The basis for bibliomining: Frameworks for bringing together usage-based data mining and bibliometrics through data warehousing in digital library services. Information processing & management, 42(3), 785-804.
[45] Redman, T. C. (1992). Data quality: management and technology. Bantam Books, Inc..
[46] Reichertz, P. L. (2006). Hospital information systems—Past, present, future.International Journal of Medical Informatics, 75(3), 282-299.
[47] Smith Jr, J. W., & Svirbely, J. R. (1990, June). Laboratory information systems. In Medical informatics: computer applications in health care (pp. 273-279). Addison-Wesley Longman Publishing Co., Inc..
[48] Wand, Y., & Wang, R. Y. (1996). Anchoring data quality dimensions in ontological foundations. Communications of the ACM, 39(11), 86-95.
[49] Zmud, R. W. (1978). An empirical investigation of the dimensionality of the concept of information*. Decision sciences, 9(2), 187-195.
[50] House Call, MD - Health Carefully Explained, http://www.myhousecallmd.com/the-thyroid-demystified-time-to-have-it-checked/
論文全文使用權限:不同意授權