現在位置首頁 > 博碩士論文 > 詳目
論文中文名稱:基於MapReduce於分散式基因演算法之應用與研究 [以論文名稱查詢館藏系統]
論文英文名稱:The Study of distributed Genetic Algorithm based on MapReduce [以論文名稱查詢館藏系統]
院校名稱:臺北科技大學
學院名稱:管理學院
系所名稱:資訊與財金管理系碩士班
畢業學年度:104
畢業學期:第二學期
中文姓名:楊世強
英文姓名:Shih-Ciang Yang
研究生學號:103AB8018
學位類別:碩士
指導教授中文名:王貞淑
指導教授英文名:Chen-Shu Wang
口試委員中文名:丁一賢;蕭文龍
中文關鍵詞:MapReduce基因演算法旅行銷售員問題背包問題
英文關鍵詞:MapReduceGenetic AlgorithmTravelling Salesman ProblemKnapsack Problem
論文中文摘要:近年來網際網路的快速發展,帶動了社會的變遷,行動上網、巨量資料、雲端運算、Hadoop…等技術的推行,資料量變得非常的龐大,除了改變資料的儲存方式之外,也為傳統的運算能力帶來很大的影響。在過去資料量不大的情況,單機環境下的運算能力還算足夠,但是資料量變大之後,所需要的時間及資源也會隨之變多,造成單一台機器不堪負荷。以基因演算法為例,它是一種迭代型之演算法,傳統實作於單機的環境之下,基因需要複製、交配、突變,以及演化至下一世代,因為不停地重複運算,雖然比較容易找到全域最佳解,但過程中會耗費非常巨大的資源以及時間,這是基因演算法的缺點。
本研究提出一個以MapReduce架構為基礎的分散式基因演算法,希望能透過MapReduce的分散式運算能力來改善傳統基因算法的效率,並維持它的求解正確率,透過本研究所提出的架構來解決兩個重要的組合優化之問題,即旅行銷售員問題與背包問題,前者唯一NP-Hard問題、後者為NP-Complete問題,經由實驗來證明本研究所提出的分散式架構用於基因演算法的可行性。
透過實驗來解決NP問題,驗證分散式架構的執行效率,並且與三種不同的執行方法來做比較,分別是單機的基因演算法、單機多執行緒的Parallel GA以及透過Virtual Machine 執行基因演算法等三種方式,計算執行效率的方法為本研究之架構所花費時間除以其他方法所使用的時間。
根據本研究的兩個實驗結果發現,本研究所提出的分散式基因演算法架構用於解決問題與單機基因演算法相比,能提升執行效率40%以上的時間,與Parallel GA相比能夠減少20%以上的時間,與VM機器上相同MapReduce架構相比,也能快上12-18%的時間,並且在正確度方面也都能找到全域最佳解,從這些結果中可以發現本研究提出的架構能用來解決傳統基因演算法無法解決更大型的旅行銷售員問題與背包問題也能夠在更短的時間內找到更好的答案。
論文英文摘要:In recent years, the rapid development of Internet, led the social changes, the implementation of technology about mobile Internet, big data, cloud computing, Hadoop. The amount of data becomes very large, except for changing the way data is stored, also have a big impact for the traditional computing. In the past, the amount of data computing is not the case of a single environment fairly enough, but after a large amount data, time and resources required will also increases, caused a single machine overloaded. Genetic algorithm, for example, which is a type of iterative algorithms for implementation under the traditional single environment,Gene needs to be selection, crossover, mutation, and the evolution to the next generation, because repeated the operation, although easier to find best solution, but the process will take very large resources and time, which is a shortcoming of genetic algorithm .
This study proposes distributed Genetic Algorithm based on MapReduce, hoping to improve distributed computing efficiency of traditional genetic algorithm through the MapReduce and maintain the correct rate. In this study, we propose a MapReduce-based Genetic Algorithm (MRGA) to solve two important combinatorial optimization problems, namely travelling salesman problem (TSP) and knapsack problem (KP), the former is a NP-Hard problem , the latter is NP-Complete problem, through experiments to prove our proposed distributed architecture for the feasibility of genetic algorithm..
Through experiments to solve NP problems, verify the efficiency of distributed architecture. For both problems, three other algorithms are also coded and applied to solve their instances: Genetic algorithm for single machine, Multi-threaded Parallel GA on a single machine (PGA) and MRGA on virtual machine (MPGA-VM). The method of calculating the execution time is the method takes time divided by the time other methods used .
According to the results of two experiments in this study found that our proposed architecture for distributed genetic algorithm to solve the problem compared with the single genetic algorithm, can improve the efficiency of more than 40% of the time, can be reduced as compared with the Parallel GA 20 % or more of the time, compared with the MapReduce architecture on the VM machine can be faster on 12-18% of the time, and the accuracy can also find the best answer. From these results can be found in architecture proposed in this study can be used to solve the traditional genetic algorithm can’t solve larger the traveling salesman problem and knapsack problem, it is possible to find a better answer in a shorter period of time.
論文目次:摘 要 i
英文摘要 iii
誌 謝 v
目 錄 vi
表目錄 ix
圖目錄 x
第一章 緒論 1
1.1 研究背景與動機 1
1.2 研究目的 2
1.3 論文架構 4
第二章 文獻探討 5
2.1 巨量資料 5
2.2 Non-Deterministic Polynomial problems 7
2.3 MapReduce 8
2.3.1 MapReduce架構 8
2.3.2 MapReduce運算流程 9
2.4 基因演算法 11
2.4.1 問題編碼及解碼 12
2.4.2 計算適應函數 12
2.4.3 選擇/複製 13
2.4.4 交配 14
2.4.5 突變 16
2.5 平行基因演算法(Parallel Genetic Algorithm) 18
2.5.1 主從式平行基因演算法 18
2.5.2 島嶼式平行基因演算法 19
2.6 MapReduce應用於基因演算法之研究 21
第三章 研究方法 26
3.1 研究架構 26
3.1.1 主程式設計方法 27
3.1.2 Mapper方法設計 29
3.1.3 Reducer方法設計 30
3.2 基於MapReduce的基因演算法 31
第四章 實驗設計 33
4.1 背包問題 33
4.1.1 問題描述與數學模式 33
4.1.2 基因演算法之設計 34
4.2 旅行銷售員問題 37
4.2.1 問題描述 37
4.2.2 數學模式 37
4.2.3 基因演算法之設計 38
第五章 研究結果 43
5.1 實驗環境 43
5.2 實驗資料 44
5.3 基因演算法參數設定 44
5.4 實驗結果 45
5.4.1 實驗一 背包問題之執行效率探討 45
5.4.2 實驗二 旅行銷售員問題之執行效率探討 49
5.4.3 實驗整理總表 53
第六章 結論與未來展望 55
6.1 結論 55
6.2 未來展望 57
參考文獻 58
論文參考文獻:[ 1 ] 王子夏. (2008). 基因演算法與粒子群演算法之比較-以派課問題為例. 崑山科技大學資訊管理研究所學位論文, 1-60.
[ 2 ] 王裕廷. (2010). 基因演算法應用於具時窗限制之多天旅遊行程規劃. 長榮大學資訊管理研究所學位論文, 1-61.
[ 3 ] 邱元泰 & 紀美秀. (2002). 遺傳演算法在排課問題之應用. 國立中正大學數學研究所碩士論文, 民 91 年.
[ 4 ] 高秀婷. (2013).基於 Hadoop雲端運算架構之平行基因演算最佳化學習障礙診斷輔助系統. 國立彰化師範大學數位內容科技與管理研究所學位論文.1-95.
[ 5 ] 張育彰. (2003). 應用基因演算法於台鐵列車駕駛員排班與輪班整合問題之研究. 成功大學交通管理科學系學位論文, 1-131.
[ 6 ] 莊凱翔. (2001). 求解護理人員排班最佳化之研究─ 以遺傳演算法求解. 國立成功大學工業管理學系碩士論文, 民國 90 年.
[ 7 ] 黃朝魁. (1996). 分散式遺傳演算法於最佳化設計之研究. 國立中山大學機械工程研究所碩士論文, 民國 85 年.
[ 8 ] 楊浩. (2014).基於MapReduce的基因演算法於旅遊行程規劃之研究.國立臺北科技大學資訊管理研究所學位論文.1-56。
[ 9 ] 廖丞宇. (2013).植基於Hadoop雲端運算架構之平行基因演算法與粒子群演算法的應用.崑山科技大學資訊管理研究所學位論文.1-78.
[ 10 ] 蔡碧展. (2010).基於Hadoop平台的雲端基因架構. 國立高雄第一科技大學資訊管理研究所學位論文.1-68.
[ 11 ] 謝欣宏. (2002). 台鐵司機員排班與輪班問題之研究-以基因演算法求解, 國立成功大學交通管理科學研究所碩士論文. In International Conference on Logic Programming (Lisbon, Portugal June), MIT Press, Cambridge, MA (pp. 165-180).
[ 12 ] Abramson, D., & Abela, J. (1991). A parallel genetic algorithm for solving the school timetabling problem. Division of Information Technology, CSIRO.
[ 13 ] Alba, E., & Troya, J. M. (2000). Influence of the migration policy in parallel distributed GAs with structured and panmictic populations. Applied Intelligence, 12(3), 163-181.
[ 14 ] Beasley, D., Martin, R. R., & Bull, D. R. (1993). An overview of genetic algorithms: Part 1. Fundamentals. University computing, 15, 58-58.
[ 15 ] Belding, T. C. (1995). The distributed genetic algorithm revisited. arXiv preprint adap-org/9504007.
[ 16 ] Bevilacqua, A., Campanini, R., & Lanconelli, N. (2001). A distributed genetic algorithm for parameters optimization to detect microcalcifications in digital mammograms. In Applications of Evolutionary Computing (pp. 278-287). Springer Berlin Heidelberg.
[ 17 ] Bianchini, R., & Brown, C. M. (1993). Parallel genetic algorithms on distributed-memory architectures. Transputer Research and Applications, 67-67.
[ 18 ] Cantu-Paz, E., & Goldberg, D. E. (2000). Efficient parallel genetic algorithms: theory and practice. Computer methods in applied mechanics and engineering, 186(2), 221-238.
[ 19 ] Carter, A. E., & Ragsdale, C. T. (2006). A new approach to solving the multiple traveling salesperson problem using genetic algorithms. European journal of operational research, 175(1), 246-257.
[ 20 ] Cohoon, J. P., Hegde, S. U., Martin, W. N., & Richards, D. (1987). Punctuated equilibria: a parallel genetic algorithm. In Genetic algorithms and their applications: proceedings of the second International Conference on Genetic Algorithms: July 28-31, 1987 at the Massachusetts Institute of Technology, Cambridge, MA. Hillsdale, NJ: L. Erlhaum Associates, 1987.
[ 21 ] Colorni, A., Dorigo, M., & Maniezzo, V. (1991). Genetic algorithms and highly constrained problems: The time-table case. In Parallel problem solving from nature (pp. 55-59). Springer Berlin Heidelberg.
[ 22 ] Cook, S. A. (1971, May). The complexity of theorem-proving procedures. In Proceedings of the third annual ACM symposium on Theory of computing (pp. 151-158). ACM.
[ 23 ] Dean, J., & Ghemawat, S. (2008). MapReduce: simplified data processing on large clusters. Communications of the ACM, 51(1), 107-113.
[ 24 ] Ferrucci, F., Salza, P., Kechadi, M. T., & Sarro, F. (2015). A Parallel Genetic Algorithms Framework based on Hadoop MapReduce.
[ 25 ] Garey, M. R., & Johnson, D. S. (2002). Computers and intractability (Vol. 29). New York: wh freeman.
[ 26 ] Geronimo, L. D., Ferrucci, F., Murolo, A., & Sarro, F. (2012, April). A parallel genetic algorithm based on hadoop mapreduce for the automatic generation of junit test suites. In Software Testing, Verification and Validation (ICST), 2012 IEEE Fifth International Conference on (pp. 785-793). IEEE.
[ 27 ] Golberg, D. E. (1989). Genetic algorithms in search, optimization, and machine learning. Addion wesley, 1989.
[ 28 ] Hans, N., Mahajan, S., & Omkar, S. N. Big Data Clustering Using Genetic Algorithm On Hadoop Mapreduce.
[ 29 ] John Henry Holland. (1992). Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence. MIT press.
[ 30 ] Hu, X. B., & Paolo, E. D. (2007, September). An efficient genetic algorithm with uniform crossover for the multi-objective airport gate assignment problem. In Evolutionary Computation, 2007. CEC 2007. IEEE Congress on (pp. 55-62). IEEE.
[ 31 ] Jin, C., Vecchiola, C., & Buyya, R. (2008, December). Mrpga: an extension of mapreduce for parallelizing genetic algorithms. In eScience, 2008. eScience'08. IEEE Fourth International Conference on (pp. 214-221). IEEE.
[ 32 ] Karp, R. M. (1972). Reducibility among combinatorial problems (pp. 85-103). springer US.
[ 33 ] Kaur, D., & Murugappan, M. M. (2008, May). Performance enhancement in solving traveling salesman problem using hybrid genetic algorithm. In Fuzzy Information Processing Society, 2008. NAFIPS 2008. Annual Meeting of the North American (pp. 1-6). IEEE.
[ 34 ] Kondekar, R., Gupta, A., Saluja, G., Maru, R., Rokde, A., & Deshpande, P. (2012, June). A MapReduce based hybrid genetic algorithm using island approach for solving time dependent vehicle routing problem. In Computer & Information Science (ICCIS), 2012 International Conference on (Vol. 1, pp. 263-269). IEEE.
[ 35 ] Keco, D., & Subasi, A. (2012). Parallelization of genetic algorithms using Hadoop Map/Reduce.
[ 36 ] Liu, Q. I. T. A. O., Bridges, S. M., & Banicescu, I. O. A. N. A. (2001, November). Parallel genetic algorithms for tuning a fuzzy data mining system. In Proceedings of the Artificial Neural Networks in Engineering Conference (ANNIE 2001) (Vol. 5).
[ 37 ] Ono, I., Yamamura, M., & Kobayashi, S. (1996, May). A genetic algorithm for job-shop scheduling problems using job-based order crossover. In Evolutionary Computation, 1996., Proceedings of IEEE International Conference on (pp. 547-552). IEEE.
[ 38 ] Pereira, C. M., & Lapa, C. M. (2003). Parallel island genetic algorithm applied to a nuclear power plant auxiliary feedwater system surveillance tests policy optimization. Annals of Nuclear Energy, 30(16), 1665-1675.
[ 39 ] Sena, G. A., Megherbi, D., & Isern, G. (2001). Implementation of a parallel genetic algorithm on a cluster of workstations: traveling salesman problem, a case study. Future Generation Computer Systems, 17(4), 477-488.
[ 40 ] Shafer, J., Rixner, S., & Cox, A. L. (2010, March). The hadoop distributed filesystem: Balancing portability and performance. In Performance Analysis of Systems & Software (ISPASS), 2010 IEEE International Symposium on (pp. 122-133). IEEE.
[ 41 ] Singh, A., & Baghel, A. S. (2009). A new grouping genetic algorithm approach to the multiple traveling salesperson problem. Soft Computing, 13(1), 95-101.
[ 42 ] Su, S., & Zhan, D. (2006, October). New genetic algorithm for the fixed charge transportation problem. In Intelligent Control and Automation, 2006. WCICA 2006. The Sixth World Congress on (Vol. 2, pp. 7039-7043). IEEE.
[ 43 ] Vega-Rodríguez, M., Gutierrez-Gil, R., Avila-Roman, J. M., Sanchez-Perez, J. M., & Gómez-Pulido, J. (2005, June). Genetic algorithms using parallelism and FPGAs: the TSP as case study. In Parallel Processing, 2005. ICPP 2005 Workshops. International Conference Workshops on (pp. 573-579). IEEE.
[ 44 ] Verma, A., Llora, X., Goldberg, D. E., & Campbell, R. H. (2009, November). Scaling genetic algorithms using mapreduce. In Intelligent Systems Design and Applications, 2009. ISDA'09. Ninth International Conference on (pp. 13-18). IEEE.
[ 45 ] Wyld, D. C. (2010). The Cloudy future of government IT: Cloud computing and the public sector around the world. International Journal of Web & Semantic Technology, 1(1), 1-20.
[ 46 ] Zec, A., Konjicija, S., & Nosovic, N. (2012, October). Hybrid approach in design of GA implementation for MapReduce. In Telecommunications (BIHTEL), 2012 IX International Symposium on (pp. 1-6). IEEE.
論文全文使用權限:不同意授權