Linux重器hadoop压缩包,hadoop集群下海量小文件优化处理word论文.docx

最新推荐文章于 2021-08-22 15:53:20 发布

青城山说

最新推荐文章于 2021-08-22 15:53:20 发布

阅读量94

点赞数

文章标签： Linux重器hadoop压缩包

hadoop集群下海量小文件优化处理word论文

AbstractWiththedevelopmentofMobileInternetandInternetofThings,theamountof dataontheinternetisgrowing exponentially,andthetraditionaltechnicalarchitecture inprocessingmassiveamountofdatahasbecomeweak.Hadoop,asatechnical frameworkthatcanprocessmassivedataefficiently,hasreceivedmoreandmore attentionbytheindustry.Hadoopusesmaster-slavesarchitecturedesignpatternand consistsofHDFSfilesystemandMapReducecomputingframework.Thesingle namenodedesignofHDFSfilesystemcansimplifythemanagementoffilesystems, butitalsoleadstothelowefficiencyofprocessingsmallfiles.Basedonthestudyon howtoprocessmassivesmallfilesinindustrialandacademiccirclesandthe technicaldetailsofHadoopandtheecologicalsystem,thispagepresentsproblems thatthecurrentsolutiondoesnottakethediversityandrepeatabilityofthefiletype intoconsiderationanddoesnotthoroughlysolvethesinglepointproblemofHadoop cluster.Therefore,thisthesisputsforwardaplantooptimizetheHadoopcluster usingtherelatedcomponentsofHadooptoimprovetheperformanceofprocessing mass small files.Inthisthesis,theMD5algorithmisusedtodeterminewhethertwofilesare duplicateby comparingcontentof thetwofiles.Inthisway,itcanreducethenumber ofwrittenfilestoreducetheconsumptionofthedisk.Inthisthesis,MapFileisused tomergesmallfilesandstoredifferentfilesbyfiles’size.Ifthefileissmall,itwillbe putinmulti-levelmergersqueueaccordingtothefiletype.Whenthequeuethreshold isreached,thesmallfilesofthequeuewillbemergedandwrittenintotheHDFS.In thisway,itcanreducethenumberoffilesatacertaindegree.Inthisthesis,HBaseis usedtostorepersistindexinformation.Itwillnotonlyensureeffectivelevelofdata readingandwriting,butalsoprovidestableservicesoutsidebyusingthecacheto storetheindexandprovidingtheconsistencyprotectionforthedataincacheand Indexer.Thisthesispresentsa"mark-delete-compress"methodtodeletefiles.When receivedthedeletionrequest,itwillmodifytheflagofthesmallfileincache.When thesmallfileisdeleted,theclusterwillcompressthelargefilewhichthesmallfileIIlocatesin.Bythismethod,ononehand,thedeletionr

青城山说

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Linux重器hadoop压缩包,hadoop集群下海量小文件优化处理word论文.docx

hadoop集群下海量小文件优化处理word论文AbstractWiththedevelopmentofMobileInternetandInternetofThings,theamountof dataontheinternetisgrowing exponentially,andthetraditionaltechnicalarchitecture inprocessingmassiveamo...
复制链接

扫一扫

Linux重器hadoop压缩包,hadoop集群下海量小文件优化处理word论文.docx

“相关推荐”对你有帮助么？