首页>
外国专利>
Method for writing a plurality of small files of 2 MB or less to HDFS including a data merge module and an HBase cache module based on Hadoop
Method for writing a plurality of small files of 2 MB or less to HDFS including a data merge module and an HBase cache module based on Hadoop
The invention discloses a Hadoop-based massive small file writing method which is suitable for an HDFS system with a data merging module and an HBase cache module. The method includes a step of receiving a small file writing command input by a user, a step of querying the HBase cache module according to a user ID and a small file file name, and uploading first file content written into a small file and updating the HBase cache module with the first file content if the first file content is queried, a step of querying a database of the HDFS system again if the first file content is not queried,and uploading second file content written into the small file and updating the database with the second file content if the second file content is queried, otherwise calling an API of an Hadoop archive tool to access a corresponding HAR file and uploading the HAR file written into the small file and updating the database with the HAR file. According to the writing method of the invention, the reading efficiency of the small file can be improved.
展开▼