Hadoop archive is a facility which packs up small files into one compact HDFS block to avoid memory wastage of namenode. Hadoop Archives (HAR) can be used to address the namespace limitations associated with storing many small files. The extension of archive file created in Hadoop is *. har extension. A Hadoop archive directory contains metadata (in the form of _index and _masterindex) and data (part-*) files.