Using Hadoop Sequence Files, you can store the collection of images in HDFS.
Those are map files that inherently can be read by map reduce applications – there is an input format especially for sequence files – and are splitable by map reduce, so we can have one huge file that will be the input of many map tasks. By using those sequence files we are letting hadoop use its advantages. It can split the work into chunks so the processing is parallel, but the chunks are big enough that the process stays efficient.
Since the sequence file are map file the desired format will be that the key will be text and hold the HDFS filename and the value will be BytesWritable and will contain the image content of the file.
Hope this helps!