前言
可交换图像文件常被简称为EXIF(Exchangeable image file format),是专门为数码相机的照片设定的,可以记录数码照片的属性信息和拍摄数据。(亦可用于音频数据。编者注)。本文只学习EXIF在Jpeg image方面的使用。
http://zh.wikipedia.org/wiki/EXIF
目前Android系统中提供了EXIF接口,可以对多媒体文件写入EXIF信息
http://developer.android.com/reference/android/media/ExifInterface.html
尽管Android提供了EXIF接口,不足之处如下:
1.该接口只能对JPEG文件进行操作。如果拍照完成后腰写入EXIF信息,则需要再进行两次IO操作(如果连拍操作则影响后期处理效率)
2.ExifInterface提供的TAGs有限。
所以,需要我们具备自己编码写入EXIF的能力
本文基于下面两个链接进行学习。
http://www.exif.org/
提供了文档,资源图片,分析工具
http://www.media.mit.edu/pia/Research/deepview/exif.html
一个实际的过程。提供了文档,资源图片,分析工具。可惜作者没有上传实际分析的文件
ExifInterface提供的TAGs
最好还是到官网查看一下有无更新。该总结截止于2013-1-31
此处列举了ExifInterface提供的TAGs。可以看到,TAGs包含了orientation,FNumber,DateTime,ExposureTime,Flash,FocalLength,GPS相关,ImageLength,ImageWidth,ISOSpeedRatings,Make,Model,WhiteBalance。提供了获取Thumbnail的接口: getThumbnail ()。
ORIENTATION_FLIP_HORIZONTAL
ORIENTATION_FLIP_VERTICAL
ORIENTATION_NORMAL
ORIENTATION_ROTATE_180
ORIENTATION_ROTATE_270
ORIENTATION_ROTATE_90
ORIENTATION_TRANSPOSE
ORIENTATION_TRANSVERSE
ORIENTATION_UNDEFINED
TAG_APERTURE Constant Value: "FNumber"
TAG_DATETIME Constant Value: "DateTime"
TAG_EXPOSURE_TIME Constant Value: "ExposureTime"
TAG_FLASH Constant Value: "Flash"
TAG_FOCAL_LENGTH Constant Value: "FocalLength"
TAG_GPS_ALTITUDE Constant Value: "GPSAltitude"
TAG_GPS_ALTITUDE_REF Constant Value: "GPSAltitudeRef"
TAG_GPS_DATESTAMP Constant Value: "GPSDateStamp"
TAG_GPS_LATITUDE Constant Value: "GPSLatitude"
TAG_GPS_LATITUDE_REF Constant Value: "GPSLatitudeRef"
TAG_GPS_LONGITUDE Constant Value: "GPSLongitude"
TAG_GPS_LONGITUDE_REF Constant Value: "GPSLongitudeRef"
TAG_GPS_PROCESSING_METHOD Constant Value: "GPSProcessingMethod"
TAG_GPS_TIMESTAMP Constant Value: "GPSTimeStamp"
TAG_IMAGE_LENGTH Constant Value: "ImageLength"
TAG_IMAGE_WIDTH Constant Value: "ImageWidth"
TAG_ISO Constant Value: "ISOSpeedRatings"
TAG_MAKE Constant Value: "Make"
TAG_MODEL Constant Value: "Model"
TAG_ORIENTATION Constant Value: "Orientation"
TAG_WHITE_BALANCE Constant Value: "WhiteBalance"
WHITEBALANCE_AUTO Constant Value: 0 (0x00000000)
WHITEBALANCE_MANUAL Constant Value: 1 (0x00000001)
容易混淆的概念
jfif
图片存储格式之一,由JPEG格式衍生而来,后缀为".jfif"。JPEG本身只有描述如何将一个图像转换为字节的数据串流(streaming),但并没有说明这些字节如何在任何特定的储存媒体上被封存起来。一个由独立JPEG小组(Independent JPEG Group)所建立的额外标准,称为JFIF(JPEG File Interchange Format,JPEG档案交换格式),详细说明如何从一个JPEG串流,产出一个适合于电脑储存和传输(像是在因特网上)的档案。当有人称呼一个"JPEG档案",一般而言他是意指一个JFIF档案,或有时候是一个Exif JPEG档案。然而,也有其他以JPEG为基础的档案格式,像是JNG。JPEG/JFIF是最普遍在万维网(World Wide Web)上被用来储存和传输图片的格式。它并不适合于线条绘图(drawing)和其他文字或图示(iconic)的图形,因为它的压缩方法用在这些图形的型态上,会得到不适当的结果(PNG和GIF格式通常是用来针对这种目的之图形;GIF每一像素只有8位元,并不很适合于用在彩色照片,PNG可以被用来无失真地储存照片,但是档案太大让它不适合在网页上放照片)。对于JFIF的MIME媒体型态是image/jpeg(定义在RFC 1341)。
更多讲解请参考:
http://baike.baidu.com/view/1326314.htm
http://www.jpeg.org/public/jfif.pdf
http://en.wikipedia.org/wiki/JPEG_File_Interchange_Format
tiff
标签图像文件格式(Tagged Image File Format,简写为TIFF)是一种主要用来存储包括照片和艺术图在内的图像的文件格式。它最初由Aldus公司与微软公司一起为PostScript打印开发。TIFF与JPEG和PNG一起成为流行的高位彩色图像格式。TIFF格式在业界得到了广泛的支持,如Adobe公司的Photoshop、The GIMP Team的GIMP、Ulead PhotoImpact和Paint Shop Pro等图像处理应用、QuarkXPress和Adobe InDesign这样的桌面印刷和页面排版应用,扫描、传真、文字处理、光学字符识别和其它一些应用等都支持这种格式。从Aldus获得了PageMaker印刷应用程序的Adobe公司现在控制着TIFF规范。
更多讲解请参考:
http://zh.wikipedia.org/wiki/TIFF
http://baike.baidu.com/view/66014.htm
在EXIF2-1.pdf(2.5.1. Basic Structure of Primary Image Data)中提到,对于压缩数据,在APP1需要使用TIFF(For compressed data, the attribute information required by the DSC application shall be recorded in APP1. Data writing in APP1 shall be compatible with TIFF. )
对Exif2-1.pdf文档的学习
EXIF可以用于非压缩的RGB数据(Uncompressed RGB data),YCbCr非压缩数据(YCbCr Uncompressed data)和JPEG压缩数据(JPEG Compressed data),而我们手机相机拍出的图片是JPEG数据,这里只对JPEG文件中的EXIF知识进行学习。
文件结构
对于带有compressed thumbnail的exif文件,采用如下格式实例分析
对www.exif.org中提供的Canon图片进行分析 http://www.exif.org/samples.html,分析中可以借助官网中提供的工具或JPEGsnoop
用十六进制文件分析工具打开
FFD8:SOI
FFE1:APP1
1BFE: APP1 length (7166 dec)
457869660000: Exif Header (EXIF..)
49492A00 08000000: TIFF Header
分析TIFF Header
(根据TIFF6.pdf Section 2:TIFF Structure Image File Header)原文规范:一个TIFF都是由8个字节的图像头部(A TIFF file begins with an 8-byte image file header that points to an image filedirectory(IFD).
Bytes 0-1: The byte order used within the file. Legal values are:
“II” (4949.H)
“MM” (4D4D.H)
Bytes 2-3 An arbitrary but carefully chosen number (42) that further identifies the file as aTIFF file.
The byte order depends on the value of Bytes 0-1.
Bytes 4-7 The offset (in bytes) of the first IFD. The directory may be at any location in thefile after the header but must begin on a word boundary.
具体分析:4949="I I", it means "Intel" type byte align. If it is 0x4d4d="MM", it means "Motorola" type byte align.
2A00 :chosen number(42).If the data uses Intel align, next 2bytes are "0x2a00". If it uses Motorola, they are "0x002a".
08000000: The offset (in bytes) of the first IFD is 0x08.
注意:0x08只是firsr IFD相对于TIFF Header's offset。
TIFF Header's offset in the file is 0xC
firsr IFD相对于文件的offset 是 0xC + 0x8 = 0x14
分析Image File Description(IFD)
Image File DirectoryAn Image File Directory(IFD) consists ofa 2-byte count of the number of directory entries (i.e., the number of fields), followed by a sequence of 12-byte field entries, followed by a 4-byte offset of the next IFD (or 0 if none). (Do not forget towrite the 4 bytes of 0 after the last IFD.)There must be at least 1 IFD in a TIFF file and each IFD must have at least oneentry.
关于big endian format,其中提到:For example, in bigendian format, if the type is SHORT and the value is 1, it is recorded as 00010000.H.
根据这个结构,我们看看这个文件是怎么实现的: 0900: 9 Directory Entries. Directory Entry 0: 0F01 0200 0600 0000 7A00 0000 0F01: The Tag. Value:10F.H :Make 0200: Type: 2 = ASCII. 0600 0000 : number of values : 6. 7A00 0000 : Offset : 0x0000 007A firsr IFD offset is 0x14, the value position is : 0xC + 7A = 86 在anddress 0x86 我们看到值是Canon. 6个字符 Directory Entry 1: 1001 0200 1400 0000 8000 0000 1001: The Tag. Value: 110.H : Model 0200: 0x0002 Type: 2 = ASCII. 1400 0000: number of values : 0x14. 8000 0000: Offset : 0x0000 0080. 实际与文件的位置是:0xC + 080 = 8c 刚好跟在上一个Directory Entry 0 value的后面。 Directory Entry 2: 1201 0300 0100 0000 0100 0000 1201: 0x0112 Tag:Orientation 0300 :0x0003 Type:Type = SHORT 0100 0000: number of values :1 0100 0000: Value: 0x00 00 00 01 Directory Entry 3: 1A01 0500 0100 0000 9400 0000 Directory Entry 4: 1B01 0500 0100 0000 9C00 0000 1B01:0x011B YResolution 0500: 5 = RATIONAL Two LONGs. 0100 0000:count 1 0xC + 0x9C = 0xA8 位置0xA8 是B4 00 00 00 01 00 00 00 Directory Entry 5: 2801 0300 0100 0000 0200 0000 Directory Entry 6: 3201 0200 1400 0000 A400 0000 Offset: C + A4 = B0 Directory Entry 7: 1302 0300 0100 0000 0100 0000 Directory Entry 8: 6987 0400 0100 0000 b800 0000 0x8769 : Exif-specific IFD 0400 : A 32- bit (4 -byte) unsigned integer, 0100 0000 : 0x01 b800 0000 : Exif SubIFD starts from address 0xb8'. 0xC + 0xb8 = c4工具分析的结果:
分析SubIFD
address 0xC4: 1B00 value: 0x1B (Decimal:27 )address 0xC6 + 0x1B * C = 20a
此处entry太多,为了分析范围,从后往前看。找Value offset最大的一项来分析:从后往前:
Directory Entry 26: 00A3 0700 0100 0000 0300 0000 00A3: FileSource 0700: undefined Directory Entry 23: 0FA2 0500 0100 0000 3804 0000 0x0438 + 0x0c = 0x444 此处address: 0053 0700 9B00 0000 0x0007 5300 = 480000 0x9B = 155 Directory Entry 21: 05A0 0400 0100 0000 4004 0000 C. Interoperability IFD Interoperability IFD Pointer Tag = A005.H 0xC + 0x440 = 0x44C ------ 7C92 0700 3601 0000 FA02 0000 0700 :undefined (undefined这种类型该怎么理解??) 3601 0000: 0x136 0xC + 2FA = 0x306 ================= The Interoperability structure of Interoperability IFD is same as TIFF defined IFD structure but does not contain 所以像TIFF一样来分析: 0400 0100 0200 0400 0000 5239 3800 0200 0700 0400 0000 3031 3030 0110 0300 0100 0000 8002 0000 0210 0300 0100 0000 E001 0000 0000 0000
工具分析结果
可以看到,该文件使用的EXIF信息是0210
分析IFD 1
0x476 + C = 0x482到IFD1看一下:
0600 : 0x0006 :6个entries. Directory Entry 0: 0301 0300 0100 0000 0600 0000 0301: 0x0103 Compression 0300 :0x0003 Type:Type = SHORT 0100 0000 : 0600 0000 : Shows compression method. '1' means no compression, '6' means JPEG compression. Directory Entry 1: 1A01 0500 0100 0000 C404 0000 0500: C404 0000: 0x04C4 0xC + 0x04C4 = 0x4d0 Directory Entry 2: 1B01 0500 0100 0000 CC04 0000 0xC + 0x04CC = 0x4D8 Directory Entry 3: 2801 0300 0100 0000 0200 0000 Directory Entry 4: 0102 0400 F405 0000 Tag = 513 (201.H) 原文:This Field indicates whether a JPEG interchange format bitstream is present in the TIFF file. If a JPEG interchange format bitstream is present, then this Field points to the Start of Image (SOI) marker code. 0xC + 0x05f4 = 0x600 Directory Entry 5: 0202 0400 0100 0000 DE14 0000
工具分析结果: