1 数据流 Data Streams
数据流读取和写入字符串,整形,浮点数和其他高层抽象的数据。
类:
java.io.DataInputStream 和 java.io.DataOutputStream
可以读取写入Java的原始类型和字符串。
用于平台之间的数据搬运工作是极好的。
2 DataStream Class
public class DataInputStream extends FilterInputStream implements DataInput
public class DataOutputStream extends FilterOutputStream
implements DataOutput
3 输入输出的接口
输入:15个,继承自
public boolean readBoolean( ) throws IOException
public byte readByte( ) throws IOException
public int readUnsignedByte( ) throws IOException
public short readShort( ) throws IOException
public int readUnsignedShort( ) throws IOException
public char readChar( ) throws IOException
public int readInt( ) throws IOException
public long readLong( ) throws IOException
public float readFloat( ) throws IOException
public double readDouble( ) throws IOException
public String readLine( ) throws IOException
public String readUTF( ) throws IOException
public void readFully(byte[] data) throws IOException
public void readFully(byte[] data, int offset, int length) throws IOException
public int skipBytes(int n) throws IOException
输出:14个
public void write(int b) throws IOException
public void write(byte[] data) throws IOException
public void write(byte[] data, int offset, int length)
throws IOException
public void writeBoolean(boolean v) throws IOException
public void writeByte(int b) throws IOException
public void writeShort(int s) throws IOException
public void writeChar(int c) throws IOException
public void writeInt(int i) throws IOException
public void writeLong(long l) throws IOException
public void writeFloat(float f) throws IOException
public void writeDouble(double d) throws IOException
public void writeBytes(String s) throws IOException
public void writeChars(String s) throws IOException
public void writeUTF(String s) throws IOException
4 Data Stream 总结
Type | Written by | Read by | Format |
---|---|---|---|
boolean | writeBoolean(boolean b) | readBoolean( ) | One byte, 0 if false, 1 if true |
byte | writeByte(int b) | readByte( ) | One byte, two's complement |
byte array | write(byte[] data) write(byte[] data, int offset, int length) | readFully(byte[] data) readFully(byte[] data, int offset, int length) | The bytes in the order they appear in the array or subarray |
short | writeShort(int s) | readShort( ) | Two bytes, two's complement, big-endian |
char | writeChar(int c) | readChar( ) | Two bytes, unsigned, big-endian |
int | writeInt(int i) | readInt( ) | Four bytes, two's complement, big-endian |
long | writeLong(long l) | readLong( ) | Eight bytes, two's complement, big-endian |
float | writeFloat(float f) | readFloat( ) | Four bytes, IEEE 754, big-endian |
double | writeDouble(double d) | readDouble( ) | Eight bytes, IEEE 754, big-endian |
unsigned byte | N/A | readUnsignedByte( ) | One unsigned byte |
unsigned short | N/A | readUnsignedShort( ) | Two bytes, big-endian, unsigned |
String | writeBytes(String s) | N/A | The low-order byte of each char in the string from first to last |
String | writeChars(String s) | N/A | Both bytes of each char in the string from first to last |
String | writeUTF(String s) | readUTF( ) | A signed short giving the number of bytes in the encoded string, followed by a modified UTF-8 encoding of the string |
5 Data Stream构造体
由于过滤流的性质决定了它使用其他流作为底层的流
public DataInputStream(InputStream in)
public DataOutputStream(OutputStream out)
6 整形(字符类似)
考虑两个问题:
长度,大小端,符号
Java里使用32big int 和大端方式,可读取无符号,不可写入(但是可用这种方式:(-n == ~n + 1))
7 浮点
遵循IEEE 754,其他与上面的整形类似
8 布尔型
实际用0x01表示true,用 0x00表示false
9 写入文本
writeChar( )方法写入一个Java 字符。并没有使用UTF-8的方式,而是简单的写入两个字节(UTF-16的方式)。writeChars( )假定写入的内容都是2字节的char,所以高位信息将丢除。换句话说,仅仅包含0~255的字符内容。然而writeUTF( ) 保留了高位字节信息,也保留了字符串的长度。它首先写入0~65535的一个2字节的内容,然后转化它为UTF-8的形式,再写入流中。不可超过65535个字节,否则抛出异常。
这个类将字节内容和字符内容混合到一起,用于解决文本内容,例如XML文档的一些场景
10 读取文本
使用readUTF( )读取一个 UTF-8编码的字符串
readLine( ) 方法已废止
11 小端写入与读取
LittleEndianOutputStream 和 LittleEndianInputStream
举例其中两个方法:
public void writeLong(long l) throws IOException {
out.write((int) l & 0xFF);
out.write((int) (l >>> 8) & 0xFF);
out.write((int) (l >>> 16) & 0xFF);
out.write((int) (l >>> 24) & 0xFF);
out.write((int) (l >>> 32) & 0xFF);
out.write((int) (l >>> 40) & 0xFF);
out.write((int) (l >>> 48) & 0xFF);
out.write((int) (l >>> 56) & 0xFF);
written += 8;
}
实际上是是先写低位地址,再写高位地址。
另一个例子是UTF写入
public void writeUTF(String s) throws IOException {
int numchars = s.length( );
int numbytes = 0;
for (int i = 0 ; i < numchars ; i++) {
int c = s.charAt(i);
if ((c >= 0x0001) && (c <= 0x007F)) numbytes++;
else if (c > 0x07FF) numbytes += 3;
else numbytes += 2;
}
if (numbytes > 65535) throw new UTFDataFormatException( );
out.write((numbytes >>> 8) & 0xFF);
out.write(numbytes & 0xFF);
for (int i = 0 ; i < numchars ; i++) {
int c = s.charAt(i);
if ((c >= 0x0001) && (c <= 0x007F)) {
out.write(c);
}
else if (c > 0x07FF) {
out.write(0xE0 | ((c >> 12) & 0x0F));
out.write(0x80 | ((c >> 6) & 0x3F));
out.write(0x80 | (c & 0x3F));
written += 2;
}
else {
out.write(0xC0 | ((c >> 6) & 0x1F));
out.write(0x80 | (c & 0x3F));
written += 1;
}
}
written += numchars + 2;
}
由于UTF-8具有如下形式(可变长度字符),所以将其逆序是需要考虑一个字符的编码长度问题的。
Unicode/UCS-4
|
bit数
|
UTF-8
|
byte数
|
备注
|
0000 ~
007F
|
0~7
|
0XXX XXXX
|
1
| |
0080 ~
07FF
|
8~11
|
110X XXXX
10XX XXXX
|
2
| |
0800 ~
FFFF
|
12~16
|
1110XXXX
10XX XXXX
10XX XXXX
|
3
|
基本定义范围:0~FFFF
|
1 0000 ~
1F FFFF
|
17~21
|
1111 0XXX
10XX XXXX
10XX XXXX
10XX XXXX
|
4
|
Unicode6.1定义范围:0~10 FFFF
|
20 0000 ~
3FF FFFF
|
22~26
|
1111 10XX
10XX XXXX
10XX XXXX
10XX XXXX
10XX XXXX
|
5
|
说明:此非unicode编码范围,属于UCS-4 编码
早期的规范UTF-8可以到达6字节序列,可以覆盖到31位元(通用字符集原来的极限)。尽管如此,2003年11月UTF-8 被 RFC 3629 重新规范,只能使用原来Unicode定义的区域, U+0000到U+10FFFF。根据规范,这些字节值将无法出现在合法 UTF-8序列中
|
400 0000 ~
7FFF FFFF
|
27~31
|
1111 110X
10XX XXXX
10XX XXXX
10XX XXXX
10XX XXXX
10XX XXXX
|
12 线程安全
考虑下面的代码,
public int readInt( ) throws IOException {
int byte1 = in.read( );
int byte2 = in.read( );
int byte3 = in.read( );
int byte4 = in.read( );
if (byte4 == -1 || byte3 == -1 || byte2 == -1 || byte1 == -1) {
throw new EOFException( );
}
return (byte4 << 24) + (byte3 << 16) + (byte2 << 8) + byte1;
}
如果多个线程同时执行,就会出现问题:无法保证byte1~4的读入顺序。
所以,你可能需要这样做:
public int readInt( ) throws IOException {
int byte1, byte2, byte3, byte4;
synchronized (this) {
byte1 = in.read( );
byte2 = in.read( );
byte3 = in.read( );
byte4 = in.read( );
}
if (byte4 == -1 || byte3 == -1 || byte2 == -1 || byte1 == -1) {
throw new EOFException( );
}
return (byte4 << 24) + (byte3 << 16) + (byte2 << 8) + byte1;
}
所以建议是,不要在多线程中分享你的流。
这对过滤流非常重要,对于常规流也很有用。