Java I/O全文摘要(八)过滤流,数据流

19 篇文章 0 订阅
15 篇文章 0 订阅

1 数据流 Data Streams

数据流读取和写入字符串,整形,浮点数和其他高层抽象的数据。

类:

java.io.DataInputStream 和 java.io.DataOutputStream

可以读取写入Java的原始类型和字符串。

用于平台之间的数据搬运工作是极好的。


2 DataStream Class

public class DataInputStream extends FilterInputStream implements DataInput
public class DataOutputStream extends FilterOutputStream
                              implements DataOutput


3 输入输出的接口

输入:15个,继承自

public boolean readBoolean( ) throws IOException
public byte    readByte( ) throws IOException
public int     readUnsignedByte( ) throws IOException
public short   readShort( ) throws IOException
public int     readUnsignedShort( ) throws IOException
public char    readChar( ) throws IOException
public int     readInt( ) throws IOException
public long    readLong( ) throws IOException
public float   readFloat( ) throws IOException
public double  readDouble( ) throws IOException
public String  readLine( ) throws IOException
public String  readUTF( ) throws IOException
public void    readFully(byte[] data) throws IOException
public void    readFully(byte[] data, int offset, int length) throws IOException
public int     skipBytes(int n) throws IOException


输出:14个

public void write(int b) throws IOException
public void write(byte[] data) throws IOException
public void write(byte[] data, int offset, int length)
                     throws IOException
public void writeBoolean(boolean v) throws IOException
public void writeByte(int b) throws IOException
public void writeShort(int s) throws IOException
public void writeChar(int c) throws IOException
public void writeInt(int i) throws IOException
public void writeLong(long l) throws IOException
public void writeFloat(float f) throws IOException
public void writeDouble(double d) throws IOException
public void writeBytes(String s) throws IOException
public void writeChars(String s) throws IOException
public void writeUTF(String s) throws IOException


4 Data Stream 总结

Table 8-1. Formats used by DataInput and DataOutput

Type

Written by

Read by

Format

boolean

writeBoolean(boolean b)

readBoolean( )

One byte, 0 if false, 1 if true

byte

writeByte(int b)

readByte( )

One byte, two's complement

byte array

write(byte[] data)

write(byte[]

data, int offset, int length)

readFully(byte[] data)

readFully(byte[] data, int offset, int length)

The bytes in the order they appear in the array or subarray

short

writeShort(int s)

readShort( )

Two bytes, two's complement, big-endian

char

writeChar(int c)

readChar( )

Two bytes, unsigned, big-endian

int

writeInt(int i)

readInt( )

Four bytes, two's complement, big-endian

long

writeLong(long l)

readLong( )

Eight bytes, two's complement, big-endian

float

writeFloat(float f)

readFloat( )

Four bytes, IEEE 754, big-endian

double

writeDouble(double d)

readDouble( )

Eight bytes, IEEE 754, big-endian

unsigned byte

N/A

readUnsignedByte( )

One unsigned byte

unsigned short

N/A

readUnsignedShort( )

Two bytes, big-endian, unsigned

String

writeBytes(String s)

N/A

The low-order byte of each char in the string from first to last

String

writeChars(String s)

N/A

Both bytes of each char in the string from first to last

String

writeUTF(String s)

readUTF( )

A signed short giving the number of bytes in the encoded string, followed by a modified UTF-8 encoding of the string


5 Data Stream构造体

由于过滤流的性质决定了它使用其他流作为底层的流

public DataInputStream(InputStream in)
public DataOutputStream(OutputStream out)


6 整形(字符类似)

考虑两个问题:

长度,大小端,符号

Java里使用32big int 和大端方式,可读取无符号,不可写入(但是可用这种方式:(-n == ~n + 1)


7 浮点

遵循IEEE 754,其他与上面的整形类似


8 布尔型

实际用0x01表示true,用 0x00表示false


9 写入文本

writeChar( )方法写入一个Java 字符。并没有使用UTF-8的方式,而是简单的写入两个字节(UTF-16的方式)。writeChars( )假定写入的内容都是2字节的char,所以高位信息将丢除。换句话说,仅仅包含0~255的字符内容。然而writeUTF( ) 保留了高位字节信息,也保留了字符串的长度。它首先写入0~65535的一个2字节的内容,然后转化它为UTF-8的形式,再写入流中。不可超过65535个字节,否则抛出异常。

这个类将字节内容和字符内容混合到一起,用于解决文本内容,例如XML文档的一些场景


10 读取文本

使用readUTF( )读取一个 UTF-8编码的字符串

readLine( ) 方法已废止


11 小端写入与读取

LittleEndianOutputStream 和 LittleEndianInputStream


举例其中两个方法:

public void writeLong(long l) throws IOException {
    out.write((int) l & 0xFF);
    out.write((int) (l >>> 8) & 0xFF);
    out.write((int) (l >>> 16) & 0xFF);
    out.write((int) (l >>> 24) & 0xFF);
    out.write((int) (l >>> 32) & 0xFF);
    out.write((int) (l >>> 40) & 0xFF);
    out.write((int) (l >>> 48) & 0xFF);
    out.write((int) (l >>> 56) & 0xFF);
    written += 8;
  }

实际上是是先写低位地址,再写高位地址。


另一个例子是UTF写入

public void writeUTF(String s) throws IOException {
    int numchars = s.length( );
    int numbytes = 0;
    for (int i = 0 ; i < numchars ; i++) {
      int c = s.charAt(i);
      if ((c >= 0x0001) && (c <= 0x007F)) numbytes++;
      else if (c > 0x07FF) numbytes += 3;
      else numbytes += 2;
    }
    if (numbytes > 65535) throw new UTFDataFormatException( );
    out.write((numbytes >>> 8) & 0xFF);
    out.write(numbytes & 0xFF);
    for (int i = 0 ; i < numchars ; i++) {
      int c = s.charAt(i);
      if ((c >= 0x0001) && (c <= 0x007F)) {
        out.write(c);
      }
      else if (c > 0x07FF) {
        out.write(0xE0 | ((c >> 12) & 0x0F));
        out.write(0x80 | ((c >>  6) & 0x3F));
        out.write(0x80 | (c & 0x3F));
        written += 2;
      }
      else {
        out.write(0xC0 | ((c >>  6) & 0x1F));
        out.write(0x80 | (c & 0x3F));
        written += 1;
      }
    }
    written += numchars + 2;
  }

由于UTF-8具有如下形式(可变长度字符),所以将其逆序是需要考虑一个字符的编码长度问题的。

Unicode/UCS-4
bit数
UTF-8
byte数
备注
0000 ~
007F
0~7
0XXX XXXX
1
 
0080 ~
07FF
8~11
110X XXXX
10XX XXXX
2
 
0800 ~
FFFF
12~16
1110XXXX
10XX XXXX
10XX XXXX
3
基本定义范围:0~FFFF
1 0000 ~
1F FFFF
17~21
1111 0XXX
10XX XXXX
10XX XXXX
10XX XXXX
4
Unicode6.1定义范围:0~10 FFFF
20 0000 ~
3FF FFFF
22~26
1111 10XX
10XX XXXX
10XX XXXX
10XX XXXX
10XX XXXX
5
说明:此非unicode编码范围,属于UCS-4 编码
早期的规范UTF-8可以到达6字节序列,可以覆盖到31位元(通用字符集原来的极限)。尽管如此,2003年11月UTF-8 被 RFC 3629 重新规范,只能使用原来Unicode定义的区域, U+0000到U+10FFFF。根据规范,这些字节值将无法出现在合法 UTF-8序列中
400 0000 ~
7FFF FFFF
27~31
1111 110X
10XX XXXX
10XX XXXX
10XX XXXX
10XX XXXX
10XX XXXX

12 线程安全

考虑下面的代码,

public int readInt( ) throws IOException {
    int byte1 = in.read( );
    int byte2 = in.read( );
    int byte3 = in.read( );
    int byte4 = in.read( );
    if (byte4 == -1  || byte3 == -1 || byte2 == -1 || byte1 == -1) {
      throw new EOFException( );
    }
    return (byte4 << 24) + (byte3 << 16) + (byte2 << 8) + byte1;
  }


如果多个线程同时执行,就会出现问题:无法保证byte1~4的读入顺序。

所以,你可能需要这样做:

public int readInt( ) throws IOException {
  int byte1, byte2, byte3, byte4;
  synchronized (this) {
    byte1 = in.read( );
    byte2 = in.read( );
    byte3 = in.read( );
    byte4 = in.read( );
  }
  if (byte4 == -1  || byte3 == -1 || byte2 == -1 || byte1 == -1) {
    throw new EOFException( );
  }
  return (byte4 << 24) + (byte3 << 16) + (byte2 << 8) + byte1;
}


但是这样仍然不完美。更好的办法是synchronized in这个对象


所以建议是,不要在多线程中分享你的流。

这对过滤流非常重要,对于常规流也很有用。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值