MARC(机读目录格式)数据的前5个字节为MARC条目的长度,为了读取整条数据,首先知道该条目的数据长度是多少。由于MARC数据的长度是以字符串形式存储的,需要将字符串转为整数。经过查找相关的资料,发现Elixir并没有提供相关的转换函数(或许是我没找到),只好自己手工写了一个:
# 转换模块 包含三个函数
# chr2num: 将字符转为数值
# str2int: 将bitstring或list转为整数
defmodule Converter do
def chr2num(char) when char in ?0..?9, do: char - ?0
def chr2num(_), do: raise "参数的取值范围为#{?0}..#{?9}!"
def str2int(string) when is_binary(string), do: str2int(to_charlist(string))
def str2int([ ?- | tail ]), do: str2int(tail) * -1
def str2int([ ?+ | tail ]), do: str2int(tail)
def str2int(tail) when is_list(tail), do: _str2int(tail, 0)
defp _str2int([], value), do: value
defp _str2int([ digit | tail ], value)
when digit in ?0..?9, do:
_str2int(tail, value * 10 + chr2num(digit))
defp _str2int([ non_digit | _ ], _), do: raise "无效的数字:'#{[non_digit]}'!"
end
执行结果:
iex(32)> {:ok, file} = File.open("marc.iso")
{:ok, #PID<0.111.0>}
iex(33)> marc = IO.read(file, :line)
<<48, 49, 53, 49, 49, 110, 97, 109, 48, 32, 50, 50, 48, 48, 50, 55, 55, 32, 32,
32, 52, 53, 48, 32, 48, 48, 53, 48, 48, 49, 55, 48, 48, 48, 48, 48, 48, 49,
48, 48, 48, 51, 50, 48, 48, 48, 49, 55, 49, 48, ...>>
iex(34)> <<length::binary-size(5), _::binary>> = marc
<<48, 49, 53, 49, 49, 110, 97, 109, 48, 32, 50, 50, 48, 48, 50, 55, 55, 32, 32,
32, 52, 53, 48, 32, 48, 48, 53, 48, 48, 49, 55, 48, 48, 48, 48, 48, 48, 49,
48, 48, 48, 51, 50, 48, 48, 48, 49, 55, 49, 48, ...>>
iex(35)> length
"01511"
iex(36)> Converter.str2int(length)
1511
iex(37)>