python 3.5.2 shall咋出来的_如何在python 3.5.2中读取avro文件

I am trying to read avro files using python.

I installed Apache Avro successfully (I think I did because I am able to "import avro" in the python shell) following the instruction here

https://avro.apache.org/docs/1.8.1/gettingstartedpython.html

However, when I try to read avro files following the code in the above instruction. I keep receiving errors when importing avro related stuff.

>>> import avro.schema

Traceback (most recent call last):

File "", line 1, in

import avro.schema

File "", line 969, in _find_and_load

File "", line 954, in _find_and_load_unlocked

File "", line 896, in _find_spec

File "", line 1139, in find_spec

File "", line 1115, in _get_spec

File "", line 1096, in _legacy_get_spec

File "", line 444, in spec_from_loader

File "", line 533, in spec_from_file_location

File "I:\Program Files\lib\site-packages\avro-_avro_version_-py3.5.egg\avro\schema.py", line 340

except Exception, e:

^

SyntaxError: invalid syntax

>>> from avro.datafile import DataFileReader, DataFileWriter

Traceback (most recent call last):

File "I:\Program Files\lib\site-packages\avro-_avro_version_-py3.5.egg\avro\datafile.py", line 21, in

from cStringIO import StringIO

ImportError: No module named 'cStringIO'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

File "", line 1, in

from avro.datafile import DataFileReader, DataFileWriter

File "I:\Program Files\lib\site-packages\avro-_avro_version_-py3.5.egg\avro\datafile.py", line 23, in

from StringIO import StringIO

ImportError: No module named 'StringIO'

>>> from avro.io import DatumReader, DatumWriter

Traceback (most recent call last):

File "", line 1, in

from avro.io import DatumReader, DatumWriter

File "", line 969, in _find_and_load

File "", line 954, in _find_and_load_unlocked

File "", line 896, in _find_spec

File "", line 1139, in find_spec

File "", line 1115, in _get_spec

File "", line 1096, in _legacy_get_spec

File "", line 444, in spec_from_loader

File "", line 533, in spec_from_file_location

File "I:\Program Files\lib\site-packages\avro-_avro_version_-py3.5.egg\avro\io.py", line 200

bits = (((ord(self.read(1)) & 0xffL)) |

^

SyntaxError: invalid syntax

So did I install avro successfully? Why am I receiving those errors? I am using python 3.5.2 on windows 7.

Edited

I fixed the issue following the suggestion by Stephane Martin. Then I try to read avro files into python. I have a bunch of avros in a directory which has already been set as the right path in the python. Here is my code

import avro.schema

from avro.datafile import DataFileReader, DataFileWriter

from avro.io import DatumReader, DatumWriter

reader = DataFileReader(open("part-00000-of-01733.avro", "r"), DatumReader())

for user in reader:

print (user)

reader.close()

And it returns the error

Traceback (most recent call last):

File "I:\DJ data\read avro.py", line 5, in

reader = DataFileReader(open("part-00000-of-01733.avro", "r"), DatumReader())

File "I:\Program Files\lib\site-packages\avro_python3-1.8.1-py3.5.egg\avro\datafile.py", line 349, in __init__

self._read_header()

File "I:\Program Files\lib\site-packages\avro_python3-1.8.1-py3.5.egg\avro\datafile.py", line 459, in _read_header

META_SCHEMA, META_SCHEMA, self.raw_decoder)

File "I:\Program Files\lib\site-packages\avro_python3-1.8.1-py3.5.egg\avro\io.py", line 525, in read_data

return self.read_record(writer_schema, reader_schema, decoder)

File "I:\Program Files\lib\site-packages\avro_python3-1.8.1-py3.5.egg\avro\io.py", line 725, in read_record

field_val = self.read_data(field.type, readers_field.type, decoder)

File "I:\Program Files\lib\site-packages\avro_python3-1.8.1-py3.5.egg\avro\io.py", line 515, in read_data

return self.read_fixed(writer_schema, reader_schema, decoder)

File "I:\Program Files\lib\site-packages\avro_python3-1.8.1-py3.5.egg\avro\io.py", line 568, in read_fixed

return decoder.read(writer_schema.size)

File "I:\Program Files\lib\site-packages\avro_python3-1.8.1-py3.5.egg\avro\io.py", line 170, in read

input_bytes = self.reader.read(n)

File "I:\Program Files\lib\encodings\cp1252.py", line 23, in decode

return codecs.charmap_decode(input,self.errors,decoding_table)[0]

UnicodeDecodeError: 'charmap' codec can't decode byte 0x90 in position 863: character maps to

I am indeed aware that in the example in the instruction, a schema is created first. But what is a avsc file? How shall I create it and the corresponding schema in my case?

解决方案

Use the Avro distribution for python 3, not the one for python 2.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值