python 括号 数组,将带括号的字符串转换为numpy数组

Description of the problem:

I have an array-like structure in a dataframe column as a string (I read the dataframe from a csv file).

One string element of this column looks like this:

In [1]: df.iloc[0]['points']

Out [2]: '[(-0.0426, -0.7231, -0.4207), (0.2116, -0.1733, -0.1013), (...)]'

so it's really an array-like structure, which looks 'ready for numpy' to me.

A simple numpy.array() on the string itself, if I copy and paste it in the array() function is returning me a numpy array.

But if I fill the array() function with the variable containing the string like that: np.array(df.iloc[0]['points']) it does not work, giving me a ValueError: could not convert string to float

The question:

Is there any function to do that in a simple way (without replacing or regex-ing the brackets)?

解决方案

You can use ast.literal_eval before passing to numpy.array:

from ast import literal_eval

import numpy as np

x = '[(-0.0426, -0.7231, -0.4207), (0.2116, -0.1733, -0.1013)]'

res = np.array(literal_eval(x))

print(res)

array([[-0.0426, -0.7231, -0.4207],

[ 0.2116, -0.1733, -0.1013]])

You can do the equivalent with strings in a Pandas series, but it's not clear if you need to aggregate across rows. If this is the case, you can combine a list of NumPy arrays derived using the above logic.

The docs explain types acceptable to literal_eval:

Safely evaluate an expression node or a string containing a Python

literal or container display. The string or node provided may only

consist of the following Python literal structures: strings, bytes,

numbers, tuples, lists, dicts, sets, booleans, and None.

So we are effectively converting a string to a list of tuples, which np.array can then convert to a NumPy array.

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值