python 提取固定列名数据_从Pandas DataFrame中提取数组（列名，数据）

最新推荐文章于 2022-04-10 22:16:31 发布

weixin_39761655

最新推荐文章于 2022-04-10 22:16:31 发布

阅读量447

点赞数

文章标签： python 提取固定列名数据

This is my first question at Stack Overflow.

I have a DataFrame of Pandas like this.

a b c d

one 0 1 2 3

two 4 5 6 7

three 8 9 0 1

four 2 1 1 5

five 1 1 8 9

I want to extract the pairs of column name and data whose data is 1 and each index is separate at array.

[ [(b,1.0)], [(d,1.0)], [(b,1.0),(c,1.0)], [(a,1.0),(b,1.0)] ]

I want to use gensim of python library which requires corpus as this form.

Is there any smart way to do this or to apply gensim from pandas data?

解决方案

Many gensim functions accept numpy arrays, so there may be a better way...

In [11]: is_one = np.where(df == 1)

In [12]: is_one

Out[12]: (array([0, 2, 3, 3, 4, 4]), array([1, 3, 1, 2, 0, 1]))

In [13]: df.index[is_one[0]], df.columns[is_one[1]]

Out[13]:

(Index([u'one', u'three', u'four', u'four', u'five', u'five'], dtype='object'),

Index([u'b', u'd', u'b', u'c', u'a', u'b'], dtype='object'))

To groupby each row, you could use iterrows:

from itertools import repeat

In [21]: [list(zip(df.columns[np.where(row == 1)], repeat(1.0)))

for label, row in df.iterrows()

if 1 in row.values] # if you don't want empty [] for rows without 1

Out[21]:

[[('b', 1.0)],

[('d', 1.0)],

[('b', 1.0), ('c', 1.0)],

[('a', 1.0), ('b', 1.0)]]

In python 2 the list is not required since zip returns a list.

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

weixin_39761655

关注关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

dataframe数组做元素_DataFrame(4)：DataFrame元素的获取方式

weixin_39663970的博客

12-20

330

1、学习DataFrame元素获取，需要掌握以下几个需求访问一列或多列访问一行或多行访问某个值访问某几行中的某几列访问某几列中的某几行2、构造一个DataFramedf = pd.DataFrame(np.random.randint(70,100,(4,5)),index=["地区1", "地区2", "地区3","地区4"],columns=["武汉","天门", "黄冈","孝感",...

python一维数组提取_Python数据分析常用数组

weixin_39775976的博客

12-09

1863

一维数组Numpy：Array# 导入numpy包import numpy as np# 定义：类似列表a = np.array([1,2,3,4,5])# 查询print(a[0])# 切片访问：获取指定序号范围的元素print(a[1:3])# 查看数据类型print(a.dtype)# 统计计算print(a.mean())# 向量化运行：乘以标量b=np.array([1,2,3])c=b...

参与评论您还未登录，请先登录后发表或查看评论

DataFrame 将某列数据转为数组的方法

12-25

如下所示： playerIds =salaries_2016['playerID'].tolist() data[‘列名’].tolist() 以上这篇DataFrame 将某列数据转为数组的方法就是小编分享给大家的全部内容了，希望能给大家一个参考，也希望大家多多支持软件开发网。您可能感兴趣的文章:python读取文本中数据并转化为DataFrame的实例pandas修改DataFrame列名的方法pandas系列之DataFrame 行列数据筛选实例Python将DataFrame的某一列作为index的方法python DataFram

python提取包含特定字符串的行_python提取具有某种特定字符串的行数据方法

weixin_39623271的博客

11-22

1905

今天又帮女朋友处理了一下，她的实验数据，因为python是一年前经常用，最近找工作，用的是c，c++，python的有些东西忘记了，然后就一直催我，说我弄的慢，弄的慢，你自己弄啊，烦不烦啊，逼逼叨叨的，最后还不是我给弄好的？呵呵好的，数据是这样的，我截个图我用红括号括起来的，就是我所要提取的数据其中lossstotal.txt是我要提取的原始数据，考虑两种方法去提取，前期以为所要提取行的数据是有一...

python 提取固定列名数据,从Pandas DataFrame中提取数组（列名，数据）

weixin_40009063的博客

11-27

719

This is my first question at Stack Overflow.I have a DataFrame of Pandas like this.a b c done 0 1 2 3two 4 5 6 7three 8 9 0 1four 2 1 1 5five 1 1 8 ...

python在excel中查找内容_python使用xlrd实现检索excel中某列含有指定字符串记录的方法...

weixin_39826089的博客

11-27

435

本文实例讲述了python使用xlrd实现检索excel中某列含有指定字符串记录的方法。分享给大家供大家参考。具体分析如下：这里利用xlrd，将excel中某列数据中，含有指定字符串的记录取出，并生成用这个字符串命名的txt文件import osimport xlrd,sys# input the excel fileFilename=raw_input('input the file name&...

Pandas DataFrame二维数组说明、DataFrame的创建、从文件中读取DataFrame对象知识---初学基础

直心I

01-07

6642

1.DataFrame说明 DataFrame是一个【表格型】的数据结构，可以看做是【由Series组成的字典】（共用同一个索引）。DataFrame由按一定顺序排列的多列数据组成，设计初衷是将Series的使用场景从一维拓展到多维，DataFrame既有行索引，也有列索引，DataFrame属性：values、columns、index、shape。行索引：index 列索引：columns...

python如何从一个dataframe提取相应的行组成一个新的dataframe_Python之pandas 基础篇...

weixin_39936380的博客

11-19

5501

Python之pandas 基础篇以下语句是在pycharm中进行演示，复制到pycharm中运行可直接查看运行结果#pandas 的数据结构介绍#1.Series(由一组数据，各种Numpy数据类型)和一组索引组成：import pandas as pdimport numpy as npobj=pd.Series([4,7,-5,3])print(obj)#1.1Values和index属性：...

python基础教程：pandasDataFrame行列索引及值的获取的方法.pdf

最新发布

06-12

在Python的pandas库中，DataFrame是一个非常重要的数据结构，用于处理二维表格型数据。它具有行索引和列索引，使得数据操作更加灵活。本教程将详细讲解如何在DataFrame中获取行列索引及值。首先，创建一个简单的...

pandas利用pd.Index和df.reindex函数提取相应列

hooyying的博客

04-10

2767

假设存在以下数据集 realgdp realcons realinv realgovt realdpi cpi m1 tbilrate unemp pop infl realint 0 2710.349 1707.4 286.898 470.045 1886.9 28.98 139.7 2.82 5.8 177.146 0.00 0.00 1 2778.801 1733.7 310.859 481.301 1919.7 29.15 141.7

python提取包含特定字符串的行_Pandas过滤dataframe中包含特定字符串数据的方法整理...

weixin_39638468的博客

11-24

3299

Pandas过滤dataframe中包含特定字符串数据假如有一列全是字符串的dataframe，希望提取包含特定字符的所有数据，该如何提取呢？因为之前尝试使用filter，发现行不通，最终找到这个行得通的方法。举例说明：我希望提取所有包含'Mr.'的人名1、首先将他们进行字符串化，并得到其对应的布尔值：>>> bool = df.str.contains('Mr\.') #不要忘记正则表达式的写法...

python自定义列名和长度输出_关于python：如何为size（）列指定名称？

weixin_39619858的博客

12-07

966

我正在GroupBy结果上使用.size()，以便计算每组中有多少项。我希望将结果保存到一个新的列名，而不手动编辑列名数组，如何完成？谢谢这就是我所尝试的：grpd = df.groupby(['A','B'])grpd['size'] = grpd.size()grpd我得到的错误是：TypeError: 'DataFrameGroupBy' object does not support it...

在python如何把表格中的某一列作为字符数组取出来？

weixin_50929843的博客

07-03

973

import numpy as np feature=pd.read_excel('try.xls')#读取表格 print(feature)#打印自己的表格给大家看一下 y=np.array(feature['心情'])#按表头‘心情’取一列的数据 print(y)

列名必须是一个字符串或者数组_我爱Julia之入门-078（字符串08）

weixin_39998998的博客

12-03

108

字符串连接我们经常会遇到将字符序列或字符串数集通过连接的方式获得新的字符串。我们可以使用 String 的构造函数对字符数组进行连接。但这种方法并不适用于字符元组，也不支持字符串数组。String(v::AbstractVector{UInt8})由上可见，构造函数往往是把字符数组、字节数组转换成字符串。join要想字符序列或字符数集连接成字符串可以使用 join 函数。join([io::IO,...

python读取csv文件指定数据_在Python中从CSV文件的特定列中提取数据

weixin_39883208的博客

11-30

3374

. . . and storing it in a PY file to use the data to graph after storing all the data in different files . . .. . . I would want to store only "2345678@abcdef" and "365" in the new python file . . .确实...

python笔记11:数据处理之字段抽取

aiyo92的专栏

01-15

1112

# -*- coding: utf-8 -*- #1. 概念：字段抽取：是根据已知列数据的开始和结束位置，抽取出新的列。例如从手机号码中抽取出运营商、地区、号码 #字段截取函数：slice(start,stop) from pandas import read_csv df = read_csv('D:/python/workspace/pythonStudy/11.csv') #由于电话号码...

Python-提取出文本中含有特定字符串的方法