文章分为两部分内容:1.excel每一行到处为txt 2.用enca转换utf-8
excel每一行导出为txt
背景:最近在做智库项目,爬取了高校很多教师信息,经过预处理后保存为excel文件,由于要将每个老师的信息做jieba分词和去停用词,因此先将excel每一行转成一个txt文本,借助excel的VBA对每一行转成txt,参考百度经验
- 打开原文件:
- Alt+F11,(mac是option+fn+f11 )打开VBA,在左侧找到你要导的表,左键双击输入指令
- 复制以下代码进去就ok了
Sub txt()
Dim i, j, arr(), brr(), myRow, myCol
arr = Sheet1.UsedRange
myRow = UBound(arr, 1)
myCol = UBound(arr, 2)
For i = 1 To myRow
Open ThisWorkbook.Path & "\" & arr(i, 1) & "1.txt" For Output As #1
Print #1, Join(Application.Index(arr, 1), ",")
Print #1, Join(Application.Index(arr