第八章。文件与数据格式化_python with open('价格.csv', 'r', encoding='utf-8') -CSDN博客

本文链接：https://blog.csdn.net/2501_90874211/article/details/146331977

在Python中，文件操作和数据格式化是处理数据的基础技能。以下是关键点总结和示例：

**一、文件操作**
1. **打开文件**
使用`open()`函数和`with`语句（推荐，自动关闭文件）：
```python
with open('file.txt', 'r', encoding='utf-8') as f:
content = f.read()
```
- **模式**：`r`（读）、`w`（覆盖写）、`a`（追加）、`b`（二进制，如`rb`或`wb`）。

2. **读取文件**
- `f.read()`：读取全部内容。
- `f.readline()`：读取单行。
- `f.readlines()`：返回所有行的列表。
- **逐行高效读取**：
```python
with open('file.txt', 'r') as f:
for line in f:
print(line.strip())
```

3. **写入文件**
- `f.write("text")`：写入字符串。
- `f.writelines(list_of_strings)`：写入字符串列表（需手动添加换行符`\n`）。
```python
with open('output.txt', 'w') as f:
f.write("Hello, World!\n")
f.writelines(["Line 1\n", "Line 2\n"])
```

4. **异常处理**
处理文件不存在等错误：
```python
try:
with open('file.txt', 'r') as f:
print(f.read())
except FileNotFoundError:
print("文件不存在！")
```

---

**二、数据格式化**
1. **字符串格式化**
- **f-string（推荐）**：
```python
name = "Alice"
age = 30
print(f"{name} is {age} years old.")
```
- **格式控制**：
```python
price = 99.987
print(f"价格：{price:.2f}") # 输出：价格：99.99
```

2. **JSON处理**
使用`json`模块：
```python
import json
# 读取JSON文件
with open('data.json', 'r') as f:
data = json.load(f)
# 写入JSON文件
data['new_key'] = "value"
with open('data.json', 'w') as f:
json.dump(data, f, indent=4) # indent美化格式
```

3. **CSV处理**
使用`csv`模块：
```python
import csv
# 读取CSV
with open('data.csv', 'r') as f:
reader = csv.DictReader(f)
for row in reader:
print(row['name'], row['age'])
# 写入CSV
with open('output.csv', 'w') as f:
writer = csv.writer(f)
writer.writerow(["Name", "Age"])
writer.writerow(["Alice", 30])
```

---

**三、示例项目**
**统计文本文件中的单词频率**：
```python
from collections import defaultdict
import re

word_count = defaultdict(int)
with open('text.txt', 'r', encoding='utf-8') as f:
for line in f:
words = re.findall(r'\b\w+\b', line.lower()) # 分割单词并转小写
for word in words:
word_count[word] += 1

# 输出结果到文件
with open('word_counts.csv', 'w') as f:
writer = csv.writer(f)
writer.writerow(["Word", "Count"])
for word, count in word_count.items():
writer.writerow([word, count])
```

---

**四、注意事项**
1. **编码问题**：始终指定文件编码（如`encoding='utf-8'`）。
2. **路径处理**：使用`pathlib`或`os.path`处理跨系统路径。
3. **大文件处理**：避免一次性读取，使用逐行或分块处理。

通过实践这些操作和格式化方法，可以高效处理各类数据任务！