.py文件与.ipynb文件互相转换

最新推荐文章于 2024-06-08 19:03:45 发布

ToTensor

最新推荐文章于 2024-06-08 19:03:45 发布

阅读量8.8k

点赞数 13

分类专栏： NLP成长之路文章标签： python

本文链接：https://blog.csdn.net/qq_44193969/article/details/119699291

版权

NLP成长之路专栏收录该内容

42 篇文章 41 订阅

订阅专栏

文章目录

前言
1、直接读写
2、冷静分析
3、正确操作
4、效果

前言

为了将python代码放在jupyter notebook上运行，人工处理需要耗费大量时间，考虑到以后数据量增多，自动化脚本处理撒必要的。

1、直接读写

可能很多人起初和我想的一样，不就是更改个后缀名吗，有那么复杂，直接读一下然后存一下不就好了。

test.py中的内容：

# %%
'''
# 这是个测试
'''

# %%
def twoSum(nums, target):
    cache = {}
    i = 0
    while i < len(nums):
        right = target-nums[i]
        if cache.get(right) is not None:
            return [cache[right], i]
        else:
            cache[nums[i]] = i
        i += 1
    return []

于是我开始了如下尝试：

with open('test.py', 'r', encoding='utf8') as f:
    code = f.read()
with open('test1.ipynb', 'w', encoding='utf8') as f:
    f.write(code)

so easy ???
于是，打开test1.ipynb
请添加图片描述
出乎意料，竟然报错了，不应该啊这。

2、冷静分析

于是我随便找了个正常的.ipynb文件
打开后发现是个JSON格式的文件，用CSDN插件的JSON工具打开，如图：
请添加图片描述
cell字段被我折叠起来了，展开看下：

从此处可以发现，ipynb格式的文件，代码都是以列表的形式存储在cells字段的source字段里面，根本不是改个后缀名那么简单。
而我们要做的，就是要构造一个这样的JSON格式，才能正确地将.py文件转为.ipynb格式。
此时，如果想将.ipynb转为.py，将cell_type=code作为代码cell_type=markdown作为注释拼接起来就好了。

3、正确操作

前面思路已经捋清楚了，直接看代码吧

import json
import sys
from os import path

header_comment = '# %%\n'


def nb2py(notebook):
    result = []
    cells = notebook['cells']

    for cell in cells:
        cell_type = cell['cell_type']

        if cell_type == 'markdown':
            result.append("%s'''\n%s\n'''" %
                          (header_comment, ''.join(cell['source'])))

        if cell_type == 'code':
            result.append("%s%s" % (header_comment, ''.join(cell['source'])))

    return '\n\n'.join(result)


def py2nb(py_str):
    # remove leading header comment
    if py_str.startswith(header_comment):
        py_str = py_str[len(header_comment):]

    cells = []
    chunks = py_str.split('\n\n%s' % header_comment)

    for chunk in chunks:
        cell_type = 'code'
        if chunk.startswith("'''"):
            chunk = chunk.strip("'\n")
            cell_type = 'markdown'

        cell = {
            'cell_type': cell_type,
            'metadata': {},
            'source': chunk.splitlines(True),
        }

        if cell_type == 'code':
            cell.update({'outputs': [], 'execution_count': None})

        cells.append(cell)

    notebook = {
        'cells': cells,
        'metadata': {
            'anaconda-cloud': {},
            'kernelspec': {
                'display_name': 'Python 3',
                'language': 'python',
                'name': 'python3'},
            'language_info': {
                'codemirror_mode': {'name': 'ipython', 'version': 3},
                'file_extension': '.py',
                'mimetype': 'text/x-python',
                'name': 'python',
                'nbconvert_exporter': 'python',
                'pygments_lexer': 'ipython3',
                'version': '3.8.8'}},
        'nbformat': 4,
        'nbformat_minor': 1
    }

    return notebook


def convert(in_file, out_file):
    _, in_ext = path.splitext(in_file)
    _, out_ext = path.splitext(out_file)

    if in_ext == '.ipynb' and out_ext == '.py':
        with open(in_file, 'r') as f:
            notebook = json.load(f)
        py_str = nb2py(notebook)
        with open(out_file, 'w') as f:
            f.write(py_str)

    elif in_ext == '.py' and out_ext == '.ipynb':
        with open(in_file, 'r') as f:
            py_str = f.read()
        notebook = py2nb(py_str)
        with open(out_file, 'w') as f:
            json.dump(notebook, f, indent=2)

    else:
        raise(Exception('Extensions must be .ipynb and .py or vice versa'))


if __name__ == '__main__':
    src_python_path = 'test.py'
    dst_python_path = 'test.ipynb'

    convert(in_file=src_python_path, out_file=dst_python_path)

4、效果

请添加图片描述
将第一个cell 里右下角的python改成markdown，运行一下就能看到预期的效果了

ToTensor

关注

13
点赞
踩
44

收藏

觉得还不错? 一键收藏
打赏
5
评论
.py文件与.ipynb文件互相转换

文章目录前言1、直接读写2、冷静分析3、正确操作4、效果前言为了将python代码放在jupyter notebook上运行，人工处理需要耗费大量时间，考虑到以后数据量增多，自动化脚本处理撒必要的。1、直接读写可能很多人起初和我想的一样，不就是更改个后缀名吗，有那么复杂，直接读一下然后存一下不就好了。test.py中的内容：# %%'''# 这是个测试'''# %%def twoSum(nums, target): cache = {} i = 0 while
复制链接

扫一扫