freemap初学者教程
The Jupyter Notebook is an incredibly powerful tool for interactively developing and presenting data science projects. A notebook integrates code and its output into a single document that combines visualisations, narrative text, mathematical equations, and other rich media. The intuitive workflow promotes iterative and rapid development, making notebooks an increasingly popular choice at the heart of contemporary data science, analysis, and increasingly science at large. Best of all, as part of the open source Project Jupyter, they are completely free.
Jupyter Notebook是用于交互开发和展示数据科学项目的强大工具。 笔记本将代码及其输出集成到单个文档中,该文档结合了可视化效果,叙述文字,数学方程式和其他富媒体。 直观的工作流程促进了迭代和快速的发展,使笔记本成为现代数据科学,分析和日益广泛的科学的核心越来越受欢迎的选择。 最重要的是,作为开源项目Jupyter的一部分 ,它们是完全免费的。
The Jupyter project is the successor to the earlier IPython Notebook, which was first published as a prototype in 2010. Although it is possible to use many different programming languages within Jupyter Notebooks, this article will focus on Python as it is the most common use case.
Jupyter项目是较早版本的IPython Notebook(其于2010年首次发布为原型)的继承者。尽管可以在Jupyter Notebook中使用许多不同的编程语言,但本文将重点介绍Python,因为这是最常见的用例。
To get the most out of this tutorial you should be familiar with programming, specifically Python and pandas specifically. That said, if you have experience with another language, the Python in this article shouldn’t be too cryptic and pandas should be interpretable. Jupyter Notebooks can also act as a flexible platform for getting to grips with pandas and even Python, as it will become apparent in this article.
为了充分利用本教程,您应该熟悉编程,尤其是Python和pandas 。 就是说,如果您有使用另一种语言的经验,那么本文中的Python不应太晦涩难懂,而熊猫应该是可解释的。 Jupyter Notebooks还可以作为灵活的平台来处理熊猫甚至Python,这将在本文中显而易见。
We will:
我们会:
- Cover the basics of installing Jupyter and creating your first notebook
- Delve deeper and learn all the important terminology
- Explore how easily notebooks can be shared and published online. Indeed, this article is a Jupyter Notebook! Everything here was written in the Jupyter Notebook environment and you are viewing it in a read-only form.
- 涵盖安装Jupyter和创建第一个笔记本的基础知识
- 深入研究并学习所有重要术语
- 探索如何轻松地在线共享和发布笔记本。 确实,本文是Jupyter笔记本! 这里的所有内容都是在Jupyter Notebook环境中编写的,您正在以只读形式查看它。
Jupyter Notebook中的示例数据分析 (Example data analysis in a Jupyter Notebook)
We will walk through a sample analysis, to answer a real-life question, so you can see how the flow of a notebook makes the task intuitive to work through ourselves, as well as for others to understand when we share it with them.
我们将逐步进行示例分析,回答一个现实生活中的问题,因此您可以看到笔记本的流程如何使任务直观地完成自己的工作,并让其他人了解与他们共享的任务。
So, let’s say you’re a data analyst and you’ve been tasked with finding out how the profits of the largest companies in the US changed historically. You find a data set of Fortune 500 companies spanning over 50 years since the list’s first publication in 1955, put together from Fortune’s public archive.
因此,假设您是一名数据分析师,而您的任务是找出美国最大公司的利润在历史上是如何变化的。 您可以找到《财富》 500强公司的数据集,该数据集自1955年该榜单首次发布以来已有50多年的历史了,该数据集来自《财富》的公共档案库 。
As we shall demonstrate, Jupyter Notebooks are perfectly suited for this investigation. First, let’s go ahead and install Jupyter.
正如我们将演示的,Jupyter笔记本非常适合此调查。 首先,让我们继续安装Jupyter。
安装 (Installation)
The easiest way for a beginner to get started with Jupyter Notebooks is by installing Anaconda. Anaconda is the most widely used Python distribution for data science and comes pre-loaded with all the most popular libraries and tools. As well as Jupyter, some of the biggest Python libraries wrapped up in Anaconda include NumPy, pandas and Matplotlib, though the full 1000+ list is exhaustive. This lets you hit the ground running in your own fully stocked data science workshop without the hassle of managing countless installations or worrying about dependencies and OS-specific (read: Windows-specific) installation issues.
对于初学者来说,开始使用Jupyter Notebooks最简单的方法是安装Anaconda 。 Anaconda是数据科学中使用最广泛的Python发行版,并预装有所有最受欢迎的库和工具。 与Jupyter一样,Anaconda中封装的一些最大的Python库包括NumPy , pandas和Matplotlib ,尽管完整的1000多个列表都是详尽无遗的。 这样一来,您便可以在自己拥有足够资源的数据科学工作室中开始运作,而无需管理无数安装,也不必担心依赖项和特定于OS(特定于Windows)的安装问题。
To get Anaconda, simply:
要获取Anaconda,只需:
- Download the latest version of Anaconda for Python 3 (ignore Python 2.7).
- Install Anaconda by following the instructions on the download page and/or in the executable.
- 下载适用于Python 3的最新版本的Anaconda(忽略Python 2.7)。
- 按照下载页面和/或可执行文件中的说明安装Anaconda。
If you are a more advanced user with Python already installed and prefer to manage your packages manually, you can just use pip:
如果您是已经安装Python的高级用户,并且希望手动管理软件包,则可以使用pip:
pip3 install jupyter
pip3 install jupyter
创建您的第一个笔记本 (Creating Your First Notebook)
In this section, we’re going to see how to run and save notebooks, familiarise ourselves with their structure, and understand the interface. We’ll become intimate with some core terminology that will steer you towards a practical understanding of how to use Jupyter Notebooks by yourself and set us up for the next section, which steps through an example data analysis and brings everything we learn here to life.
在本节中,我们将了解如何运行和保存笔记本,熟悉其结构以及了解界面。 我们将熟悉一些核心术语,这些术语将引导您逐步了解如何自己使用Jupyter Notebooks,并为下一节做好准备,它将逐步进行示例数据分析,并将我们在此学到的所有知识带入生活。
运行Jupyter (Running Jupyter)
On Windows, you can run Jupyter via the shortcut Anaconda adds to your start menu, which will open a new tab in your default web browser that should look something like the following screenshot.
在Windows上,您可以通过Anaconda添加到您的开始菜单中的快捷方式运行Jupyter,这将在默认的Web浏览器中打开一个新选项卡,其外观应类似于以下屏幕截图。
This isn’t a notebook just yet, but don’t panic! There’s not much to it. This is the Notebook Dashboard, specifically designed for managing your Jupyter Notebooks. Think of it as the launchpad for exploring, editing and creating your notebooks.
这还不是笔记本,但不要惊慌! 没什么。 这是笔记本仪表板,专门用于管理Jupyter笔记本。 将其视为用于浏览,编辑和创建笔记本的启动板。
Be aware that the dashboard will give you access only to the files and sub-folders contained within Jupyter’s start-up directory; however, the start-up directory can be changed. It is also possible to start the dashboard on any system via the command prompt (or terminal on Unix systems) by entering the command jupyter notebook
; in this case, the current working directory will be the start-up directory.
请注意,仪表板将只允许您访问Jupyter的启动目录中包含的文件和子文件夹; 但是, 可以更改启动目录。 通过输入命令jupyter notebook
,还可以通过命令提示符(或Unix系统上的终端)在任何系统上启动仪表板; 在这种情况下,当前的工作目录将是启动目录。
The astute reader may have noticed that the URL for the dashboard is something like http://localhost:8888/tree
. Localhost is not a website, but indicates that the content is being served from your local machine: your own computer. Jupyter’s Notebooks and dashboard are web apps, and Jupyter starts up a local Python server to serve these apps to your web browser, making it essentially platform independent and opening the door to easier sharing on the web.
精明的读者可能已经注意到,仪表板的URL类似于http://localhost:8888/tree
。 Localhost不是网站,而是表示正在通过您的本地计算机(您自己的计算机)提供内容。 Jupyter的Notebooks和仪表板是Web应用程序,Jupyter启动了本地Python服务器,以将这些应用程序提供给Web浏览器,从而使其实质上与平台无关,并为更轻松地在Web上共享打开了大门。
The dashboard’s interface is mostly self-explanatory — though we will come back to it briefly later. So what are we waiting for? Browse to the folder in which you would like to create your first notebook, click the “New” drop-down button in the top-right and select “Python 3” (or the version of your choice).
仪表板的界面大部分都是不言而喻的-尽管稍后我们将简要介绍它。 那么,我们还等什么呢? 浏览到要在其中创建第一个笔记本的文件夹,单击右上角的“新建”(New)下拉按钮,然后选择“ Python 3”(或您选择的版本)。
Hey presto, here we are! Your first Jupyter Notebook will open in new tab — each notebook uses its own tab because you can open multiple notebooks simultaneously. If you switch back to the dashboard, you will see the new file Untitled.ipynb
and you should see some green text that tells you your notebook is running.
嘿,请问,我们在这里! 您的第一个Jupyter笔记本将在新选项卡中打开-每个笔记本使用自己的选项卡,因为您可以同时打开多个笔记本。 如果切换回仪表板,您将看到新文件Untitled.ipynb
并且应该看到一些绿色文本,指示您的笔记本计算机正在运行。
什么是ipynb文件? (What is an ipynb File?)
It will be useful to understand what this file really is. Each .ipynb
file is a text file that describes the contents of your notebook in a format called JSON. Each cell and its contents, including image attachments that have been converted into strings of text, is listed therein along with some metadata. You can edit this yourself — if you know what you are doing! — by selecting “Edit > Edit Notebook Metadata” from the menu bar in the notebook.
了解此文件的真正含义将很有用。 每个.ipynb
文件都是一个文本文件,以JSON格式描述笔记本的内容。 其中列出了每个单元格及其内容(包括已转换为文本字符串的图像附件)以及一些元数据 。 您可以自己编辑此内容-如果您知道自己在做什么! —通过从笔记本菜单栏中选择“编辑>编辑笔记本元数据”。
You can also view the contents of your notebook files by selecting “Edit” from the controls on the dashboard, but the keyword here is “can“; there’s no reason other than curiosity to do so unless you really know what you are doing.
您也可以通过从仪表板上的控件中选择“编辑”来查看笔记本文件的内容,但是此处的关键字是“可以”; 除非您真的知道自己在做什么,否则只有好奇心可以这样做。
笔记本界面 (The notebook interface)
Now that you have an open notebook in front of you, its interface will hopefully not look entirely alien; after all, Jupyter is essentially just an advanced word processor. Why not take a look around? Check out the menus to get a feel for it, especially take a few moments to scroll down the list of commands in the command palette, which is the small button with the keyboard icon (or Ctrl + Shift + P
).
现在您面前有一个打开的笔记本,希望它的界面看起来不会完全陌生。 毕竟,Jupyter本质上只是一个高级文字处理器。 为什么不四处看看? 签出菜单以了解一下,特别是花一些时间向下滚动命令面板中的命令列表,这是带有键盘图标的小按钮(或Ctrl + Shift + P
)。
There are two fairly prominent terms that you should notice, which are probably new to you: cells and kernels are key both to understanding Jupyter and to what makes it more than just a word processor. Fortunately, these concepts are not difficult to understand.
您应该注意到两个相当突出的术语,它们可能对您来说是新的:单元和内核是理解Jupyter以及使它不仅仅是文字处理程序的关键。 幸运的是,这些概念并不难理解。
- A kernel is a “computational engine” that executes the code contained in a notebook document.
- A cell is a container for text to be displayed in the notebook or code to be executed by the notebook’s kernel.
- 内核是执行笔记本文档中包含的代码的“计算引擎”。
- 单元格是用于在笔记本中显示文本或由笔记本内核执行的代码的容器。
细胞 (Cells)
We’ll return to kernels a little later, but first let’s come to grips with cells. Cells form the body of a notebook. In the screenshot of a new notebook in the section above, that box with the green outline is an empty cell. There are two main cell types that we will cover:
我们稍后再返回内核,但首先让我们来了解一下单元。 细胞形成笔记本的主体。 在上面部分中新笔记本的屏幕截图中,带有绿色轮廓的框是一个空单元格。 我们将介绍两种主要的单元格类型:
- A code cell contains code to be executed in the kernel and displays its output below.
- A Markdown cell contains text formatted using Markdown and displays its output in-place when it is run.
- 代码单元包含要在内核中执行的代码,并在下面显示其输出。
- Markdown单元包含使用Markdown格式化的文本,并在运行时就地显示其输出。
The first cell in a new notebook is always a code cell. Let’s test it out with a classic hello world example. Type print('Hello World!')
into the cell and click the run button Ctrl + Enter
. The result should look like this:
新笔记本中的第一个单元格始终是代码单元格。 让我们用经典的hello world示例进行测试。 在单元格中键入print('Hello World!')
,然后单击运行按钮
Ctrl + Enter
。
结果应如下所示:
Hello World!
Hello World!
When you ran the cell, its output will have been displayed below and the label to its left will have changed from In [ ]
to In [1]
. The output of a code cell also forms part of the document, which is why you can see it in this article. You can always tell the difference between code and Markdown cells because code cells have that label on the left and Markdown cells do not. The “In” part of the label is simply short for “Input,” while the label number indicates when the cell was executed on the kernel — in this case the cell was executed first. Run the cell again and the label will change to In [2]
because now the cell was the second to be run on the kernel. It will become clearer why this is so useful later on when we take a closer look at kernels.
当您运行单元格时,其输出将显示在下方,并且其左侧的标签将从In [ ]
更改为In [1]
。 代码单元的输出也构成了文档的一部分,这就是为什么您可以在本文中看到它的原因。 您总是可以分辨出代码单元和Markdown单元之间的区别,因为代码单元的左侧带有该标签,而Markdown单元则没有。 标签的“ In”部分只是“ Input”的缩写,而标签号表示何时在内核上执行该单元-在这种情况下,该单元首先执行。 再次运行该单元,标签将更改为In [2]
因为现在该单元是第二个在内核上运行的单元。 稍后我们仔细研究内核时,为什么这样做如此有用将变得更加清楚。
From the menu bar, click Insert and select Insert Cell Below to create a new code cell underneath your first and try out the following code to see what happens. Do you notice anything different?
在菜单栏中,单击“插入”,然后选择“在下面插入单元格”以在您的第一个代码单元下方创建一个新的代码单元,并尝试以下代码以查看会发生什么。 你有什么不同吗?
This cell doesn’t produce any output, but it does take three seconds to execute. Notice how Jupyter signifies that the cell is currently running by changing its label to In [*]
.
该单元格不产生任何输出,但是确实需要三秒钟来执行。 注意Jupyter如何通过将其标签更改为In [*]
来指示该单元当前正在运行。
In general, the output of a cell comes from any text data specifically printed during the cells execution, as well as the value of the last line in the cell, be it a lone variable, a function call, or something else. For example:
通常,单元格的输出来自单元格执行期间专门打印的任何文本数据,以及单元格中最后一行的值,无论是单独的变量,函数调用还是其他内容。 例如:
def say_hello(recipient): return 'Hello, {}!'.format(recipient) say_hello('Tim')
def say_hello(recipient): return 'Hello, {}!'.format(recipient) say_hello('Tim')
You’ll find yourself using this almost constantly in your own projects, and we’ll see more of it later on.
您会发现自己几乎在自己的项目中不断使用它,以后我们将看到更多内容。
键盘快捷键 (Keyboard shortcuts)
One final thing you may have observed when running your cells is that their border turned blue, whereas it was green while you were editing. There is always one “active” cell highlighted with a border whose colour denotes its current mode, where green means “edit mode” and blue is “command mode.”
在运行单元格时,您可能会观察到的最后一件事是它们的边框变为蓝色,而在编辑时为绿色。 总是有一个用边框突出显示的“活动”单元格,边框的颜色表示其当前模式,绿色表示“编辑模式”,蓝色表示“命令模式”。
So far we have seen how to run a cell with Ctrl + Enter
, but there are plenty more. Keyboard shortcuts are a very popular aspect of the Jupyter environment because they facilitate a speedy cell-based workflow. Many of these are actions you can carry out on the active cell when it’s in command mode.
到目前为止,我们已经看到了如何使用Ctrl + Enter
运行单元格,但是还有很多。 键盘快捷键是Jupyter环境中非常流行的方面,因为它们有助于快速进行基于单元的工作流程。 其中许多动作是您可以在活动单元处于命令模式下时对其执行的操作。
Below, you’ll find a list of some of Jupyter’s keyboard shortcuts. You’re not expected to pick them up immediately, but the list should give you a good idea of what’s possible.
在下面,您将找到一些Jupyter键盘快捷键的列表。 您不应立即选择它们,但是列表应该使您对可能的情况有了一个很好的了解。
- Toggle between edit and command mode with
Esc
andEnter
, respectively. - Once in command mode:
- Scroll up and down your cells with your
Up
andDown
keys. - Press
A
orB
to insert a new cell above or below the active cell. M
will transform the active cell to a Markdown cell.Y
will set the active cell to a code cell.D + D
(D
twice) will delete the active cell.Z
will undo cell deletion.- Hold
Shift
and pressUp
orDown
to select multiple cells at once.- With multple cells selected,
Shift + M
will merge your selection.
- With multple cells selected,
- Scroll up and down your cells with your
Ctrl + Shift + -
, in edit mode, will split the active cell at the cursor.- You can also click and
Shift + Click
in the margin to the left of your cells to select them.
- 分别使用
Esc
和Enter
在编辑和命令模式之间切换。 - 一旦进入命令模式:
- 使用
Up
和Down
键上下滚动单元格。 - 按
A
或B
在活动单元格上方或下方插入一个新单元格。 -
M
会将活动单元格转换为Markdown单元格。 -
Y
会将活动单元格设置为代码单元格。 -
D + D
(两次D
)将删除活动单元格。 -
Z
将撤消单元格删除。 - 按住
Shift
,然后按Up
或Down
键一次选择多个单元格。- 选择多个单元格后,
Shift + M
将合并您的选择。
- 选择多个单元格后,
- 使用
-
Ctrl + Shift + -
在编辑模式下,将在光标处拆分活动单元格。 - 您也可以单击并
Shift + Click
单元格左侧的空白处以选择它们。
Go ahead and try these out in your own notebook. Once you’ve had a play, create a new Markdown cell and we’ll learn how to format the text in our notebooks.
继续并在您自己的笔记本中尝试这些。 玩完游戏后,创建一个新的Markdown单元,我们将学习如何在笔记本中设置文本格式。
降价促销 (Markdown)
Markdown is a lightweight, easy to learn markup language for formatting plain text. Its syntax has a one-to-one correspondance with HTML tags, so some prior knowledge here would be helpful but is definitely not a prerequisite. Remember that this article was written in a Jupyter notebook, so all of the narrative text and images you have seen so far was achieved in Markdown. Let’s cover the basics with a quick example.
Markdown是一种轻量级,易于学习的标记语言,用于格式化纯文本。 它的语法与HTML标签具有一一对应的关系,因此此处的一些先验知识将有所帮助,但绝对不是前提条件。 请记住,本文是在Jupyter笔记本中撰写的,因此到目前为止,您所看到的所有叙述性文字和图像都是在Markdown中实现的。 让我们用一个简单的例子来介绍基础知识。
# This is a level 1 heading ## This is a level 2 heading This is some plain text that forms a paragraph. Add emphasis via **bold** and __bold__, or *italic* and _italic_. Paragraphs must be separated by an empty line. * Sometimes we want to include lists. * Which can be indented. 1. Lists can also be numbered. 2. For ordered lists. [It is possible to include hyperlinks](https://www.example.com) Inline code uses single backticks: `foo()`, and code blocks use triple backticks: ``` bar() ``` Or can be intented by 4 spaces: foo() And finally, adding images is easy: ![Alt text](https://www.example.com/image.jpg)
# This is a level 1 heading ## This is a level 2 heading This is some plain text that forms a paragraph. Add emphasis via **bold** and __bold__, or *italic* and _italic_. Paragraphs must be separated by an empty line. * Sometimes we want to include lists. * Which can be indented. 1. Lists can also be numbered. 2. For ordered lists. [It is possible to include hyperlinks](https://www.example.com) Inline code uses single backticks: `foo()`, and code blocks use triple backticks: ``` bar() ``` Or can be intented by 4 spaces: foo() And finally, adding images is easy: ![Alt text](https://www.example.com/image.jpg)
When attaching images, you have three options:
附加图像时,有三个选项:
- Use a URL to an image on the web.
- Use a local URL to an image that you will be keeping alongside your notebook, such as in the same git repo.
- Add an attachment via “Edit > Insert Image”; this will convert the image into a string and store it inside your notebook
.ipynb
file.
- 使用URL指向网络上的图像。
- 使用本地URL指向将与笔记本一起保存的图像,例如在同一git repo中。
- 通过“编辑>插入图像”添加附件; 这会将图像转换为字符串,并将其存储在笔记本的
.ipynb
文件中。
- Note that this will make your
.ipynb
file much larger!
- 请注意,这会使您的
.ipynb
文件更大!
There is plenty more detail to Markdown, especially around hyperlinking, and it’s also possible to simply include plain HTML. Once you find yourself pushing the limits of the basics above, you can refer to the official guide from the creator, John Gruber, on his website.
Markdown有很多更多的细节,尤其是在超链接方面,还可以简单地包含纯HTML。 一旦发现自己突破了上述基本限制,就可以在其网站上参考创建者John Gruber的官方指南 。
核仁 (Kernels)
Behind every notebook runs a kernel. When you run a code cell, that code is executed within the kernel and any output is returned back to the cell to be displayed. The kernel’s state persists over time and between cells — it pertains to the document as a whole and not individual cells.
每个笔记本的背后都有一个内核。 当您运行代码单元时,该代码将在内核中执行,并且所有输出都将返回到要显示的单元中。 内核的状态会随着时间推移以及在单元之间持续存在-它与整个文档有关,而与单个单元无关。
For example, if you import libraries or declare variables in one cell, they will be available in another. In this way, you can think of a notebook document as being somewhat comparable to a script file, except that it is multimedia. Let’s try this out to get a feel for it. First, we’ll import a Python package and define a function.
例如,如果您在一个单元格中导入库或声明变量,则它们将在另一个单元格中可用。 这样,您可以认为笔记本文档在某种程度上类似于脚本文件,只是它是多媒体文件。 让我们尝试一下,以感受一下。 首先,我们将导入一个Python包并定义一个函数。
Once we’ve executed the cell above, we can reference np
and square
in any other cell.
一旦执行了上面的单元格,就可以在任何其他单元格中引用np
和square
。
x = np.random.randint(1, 10) y = square(x) print('%d squared is %d' % (x, y))
x = np.random.randint(1, 10) y = square(x) print('%d squared is %d' % (x, y))
This will work regardless of the order of the cells in your notebook. You can try it yourself, let’s print out our variables again.
无论笔记本中电池的顺序如何,它都可以工作。 您可以自己尝试,让我们再次打印出变量。
print('Is %d squared is %d?' % (x, y))
print('Is %d squared is %d?' % (x, y))
No surprises here! But now let’s change y
.
这里没有惊喜! 但是现在让我们改变y
。
y = 10
y = 10
What do you think will happen if we run the cell containing our print
statement again? We will get the output Is 4 squared is 10?
!
如果我们再次运行包含print
语句的单元格,您会怎么办? 我们将得到输出Is 4 squared is 10?
!
Most of the time, the flow in your notebook will be top-to-bottom, but it’s common to go back to make changes. In this case, the order of execution stated to the left of each cell, such as In [6]
, will let you know whether any of your cells have stale output. And if you ever wish to reset things, there are several incredibly useful options from the Kernel menu:
大多数情况下,笔记本中的流程是从上到下的,但是通常回去进行更改。 在这种情况下,每个单元格左侧的执行顺序(例如In [6]
)将使您知道是否有任何单元格具有过时的输出。 而且,如果您想重置某些内容,则“内核”菜单中有几个非常有用的选项:
- Restart: restarts the kernel, thus clearing all the variables etc that were defined.
- Restart & Clear Output: same as above but will also wipe the output displayed below your code cells.
- Restart & Run All: same as above but will also run all your cells in order from first to last.
- 重新启动:重新启动内核,从而清除所有已定义的变量等。
- 重新启动并清除输出:与上面相同,但也会擦除代码单元下方显示的输出。
- 重新启动并全部运行:与上面相同,但是还将从头到尾依次运行所有单元。
If your kernel is ever stuck on a computation and you wish to stop it, you can choose the Interupt option.
如果您的内核一直停留在计算上并且希望停止它,则可以选择Interupt选项。
选择一个内核 (Choosing a kernal)
You may have noticed that Jupyter gives you the option to change kernel, and in fact there are many different options to choose from. Back when you created a new notebook from the dashboard by selecting a Python version, you were actually choosing which kernel to use.
您可能已经注意到Jupyter为您提供了更改内核的选项,实际上,有很多不同的选项可供选择。 回到当您通过选择Python版本从仪表板创建新笔记本时,您实际上是在选择要使用的内核。
Not only are there kernels for different versions of Python, but also for over 100 languages including Java, C, and even Fortran. Data scientists may be particularly interested in the kernels for R and Julia, as well as both imatlab and the Calysto MATLAB Kernel for Matlab. The SoS kernel provides multi-language support within a single notebook. Each kernel has its own installation instructions, but will likely require you to run some commands on your computer.
不仅有适用于不同版本Python的内核,而且还适用于100多种语言,包括Java,C甚至Fortran。 数据科学家可能对R和Julia的内核以及imatlab和Matlab的Calysto MATLAB内核特别感兴趣。 SoS内核在单个笔记本中提供了多语言支持。 每个内核都有其自己的安装说明,但是可能需要您在计算机上运行一些命令。
实例分析 (Example analysis)
Now we’ve looked at what a Jupyter Notebook is, it’s time to look at how they’re used in practice, which should give you a clearer understanding of why they are so popular. It’s finally time to get started with that Fortune 500 data set mentioned earlier. Remember, our goal is to find out how the profits of the largest companies in the US changed historically.
现在,我们研究了什么是Jupyter笔记本,是时候看看它们在实践中的用法了,这应该使您对它们为什么如此受欢迎有更清楚的了解。 终于该开始使用前面提到的财富500强数据集了。 请记住,我们的目标是找出美国最大公司的利润在历史上是如何变化的。
It’s worth noting that everyone will develop their own preferences and style, but the general principles still apply, and you can follow along with this section in your own notebook if you wish, which gives you the scope to play around.
值得注意的是,每个人都会发展自己的喜好和风格,但是一般原则仍然适用,如果愿意,您可以在自己的笔记本中继续阅读本节,这为您提供了广阔的发展空间。
命名笔记本 (Naming your notebooks)
Before you start writing your project, you’ll probably want to give it a meaningful name. Perhaps somewhat confusingly, you cannot name or rename your notebooks from the notebook app itself, but must use either the dashboard or your file browser to rename the .ipynb
file. We’ll head back to the dashboard to rename the file you created earlier, which will have the default notebook file name Untitled.ipynb
.
在开始编写项目之前,您可能需要给它起一个有意义的名称。 也许有些令人困惑,您不能从笔记本应用程序本身命名或重命名笔记本,但是必须使用仪表板或文件浏览器来重命名.ipynb
文件。 我们将回到仪表板,重命名您之前创建的文件,该文件将具有默认的笔记本文件名Untitled.ipynb
。
You cannot rename a notebook while it is running, so you’ve first got to shut it down. The easiest way to do this is to select “File > Close and Halt” from the notebook menu. However, you can also shutdown the kernel either by going to “Kernel > Shutdown” from within the notebook app or by selecting the notebook in the dashboard and clicking “Shutdown” (see image below).
您无法在笔记本计算机运行时对其进行重命名,因此首先必须将其关闭。 最简单的方法是从笔记本菜单中选择“文件>关闭并暂停”。 但是,您也可以通过从笔记本应用程序中转到“内核>关闭”或通过在仪表板中选择笔记本并单击“关闭”来关闭内核(请参见下图)。
You can then select your notebook and and click “Rename” in the dashboard controls.
然后,您可以选择笔记本,然后在仪表板控件中单击“重命名”。
Note that closing the notebook tab in your browser will not “close” your notebook in the way closing a document in a traditional application will. The notebook’s kernel will continue to run in the background and needs to be shut down before it is truly “closed” — though this is pretty handy if you accidentally close your tab or browser! If the kernel is shut down, you can close the tab without worrying about whether it is still running or not.
请注意,在浏览器中关闭笔记本选项卡不会像在传统应用程序中关闭文档那样 “关闭”笔记本。 笔记本的内核将继续在后台运行,并且需要在真正关闭之前关闭它-尽管如果您不小心关闭了标签页或浏览器,这非常方便! 如果内核已关闭,则可以关闭选项卡而不必担心它是否仍在运行。
Once you’ve named your notebook, open it back up and we’ll get going.
命名笔记本后,请重新打开笔记本,我们将继续进行。
建立 (Setup)
It’s common to start off with a code cell specifically for imports and setup, so that if you choose to add or change anything, you can simply edit and re-run the cell without causing any side-effects.
通常从专门用于导入和设置的代码单元开始,因此,如果您选择添加或更改任何内容,则可以简单地编辑并重新运行该单元而不会引起任何副作用。
We import pandas to work with our data, Matplotlib to plot charts, and Seaborn to make our charts prettier. It’s also common to import NumPy but in this case, although we use it via pandas, we don’t need to explicitly. And that first line isn’t a Python command, but uses something called a line magic to instruct Jupyter to capture Matplotlib plots and render them in the cell output; this is one of a range of advanced features that are out of the scope of this article.
我们导入大熊猫来处理我们的数据, 导入Matplotlib来绘制图表,并使用Seaborn来使图表更漂亮。 导入NumPy也很常见,但是在这种情况下,尽管我们通过熊猫使用它,但我们不需要显式。 第一行不是Python命令,而是使用一种称为行魔术的东西来指示Jupyter捕获Matplotlib图并将其渲染到单元格输出中。 这是本文讨论范围之外的一系列高级功能之一。
Let’s go ahead and load our data.
让我们继续加载数据。
df = pd.read_csv('fortune500.csv')
df = pd.read_csv('fortune500.csv')
It’s sensible to also do this in a single cell in case we need to reload it at any point.
如果需要在任何时候重新加载它,也可以在单个单元中执行此操作。
保存并检查点 (Save and Checkpoint)
Now we’ve got started, it’s best practice to save regularly. Pressing Ctrl + S
will save your notebook by calling the “Save and Checkpoint” command, but what this checkpoint thing?
现在我们开始了,最好的做法是定期保存。 按Ctrl + S
将通过调用“保存并检查点”命令来保存笔记本,但是此检查点会发生什么呢?
Every time you create a new notebook, a checkpoint file is created as well as your notebook file; it will be located within a hidden subdirectory of your save location called .ipynb_checkpoints
and is also a .ipynb
file. By default, Jupyter will autosave your notebook every 120 seconds to this checkpoint file without altering your primary notebook file. When you “Save and Checkpoint,” both the notebook and checkpoint files are updated. Hence, the checkpoint enables you to recover your unsaved work in the event of an unexpected issue. You can revert to the checkpoint from the menu via “File > Revert to Checkpoint.”
每次创建新笔记本时,都会创建一个检查点文件以及您的笔记本文件。 该文件将位于保存位置的隐藏子目录中,该目录名为.ipynb_checkpoints
,它也是.ipynb
文件。 默认情况下,Jupyter会每120秒将笔记本自动保存到此检查点文件,而不会更改您的主要笔记本文件。 当您“保存并检查点”时,笔记本和检查点文件都将更新。 因此,检查点使您能够在出现意外问题时恢复未保存的工作。 您可以通过“文件>恢复到检查点”从菜单恢复到检查点。
调查我们的数据集 (Investigating our data set)
Now we’re really rolling! Our notebook is safely saved and we’ve loaded our data set df
into the most-used pandas data structure, which is called a DataFrame
and basically looks like a table. What does ours look like?
现在我们真的在滚动! 安全地保存了我们的笔记本,并将数据集df
加载到了最常用的pandas数据结构中,该数据结构称为DataFrame
,基本上看起来像一个表。 我们长什么样?
Year | 年 | Rank | 秩 | Company | 公司 | Revenue (in millions) | 收入(百万) | Profit (in millions) | 利润(百万) | ||
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0 | 1955 | 1955年 | 1 | 1个 | General Motors | 通用汽车 | 9823.5 | 9823.5 | 806 | 806 |
1 | 1个 | 1955 | 1955年 | 2 | 2 | Exxon Mobil | 埃克森美孚 | 5661.4 | 5661.4 | 584.8 | 584.8 |
2 | 2 | 1955 | 1955年 | 3 | 3 | U.S. Steel | 美国钢铁 | 3250.4 | 3250.4 | 195.4 | 195.4 |
3 | 3 | 1955 | 1955年 | 4 | 4 | General Electric | 通用电气 | 2959.1 | 2959.1 | 212.6 | 212.6 |
4 | 4 | 1955 | 1955年 | 5 | 5 | Esmark | 埃斯马克 | 2510.8 | 2510.8 | 19.1 | 19.1 |
df.tail()
df.tail()
Year | 年 | Rank | 秩 | Company | 公司 | Revenue (in millions) | 收入(百万) | Profit (in millions) | 利润(百万) | ||
---|---|---|---|---|---|---|---|---|---|---|---|
25495 | 25495 | 2005 | 2005年 | 496 | 496 | Wm. Wrigley Jr. | 嗯 小箭牌 | 3648.6 | 3648.6 | 493 | 493 |
25496 | 25496 | 2005 | 2005年 | 497 | 497 | Peabody Energy | 皮博迪能源 | 3631.6 | 3631.6 | 175.4 | 175.4 |
25497 | 25497 | 2005 | 2005年 | 498 | 498 | Wendy’s International | 温迪国际 | 3630.4 | 3630.4 | 57.8 | 57.8 |
25498 | 25498 | 2005 | 2005年 | 499 | 499 | Kindred Healthcare | 亲戚医疗 | 3616.6 | 3616.6 | 70.6 | 70.6 |
25499 | 25499 | 2005 | 2005年 | 500 | 500 | Cincinnati Financial | 辛辛那提金融 | 3614.0 | 3614.0 | 584 | 584 |
Looking good. We have the columns we need, and each row corresponds to a single company in a single year.
看起来不错。 我们有需要的列,每一行对应一年中的一家公司。
Let’s just rename those columns so we can refer to them later.
让我们重命名这些列,以便我们以后可以参考它们。
Next, we need to explore our data set. Is it complete? Did pandas read it as expected? Are any values missing?
接下来,我们需要探索我们的数据集。 完成了吗? 熊猫是否按预期阅读? 是否缺少任何值?
len(df)
len(df)
Okay, that looks good — that’s 500 rows for every year from 1955 to 2005, inclusive.
好的,这看起来不错-从1955年到2005年(包括该年),每年有500行。
Let’s check whether our data set has been imported as we would expect. A simple check is to see if the data types (or dtypes) have been correctly interpreted.
让我们检查我们的数据集是否已按预期导入。 一个简单的检查是查看数据类型(或dtypes)是否已正确解释。
df.dtypes
df.dtypes
Uh oh. It looks like there’s something wrong with the profits column — we would expect it to be a float64
like the revenue column. This indicates that it probably contains some non-integer values, so let’s take a look.
哦哦 利润栏似乎有问题-我们希望它像收入栏一样是float64
。 这表明它可能包含一些非整数值,所以让我们看一下。
non_numberic_profits = df.profit.str.contains('[^0-9.-]') df.loc[non_numberic_profits].head()
non_numberic_profits = df.profit.str.contains('[^0-9.-]') df.loc[non_numberic_profits].head()
year | 年 | rank | 秩 | company | 公司 | revenue | 收入 | profit | 利润 | ||
---|---|---|---|---|---|---|---|---|---|---|---|
228 | 228 | 1955 | 1955年 | 229 | 229 | Norton | 诺顿 | 135.0 | 135.0 | N.A. | 不适用 |
290 | 290 | 1955 | 1955年 | 291 | 291 | Schlitz Brewing | 施利兹酿酒 | 100.0 | 100.0 | N.A. | 不适用 |
294 | 294 | 1955 | 1955年 | 295 | 295 | Pacific Vegetable Oil | 太平洋植物油 | 97.9 | 97.9 | N.A. | 不适用 |
296 | 296 | 1955 | 1955年 | 297 | 297 | Liebmann Breweries | 利勃曼啤酒厂 | 96.0 | 96.0 | N.A. | 不适用 |
352 | 352 | 1955 | 1955年 | 353 | 353 | Minneapolis-Moline | 明尼阿波利斯-莫林 | 77.4 | 77.4 | N.A. | 不适用 |
Just as we suspected! Some of the values are strings, which have been used to indicate missing data. Are there any other values that have crept in?
正如我们所怀疑的! 一些值是字符串,已用于指示丢失的数据。 还有其他价值潜移默化吗?
{'N.A.'}
{'N.A.'}
That makes it easy to interpret, but what should we do? Well, that depends how many values are missing.
这很容易解释,但是我们应该怎么做? 好吧,这取决于缺少多少个值。
369
369
It’s a small fraction of our data set, though not completely inconsequential as it is still around 1.5%. If rows containing N.A.
are, roughly, uniformly distributed over the years, the easiest solution would just be to remove them. So let’s have a quick look at the distribution.
这只是我们数据集的一小部分,尽管并非完全无关紧要,因为它仍在1.5%左右。 如果包含NA
行在过去几年中大致均匀地分布,那么最简单的解决方案就是删除它们。 因此,让我们快速浏览一下分发。
At a glance, we can see that the most invalid values in a single year is fewer than 25, and as there are 500 data points per year, removing these values would account for less than 4% of the data for the worst years. Indeed, other than a surge around the 90s, most years have fewer than half the missing values of the peak. For our purposes, let’s say this is acceptable and go ahead and remove these rows.
乍一看,我们可以看到一年中最无效的值小于25,并且由于每年有500个数据点,因此删除这些值将占最坏年份的数据不足4%。 的确,除了90年代前后的激增外,大多数年份的峰值缺失值还不到一半。 就我们的目的而言,假设这是可以接受的,然后继续删除这些行。
df = df.loc[~non_numberic_profits] df.profit = df.profit.apply(pd.to_numeric)
df = df.loc[~non_numberic_profits] df.profit = df.profit.apply(pd.to_numeric)
We should check that worked.
我们应该检查一下是否有效。
25131
25131
year int64 rank int64 company object revenue float64 profit float64 dtype: object
year int64 rank int64 company object revenue float64 profit float64 dtype: object
Great! We have finished our data set setup.
大! 我们已经完成了数据集的设置。
If you were going to present your notebook as a report, you could get rid of the investigatory cells we created, which are included here as a demonstration of the flow of working with notebooks, and merge relevant cells (see the Advanced Functionality section below for more on this) to create a single data set setup cell. This would mean that if we ever mess up our data set elsewhere, we can just rerun the setup cell to restore it.
如果您打算将笔记本作为报告展示,则可以摆脱我们创建的调查单元,此处将其作为演示笔记本工作流程的演示,并合并相关单元(请参阅下面的“高级功能”部分,以获取更多信息)。更多信息)来创建一个数据集设置单元。 这意味着,如果我们在其他地方弄乱了数据集,则只需重新运行设置单元即可将其还原。
用matplotlib绘图 (Plotting with matplotlib)
Next, we can get to addressing the question at hand by plotting the average profit by year. We might as well plot the revenue as well, so first we can define some variables and a method to reduce our code.
接下来,我们可以通过按年绘制平均利润来解决当前的问题。 我们也可能会绘制收益,因此首先我们可以定义一些变量和减少代码的方法。
Now let’s plot!
现在让我们绘图!
fig, ax = plt.subplots() plot(x, y1, ax, 'Increase in mean Fortune 500 company profits from 1955 to 2005', 'Profit (millions)')
fig, ax = plt.subplots() plot(x, y1, ax, 'Increase in mean Fortune 500 company profits from 1955 to 2005', 'Profit (millions)')
Wow, that looks like an exponential, but it’s got some huge dips. They must correspond to the early 1990s recession and the dot-com bubble. It’s pretty interesting to see that in the data. But how come profits recovered to even higher levels post each recession?
哇,看起来像是指数级的,但是跌幅很大。 它们必须对应于1990年代初期的衰退和网络泡沫 。 看到数据很有趣。 但是,每次衰退后利润如何恢复到更高的水平?
Maybe the revenues can tell us more.
也许收入可以告诉我们更多。
That adds another side to the story. Revenues were no way nearly as badly hit, that’s some great accounting work from the finance departments.
这为故事增添了另一面。 收入受到的打击几乎没有那么严重,这是财务部门的一些出色的会计工作。
With a little help from Stack Overflow, we can superimpose these plots with +/- their standard deviations.
在Stack Overflow的一点帮助下,我们可以将这些图与+/-标准偏差叠加。
def plot_with_std(x, y, stds, ax, title, y_label): ax.fill_between(x, y - stds, y + stds, alpha=0.2) plot(x, y, ax, title, y_label) fig, (ax1, ax2) = plt.subplots(ncols=2) title = 'Increase in mean and std Fortune 500 company %s from 1955 to 2005' stds1 = group_by_year.std().profit.as_matrix() stds2 = group_by_year.std().revenue.as_matrix() plot_with_std(x, y1.as_matrix(), stds1, ax1, title % 'profits', 'Profit (millions)') plot_with_std(x, y2.as_matrix(), stds2, ax2, title % 'revenues', 'Revenue (millions)') fig.set_size_inches(14, 4) fig.tight_layout()
def plot_with_std(x, y, stds, ax, title, y_label): ax.fill_between(x, y - stds, y + stds, alpha=0.2) plot(x, y, ax, title, y_label) fig, (ax1, ax2) = plt.subplots(ncols=2) title = 'Increase in mean and std Fortune 500 company %s from 1955 to 2005' stds1 = group_by_year.std().profit.as_matrix() stds2 = group_by_year.std().revenue.as_matrix() plot_with_std(x, y1.as_matrix(), stds1, ax1, title % 'profits', 'Profit (millions)') plot_with_std(x, y2.as_matrix(), stds2, ax2, title % 'revenues', 'Revenue (millions)') fig.set_size_inches(14, 4) fig.tight_layout()
That’s staggering, the standard deviations are huge. Some Fortune 500 companies make billions while others lose billions, and the risk has increased along with rising profits over the years. Perhaps some companies perform better than others; are the profits of the top 10% more or less volatile than the bottom 10%?
令人震惊的是,标准差很大。 某些财富500强公司赚了数十亿美元,而另一些则亏损了数十亿美元,而且这些年来风险随着利润的增加而增加。 也许有些公司的表现要好于另一些公司。 前10%的利润是否比前10%的收入波动更大或更小?
There are plenty of questions that we could look into next, and it’s easy to see how the flow of working in a notebook matches one’s own thought process, so now it’s time to draw this example to a close. This flow helped us to easily investigate our data set in one place without context switching between applications, and our work is immediately sharable and reproducible. If we wished to create a more concise report for a particular audience, we could quickly refactor our work by merging cells and removing intermediary code.
接下来,我们有很多问题可以解决,很容易看出笔记本的工作流程如何与自己的思维过程相匹配,因此现在该结束这个例子了。 这种流程帮助我们轻松地在一个地方调查我们的数据集,而无需在应用程序之间进行上下文切换,并且我们的工作可立即共享和重现。 如果我们希望为特定受众创建更简洁的报告,则可以通过合并单元格并删除中间代码来快速重构我们的工作。
共享笔记本 (Sharing your notebooks)
When people talk of sharing their notebooks, there are generally two paradigms they may be considering. Most often, individuals share the end-result of their work, much like this article itself, which means sharing non-interactive, pre-rendered versions of their notebooks; however, it is also possible to collaborate on notebooks with the aid version control systems such as Git.
人们谈论共享笔记本时,通常会考虑两种范例。 多数情况下,个人共享工作的最终结果,就像本文本身一样,这意味着共享他们的笔记本电脑的非交互式,预渲染版本。 但是,也可以在笔记本上与Git之类的辅助版本控制系统进行协作。
That said, there are some nascent companies popping up on the web offering the ability to run interactive Jupyter Notebooks in the cloud.
也就是说,网络上出现了一些新兴公司 ,它们提供了在云中运行交互式Jupyter Notebook的功能。
分享之前 (Before you share)
A shared notebook will appear exactly in the state it was in when you export or save it, including the output of any code cells. Therefore, to ensure that your notebook is share-ready, so to speak, there are a few steps you should take before sharing:
共享笔记本将完全以您导出或保存时的状态显示,包括任何代码单元的输出。 因此,为了确保您的笔记本可以共享,可以说,共享之前应该采取一些步骤:
- Click “Cell > All Output > Clear”
- Click “Kernel > Restart & Run All”
- Wait for your code cells to finish executing and check they did so as expected
- 单击“单元格>所有输出>清除”
- 单击“内核>重新启动并全部运行”
- 等待您的代码单元完成执行,并检查它们是否按预期执行
This will ensure your notebooks don’t contain intermediary output, have a stale state, and executed in order at the time of sharing.
这将确保您的笔记本不包含中间输出,具有陈旧状态,并在共享时按顺序执行。
导出笔记本 (Exporting your notebooks)
Jupyter has built-in support for exporting to HTML and PDF as well as several other formats, which you can find from the menu under “File > Download As.” If you wish to share your notebooks with a small private group, this functionality may well be all you need. Indeed, as many researchers in academic institutions are given some public or internal webspace, and because you can export a notebook to an HTML file, Jupyter Notebooks can be an especially convenient way for them to share their results with their peers.
Jupyter内置了对导出为HTML和PDF以及其他几种格式的支持,您可以从“文件>下载为”下的菜单中找到它们。 如果您希望与一个小型私人团体共享笔记本,则可能只需要此功能。 确实,为学术机构的许多研究人员提供了一些公共或内部网络空间,并且由于您可以将笔记本导出为HTML文件,因此Jupyter笔记本可以是与他人共享其结果的一种特别便捷的方式。
But if sharing exported files doesn’t cut it for you, there are also some immensely popular methods of sharing .ipynb
files more directly on the web.
但是,如果共享导出的文件不适合您,那么还有一些非常流行的方法可以更直接地在网络上共享.ipynb
文件。
的GitHub (GitHub)
With the number of public notebooks on GitHub exceeding 1.8 million by early 2018, it is surely the most popular independent platform for sharing Jupyter projects with the world. GitHub has integrated support for rendering .ipynb
files directly both in repositories and gists on its website. If you aren’t already aware, GitHub is a code hosting platform for version control and collaboration for repositories created with Git. You’ll need an account to use their services, but standard accounts are free.
到2018年初, GitHub上的公共笔记本数量超过180万,它无疑是与世界共享Jupyter项目的最受欢迎的独立平台。 GitHub集成了对直接在其网站上的存储库和.ipynb
中呈现.ipynb
文件的支持。 如果您还不了解,那么GitHub是一个代码托管平台,用于使用Git创建的存储库进行版本控制和协作。 您需要一个帐户才能使用其服务,但标准帐户是免费的。
Once you have a GitHub account, the easiest way to share a notebook on GitHub doesn’t actually require Git at all. Since 2008, GitHub has provided its Gist service for hosting and sharing code snippets, which each get their own repository. To share a notebook using Gists:
拥有GitHub帐户后,在GitHub上共享笔记本的最简单方法实际上根本就不需要Git。 自2008年以来,GitHub已提供其Gist服务,用于托管和共享代码段,每个代码段都有自己的存储库。 要使用Gists共享笔记本,请执行以下操作:
- Sign in and browse to gist.github.com.
- Open your
.ipynb
file in a text editor, select all and copy the JSON inside. - Paste the notebook JSON into the gist.
- Give your Gist a filename, remembering to add
.iypnb
or this will not work. - Click either “Create secret gist” or “Create public gist.”
- 登录并浏览到gist.github.com 。
- 在文本编辑器中打开
.ipynb
文件,全选并在其中复制JSON。 - 将笔记本JSON粘贴到要点中。
- 给您的Gist一个文件名,记住添加
.iypnb
否则将不起作用。 - 点击“创建秘密要点”或“创建公开要点”。
This should look something like the following:
看起来应该如下所示:
If you created a public Gist, you will now be able to share its URL with anyone, and others will be able to fork and clone your work.
如果您创建了公共Gist,则现在可以与任何人共享其URL,而其他人则可以分叉和克隆您的作品。
Creating your own Git repository and sharing this on GitHub is beyond the scope of this tutorial, but GitHub provides plenty of guides for you to get started on your own.
创建自己的Git存储库并在GitHub上共享它不在本教程的讨论范围内,但是GitHub为您提供了许多入门指南 。
An extra tip for those using git is to add an exception to your .gitignore
for those hidden .ipynb_checkpoints
directories Jupyter creates, so as not to commit checkpoint files unnecessarily to your repo.
对于使用git的用户来说,另一个提示是,为Jupyter创建的那些隐藏的.ipynb_checkpoints
目录添加一个例外到.gitignore
中,以免将检查点文件不必要地提交到您的仓库中。
Nbviewer (Nbviewer)
Having grown to render hundreds of thousands of notebooks every week by 2015, NBViewer is the most popular notebook renderer on the web. If you already have somewhere to host your Jupyter Notebooks online, be it GitHub or elsewhere, NBViewer will render your notebook and provide a shareable URL along with it. Provided as a free service as part of Project Jupyter, it is available at nbviewer.jupyter.org.
到2015年,每周可渲染成千上万本笔记本的NBViewer已成为网络上最受欢迎的笔记本渲染器。 如果您已经有某个地方可以在线托管Jupyter Notebook,无论是GitHub还是其他地方,NBViewer都将呈现您的笔记本并提供可共享的URL。 作为Jupyter项目的一部分提供的免费服务,可以从nbviewer.jupyter.org上获得 。
Initially developed before GitHub’s Jupyter Notebook integration, NBViewer allows anyone to enter a URL, Gist ID, or GitHub username/repo/file and it will render the notebook as a webpage. A Gist’s ID is the unique number at the end of its URL; for example, the string of characters after the last backslash in https://gist.github.com/username/50896401c23e0bf417e89cd57e89e1de
. If you enter a GitHub username or username/repo, you will see a minimal file browser that lets you explore a user’s repos and their contents.
NBViewer最初是在GitHub的Jupyter Notebook集成之前开发的,它允许任何人输入URL,Gist ID或GitHub用户名/仓库/文件,它将笔记本作为网页呈现。 Gist的ID是其URL末尾的唯一编号; 例如, https://gist.github.com/username/50896401c23e0bf417e89cd57e89e1de
//gist.github.com/username/50896401c23e0bf417e89cd57e89e1de中最后一个反斜杠之后的字符串。 如果输入GitHub用户名或用户名/存储库,您将看到一个最小的文件浏览器,可让您浏览用户的存储库及其内容。
The URL NBViewer displays when displaying a notebook is a constant based on the URL of the notebook it is rendering, so you can share this with anyone and it will work as long as the original files remain online — NBViewer doesn’t cache files for very long.
NBViewer在显示笔记本时显示的URL是一个基于其呈现的笔记本URL的常量,因此您可以与任何人共享该URL,只要原始文件保持在线状态,它便可以工作-NBViewer不会缓存文件长。
最后的想法 (Final Thoughts)
Starting with the basics, we have come to grips with the natural workflow of Jupyter Notebooks, delved into IPython’s more advanced features, and finally learned how to share our work with friends, colleagues, and the world. And we accomplished all this from a notebook itself!
从基础开始,我们逐渐熟悉了Jupyter Notebooks的自然工作流程,深入研究了IPython的更高级功能,最后了解了如何与朋友,同事和全世界共享我们的工作。 我们从笔记本本身完成了所有这些工作!
It should be clear how notebooks promote a productive working experience by reducing context switching and emulating a natural development of thoughts during a project. The power of Jupyter Notebooks should also be evident, and we covered plenty of leads to get you started exploring more advanced features in your own projects.
应该清楚的是,笔记本如何通过减少上下文切换和模仿项目中思想的自然发展来提高生产性工作体验。 Jupyter Notebook的功能也应该显而易见,而且我们涵盖了许多线索,可让您开始在自己的项目中探索更多高级功能。
If you’d like further inspiration for your own Notebooks, Jupyter has put together a gallery of interesting Jupyter Notebooks that you may find helpful and the Nbviewer homepage links to some really fancy examples of quality notebooks. Also check out our list of Jupyter Notebook tips.
如果您想为自己的笔记本电脑提供更多灵感,Jupyter汇集了许多有趣的Jupyter笔记本电脑图库,它们可能会对您有所帮助,并且Nbviewer主页链接到一些非常精美的优质笔记本电脑示例。 另请查看我们的Jupyter Notebook提示列表。
Want to learn more about Jupyter Notebooks? We have a guided project you may be interested in.
翻译自: https://www.pybloggers.com/2018/04/jupyter-notebook-for-beginners-a-tutorial/
freemap初学者教程