python julia_从Julia角度看机器学习的python经验

python julia

I recently went through some machine learning training with Python, and wanted to reflect a bit on how I experienced that when comparing it with my experience with Machine Learning in Julia.

最近,我接受了一些使用Python进行的机器学习培训,想与我在Julia中使用机器学习的经验进行比较时,我会反思自己的经历。

First a disclaimer: I am not an expert in this field. I am a learner and so this is written from the perspective of a beginner and not from the perspective of an expert. I am also far more competent using Julia, but will try to give a fear assessment of Python.

首先免责声明:我不是该领域的专家。 我是一个学习者,所以本文是从初学者的角度而不是专家的角度编写的。 我使用Julia的能力也强得多,但会尝试对Python进行评估。

使用Jupyter Notebook的好处 (Benefits of Using Jupyter Notebook)

I am an old school developer and I have never been a big fan of notebooks such as Jupyter. I have tended to prefer using a good text editor such as TextMate, Atom, Kakoune or VSCode together with a terminal for the Julia or Python REPL.

我是一名老派开发人员,并且从未像Jupyter这样忠实于笔记本电脑。 我倾向于使用像TextMate,Atom,Kakoune或VSCode这样的优质文本编辑器以及Julia或Python REPL的终端。

Image for post
Jupyter Notebook showing mathematical equation and plots produced by python code embedded in text.
Jupyter Notebook显示了数学方程式和由嵌入文本的python代码生成的绘图。

For a while I tried to go through the material we covered using my normal approach. However what I noticed is that as you are learning a complicated topic with a bunch of math, keeping notes of what is going on in a regular Text editor does not work very well.

有一段时间,我尝试使用常规方法浏览我们涵盖的材料。 但是我注意到的是,当您通过一堆数学学习一个复杂的主题时,在常规的文本编辑器中记录正在发生的事情并不能很好地进行。

This is where notebooks really shone. Being able to see past results of code that had run previously, both in the form of numbers and graphs was invaluable. You may forget something and have to scroll back for reference. That ability is all lost when you got a plain text file of code which has to be re-run each time you want to look at a particular result.

这是笔记本真正发光的地方。 能够以数字和图形的形式查看以前运行的代码的过去结果非常宝贵。 您可能会忘记一些东西,因此必须向后滚动以供参考。 当您获得纯文本代码文件时,该功能将全部丢失,每次您要查看特定结果时都必须重新运行该文件。

Also being able to look at complex equations beautifully rendered is helpful. Your ability to see the bigger picture is severely diminished if you stare at some raw LaTeX code for an equation.

能够查看精美呈现的复杂方程式也很有帮助。 如果您盯着方程式的原始LaTeX代码,则看到大图的能力将大大降低。

Python优于Julia的主要优势 (Key Benefits of Python over Julia)

While I am a huge Julia fan and will argue Julia has a ton of advantages over Python, there are a number of areas where Python clearly has the edge.

尽管我是Julia的忠实粉丝,并且会争辩说Julia与Python相比有很多优势,但是Python在很多领域显然具有优势。

When working with notebooks and doing training it is important to be able to keep up quickly. Your trainer may switch to another notebook to show you something and you want to be able to quickly try out the same thing.

在使用笔记本电脑进行培训时,重要的是要能够快速跟上。 您的教练可能会切换到另一个笔记本上给您看一些东西,并且您希望能够快速尝试同样的事情。

Python is an interpreted language and so code starts running at full speed right away. That speed may not be high, but at least latency is very low. This means with Python you get plots and results up really fast when switching notebooks.

Python是一种解释型语言,因此代码立即开始全速运行。 该速度可能不高,但是至少延迟非常低。 这意味着使用Python可以在切换笔记本时快速获得绘图和结果。

Julia here has a problem with being Just in Time compiled. Julia may be several magnitudes faster than Python but it has a latency problem. In particular getting the first plot up, can be quit slow. While you are waiting for your plot, your trainer may have moved on to another subject.

Julia(Julia)在及时编译时遇到问题。 Julia可能比Python快几个数量级,但存在延迟问题。 特别是起初的情节,可以慢慢退出。 在等待剧情时,您的教练可能已经转移到另一个主题。

But it gets worse. Not infrequently something goes wrong. Your whole notebook dies and your kernel has to be restarted. Oops that is really painful when using a JIT.

但情况变得更糟。 经常会出问题。 您的整个笔记本都死了,必须重新启动内核。 糟糕,使用JIT时确实很痛苦。

Some other benefits are perhaps too obvious to mention. Supplemental literature we may want to use say discussing Neural Networks will typically favor Python. It has become a sort of industry standard.

其他一些好处也许太明显了,无法提及。 我们可能要使用的补充文献说,讨论神经网络通常会偏爱Python。 它已经成为一种行业标准。

Julia笔记本的优势 (Julia Notebook Advantages)

While latency worked in Python’s advantage I don’t think the total notebook experience in Python was necessarily better. One problem I observed with Python is that in Python, not all code is expressions and not all objects have sensible default visualizations. You often have to create those yourself.

尽管延迟有利于Python的优势,但我认为Python的整体笔记本体验不一定会更好。 我在Python中观察到的一个问题是,在Python中,并非所有代码都是表达式,也不是所有对象都具有明智的默认可视化效果。 您通常必须自己创建那些。

Here I feel Julia really shines. Everything is an expression and almost every object displays itself in a pretty manner. Often special forms of visualization is tailored to the system you work in. E.g. an object may look different in a notebook compared to a REPL.

在这里,我感到茱莉亚真的很闪耀。 一切都是表达式,几乎每个对象都以漂亮的方式显示自己。 通常,特殊形式的可视化是针对您使用的系统定制的。例如,与REPL相比,笔记本中的某个对象看起来可能有所不同。

The concrete outcome of this was that I found my Python teachers constantly had to write print statements with formatting to show results of code snippets. Something I find myself rarely do in Julia. You almost always get a sensible visualization of some computation.

其具体结果是,我发现我的Python老师经常不得不编写带有格式的打印语句,以显示代码片段的结果。 我发现自己在Julia很少做的事情。 您几乎总会获得一些计算的明智可视化效果。

The topic matter we covered as highly mathematical. Meaning we usually had equations shown that later would be implemented in code.

我们讨论的主题是高度数学的。 意思是我们通常让方程式显示出以后可以在代码中实现。

Here in my opinion Julia has a clear benefit in making it very easy to type unicode characters used in mathematics. Julia uses LaTeX notation which scientific programmers will already know. Hitting tab will simply complete something written in LaTeX to its unicode equivalent. Here Julia benefits from being a modern language created after unicode became widely used. All Julia source code must be UTF-8.

在我看来,Julia在简化键入数学中使用的Unicode字符方面具有明显的优势。 Julia使用科学程序员已经知道的LaTeX表示法。 点击选项卡将仅完成用LaTeX编写的等效于unicode的内容。 茱莉亚(Julia)受益于Unicode广泛使用后创建的现代语言。 Julia的所有源代码都必须为UTF-8。

ŷ is usually used to indicate a prediction. So a loss function definition in Julia may look like this:

ŷ通常用于指示预测。 因此,Julia中的损失函数定义可能如下所示:

function loss(x, y)
ŷ = predict(x)
sum((y .- ŷ).^2)
end

However in Python one would write y_hat instead, which creates a mismatch between the equation you are looking at and the source code implementing that equation.

但是,在Python中,应该编写y_hat ,这会在您正在查看的方程式与实现该方程式的源代码之间造成不匹配。

Here is another example of defining a simple neural network in Julia’s Flux ML library.

这是在Julia的Flux ML库中定义简单神经网络的另一个示例。

layers = [Dense(10, 5, σ), Dense(5, 2), softmax]

Notice how the sigmoid function σ uses the same unicode symbol as the mathematics equation would use. Here Dense(10, 5, σ) is used to create a neural network layer with 10 inputs, 5 outputs and a sigmoid function as the activation function.

请注意,S型函数σ如何使用与数学方程式相同的unicode符号。 在这里, Dense(10, 5, σ)用于创建具有10个输入,5个输出和一个S型函数作为激活函数的神经网络层。

功能与面向对象的编程 (Functional vs Object-Oriented Programming)

Julia and Python are languages following entirely different paradigms. Object-oriented thinking is deeply embedded in the DNA of Python. This is natural as it was developed in a period where the popularity and hype around object-oriented programming was perhaps in its peak.

Julia和Python是遵循完全不同范例的语言。 面向对象的思维已深深地嵌入到Python的DNA中。 这是自然而然的事,因为它是在有关面向对象编程的流行和炒作可能达到顶峰的时期开发的。

Julia in contrast is a language that got released at the tail end of the object-oriented hype, when functional programming began gaining ground. This difference manifest themselves if a profoundly different way to approach the construction of programs.

相比之下,Julia是一种语言,它是在函数式编程开始兴起时在面向对象的炒作的末尾发布的。 如果以一种截然不同的方式来构建程序,则这种差异将体现出来。

设置模型 (Setting Up Model)

It is in particular noticeable when comparing the machine learning libraries PyTorch and Julia’s Flux. With Python you see a preference for defining a neural network model in this fashion, taken from common PyTorch tutorial:

在比较机器学习库PyTorch和Julia's Flux时尤其值得注意。 使用Python,您会看到一种偏爱以这种方式定义神经网络模型的方法,该方法取自常见的PyTorch教程:

class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(3, 6, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(16 * 5 * 5, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)

def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1, 16 * 5 * 5)
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x

net = Net()

Although PyTorch does contain higher level abstractions which let you build models which reminds me how the approach used by Flux. This is an example from the PyTorch documentation. First we setup some inputs:

尽管PyTorch确实包含更高级别的抽象,但您可以通过它们来构建模型,这使我想起了Flux使用的方法。 这是来自PyTorch文档的示例。 首先,我们设置一些输入:

import torch

N, D_in, H, D_out = 64, 1000, 100, 10

# Create random Tensors to hold inputs and outputs
x = torch.randn(N, D_in)
y = torch.randn(N, D_out)

This is part is very similar to how a model is setup in Flux.

这部分与在Flux中建立模型的方式非常相似。

model = torch.nn.Sequential(
torch.nn.Linear(D_in, H),
torch.nn.ReLU(),
torch.nn.Linear(H, D_out),
)

y_pred = model(x)

Here we got an example of setting up a model in Julia’s Flux. Please note these are not the same models. I am just doing a quick comparison here based on example provided in official docs.

在这里,我们有一个在Julia's Flux中建立模型的示例。 请注意,这些不是同一型号。 我只是根据官方文档中提供的示例在此处进行快速比较。

x = rand(10)
model = Chain(
Dense(10, 5, σ),
Dense(5, 2),
softmax)

y_pred = model(x)

While these may look quite different it is worth noting that things like torch.nn.ReLU() are not normal Python functions but special nodes in a PyTorch graph. Contrast this with σ and softmax in the Julia example which are just normal function. You can use them outside of Flux. In fact they are not part of the Flux library but part of NNlib, and could be used in any other machine learning library.

尽管这些看上去很不一样,但值得注意的是,诸如torch.nn.ReLU()并不是正常的Python函数,而是PyTorch图中的特殊节点。 将此与Julia示例中的σsoftmax进行对比,它们只是正常函数。 您可以在Flux之外使用它们。 实际上,它们不是Flux库的一部分,而是NNlib的一部分,并且可以在任何其他机器学习库中使用。

训练 (Training)

Let us compare how training is done. In our PyTorch example we start by defining a loss function and optimizer (training algorithm).

让我们比较一下培训的完成方式。 在我们的PyTorch示例中,我们首先定义一个损失函数和优化器(训练算法)。

The loss function is a measure of how big the difference between what your model predict given some input and what the expected output is. Machine learning is basically about minimizing the value of the loss function.

损失函数衡量的是模型在给定输入的情况下预测的值与预期输出之间的差值有多大。 机器学习基本上是关于最小化损失函数的值。

The optimizer in contrast, is the algorithm for how we adjust our model to reduce the loss on each iteration.

相反,优化器是一种算法,用于调整模型以减少每次迭代的损失。

loss = torch.nn.MSELoss(reduction='sum')

learning_rate = 1e-4
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

You can see that Flux looks superficially similar. But a key difference is that you can see that the Julia loss function is actually just a normal Julia function, while in PyTorch it is some kind of object.

您会看到Flux表面上看起来很相似。 但是一个关键的区别是您可以看到Julia loss函数实际上只是普通的Julia函数,而在PyTorch中它是某种对象。

The reduction and agg are how we reduce or aggregate the values in each batch of data processed. Notice how in PyTorch it is specified as a string 'sum' while in Julia it is actually just the normal sum function in the Julia standard library.

reductionagg都我们如何减少或处理的数据的每个批次聚集的值。 注意,在PyTorch中如何将其指定为字符串'sum'而在Julia中,它实际上只是Julia标准库中的普通sum函数。

loss(x, y) = Flux.Losses.mse(model(x), y, agg=sum)

learning_rate = 1e-4
optimizer = ADAM(learning_rate)

Finally we have a pretty standard loop for performing training in PyTorch.

最后,我们有一个漂亮的标准循环用于在PyTorch中进行训练。

On each iteration we calculate a prediction. With the target output y we calculate training loss.

在每次迭代中,我们都会计算一个预测。 利用目标输出y我们可以计算出训练损失。

Then we compute the gradient with back propagation before we use the optimizer to update the model.

然后,在使用优化器更新模型之前,我们先计算具有反向传播的梯度。

for t in range(500):
# Forward pass: compute predicted y by passing x to the model.
y_pred = model(x)

# Compute and print loss.
training_loss = loss(y_pred, y)
if t % 100 == 99:
print(t, training_loss.item())

optimizer.zero_grad()

# Backward pass: compute gradient of the loss with respect to model
# parameters
training_loss.backward()

optimizer.step()

There are some similarities with Flux. Often we use built in training functions but here we make our own custom training function to make comparison of what is going on easier.

与Flux有一些相似之处。 通常,我们使用内置的训练功能,但是在这里,我们使用自己的自定义训练功能,以使进行中的操作比较容易。

Here we can see more clearly the difference between the more object-oriented approach of PyTorch and the more function approach of Flux.

在这里,我们可以更清楚地看到PyTorch的更多面向对象的方法与Flux的更多功能的方法之间的区别。

Flux follows more a pattern of taking inputs, and producing outputs which are then fed into another function. E.g. we are more explicitly calculating the gradient and storing it in gs. In PyTorch we never actually see the gradient. It is all hidden in the training_loss.backward() call.

助焊剂遵循的是获取输入并产生输出的模式,然后将其馈送到另一个函数中。 例如,我们更明确地计算梯度并将其存储在gs 。 在PyTorch中,我们从未真正看到渐变。 所有这些都隐藏在training_loss.backward()调用中。

Next in Julia we update the model parameters ps by calling update!. Again we are very explicit about what we are going. We tell it what optimizer we are using, what the model parameters are and the gradient gs.

接下来在Julia中,我们通过调用update!模型参数ps update! 。 再次,我们非常清楚我们要做什么。 我们告诉它我们正在使用什么优化器,模型参数是什么以及梯度gs

function custom_train!(loss, model_params, data, optimizer)
local training_loss
ps = Params(model_params)
for d in data
gs = gradient(ps) do
training_loss = loss(d...)
return training_loss
end

update!(optimizer, ps, gs)
end
end

custom_train!(loss, params(model), dataset, optimizer)

发挥功能的优势 (Advantage of Going Functional)

Personally what I see as major advantage with the functional approach is that we limit object mutations and we can more clearly see what data goes in and what data goes out at each step.

就我个人而言,使用功能方法的主要优势在于我们限制了对象突变,并且我们可以更清楚地看到每个步骤中输入的数据和输出的数据。

This makes it harder to screw up the sequence of operations. For instance in the PyTorch example there is nothing stopping me from attempting to call optimizer.step() as the first thing in the loop.

这使得很难确定操作顺序。 例如,在PyTorch示例中,没有什么阻止我尝试将Optimizer.step optimizer.step()作为循环中的第一件事。

In the Julia example the equivalent is not possible. You cannot call update!(optimizer, ps, gs) until you have produce the gradient gs.

在Julia的示例中,等效项是不可能的。 在生成梯度gs之前update!(optimizer, ps, gs)不能调用update!(optimizer, ps, gs)

And producing gradient requires you to produce the training_loss. In effect the API of Flux force you to do things in the right order.

产生梯度需要您产生training_loss 。 实际上,Flux的API会迫使您以正确的顺序执行操作。

In PyTorch you must simply remember to do it in the right order. This is something which has frustrated me about object-oriented programming for many years. I would frequently see code like this:

在PyTorch中,您必须记住要以正确的顺序进行操作。 多年来,这使我对面向对象编程感到沮丧。 我会经常看到这样的代码:

obj.foo()
obj.bar()
obj.qux()

Please note that foo, bar and qux are just common nonsense names for variables and functions beloved by some programmers, and hated by others.

请注意, foobarqux只是一些程序员钟爱的变量和函数的普通废话名称,而被其他程序员讨厌。

The point is that exactly what function we are calling is not important. The key is that we keep mutating the object obj, but we don't actually know if we are calling these methods in order.

关键是我们要调用的函数并不重要。 关键是我们不断更改对象obj ,但实际上我们不知道是否按顺序调用这些方法。

Contrast this with the functional approach:

将此与功能方法进行对比:

Object obj = ...

Thingy t = foo(obj)
Doodad d = bar(obj)
Result r = qux(t, d)

I have written this to look a bit like C++. I wanted to show it in a static language syntax as it makes my point a bit clearer.

我写的看起来有点像C ++。 我想以静态语言语法显示它,因为这使我的观点更加清楚。

In the first case we could have moved obj.qux() to the first line and nobody could tell that this was obviously wrong.

在第一种情况下,我们可以将obj.qux()移到第一行,没有人知道这显然是错误的。

However in the functional example, it is impossible to call qux in the first line because we don't have input values of type Doodad and Thingy. And looking up the docs we can discover that we need to call foo and bar to get those kinds of values.

但是,在功能示例中,不可能在第一行中调用qux ,因为我们没有类型为DoodadThingy输入值。 通过查找文档,我们发现我们需要调用foobar来获取这些类型的值。

This works quite well in a dynamic language like Julia. The code would fail early if you where not supplying arguments of the right type. But perhaps more importantly, you could easily lookup in the documentation what inputs and outputs they had.

在像Julia这样的动态语言中,这很好用。 如果您不提供正确类型的参数,则代码将尽早失败。 但也许更重要的是,您可以轻松地在文档中查找它们具有哪些输入和输出。

With the object-oriented approach it is all opaque. You have no idea what objects obj.foo() may produce internally, which can then later be used by another function.

使用面向对象的方法是不透明的。 您不知道obj.foo()会在内部产生什么对象,然后可以由另一个函数使用该对象。

This is something I observed repeatedly while going through Python code examples. A lot of code mutates objects and you have no idea what is really going on. You cannot clearly see how data flows through the system, or what must be done first and last.

我在遍历Python代码示例时反复观察到这一点。 很多代码都会使对象发生变异,而您不知道到底发生了什么。 您无法清楚地看到数据如何流经系统,或者必须首先完成和最后完成什么。

In my mind this hampers understanding and maintainability.

在我看来,这妨碍了理解和可维护性。

Python详尽度 (Python Verbosity)

Compared to C++, Java and certainly Objective-C, Python is succinct. But to a Julia developer like me used to very brief code, the Python example code I look at often looks noisy.

与C ++,Java和Objective-C相比,Python简洁。 但是对于像我这样曾经非常简短的代码的Julia开发人员来说,我所看到的Python示例代码通常看起来很嘈杂。

I find looking at stuff like this distracting. From a usability point of view you should not have identical text at the beginning of each line, because this screws with your ability to scan code downwards.

我发现看这样的东西分心。 从可用性的角度来看,每行的开头不应有相同的文本,因为这会降低您向下扫描代码的能力。

This is a recommendation e.g. when writing bulleted lists in presentations. You should begin each line with unique distinguishing words. Yet here we have to see torch.nn. repeatedly on each line before actually getting to the meat.

这是一个建议,例如在演示文稿中编写项目符号列表时。 您应该在每一行开头都使用独特的区别词。 但是在这里我们必须看到torch.nn. 在实际进食之前,在每条线上反复进行。

model = torch.nn.Sequential(
torch.nn.Linear(D_in, H),
torch.nn.ReLU(),
torch.nn.Linear(H, D_out),
)

Remember in the Julia case we simply wrote:

请记住,在Julia的案例中,我们只是写道:

model = Chain(
Dense(10, 5, σ),
Dense(5, 2),
softmax)

Meaning there is far less distracting text. In the Python example most of the characters on each line is actually a repetition.

意思是说,分心的文字要少得多。 在Python示例中,每行上的大多数字符实际上都是重复。

I know Julia developers and Python developers will have some disagreement on this. We simply read code in a very different way I have noticed.

我知道Julia开发人员和Python开发人员将对此有所不同。 我们只是以我注意到的非常不同的方式阅读代码。

Python developers scan source code for stuff like torch.nn because they want to know where the function or type came from.

Python开发人员在源代码中扫描诸如torch.nn东西,因为他们想知道函数或类型的来源。

This would often be highly misleading in a language like Julia, because one of the key features of Julia is that multiple packages can extend existing function to handle new data types and arguments. Where something comes from should be abstracted away, much like you don’t annotate on every method call what package it belongs to. Selecting what code to run using dynamic dispatch is kind of the whole point of methods.

在Julia这样的语言中,这通常会产生极大的误导,因为Julia的主要功能之一是多​​个包可以扩展现有函数以处理新的数据类型和参数。 其中一些来自应该抽象出来,就像你不注释它所属的包每一个方法调用。 选择要使用动态调度运行的代码是整个方法的重点。

Instead as a Julia developer my approach to thinking about code is very functional. What flows in and what flows out. If you know that, you can reason about what goes on in between. Thus looking at type information for function inputs gives valuable clues when reading code.

作为Julia的开发人员,我思考代码的方法非常实用。 什么流入和流出。 如果您知道这一点,则可以推断出两者之间发生了什么。 因此,在阅读代码时,查看函数输入的类型信息会提供有价值的线索。

Here is a random example from some of my code. This normalizes numbers in each column of a table (DataFrame).

这是我的一些代码中的一个随机示例。 这会规范化表( DataFrame )的每一列中的数字。

function _normalize(normalizer::Function, df::DataFrame)
result = DataFrame()
for colname in names(df)
column = df[:, colname]
if eltype(column) <: Real
result[:, colname] = normalizer(column)
else
result[:, colname] = column
end
end
result
end

Nothing here really says what package something is from. But when reading this code I would start with the inputs. Looking at the type annotations ::Function and ::DataFrame gives me immediately a clear idea of what sort of objects this function is operating on.

这里什么也没有说什么是什么包装。 但是,在阅读此代码时,我将从输入开始。 通过查看类型注释::Function::DataFrame ,可以立即清楚地知道此函数在哪种对象上进行操作。

This helps making sense of the rest of the code. If I am not sure what these types are, I can look them up in the help documentation.

这有助于理解其余的代码。 如果不确定这些类型是什么,可以在帮助文档中查找它们。

With this mindset I find that I struggle reading Python code. I look at a function definition but have no idea of what is really going into the function. Often you can guess from the argument name, but for more unusual types it can be hard.

以这种心态,我发现我在阅读Python代码方面遇到了困难。 我看了一个函数定义,但不知道该函数真正包含了什么。 通常,您可以从参数名称中进行猜测,但是对于更多不常见的类型,可能会很难。

Julia developers also don’t always annotate, input. Often you deal with common types like strings or numbers where it is not that important to convey what the type is.

Julia开发人员也不总是注释,输入。 通常,您会处理诸如字符串或数字之类的常见类型,而传达什么是类型并不那么重要。

But a Python developer will often struggle reading Julia code because they are scanning for packages. What package does the functions used come from.

但是,Python开发人员通常会在阅读Julia代码时遇到困难,因为他们正在扫描软件包。 使用的功能来自哪个软件包。

Although to me this does not always make sense. A lot of Python code is object-oriented:

尽管对我而言,这并不总是有意义。 许多Python代码都是面向对象的:

optimizer.step()

What package does the step() method belong to? And what type is optimizer and what package does this type come from? That is impossible to tell.

step()方法属于哪个包? 那么optimizer是什么类型的,该类型来自什么包? 这是不可能的。

Python功能岛与Julia可组合性 (Python Islands of Functionality vs Julia Composability)

As I venture into Python machine learning landscape I notice it is made up of a number of separate fiefdoms or islands. You got TensorFlow/Keras, PyTorch, Chainer and others.

当我冒险进入Python机器学习领域时,我注意到它是由许多独立的领地或岛屿组成的。 您有TensorFlow / Keras,PyTorch,Chainer等。

They are separate tribes with their own followers and they don’t talk to each other. You don’t take a piece of functionality such as say an optimizer or loss function from PyTorch and put it inside your Keras neural network. This libraries are fundamentally incompatible with each other and live in different realities.

他们是各自的部落,有自己的追随者,并且彼此不交谈。 您无需使用诸如PyTorch的优化程序或损失函数之类的功能,并将其放入Keras神经网络中。 这些库从根本上是彼此不兼容的,并且生活在不同的现实中。

Not only are they removed from each other but they are also removed from other Python libraries such as NumPy.

它们不仅彼此删除,而且还从其他Python库(例如NumPy)中删除。

In the Julia world, libraries such as Flux don’t represent islands. A lot of machine learning functionality is shared between different libraries.

在Julia世界中,Flux之类的图书馆并不代表岛屿。 不同库之间共享许多机器学习功能。

Activation functions used in Flux could be used in another machine learning library if you wanted to. They are just normal Julia functions. They are not even part of Flux, but NNlib.

如果需要,可以在另一个机器学习库中使用Flux中使用的激活功能。 它们只是正常的Julia功能。 它们甚至不是Flux的一部分,而是NNlib。

Both TensorFlow and PyTorch use special matrix classes which they refer to as tensors which are unique to them. They are different from each other and from NumPy matrices. With Flux you use normal Julia matrices. Hence any Julia library built to process standard matrices can be used together with Flux.

TensorFlow和PyTorch都使用特殊的矩阵类,它们被称为张量,这对它们是唯一的。 它们彼此不同,与NumPy矩阵不同。 使用Flux,您可以使用普通的Julia矩阵。 因此,为处理标准矩阵而构建的任何Julia库都可以与Flux一起使用。

One example of this benefit is for running Flux on a GPU. Consider how this is done in PyTorch:

这种好处的一个例子是在GPU上运行Flux。 考虑一下如何在PyTorch中完成此操作:

device = torch.device("cuda:0")net.to(device)inputs, labels = data[0].to(device), data[1].to(device)

You can see that the GPU functionality relies on method implemented on object provided by PyTorch. In other words PyTorch was specifically built to support this.

您可以看到,GPU功能依赖于PyTorch提供的对象上实现的方法。 换句话说,PyTorch是专门为支持此功能而构建的。

While Flux contains some convenience functions to help work with the CPU, you would actually have been able to do this even if the creators of Flux had no knowledge of the CUDA package. In this example, none of the Flux functions know they are dealing with arrays on a GPU:

尽管Flux包含一些便利功能来帮助CPU,但即使Flux的创建者不了解CUDA软件包,您实际上也能够做到这一点。 在此示例中,所有Flux函数都不知道它们正在处理GPU上的数组:

using CUDAW = cu(rand(2, 5)) # a 2×5 CuArray
b = cu(rand(2))predict(x) = W*x .+ b
loss(x, y) = sum((predict(x) .- y).^2)x, y = cu(rand(5)), cu(rand(2)) # Dummy data
loss(x, y) # ~ 3

The cu function is used to convert regular Julia arrays into CUDA arrays living in in GPU memory. However the code for the predict and loss functions are the same, and the rest of Flux doesn’t know or care that the arrays are on a GPU. Flux is not made to work with any specific array type, only a particular array interface, and the CUDA arrays have the same interface as regular Julia arrays.

cu函数用于将常规Julia数组转换为GPU内存中的CUDA数组。 但是, predictloss函数的代码是相同的,并且Flux的其余部分不知道或不在乎这些数组是否在GPU上。 Flux不能与任何特定的数组类型一起使用,只能与特定的数组接口一起使用,并且CUDA数组具有与常规Julia数组相同的接口。

One benefit I see with this is that there is less to learn in the Julia ecosystem, because you reuse your knowledge more. A lot of the same packages are reused in many different contexts. Switch to another machine learning library and you can still use regular Julia arrays, CUDA, NNlib and so on.

我看到的一个好处是,在Julia生态系统中学习的东西更少了,因为您可以重用知识。 许多相同的程序包可在许多不同的上下文中重用。 切换到另一个机器学习库,您仍然可以使用常规的Julia数组,CUDA,NNlib等。

备注 (Remark)

Many of you may be interested in a more in depth comparison of PyTorch and Flux and I will try to do that in a future story. This was primarily a story about what it felt like or the experience of working with Python machine learning as a learner.

你们中的许多人可能对PyTorch和Flux的更深入的比较感兴趣,我将在以后的故事中尝试这样做。 这主要是一个关于它的感觉或作为学习者使用Python机器学习的经历的故事。

翻译自: https://medium.com/@Jernfrost/python-experience-in-machine-learning-from-julia-perspective-fe24e42eee4a

python julia

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值