When thinking of data science or machine learning, Python immediately comes to mind. No other production-ready programming language can match its extensive set of libraries (pandas, numpy, scikit, etc.) paired with proven experimentation tools (jupyter, dash plotly, etc.).
当想到数据科学或机器学习时,Python马上浮现在脑海。 没有其他可用于生产的编程语言可以与其广泛的库集(熊猫,numpy,scikit等)相匹配,再加上经过验证的实验工具(jupyter,破折号等)。
Other ecosystems are trying to catch up in terms of libraries but, when it comes to producing an analysis and insight in a short timeframe, almost none have the tooling that still makes python the champion of productivity.
其他生态系统正在努力追赶库,但是,在短时间内进行分析和洞察时,几乎没有工具能够使python成为生产力的冠军。
On the other hand, there is one field where python is lagging compared to other languages: performance. Even though most of the supporting libraries for data science and machine learning used by python are written in native languages, it will never have the same performance as pure native code.
另一方面,与其他语言相比,python有一个滞后的领域:性能。 即使python使用的大多数用于数据科学和机器学习的支持库都是用本机语言编写的,但它永远不会具有与纯本机代码相同的性能。
In a world where milliseconds (or even microseconds in some fields) matter to deliver information, some projects have to go through the following steps:
在毫秒(某些领域甚至几毫秒)的信息传递至关重要的世界中,某些项目必须经过以下步骤:
- Data scientists write a Proof Of Concept in python/jupyter 数据科学家用python / jupyter编写概念证明
When going to production:
投入生产时:
* -either- Software Engineers translate the logic to another language
*-任一个-软件工程师将逻辑翻译成另一种语言
* -or- Data Scientists have to expose the logic as a micro-service (and therefore need knowledge in api authoring)
*-或-数据科学家必须将逻辑公开为微服务(因此需要api创作方面的知识)
Thankfully, .NET now brings the best of both worlds:
幸运的是,.NET现在带来了两全其美的优势:
Safety: by default, the code runs in a sandbox
安全:默认情况下,代码在沙盒中运行
Productive languages: the .NET ecosystem supports dozens of languages that you can choose from (including python!) that can talk with each other
高效的语言:.NET生态系统支持您可以选择的数十种可以相互交流的语言(包括python!)
High performance: thanks to the recent changes in dotnet core, C# is faster than java and has almost the same performance as native C++
高性能:由于dotnet核心的最新变化, C#比Java更快,并且性能几乎与本地C ++相同
An extensive set of libraries including dataframes, bindings to numpy and tensorflow, and charting libraries.
一个广泛的组库,包括dataframes,绑定numpy的和tensorflow和图表库。
A jupyter-like interactive notebook for F# with support for charts and custom formatters
一个jupyter像互动笔记本F#与图表和自定义格式的支持
In this article I will show you how to install and use the interactive console notebook for F#. In other future articles, I will write about technical implementation details, performance and libraries.
在本文中,我将向您展示如何为F#安装和使用交互式控制台笔记本。 在以后的其他文章中,我将介绍技术实现细节,性能和库。
为VsCode安装F#笔记本 (Installing the F# notebook for VsCode)
In order to play with a notebook in Visual Studio Code:
为了在Visual Studio Code中玩笔记本:
Install the ionide-fsharp extension
安装ionide-fsharp扩展
- Install .NET Core 5: the extension uses F#5 syntax and therefore is only compatible with dotnet core 5安装.NET Core 5:扩展名使用F#5语法,因此仅与dotnet core 5兼容
Install the F# notebook extension for Visual Studio Code itself
Edit VSCode
settings.json
as specified in the extension documentation按照扩展文档中的指定编辑VSCode
settings.json
open the notebook panel with the command Ctrl+Alt+P > “F# Notebook+DataScience: Open Panel”
使用命令Ctrl + Alt + P>“ F#Notebook + DataScience:打开面板”打开笔记本面板
- open an *.fsx file and start coding!打开* .fsx文件并开始编码!
Tip: Alt+Enter will execute the current line
提示:Alt + Enter将执行当前行
简单的例子(Simple examples)
The extension works exactly like the interactive fsharp interpreter (FSI) but with an additional panel that displays formatted data.
该扩展程序的工作原理与交互式fsharp解释器(FSI)完全相同,但是具有一个额外的面板,用于显示格式化的数据。
When one of the Notebook.*
helpers are called, a cell will be added to the panel. The extension has multiple built-in formatters.
调用Notebook.*
助手之一时,一个单元格将添加到面板中。 该扩展程序具有多个内置格式化程序。
基元和降价 (Primitives and markdown)
// Ctrl+Alt+P : F# Notebook: Open Panel
Notebook.Text (1+1)
Notebook.Text "Hello world"Notebook.Markdown """
# Hello, Markdown!
"""
图表 (Charts)
open XPlot.Plotly
// Ctrl+Alt+P : F# Notebook: Open Panel
let chart =
Chart.Line
[ 1, 1
2, 2 ]
|> Chart.WithWidth 400
|> Chart.WithHeight 300
|> Chart.WithLayout(Layout(title = "my title"))
Notebook.Plotly chart
地图 (Maps)
// Ctrl+Alt+P : F# Notebook: Open Panel
open XPlot.Plotly
open FSharp.Datalet marginWidth = 50.0
let margin = Margin(l = marginWidth, r = marginWidth, t = marginWidth, b = marginWidth)type AlcoholConsumption = CsvProvider<"https://raw.githubusercontent.com/plotly/datasets/master/2010_alcohol_consumption_by_country.csv">let consumption = AlcoholConsumption.Load("https://raw.githubusercontent.com/plotly/datasets/master/2010_alcohol_consumption_by_country.csv")
let locations = consumption.Rows |> Seq.map (fun r -> r.Location)
let z = consumption.Rows |> Seq.map (fun r -> r.Alcohol)let map =
Chart.Plot([ Choropleth(locations = locations, locationmode = "country names", z = z, autocolorscale = true) ])
|> Chart.WithLayout(Layout(title = "Alcohol consumption", width = 700.0, margin = margin, geo = Geo(projection = Projection(``type`` = "mercator"))))// display chart
Notebook.Plotly map
数据框 (Dataframes)
// Ctrl+Alt+P : F# Notebook: Open Panel
#r "nuget: Microsoft.Data.Analysis"
open Microsoft.Data.Analysis
let locations, alcohol =
consumption.Rows
|> Seq.map (fun row -> row.Location, row.Alcohol)
|> List.ofSeq
|> List.unzip
let df = new DataFrame(
new StringDataFrameColumn("location", locations),
new PrimitiveDataFrameColumn<decimal>("consumption", alcohol)
)
Notebook.DataFrame df
乳胶表达式 (Latex expressions)
Notebook.Markdown @"This is cool $$x = {-b \pm \sqrt{b^2-4ac} \over 2a}.$$ isn't it"
定制打印机 (Custom printers)
You can also add your own printers that will display the data using a customized format.
您也可以添加自己的打印机,这些打印机将使用自定义格式显示数据。
open Notebook
fsi.AddPrinter(fun (data : YourType) ->
... // Format to string
|> HTML // or SVG or Markdown or Text
|> printerNotebook
)let x = new YourType() // this will automatically print x in the notebook panel
结论 (Conclusion)
This extension is not the only one that offers an interactive environment for F#. There are other projects that offer a similar functionality, notably the jupyter kernel for C#/F#.
该扩展不是唯一为F#提供交互式环境的扩展。 还有其他提供类似功能的项目,特别是C#/ F#的jupyter内核。
But none of them offer the same ease of installation and level of integration with Visual Studio code (code completion, code lenses, integration with other F# extensions for formatting or code quality etc.).
但是它们都没有提供与Visual Studio代码相同的简便安装和集成级别(代码完成,代码镜头,与其他F#扩展的集成以用于格式化或代码质量等)。
That’s all from this article. In the next article, I will do a quick round on machine learning with F#. If you have any questions or just want to chat with me feel free to leave a comment below or contact me on social media.
这就是本文的全部内容。 在下一篇文章中,我将使用F#快速进行机器学习。 如果您有任何疑问或想与我聊天,请随时在下面发表评论或通过社交媒体与我联系。
Note: at the time I am writing this, F#5 is still in preview and has a very nasty bug that freezes autocompletion.
注意:在撰写本文时,F#5仍处于预览状态,并且有一个非常讨厌的错误,冻结了自动补全功能。
翻译自: https://medium.com/@andriniaina/data-science-in-f-net-with-vscode-6084a04b1521