面向数据编程的编程语言_面向数据科学家的10个很棒的编程项目

面向数据编程的编程语言

Practice is an essential part of learning. But in my experience learning programming, finding useful tasks and projects to reinforce your skills is tough. This is especially true of programming for data analysis. Finding meaningful and interesting data is really hard if you don’t know where to look.

实践是学习的重要组成部分。 但是以我学习编程的经验来看,要找到有用的任务和项目来增强技能是很困难的。 对于数据分析编程尤其如此。 如果您不知道要看哪里,要找到有意义且有趣的数据真的很难。

Luckily, there are plenty of great datasets and projects out there. When searching for your next project, it’s worth considering a couple of things:

幸运的是,那里有很多很棒的数据集和项目。 在搜索下一个项目时,值得考虑以下几点:

  • Not every project has to be complicated. I’ve learned a lot from trying tasks that can be summed up in a single sentence.

    并非每个项目都必须很复杂。 通过尝试可以用一个句子总结的任务,我学到了很多东西。
  • Projects that aren’t related to ‘data’ in a traditional sense can still help you become a better data scientist/analyst.

    与传统意义上的“数据”无关的项目仍可以帮助您成为更好的数据科学家/分析师。

With this in mind, I like to explore a mix of quick tasks, longer projects, and collections of datasets and problems. Here are 10 of my favourites.

考虑到这一点,我喜欢探索快速任务,较长项目以及数据集和问题集合的组合。 这是我最喜欢的10个。

1. eBay分析 (1. eBay Analysis)

Description: Data scraping and analysis project using Python

说明:使用Python的数据抓取和分析项目

Image for post
Photo by Justin Lim on Unsplash
Justin LimUnsplash拍摄的照片

As a very popular used-item market, eBay has a huge amount of data about the ever-changing prices of all its items and listings. Getting at that data and analysing it is really useful if you want an advantage when buying or selling.

作为一个非常受欢迎的二手物品市场,eBay拥有大量有关其所有物品和清单价格不断变化的数据。 如果您想在买卖中获得优势,那么获取并分析数据将非常有用。

There are a few ways of getting eBay data. I’ve used both web scraping libraries like Python’s BeautifulSoup, and the eBay API in a different Python script. Whatever approach you take, here are some things you can do with the data:

有几种获取eBay数据的方法。 我在不同的Python脚本中都使用了Web抓取库(例如Python的BeautifulSoup)和eBay API。 无论采用哪种方法,都可以对数据执行以下操作:

  • Analyse price fluctuations of certain items over time

    分析某些商品随时间的价格波动
  • Determine whether a given listing can be bought and resold for a greater profit

    确定是否可以购买和转售给定的列表以获取更大的利润
  • Predict the sale price of an item based on the content of its title or description

    根据商品名称或说明的内容预测商品的销售价格

I’ve had a go at a couple of these, and I’ve found it useful when gearing up to buy and sell on eBay. Try it, and perhaps you’ll make a bit of extra money too.

我已经尝试了其中的一些,并且发现在准备好在eBay上进行买卖时,它很有用。 尝试一下,也许您也会赚一些额外的钱。

2.待办事项清单自动化 (2. To-Do List Automation)

Description: Life automation project using Python and Todoist API

说明:使用Python和Todoist API的生活自动化项目

Image for post
Photo by Cathryn Lavery on Unsplash
Cathryn LaveryUnsplash拍摄的照片

If you have a digital to-do list, you can create a script to add and manipulate tasks programmatically. This has been a huge timesaver for me in the past and has helped me keep on top of tasks that don’t repeat regularly.

如果您有数字任务列表,则可以创建脚本以编程方式添加和操作任务。 在过去,这对我来说是一个巨大的节省时间,并帮助我掌握了不定期重复执行的任务。

To get started with this project, I’d suggest using Todoist as your digital to-do list. Todoist offers a straightforward API that’s easy to use with Python, and some helpful documentation. You can then determine the conditions by which you want to add and manipulate to-do list items. You could:

为了开始这个项目,我建议使用Todoist作为您的数字任务清单。 Todoist提供了一个简单易用的Python API,以及一些有用的文档 。 然后,您可以确定要添加和处理待办事项列表项的条件。 你可以:

  • Set dynamic reminders to water your plants based on the weather

    设置动态提醒以根据天气为植物浇水
  • Create an urgent reminder to clean your computer when internal temperatures get too high

    当内部温度过高时,发出紧急提醒以清洁计算机
  • Remove a task if it’s been overdue for more than a set amount of time (and add another reminder to set yourself more realistic goals)

    如果任务逾期超过一定时间,则将其删除(并添加另一个提醒以设置自己更现实的目标)

Once you set your to-do list automation script to run regularly, you should hopefully experience a surge in productivity. At the very least, you’ll get some more experience under your belt.

一旦将待办事项列表自动化脚本设置为定期运行,就有望提高生产力。 至少,您将获得更多经验。

3. FizzBu​​zz (3. FizzBuzz)

Description: Quick fundamental coding task

说明:快速基本编码任务

Image for post
Photo by Sharon McCutcheon on Unsplash
Sharon McCutcheonUnsplash拍摄的照片

FizzBuzz is a very simple problem. But I’ve highlighted it as a standalone project because I think every programmer should try it out and learn how to solve it. The problem is as follows:

FizzBu​​zz是一个非常简单的问题。 但是我将其强调为一个独立的项目,因为我认为每个程序员都应该尝试一下并学习如何解决它。 问题如下:

Print integers 1 to N, but print “Fizz” if an integer is divisible by 3, “Buzz” if an integer is divisible by 5, and “FizzBuzz” if an integer is divisible by both 3 and 5.

打印从1到N的整数,但如果将整数除以3,则打印“ Fizz”;如果将整数除以5,则打印“ Buzz”;如果将整数除以3和5,则打印“ FizzBu​​zz”。

While it’s a simple task, it still takes some thought for a novice programmer. There are different ways to solve it too, some of which are of interest to more experienced users. Learning good solutions to this task will help improve your problem-solving and your understanding of what makes good code.

尽管这是一个简单的任务,但对于新手程序员来说仍然需要一些思考。 解决问题的方法也多种多样,其中一些是经验丰富的用户所感兴趣的。 学习良好的解决方案可以帮助您提高解决问题的能力,并更好地理解优质代码的构成。

4.代码出现 (4. Advent of Code)

Description: Collection of quick programming problems

说明:快速编程问题的集合

Image for post
Photo by Jude Beck on Unsplash
裘德·贝克 ( Jude Beck)Unsplash

In December of each year, the Advent of Code initiative uploads a new programming problem every day until Christmas. Can’t wait until next Christmas? No worries — all the problems from each year are archived and available all year round.

每年的12月,“ 代码问世”倡议每天都会上载一个新的编程问题,直到圣诞节。 等不及明年圣诞节? 不用担心-每年的所有问题都已存档,并且全年可用。

I like Advent of Code for a few reasons. The problems get progressively more difficult as the days go on, meaning there’s something for everyone. They cover a broad range of tasks and skills too, exposing you to a variety of strange tasks and data. Some of these might challenge you and push you to improve your computer science knowledge if you’re a self-taught programmer too.

我喜欢出现代码的原因有很多。 随着时间的流逝,这些问题变得越来越困难,这意味着每个人都有所需要的东西。 它们也涵盖了广泛的任务和技能,使您接触到各种奇怪的任务和数据。 如果您也是自学成才的程序员,其中一些可能会挑战您并促使您提高计算机科学知识。

5.欧拉计划 (5. Project Euler)

Description: Collection of mathematical coding problems

说明:数学编码问题的集合

Image for post
Photo by Antoine Dautry on Unsplash
Antoine DautryUnsplash上的 照片

Want to put your coding skills to work on interesting recreational maths problems? Project Euler is for you.

是否想利用您的编码技能来解决有趣的娱乐数学问题? 欧拉计划适合您。

Project Euler is a huge collection of mathematical problems aimed at programmers who want to sharpen their skills. These problems range in difficulty, but many don’t require an expert understanding of maths. While Project Euler problems aren’t data-focused, tackling them will no doubt reinforce maths knowledge and problem-solving approaches that will help with handling data.

欧拉计划(Project Euler)收集了大量数学问题,旨在帮助想要提高技能的程序员。 这些问题的难度各不相同,但其中许多不需要专业的数学知识。 尽管Euler项目的问题不是针对数据的,但解决这些问题无疑会增强数学知识和解决问题的方法,这将有助于处理数据。

I recommend starting with some early problems from the archive.

我建议从存档中的一些早期问题入手。

6.整洁的星期二 (6. Tidy Tuesdays)

Description: Collection of interesting datasets from the R community

说明:来自R社区的有趣数据集的收集

Image for post
Photo by Pascal Bernardon on Unsplash
Pascal BernardonUnsplash上的 照片

In the R community, Tuesdays are an event! Every week on this day, a new dataset is released in the Tidy Tuesdays GitHub repository. These datasets vary widely in subject and are fun to explore and analyse no matter your preferred programming language.

在R社区中,星期二是一个活动! 在这一天的每个星期,都会在Tidy Tuesdays GitHub存储库中发布一个新的数据集。 这些数据集的主题差异很大,无论您喜欢哪种编程语言,都可以对其进行探索和分析。

The social aspect of Tidy Tuesdays is also unique. Users are encouraged to share their visualisations and analyses each week with the Twitter hashtag #TidyTuesday. This creates a compilation of interesting takes on the same data, which are often a great inspiration for your own efforts.

Tidy Tuesdays的社交方面也很独特。 鼓励用户每周使用Twitter标签#TidyTuesday分享其可视化效果和分析。 这将对相同的数据进行有趣的汇总,通常这对于您自己的工作很有启发。

7. Kaggle比赛 (7. Kaggle Competitions)

Description: Collection of larger projects, with prizes

描述:大型项目的集合,并提供奖品

Image for post
Photo by Braden Collum on Unsplash
Braden CollumUnsplash拍摄的照片

By now, Kaggle competitions are a mainstay in the data science community. Kaggle is a platform that hosts an online Jupyter notebook environment, as well as tutorials and thousands of free, real datasets. Kaggle also holds competitions where teams attempt to answer questions of a dataset using data science.

到目前为止, Kaggle竞赛已成为数据科学界的Struts。 Kaggle是一个平台,可托管在线Jupyter笔记本环境,教程以及数千个免费的实际数据集。 Kaggle还举办比赛,团队尝试使用数据科学回答数据集的问题。

The introductory competitions are well documented, but new ones appear on a rolling basis, sometimes rewarding the winning team with a large cash prize. It’s worth taking a look at Kaggle in case you find a project that aligns with your skills.

介绍性比赛有据可查,但新的比赛会滚动出现,有时会为获胜的团队提供丰厚的现金奖励。 值得一看的是Kaggle,以防您找到适合您技能的项目。

8.个人推荐仪表板 (8. Personal Recommendation Dashboard)

Description: Decision automation program using anything from basic logic to machine learning

说明:决策自动化程序,使用从基本逻辑到机器学习的任何内容

Image for post
Photo by chuttersnap on Unsplash
chuttersnapUnsplash拍摄的照片

It’s always satisfying to take the conclusions of your analyses and apply them to something useful. A recommendation engine can do exactly this. Based on some input and a few heuristics, you can make your life easier by getting a program to automate some of your everyday choices.

取得分析结论并将其应用于有用的东西总是很令人满意的。 推荐引擎可以做到这一点。 根据一些输入和一些启发式方法,您可以通过使一个程序自动执行一些日常选择来使您的生活更轻松。

I’ve previously built a dashboard app with R Shiny that gave me suggestions of what to wear, given factors like temperature and formality. Although I used basic logic to do this, using external data and/or a statistical model is a natural next step. Whatever the approach or recommendation problem you choose, this project could save you time in the long run. Your frontend, backend and design skills will all get a workout too.

以前,我已经使用R Shiny构建了一个仪表板应用程序,考虑到温度和形式等因素,该应用程序向我建议了穿什么衣服。 尽管我使用基本逻辑来执行此操作,但是接下来自然要使用外部数据和/或统计模型。 无论您选择哪种方法或建议问题,从长期来看,该项目都可以节省您的时间。 您的前端,后端和设计技能也将得到锻炼。

9.个人理财分析 (9. Personal Finance Analysis)

Description: Gaining insights into your spending and saving with code

描述:通过代码深入了解您的支出和储蓄

Image for post
Photo by Kelly Sikkema on Unsplash
Kelly SikkemaUnsplash上的 照片

If you track your spending or have a budget, you’ve immediately unlocked a great project. Why not use your analytical knowledge to break down your spending, predict how much you’ll save in the next year, or calculate forecasted returns based on compound interest?

如果您跟踪支出或预算,则可以立即解锁一个出色的项目。 为什么不利用您的分析知识来分解支出,预测明年将节省多少或基于复利计算预测收益?

While some banking apps have budgeting functionality, I find they’re never comprehensive or customisable enough for my liking. Your own financial analyses can go deeper by comparison, and be a useful project. The scope for saving money is huge when you finally quantify how much you spend at the pub.

虽然某些银行应用程序具有预算功能,但我发现它们对我的喜好从来都不全面或可定制。 通过比较,您自己的财务分析会更深入,并且会成为一个有用的项目。 当您最终确定自己在酒吧里花了多少钱时,省钱的范围是巨大的。

10.批处理图像 (10. Batch Image Manipulation)

Description: Image editing using R packages

说明:使用R包进行图像编辑

Image for post
Photo by Vincentas Liskauskas on Unsplash
Vincentas LiskauskasUnsplash上的 照片

Got a bunch of images that you want to edit at the same time? Perhaps to resize them for use on a webpage, or adjust their brightness because they were all a bit overexposed? Automate your image editing process with code!

有一堆想要同时编辑的图像? 也许要调整它们的大小以在网页上使用,或者因为它们都曝光过度而调整它们的亮度? 使用代码自动执行图像编辑过程!

While editing images doesn’t sound like data analysis, it’s worth remembering that images are still data. They contain a lot of rich information and can be analysed programmatically for a variety of uses. Learning to work with them is therefore pretty valuable. It’s also easier than you might think, too. I wrote a short tutorial about using the magick package in R for batch image editing, and the whole code is about 10 lines long!

尽管编辑图像听起来不像是数据分析,但值得记住的是图像仍然是数据。 它们包含许多丰富的信息,并且可以通过编程方式进行分析以用于多种用途。 因此,学习与他们合作非常有价值。 它也比您想象的要容易。 我写了一个简短的教程,介绍如何在R中使用magick软件包进行批处理图像编辑,整个代码长约10行!

最后的想法 (Final Thoughts)

The best way to learn anything is to apply it to something you care about. Many of these projects have made my life easier in some way, and all have engaged me as I gained understanding. So open up a new project in your favourite language.

学习任何事物的最好方法是将其应用于您关心的事物。 其中许多项目以某种方式使我的生活更轻松,并且随着我的理解,所有这些项目都吸引了我。 因此,请以您喜欢的语言打开一个新项目。

It doesn’t have to be a big project, and it doesn’t have to make you money. It only has to be something new and fun.

它不必是一个大项目,也不必赚钱。 它只需要是新的和有趣的。

Who knows where it might take you.

谁知道它可能带你去哪里。

翻译自: https://towardsdatascience.com/10-awesome-programming-projects-for-data-scientists-d2bf64f72ee4

面向数据编程的编程语言

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值