2019年4月第四周_项目更新:2016年4月

2019年4月第四周

Over the past year or so I’ve made a conscious effort to put out more public facing work through this site, some side projects and through github.  Now, as a few have started to mature and take shape, I wanted to do the first of what will become an occasionally recurring kind of post: projects updates.

在过去的一年左右的时间里,我做出了有意识的努力,通过该站点,一些辅助项目和github发布了更多面向公众的工作。 现在,随着一些人开始成熟并成型,我想做一个偶尔会重复出现的文章的第一个:项目更新。

The idea here is to touch on all of the things I’m working on here in one place once every quarter or so. I’ll try to go through what each project is, maybe a bit about it’s intended purpose and what my plans for it are.  If you’re a reader of this blog and there’s something you’d like to contribute to, reach out to me here or on github.

这里的想法是每隔一个季度左右在一个地方触摸一次我正在处理的所有事情。 我将尝试研究每个项目的含义,也许是有关它的预期目的以及我的计划的细节。 如果您是该博客的读者,并且想为您提供一些帮助,请在此处或在github上与我联系。

辅助项目 (Side Projects)

First off, side projects.

首先,附带项目。

Ce1pQGgWAAIjRvL
I’ve posted a couple of times recently about www.pedalwrencher.com.  That project was first launched about this time last year, and more or less ran unattended for 10 of those 12 months.  In recent weeks I’ve made an attempt to reboot the styling and usability to maybe grow it into something more useful.

我已经发布了几个 时代最近www.pedalwrencher.com 。 该项目是在去年这个时候首次启动的,在这12个月中的10个月中,无人值守。 最近几周,我尝试重新启动样式和可用性,以使其可能更有用。

开源项目 (Open Source Projects)

I’ve managed to publish a pretty decent collection of projects over the past year or two, some of which are much more actively developed than others, and most of which have been touched on here at one point or another.  In no particular order:

在过去的一两年中,我设法发布了相当不错的项目集,其中一些项目的开发比其他项目要积极得多,其中大多数都在某个时候或另一个地方涉及到。 没有特别的顺序:

  • git-pandas: this is my most starred repo, and is probably one of the most actively developed as well.  I’ve posted quite a bit about it, so I’ll be breif, but it is a library for analyzing git-based data using the popular dataframe package, pandas.   There are some open issues I’d like to solve here, notably support for glob syntax for ignoring or selecting particular portions of a codebase.
  • categorical_encoding: this library was originally just a script backing a blog post, but eventually grew into a standalone pip-installable package that is being used in production in at least a handful of places that I know about.  I’d like to improve testing and performance to help solidify it’s usefulness in that regard, and maybe one day reduce some of the external requirements (patsy, pandas, statsmodels, etc.) to integrate with scikit-learn without so much baggage.
  • pypi-publisher: ppp is a little command line interface for publishing things to pypi, which I use personally and professionally nearly daily, so maintenance will for sure continue there.  I’d like to continue work on integrating support for publishing documentation with gh-pages and sphinx, as well as on verification of deploys.
  • gitnoc: gitnoc was originally a really hacky flask app for playing with git-pandas.  Over time it’s evolved into a bit more polished but still pretty hacky flask app for playing with git-pandas.  It has, however, gotten to be pretty capable, and the needs of it now tend to lead the eventual requirements of git-pandas.  I use it personally very regularly, particularly the risk identification pages, so expect continued development.
  • cookiecutter-flask: I haven’t touched this in quite a while, but people seem to continue using it which is great.  Many of my side projects use this, though it’s often a bit heavyweight for what I’m doing.
  • pygeohash: pygeohash is in use in production in a number of places, and is stable, but at this point, I’m not sure if there is anything useful to add to it, so it will be in maintenance mode unless someone suggests something for it.
  • DummyRDD: this particular project is an interesting one because as-is it meets my personal needs but is clearly a long, long, long way from mocking out the whole Spark API. I’d be very interested to get some feedback or help from any contributors, it would be a great way for a python dev to learn the internals of how Apache Spark works.
  • pyculiarity: this is a pure python implementation of Twitter’s time series anomaly detection algorithm. Like a few other projects listed here it’s complete and in production, and will be maintained but there are no new features planned for it at this stage.
  • petersburg: this is totally a pet project of mine.  I’ve posted about it a few times here, and I doubt anyone is using it, but I have some big plans that may make it actually really useful for some practical data science problems.
  • flink-python-examples: This is woefully out of date since the v1.0.0 release of Apache Flink a few weeks ago, and in need of an update.  The python API for Flink is not super well supported and barely documented, so it’s a pretty decent time commitment, but I plan to work on that in coming weeks.
  • git-pandas :这是我最受欢迎的回购,并且可能也是最活跃的回购之一。 我已经张贴 相当 一个 这件事,所以我会breif,但它是分析使用流行的数据帧封装,基于熊猫混帐资料库。 我想在这里解决一些未解决的问题,特别是支持忽略或选择代码库特定部分的glob语法。
  • categorical_encoding :该库最初只是一个支持博客文章的脚本,但最终发展成为一个独立的可点子安装的软件包 ,至少在我所知的一些地方用于生产。 我想改善测试和性能,以巩固它在这方面的用处,也许有一天可以减少一些外部需求(patsy,pandas,statsmodels等),以便与scikit-learn集成而无需太多负担。
  • pypi-publisher :ppp是一个用于将内容发布到pypi的命令行界面 ,我几乎每天都亲自和专业地使用它,因此维护肯定会在那里继续。 我想继续进行工作,以将对发布文档的支持与gh-pages和sphinx集成在一起,以及对部署进行验证。
  • gitnoc :gitnoc最初是一个真正的hacky flask应用程序,用于与git-pandas一起玩。 随着时间的流逝,它演变为更精致,但仍然很烂的烧瓶应用程序 ,可以与git-pandas一起玩。 但是,它已经变得非常有能力,并且它的需求现在倾向于引导git-pandas的最终需求。 我个人非常定期地使用它,尤其是在风险识别页面上,因此期望继续发展。
  • cookiecutter-flask :我已经有一段时间没有接触过它了,但是人们似乎继续使用它,这很棒。 我的许多辅助项目都使用此功能,尽管对于我正在执行的操作而言,它通常会有些繁重
  • pygeohash :pygeohash在许多地方已在生产中使用,并且很稳定,但是在这一点上,我不确定是否要添加任何有用的东西,因此除非有人提出建议,否则它将处于维护模式。它。
  • DummyRDD :这个特殊的项目是一个有趣的项目,因为它可以满足我的个人需求,但是显然要比拟出整个Spark API还要漫长而漫长。 我非常想从任何贡献者那里得到一些反馈或帮助,这对于python开发人员来说是学习Apache Spark工作原理的好方法。
  • pyculiarity :这是Twitter时间序列异常检测算法的纯python实现。 像此处列出的其他一些项目一样,该项目已经完成并且正在生产中,并且将得到维护,但是在此阶段没有计划中的新功能。
  • 彼得斯堡 :这完全是我的宠物项目 。 我已经在这里发表过几次,我怀疑有人在使用它,但是我有一些宏伟的计划可能使它对于某些实际的数据科学问题确实有用。
  • flink-python-examples :自几周前Apache Flink v1.0.0发行以来,这已经过时了,需要更新。 用于Flink的python API并没有得到很好的支持,并且几乎没有文档记录,因此这是相当不错的时间投入,但是我计划在未来几周内对此进行努力。

杂记 (Miscellanea)

Aside from those above, there are a few projects mentioned here, or committed to once or twice that are probably one offs or failed experiments.  Unless something changes with these, they won’t be in the next update.

除了上述项目外,这里还提到了一些项目,或者承诺进行一两次,这可能是一次失败或失败的实验。 除非这些内容有所更改,否则它们将不在下次更新中。

翻译自: https://www.pybloggers.com/2016/04/projects-update-april-2016/

2019年4月第四周

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值