转载一篇云计算会议笔记

Fall Forecast 2008, Computing Among the Clouds

Just a quick note on the one-day cloud-computing focused conference I just attended, Computing Among the Clouds. Of particular interest to this blog was a presentation by Joe Gregorio on GAE.


The talk was basically covering the tutorial to GAE, so I personally didn’t get anything out of it, but I serendipitously sat next to him during the previous talk and had a chance to talk to him during the intermission about a hot-button topic: bulk data loading.


A little context, I recently went looking for a way to set a key_name value using the packaged bulk-loading tools. By default, the bulk load tools turn CSV rows into Datastore entities using a sequntial numerical ID and not a key_name, even when there is a column “key_name” if the type descriptor. I wanted to created entities with varchar keys, without having to create a new entity in a custom handler, in an effort to minimize CPU usage during uploads, which is a known problem. In the package google.appengine.ext.bulkload there are a pair of methods that set a key_name if defined in the uploaded data, but these are tied to a cryptic mention of a “version-1″ format.


I asked Joe whether he knew what these methods were about, if Google was working on better tools for data upload / download / sync, or at least if he new what “version 1″ format data was or might possibly refer to and he pleaded ignorance on all counts. Reflecting on this conversation, I think I have to call bullshit here, at risk of going against my “no-negative vibes” mantra. I think he knew exactly what I was talking about and for whatever reason was not at liberty to disclose details. Which would have been a fine answer by me frankly.


Why I am calling Joe out about this? Well, mainly because I just found the methods the night prior to the conference as I was researching a project and he was the GAE representative at the conference. Sorry, but them’s the breaks.


Why do I think he maybe could have given me a more reasonable answer than pleading complete ignorance? Two reasons: (1) Protocol Buffers and (2) the released protocol buffer version 2 code for the memcached API on the groups list. Version 1 I think refers to protocol buffers version one, which has just been upgraded to version 2 and GAE has already announced that V2 specs are going through QA. My thinking is that this is either someone’s 20%, or that protocol buffer client/servers are used internally at Google to load data (or both) and somehow these methods have ended up in the HEAD branch by mistake. There is certainly no released client that talks to these server methods, and no documentation elsewhere in the code base, official API reference, or articles that hint at how the PB loads would work or what is required to make them work.


Now this is all perfectly understandable since PB V2 is coming soon to all parts of GAE, and it would be confusing to say the least to release some uber-complicated stream protocol that is soon to be replaced. But don’t plead complete ignorance, that’s just insulting.


  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值