MIT 6.00.1x 计算机科学和Python编程导论 Set 7

Part I: Data Structure Design

感谢 glhezjnucn 童鞋的给力翻译
First, let’s talk about one specific RSS feed: Google News. The URL for the Google News feed is: 首先我们讨论一个特定的RSS推送:Google新闻。链接如下
http://news.google.com/?output=rss
If you try to load this URL in your browser, you’ll probably see your browser’s interpretation of the XML code generated by the feed. You can view the XML source with your browser’s “View Page Source” function, though it probably will not make much sense to you. Abstractly, whenever you connect to the Google News RSS feed, you receive a list of items. Each entry in this list represents a single news item. In a Google News feed, every entry has the following fields:
如果你用浏览器打开这个网址,那么你的浏览器会将XML生成为页面反馈信息,你可以用查看源代码的功能看XML的源码,不过这也许没多大帮助。 简略的说,当你链接一个Google 新闻RSS推送时,你会得到一个项目列表,每个条目表示一条单一的新闻,在Google新闻条目里,含有以下信息:

  • guid : A globally unique identifier for this news story. 全局唯一的新闻条目的识别号
  • title : The news story’s headline. 新闻条目的标题
  • subject : A subject tag for this story (e.g. ‘Top Stories’, or ‘Sports’). 新闻条目的主题(比如 ‘Top Stories’, 或 ‘Sports’).
  • summary : A paragraph or so summarizing the news story.新闻条目的概要或段落ry.
  • link : A link to a web-site with the entire story. 新闻条目的完整链接网址
Generalizing the Problem 问题的一般化

This is a little trickier than we’d like it to be, because each of these RSS feeds is structured a little bit differently than the others. So, our goal in Part I is to come up with a unified, standard representation that we’ll use to store a news story.
比我们希望的要棘手一点,每个RSS推送器可能在信息组织格式上不太一样。因此,第一部分我们的目标是统一,标准化存储新闻条目。
Why do we want this? When all is said and done, we want an application that aggregates several RSS feeds from various sources and can act on all of them in the exact same way: we should be able to read news stories from various RSS feeds all in one place. If you’ve ever used an RSS feed reader, be assured that it has had to solve the exact problem we’re going to tackle in this pset!
为什么需要这样?当我们完成了这样的统一,我们的应用程序可以对聚集一起的多个RSS推送器看成完全是同一个。也就是我们在一个地方就可以阅读来自不同RSS推送器的新闻条目。要是你用过RSS阅读器,那就会相信它首先就要解决我们所面临的一模一样的问题。

Problem 1

Parsing is the process of turning a data stream into a structured format that is more convenient to work with. We have provided you with code that will retrieve and parse the Google and Yahoo news feeds. 分列(parse语句)是处理数据流使之成为结构化格式,更适于工作的过程。我们已经为你提供了获取与分列Google, Yahoo新闻推送的代码。
Parsing all of this information from the feeds that Google/Yahoo/the New York Times/etc. gives us is no small feat. So, let’s tackle an easy part of the problem first: Pretend that someone has already done the specific parsing, and has left you with variables that contain the following information for a news story:
获取与分列来自Google/Yahoo/the New York Times等等的信息并非易事。因此让我们先来对付一个简单的部分问题:就像别人已经为我们做了这些提取与分列的工作,留给你的是关于新闻条目的如下信息:

  • globally unique identifier (GUID) - a string that serves as a unique name for this entry 条目的唯一识别字符串
  • title - a string 字符串
  • subject - a string 字符串
  • summary - a string 字符串
  • link to more content - a string 字符串

We want to store this information in an object that we can then pass around in the rest of our program. Your task, in this problem, is to write a class, NewsStory, starting with a constructor that takes (guid, title, subject, summary, link) as arguments and stores them appropriately. NewsStory also needs to contain the following methods:
我们希望用对象来存储这些信息,然后可以将它传给程序的其他地方。你的任务是写一个class, NewsStory,以(guid, title, subject, summary, link)作为参数的一个构造器,并将它们合理的存储,NewsStory也需要如下方法

  • getGuid(self)
  • getTitle(self)
  • getSubject(self)
  • getSummary(self)
  • getLink(self)

Each method should return the appropriate element of an instance. For example, if we have implemented the class and call
test = NewsStory('foo', 'myTitle', 'mySubject', 'some long summary', 'www.example.com')
每个方法需要能返回实例的对应元素,例如,当我们实现了class定义,调用
then test.getGuid() will return foo.
The solution to this problem should be relatively short and very straightforward (please review what get methods should do if you find yourself writing multiple lines of code for each). Once you have implemented NewsStory all the NewsStory test cases should work.
对这个问题的解答应该是相对简短而非常直接的。当你实现了NewsStrory,那所有的NewssTory测试都应该通过。
To test your class definition, we have provided a test suite in ps7_test.py. You can test your code by loading and running this file. You should see an “OK” for the NewsStory tests if your code is correct. Because ps7.py contains code to run the full RSS scraping system, we suggest you do not try to run ps7.py directly to test your implementation. Instead, in IDLE, you can do the following:

>>> from ps7 import *
>>> test = ps7.NewsStory('foo', 'myTitle', 'mySubject', 'some long summary', 'www.example.com')

to load in then run your own tests on your class definitions.
为测试你的class定义,我们在ps7_test.py中提供了测试集。你可以装入运行那个程序来测试你的代码。如果你的代码正确,应该看到NewssTory测试OK这样的信息,我们提议你不直接运行ps7.py来测试你的实现。而是,在IDLE环境用下列代码来测试:

# Enter your code for NewsStory in this box
class NewsStory(object):
    def __init__(self,guid,title,subject,summary,link):
        self.guid = guid
        self.title = title
        self.subject = subject
        self.summary = summary
        self.link = link
    def getGuid(self):
        return self.guid
    def getTitle(self):
        return self.title
    def getSubject(self):
        return self.subject
    def getSummary(self):
        return self.summary
    def getLink(self):
        return self.link

Part II: Word Triggers

Given a set of news stories, your program will generate alerts for a subset of those stories. Stories with alerts will be displayed to the user, and the other stories will be silently discarded. We will represent alerting rules as triggers. A trigger is a rule that is evaluated over a single news story and may fire to generate an alert. For example, a simple trigger could fire for every news story whose title contained the word “Microsoft”. Another trigger may be set up to fire for all news stories where the summary contained the word “Boston”. Finally, a more specific trigger could be set up to fire only when a news story contained both the words “Microsoft” and “Boston” in the summary.
给定一组新闻条目,你的程序需要为一个子集生成通知(alerts),附带alert的条目将显示给用户,而其他就无声地丢弃。 我们将用触发器来表示通知规则。一个触发器是用于评估单一的新闻条目是否需要被通知的规则。比如,一个简单的触发器可能是触发所有在title中含有单词”Microsoft”的新闻条目,另一个触发器可能是触发在summary中含有单词”Boston”的新闻条目。而进一步的特定触发器可能触发在summary中同时含有这两个单词的。
In order to simplify our code, we will use object polymorphism. We will define a trigger interface and then implement a number of different classes that implement that trigger interface in different ways.
为了简化我们的代码,我们采用对象的多态性,我们定义一个触发器界面,然后定义一系列不同的类来以不同的形式实现触发器界面。

Trigger interface 触发器界面

Each trigger class you define should implement the following interface, either directly or transitively. It must implement the evaluate method that takes a news item (NewsStory object) as an input and returns True if an alert should be generated for that item. We will not directly use the implementation of the Trigger

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值