DiveIntoPython(四)Objects and Object-Orientation

DiveIntoPython(四)Objects and Object-Orientation

英文书地址:
http://diveintopython.org/toc/index.html

Chapter 5. Objects and Object-Orientation
5.1. Diving In
example fileinfo.py,share the mp3 file in your own disk. I use Chinese windows system, so I change the py file like this:
if __name__ == "__main__":
for info in listDirectory("E:/music/code/", [".mp3"]):
print "\n".join(["%s=%s" % (k, v) for k, v in info.items()])
print
I get the output from the console like this:
album=Last Name
comment=
name=E:/music/code/last name.mp3
title=Last Name
artist=Carrie Underwood
year=
genre=12

album=Љݣɽ֧˓ԭʹո
comment=
name=E:/music/code/Ƞ؉ԣ - Ջʺһۻ.mp3
title=Ջʺһۻ
artist=Ƞû؉ԣ
year=2009
genre=13

I got some messy codes if the mp3 is Chinese.

5.2. Importing Modules Using from module import
Python has two ways of importing modules. Both are useful, and you should know when to use each. One way, import module, you've already seen in Section 2.4, “Everything Is an Object”. The other way accomplishes the same thing, but it has subtle and important differences.
Here is the basic from module import syntax: from UserDict import UserDict

examples --- import module VS from module import
>>> import types
>>> types.FunctionType
<type 'function'>
>>> FunctionType
Traceback (most recent call last):
File "<interactive input>", line 1, in <module>
NameError: name 'FunctionType' is not defined
>>> from types import FunctionType
>>> FunctionType
<type 'function'>

The types module contains no methods; it just has attributes for each Python object type. Note that the attribute, FunctionType, must be qualified by the module name, types.

This syntax imports the attribute FunctionType from the types module directly into the local namespace.Now FunctionType can be accessed directly, without reference to types.

If you will be accessing attributes and methods often and don't want to type the module name over and over, use from module import.

If you want to selectively import some attributes and methods but not others, use from module import.

If the module contains attributes or functions with the same name as ones in your module, you must use import module to avoid name conflicts.

5.3. Defining Classes
Python is fully object-oriented: you can define your own classes, inherit from your own or built-in classes, and instantiate the classes you've defined.

Example 5.3 The Simplest Python Class
>>> class Loaf:
... pass
...

The name of this class is Loaf, and it doesn't inherit from any other class. Class names are usually capitalized, EachWordLikeThis, but this is only a convention, not a requirement.

This class doesn't define any methods or attributes, but syntactically, there needs to be something in the definition, so you use pass. This is a Python reserved word that just means “move along, nothing to see here”. It's a statement that does nothing, and it's a good placeholder when you're stubbing out functions or classes.

You probably guessed this, but everything in a class is indented, just like the code within a function, if statement, for loop, and so forth. The first thing not indented is not in the class.

The pass statement in Python is like an empty set of braces ({}) in Java or C.

example 5.4 Defining the FileInfo Class:
>>> from UserDict import UserDict
>>> class FileInfo(UserDict):
... pass
...

In Python, the ancestor of a class is simply listed in parentheses immediately after the class name. So the FileInfo class is inherited from the UserDict class.

Python supports multiple inheritance. In the parentheses following the class name, you can list as many ancestor classes as you like, separated by commas.

5.3.1.Initializing and Coding Classes
example 5.5.Initializing the FileInfo Class
>>> class FileInfo(UserDict):
... "store file metadata"
... def __init__(self,filename=None):

Classes can (and should) have doc strings too, just like modules and functions.

__init__ is called immediately after an instance of the class is created. It would be tempting but incorrect to call this the constructor of the class. It's tempting, because it looks like a constructor (by convention, __init__ is the first method defined for the class), acts like one (it's the first piece of code executed in a newly created instance of the class), and even sounds like one (“init” certainly suggests a constructor-ish nature). Incorrect, because the object has already been constructed by the time __init__ is called, and you already have a valid reference to the new instance of the class. But __init__ is the closest thing you're going to get to a constructor in Python, and it fills much the same role.

The first argument of every class method, including __init__, is always a reference to the current instance of the class. By convention, this argument is always named self. In the __init__ method, self refers to the newly created object; in other class methods, it refers to the instance whose method was called. Although you need to specify self explicitly when defining the method, you do not specify it when calling the method; Python will add it for you automatically.

By convention, the first argument of any Python class method (the reference to the current instance) is called self. This argument fills the role of the reserved word this in C++ or Java, but self is not a reserved word in Python, merely a naming convention. Nonetheless, please don't call it anything but self; this is a very strong convention.

example 5.6. Coding the FileInfo Class
class FileInfo(UserDict):
"store file metadata"
def __init__(self, filename=None):
UserDict.__init__(self)
self["name"] = filename

Note that the __init__ method never returns a value.

5.3.2. Knowing When to Use self and __init__
When defining your class methods, you must explicitly list self as the first argument for each method, including __init__. When you call a method of an ancestor class from within your class, you must include the self argument. But when you call your class method from outside, you do not specify anything for the self argument; you skip it entirely, and Python automatically adds the instance reference for you. I am aware that this is confusing at first; it's not really inconsistent, but it may appear inconsistent because it relies on a distinction (between bound and unbound methods) that you don't know about yet.

I realize that's a lot to absorb, but you'll get the hang of it. All Python classes work the same way, so once you learn one, you've learned them all. If you forget everything else, remember this one thing, because I promise it will trip you up:

__init__ methods are optional, but when you define one, you must remember to explicitly call the ancestor's __init__ method (if it defines one). This is more generally true: whenever a descendant wants to extend the behavior of the ancestor, the descendant method must explicitly call the ancestor method at the proper time, with the proper arguments.

5.4.Instantiating Classes
Instantiating classes in Python is straightforward. To instantiate a class, simply call the class as if it were a function, passing the arguments that the __init__ method defines. The return value will be the newly created object.

example 5.7.Creating a FileInfo Instance
>>> import fileinfo
>>> f = fileinfo.FileInfo("/test/test.mp3")
>>> f.__class__
<class fileinfo.FileInfo at 0x0137E180>
>>> f.__doc__
'store file metadata'
>>> f
{'name': '/test/test.mp3'}

You are creating an instance of the FileInfo class (defined in the fileinfo module) and assigning the newly created instance to the variable f.

Every class instance has a built-in attribute, __class__, which is the object's class. Java programmers may be familiar with the Class class, which contains methods like getName and getSuperclass to get metadata information about an object. In Python, this kind of metadata is available directly on the object itself through attributes like __class__, __name__, and __bases__.

You can access the instance's doc string just as with a function or a module. All instances of a class share the same doc string.

5.4.1.Garbage Collection
In general, there is no need to explicitly free instances, because they are freed automatically when the variables assigned to them go out of scope. Memory leaks are rare in Python.

example 5.8.Trying to Implement a Memory Leak
>>> def leakmem():
... f = fileinfo.FileInfo('test/test.mp3')
...
>>> for i in range(100):
... leakmem()
...

No matter how many times you call the leakmem function, it will never leak memory, because every time, Python will destroy the newly created FileInfo class before returning from leakmem.

5.5.Exploring UserDict: A Wrapper Class
As you've seen, FileInfo is a class that acts like a dictionary. To explore this further, let's look at the UserDict class in the UserDict module, which is the ancestor of the FileInfo class. This is nothing special; the class is written in Python and stored in a .py file, just like any other Python code. In particular, it's stored in the lib directory in your Python installation.

In the ActivePython IDE on Windows, you can quickly open any module in your library path by selecting File->Locate... (Ctrl-L).

example 5.9.Defining the UserDict Class
>>> class UserDict:
... def __init__(self,dict=None):
... self.data = {}
... if dict is not None:self.update(dict)
...

This is the __init__ method that you overrode in the FileInfo class. Note that the argument list in this ancestor class is different than the descendant. That's okay; each subclass can have its own set of arguments, as long as it calls the ancestor with the correct arguments. Here the ancestor class has a way to define initial values (by passing a dictionary in the dict argument) which the FileInfo does not use.

Data attributes are pieces of data held by a specific instance of a class. In this case, each instance of UserDict will have a data attribute data. To reference this attribute from code outside the class, you qualify it with the instance name, instance.data, in the same way that you qualify a function with its module name. To reference a data attribute from within the class, you use self as the qualifier. By convention, all data attributes are initialized to reasonable values in the __init__ method. However, this is not required, since data attributes, like local variables, spring into existence when they are first assigned a value.

The update method is a dictionary duplicator: it copies all the keys and values from one dictionary to another. This does not clear the target dictionary first; if the target dictionary already has some keys, the ones from the source dictionary will be overwritten, but others will be left untouched. Think of update as a merge function, not a copy function.

examples:
>>> a
{'1': 'haha', '100': 'heihei'}
>>> b
{'1': 'testb', '3': 'test3', '2': 'test2'}
>>> a.update(b)
>>> a
{'1': 'testb', '100': 'heihei', '2': 'test2', '3': 'test3'}

example 5.10. UserDict Normal Methods
def clear(self): self.data.clear()
def copy(self):
if self.__class__ is UserDict:
return UserDict(self.data)
import copy
return copy.copy(self)
def keys(self): return self.data.keys()
def items(self): return self.data.items()
def values(self): return self.data.values()

In Python, you can inherit directly from the dict built-in datatype, as shown in this example. There are three differences here compared to the UserDict version.

example 5.11.Inheriting Directly from Built-In Datatype dict
>>> class FileInfo(dict):
... "store file metadata"
... def __init__(self,filename=None):
... self["name"] = filename
...

The first difference is that you don't need to import the UserDict module, since dict is a built-in datatype and is always available. The second is that you are inheriting from dict directly, instead of from UserDict.UserDict.

The third difference is subtle but important. Because of the way UserDict works internally, it requires you to manually call its __init__ method to properly initialize its internal data structures. dict does not work like this; it is not a wrapper, and it requires no explicit initialization.

5.6.Special Class Methods
5.6.1.Getting and Setting Items
example 5.12. The __getitem__ Special Method
def __getitem__(self, key): return self.data[key]
>>> f = fileinfo.FileInfo("/music/test.mp3")
>>> f
{'name': '/music/test.mp3'}
>>> f.__getitem__("name")
'/music/test.mp3'
>>> f["name"]
'/music/test.mp3'

The __getitem__ special method looks simple enough. Like the normal methods clear, keys, and values, it just redirects to the dictionary to return its value. But how does it get called? Well, you can call __getitem__ directly, but in practice you wouldn't actually do that.

That's why __getitem__ is a special class method; not only can you call it yourself, you can get Python to call it for you by using the right syntax.

Of course, Python has a __setitem__ special method to go along with __getitem__, as shown in the next example.

example 5.13. The __setitem__ Special Method
def __setitem__(self, key, item): self.data[key] = item
>>> f
{'name': '/music/test.mp3'}
>>> f.__setitem__("genre",31)
>>> f
{'genre': 31, 'name': '/music/test.mp3'}
>>> f["genre"] = 28
>>> f
{'genre': 28, 'name': '/music/test.mp3'}

Like the __getitem__ method, __setitem__ simply redirects to the real dictionary self.data to do its work. And like __getitem__, you wouldn't ordinarily call it directly like this; Python calls __setitem__ for you when you use the right syntax.

This looks like regular dictionary syntax, except of course that f is really a class that's trying very hard to masquerade as a dictionary, and __setitem__ is an essential part of that masquerade. This line of code actually calls f.__setitem__("genre", 28) under the covers.

__setitem__ is a special class method because it gets called for you, but it's still a class method. Just as easily as the __setitem__ method was defined in UserDict, you can redefine it in the descendant class to override the ancestor method. This allows you to define classes that act like dictionaries in some ways but define their own behavior above and beyond the built-in dictionary.

examples 5.14.Overriding __setitem__ in MP3FileInfo
def __setitem__(self, key, item):
if key == "name" and item:
self.__parse(item)
FileInfo.__setitem__(self, key, item)

Notice that this __setitem__ method is defined exactly the same way as the ancestor method. This is important, since Python will be calling the method for you, and it expects it to be defined with a certain number of arguments. (Technically speaking, the names of the arguments don't matter; only the number of arguments is important.)

Here's the crux of the entire MP3FileInfo class: if you're assigning a value to the name key, you want to do something extra.

Calling self.__parse will look for a class method defined within the class. This isn't anything new; you reference data attributes the same way.

After doing this extra processing, you want to call the ancestor method. Remember that this is never done for you in Python; you must do it manually. Note that you're calling the immediate ancestor, FileInfo, even though it doesn't have a __setitem__ method. That's okay, because Python will walk up the ancestor tree until it finds a class with the method you're calling, so this line of code will eventually find and call the __setitem__ defined in UserDict.

When accessing data attributes within a class, you need to qualify the attribute name: self.attribute. When calling other methods within a class, you need to qualify the method name: self.method.

example 5.15. Setting an MP3FileInfo's name
>>> import fileinfo
>>> mp3file = fileinfo.MP3FileInfo()
>>> mp3file
{'name': None}
>>> mp3file["name"] = "d:/data/LastName.mp3"
>>> mp3file
{'album': 'Last Name', 'comment': '', 'name': 'd:/data/LastName.mp3', 'title': 'Last Name', 'artist': 'Carrie Underwood', 'year': '', 'genre': 12}

First, you create an instance of MP3FileInfo, without passing it a filename. (You can get away with this because the filename argument of the __init__ method is optional.) Since MP3FileInfo has no __init__ method of its own, Python walks up the ancestor tree and finds the __init__ method of FileInfo. This __init__ method manually calls the __init__ method of UserDict and then sets the name key to filename, which is None, since you didn't pass a filename. Thus, mp3file initially looks like a dictionary with one key, name, whose value is None.

Now the real fun begins. Setting the name key of mp3file triggers the __setitem__ method on MP3FileInfo (not UserDict), which notices that you're setting the name key with a real value and calls self.__parse. Although you haven't traced through the __parse method yet, you can see from the output that it sets several other keys: album, artist, genre, title, year, and comment.

5.7.Advanced Special Class Methods
example 5.16.More Special Methods in UserDict
def __repr__(self): return repr(self.data)
def __cmp__(self, dict):
if isinstance(dict, UserDict):
return cmp(self.data, dict.data)
else:
return cmp(self.data, dict)
def __len__(self): return len(self.data)
def __delitem__(self, key): del self.data[key]

__repr__ is a special method that is called when you call repr(instance). The repr function is a built-in function that returns a string representation of an object. It works on any object, not just class instances. You're already intimately familiar with repr and you don't even know it. In the interactive window, when you type just a variable name and press the ENTER key, Python uses repr to display the variable's value. Go create a dictionary d with some data and then print repr(d) to see for yourself.

__cmp__ is called when you compare class instances. In general, you can compare any two Python objects, not just class instances, by using ==. There are rules that define when built-in datatypes are considered equal; for instance, dictionaries are equal when they have all the same keys and values, and strings are equal when they are the same length and contain the same sequence of characters. For class instances, you can define the __cmp__ method and code the comparison logic yourself, and then you can use == to compare instances of your class and Python will call your __cmp__ special method for you.

__len__ is called when you call len(instance). The len function is a built-in function that returns the length of an object. It works on any object that could reasonably be thought of as having a length. The len of a string is its number of characters; the len of a dictionary is its number of keys; the len of a list or tuple is its number of elements. For class instances, define the __len__ method and code the length calculation yourself, and then call len(instance) and Python will call your __len__ special method for you.

__delitem__ is called when you call del instance[key], which you may remember as the way to delete individual items from a dictionary. When you use del on a class instance, Python calls the __delitem__ special method for you.

In Java, you determine whether two string variables reference the same physical memory location by using str1 == str2. This is called object identity, and it is written in Python as str1 is str2. To compare string values in Java, you would use str1.equals(str2); in Python, you would use str1 == str2.

5.8.Introducing Class Attributes
example 5.17.Introducing Class Attributes
class MP3FileInfo(FileInfo):
"store ID3v1.0 MP3 tags"
tagDataMap = {"title" : ( 3, 33, stripnulls),
"artist" : ( 33, 63, stripnulls),
"album" : ( 63, 93, stripnulls),
"year" : ( 93, 97, stripnulls),
"comment" : ( 97, 126, stripnulls),
"genre" : (127, 128, ord)}

>>> import fileinfo
>>> fileinfo.MP3FileInfo
<class fileinfo.MP3FileInfo at 0x0137B1B0>
>>> fileinfo.MP3FileInfo.tagDataMap
{'album': (63, 93, <function stripnulls at 0x0135D570>), 'comment': (97, 126, <function stripnulls at 0x0135D570>), 'artist': (33, 63, <function stripnulls at 0x0135D570>), 'title': (3, 33, <function stripnulls at 0x0135D570>), 'year': (93, 97, <function stripnulls at 0x0135D570>), 'genre': (127, 128, <built-in function ord>)}
>>> m = fileinfo.MP3FileInfo()
>>> m.tagDataMap
{'album': (63, 93, <function stripnulls at 0x0135D570>), 'comment': (97, 126, <function stripnulls at 0x0135D570>), 'artist': (33, 63, <function stripnulls at 0x0135D570>), 'title': (3, 33, <function stripnulls at 0x0135D570>), 'year': (93, 97, <function stripnulls at 0x0135D570>), 'genre': (127, 128, <built-in function ord>)}

tagDataMap is a class attribute: literally, an attribute of the class. It is available before creating any instances of the class.

In Java, both static variables (called class attributes in Python) and instance variables (called data attributes in Python) are defined immediately after the class definition (one with the static keyword, one without). In Python, only class attributes can be defined here; data attributes are defined in the __init__ method.

example 5.18.Modifying Class Attributes
>>> class counter:
... count = 0
... def __init__(self):
... self.__class__.count += 1
...
>>> counter
<class __main__.counter at 0x0137B660>
>>> counter.count
0
>>> c = counter()
>>> c.count
1
>>> counter.count
1
>>> d = counter()
>>> d.count
2
>>> c.count
2
>>> counter.count
2

count is a class attribute of the counter class.

Because count is a class attribute, it is available through direct reference to the class, before you have created any instances of the class.

Creating an instance of the class calls the __init__ method, which increments the class attribute count by 1. This affects the class itself, not just the newly created instance.

Creating a second instance will increment the class attribute count again. Notice how the class attribute is shared by the class and all instances of the class.

5.9.Private Functions
Private functions, which can't be called from outside their module
Private class methods, which can't be called from outside their class
Private attributes, which can't be accessed from outside their class.

If the name of a Python function, class method, or attribute starts with (but doesn't end with) two underscores, it's private; everything else is public. Python has no concept of protected class methods (accessible only in their own class and descendant classes). Class methods are either private (accessible only in their own class) or public (accessible from anywhere).

In Python, all special methods (like __setitem__) and built-in attributes (like __doc__) follow a standard naming convention: they both start with and end with two underscores. Don't name your own methods and attributes this way, because it will only confuse you (and others) later.

example 5.19.Trying to Call a Private Method
>>> import fileinfo
>>> m = fileinfo.MP3FileInfo()
>>> m.__parse("d:/data/LastName.mp3")
Traceback (most recent call last):
File "<interactive input>", line 1, in <module>
AttributeError: MP3FileInfo instance has no attribute '__parse'

If you try to call a private method, Python will raise a slightly misleading exception, saying that the method does not exist. Of course it does exist, but it's private, so it's not accessible outside the class.Strictly speaking, private methods are accessible outside their class, just not easily accessible. Nothing in Python is truly private; internally, the names of private methods and attributes are mangled and unmangled on the fly to make them seem inaccessible by their given names. You can access the __parse method of the MP3FileInfo class by the name _MP3FileInfo__parse. Acknowledge that this is interesting, but promise to never, ever do it in real code. Private methods are private for a reason, but like many other things in Python, their privateness is ultimately a matter of convention, not force.

That's it for the hard-core object trickery. You'll see a real-world application of special class methods in Chapter 12, which uses getattr to create a proxy to a remote web service.
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值