DiveIntoPython(三)

DiveIntoPython(三)

英文书地址:
http://diveintopython.org/toc/index.html

Chapter 4. The Power Of Introspection

4.1 Diving In
examples:
def info(object, spacing=10, collapse=1):
"""Print methods and doc strings.

Takes module, class, list, dictionary, or string."""
methodList = [e for e in dir(object) if callable(getattr(object, e))]
processFunc = collapse and (lambda s: " ".join(s.split())) or (lambda s: s)
print "\n".join(["%s %s" %
(method.ljust(spacing),
processFunc(str(getattr(object, method).__doc__)))
for method in methodList])

if __name__ == "__main__":
print help.__doc__


The info function has a multi-line doc string that succinctly describes the function's purpose. Note that no return value is mentioned; this function will be used solely for its effects, rather than its value.

It takes any object that has functions or methods (like a module, which has functions, or a list, which has methods) and prints out the functions and their doc strings.

examples----Sample Usage of apihelper.py:
>>> from apihelper import info
>>> li = []
>>> info(li)
__add__ x.__add__(y) <==> x+y
__class__ list() -> new list list(sequence) -> new list initialized from sequence's items
__contains__ x.__contains__(y) <==> y in x
__delattr__ x.__delattr__('name') <==> del x.name
__delitem__ x.__delitem__(y) <==> del x[y]
__delslice__ x.__delslice__(i, j) <==> del x[i:j] Use of negative indices is not supported.
__eq__ x.__eq__(y) <==> x==y
__format__ default object formatter
__ge__ x.__ge__(y) <==> x>=y
__getattribute__ x.__getattribute__('name') <==> x.name
__getitem__ x.__getitem__(y) <==> x[y]
__getslice__ x.__getslice__(i, j) <==> x[i:j] Use of negative indices is not supported.
__gt__ x.__gt__(y) <==> x>y
__iadd__ x.__iadd__(y) <==> x+=y
__imul__ x.__imul__(y) <==> x*=y
__init__ x.__init__(...) initializes x; see x.__class__.__doc__ for signature
__iter__ x.__iter__() <==> iter(x)
__le__ x.__le__(y) <==> x<=y
__len__ x.__len__() <==> len(x)
__lt__ x.__lt__(y) <==> x<y
__mul__ x.__mul__(n) <==> x*n
__ne__ x.__ne__(y) <==> x!=y
__new__ T.__new__(S, ...) -> a new object with type S, a subtype of T
__reduce__ helper for pickle
__reduce_ex__ helper for pickle
__repr__ x.__repr__() <==> repr(x)
__reversed__ L.__reversed__() -- return a reverse iterator over the list
__rmul__ x.__rmul__(n) <==> n*x
__setattr__ x.__setattr__('name', value) <==> x.name = value
__setitem__ x.__setitem__(i, y) <==> x[i]=y
__setslice__ x.__setslice__(i, j, y) <==> x[i:j]=y Use of negative indices is not supported.
__sizeof__ L.__sizeof__() -- size of L in memory, in bytes
__str__ x.__str__() <==> str(x)
__subclasshook__ Abstract classes can override this to customize issubclass(). This is invoked early on by abc.ABCMeta.__subclasscheck__(). It should return True, False or NotImplemented. If it returns NotImplemented, the normal algorithm is used. Otherwise, it overrides the normal algorithm (and the outcome is cached).
append L.append(object) -- append object to end
count L.count(value) -> integer -- return number of occurrences of value
extend L.extend(iterable) -- extend list by appending elements from the iterable
index L.index(value, [start, [stop]]) -> integer -- return first index of value. Raises ValueError if the value is not present.
insert L.insert(index, object) -- insert object before index
pop L.pop([index]) -> item -- remove and return item at index (default last). Raises IndexError if list is empty or index is out of range.
remove L.remove(value) -- remove first occurrence of value. Raises ValueError if the value is not present.
reverse L.reverse() -- reverse *IN PLACE*
sort L.sort(cmp=None, key=None, reverse=False) -- stable sort *IN PLACE*; cmp(x, y) -> -1, 0, 1

By default the output is formatted to be easy to read. Multi-line doc strings are collapsed into a single long line, but this option can be changed by specifying 0 for the collapse argument. If the function names are longer than 10 characters, you can specify a larger value for the spacing argument to make the output easier to read.

examples----Advanced Usage of apihelper.py:
>>> import odbchelper
>>> info(odbchelper)
buildConnectionString Build a connection string from a dictionary Returns string.
>>> info(odbchelper,30)
buildConnectionString Build a connection string from a dictionary Returns string.
>>> info(odbchelper,30,0)
buildConnectionString Build a connection string from a dictionary

Returns string.

4.2 Using Optional and Named Arguments
Python allows function arguments to have default values; if the function is called without the argument, the argument gets its default value. Futhermore, arguments can be specified in any order by using named arguments.

examples:
def info(object, spacing=10, collapse=1):

spacing and collapse are optional, because they have default values defined. object is required, because it has no default value. If info is called with only one argument, spacing defaults to 10 and collapse defaults to 1. If info is called with two arguments, collapse still defaults to 1.

Say you want to specify a value for collapse but want to accept the default value for spacing. In most languages, you would be out of luck, because you would need to call the function with three arguments. But in Python, arguments can be specified by name, in any order.

examples:
>>> info(odbchelper)
buildConnectionString Build a connection string from a dictionary Returns string.
>>> info(odbchelper,12)
buildConnectionString Build a connection string from a dictionary Returns string.
>>> info(odbchelper,collapse=0)
buildConnectionString Build a connection string from a dictionary

Returns string.

>>> info(spacing=16,object=odbchelper)
buildConnectionString Build a connection string from a dictionary Returns string.

Even required arguments (like object, which has no default value) can be named, and named arguments can appear in any order.

This looks totally whacked until you realize that arguments are simply a dictionary.

4.3. Using type,str,dir,and Other Built-In Functions
Python has a small set of extremely useful built-in functions. All other functions are partitioned off into modules. This was actually a conscious design decision, to keep the core language from getting bloated like other scripting languages

4.3.1 The type Function
The type function returns the datatype of any arbitrary object. The possible types are listed in the types module. This is useful for helper functions that can handle several types of data.

examples---Introducing type:
>>> type(info)
<type 'function'>
>>> type(1)
<type 'int'>
>>> li = []
>>> type(li)
<type 'list'>
>>> import odbchelper
>>> type(odbchelper)
<type 'module'>
>>> import types
>>> type(odbchelper) == types.ModuleType
True
>>> type(types.ModuleType)
<type 'type'>

type takes anything -- and I mean anything -- and returns its datatype. Integers, strings, lists, dictionaries, tuples, functions, classes, modules, even types are acceptable.

4.3.2 The str Function
The str coerces data into a string. Every datatype can be coerced into a string.

examples -----Introducing str:
>>> str(1)
'1'
>>> horsemen = ['war','pestilence','famous']
>>> horsemen
['war', 'pestilence', 'famous']
>>> str(horsemen)
"['war', 'pestilence', 'famous']"
>>> str(odbchelper)
"<module 'odbchelper' from 'odbchelper.pyc'>"
>>> str(None)
'None'

At the heart of the info function is the powerful dir function. dir returns a list of the attributes and methods of any object: modules, functions, strings, lists, dictionaries... pretty much anything.

examples ------Introducing dir:
>>> li = []
>>> dir(li)
['__add__', '__class__', '__contains__', '__delattr__', '__delitem__', '__delslice__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getslice__', '__gt__', '__hash__', '__iadd__', '__imul__', '__init__', '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__reversed__', '__rmul__', '__setattr__', '__setitem__', '__setslice__', '__sizeof__', '__str__', '__subclasshook__', 'append', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort']
>>> d = {}
>>> dir(d)
['__class__', '__cmp__', '__contains__', '__delattr__', '__delitem__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__init__', '__iter__', '__le__', '__len__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setitem__', '__sizeof__', '__str__', '__subclasshook__', 'clear', 'copy', 'fromkeys', 'get', 'has_key', 'items', 'iteritems', 'iterkeys', 'itervalues', 'keys', 'pop', 'popitem', 'setdefault', 'update', 'values']
>>> import odbchelper
>>> dir(odbchelper)
['__author__', '__builtins__', '__copyright__', '__date__', '__doc__', '__file__', '__license__', '__name__', '__package__', '__version__', 'buildConnectionString']

This is where it really gets interesting. odbchelper is a module, so dir(odbchelper) returns a list of all kinds of stuff defined in the module, including built-in attributes, like __name__, __doc__, and whatever other attributes and methods you define. In this case, odbchelper has only one user-defined method, the buildConnectionString function.

Finally, the callable function takes any object and returns True if the object can be called, or False otherwise. Callable objects include functions, class methods, even classes themselves.

examples -----Introducing callable:
>>> import string
>>> string.punctuation
'!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~'
>>> string.join
<function join at 0x00BB9330>
>>> callable(string.punctuation)
False
>>> callable(string.join)
True
>>> print string.join.__doc__
join(list [,sep]) -> string

Return a string composed of the words in list, with
intervening occurrences of sep. The default separator is a
single space.

(joinfields and join are synonymous)

The functions in the string module are deprecated (although many people still use the join function), but the module contains a lot of useful constants like this string.punctuation, which contains all the standard punctuation characters.

string.punctuation is not callable; it is a string. (A string does have callable methods, but the string itself is not callable.)

Any callable object may have a doc string. By using the callable function on each of an object's attributes, you can determine which attributes you care about (methods, functions, classes) and which you want to ignore (constants and so on) without knowing anything about the object ahead of time.

4.3.3. Built-In Functions

examples ---Built-in Attributes and Functions:
>>> from apihelper import info
>>> import __builtin__
>>> info(__builtin__,20)
ArithmeticError Base class for arithmetic errors.
AssertionError Assertion failed.
AttributeError Attribute not found.
BaseException Common base class for all exceptions
BufferError Buffer error.
...snip...

4.4.Getting Object References With getattr
You already know that Python functions are objects. What you don't know is that you can get a reference to a function without knowing its name until run-time, by using the getattr function.

examples Introducing getattr:
>>> li = ["larry","curly"]
>>> li.pop
<built-in method pop of list object at 0x013882B0>
>>> getattr(li,"pop")
<built-in method pop of list object at 0x013882B0>
>>> getattr(li,"append")("moe")
>>> li
['larry', 'curly', 'moe']
>>> getattr({},"clear")
<built-in method clear of dict object at 0x01386C00>
>>> getattr((),"pop")
Traceback (most recent call last):
File "<interactive input>", line 1, in <module>
AttributeError: 'tuple' object has no attribute 'pop'

This gets a reference to the pop method of the list. Note that this is not calling the pop method; that would be li.pop(). This is the method itself.

This also returns a reference to the pop method, but this time, the method name is specified as a string argument to the getattr function. getattr is an incredibly useful built-in function that returns any attribute of any object. In this case, the object is a list, and the attribute is the pop method.

In case it hasn't sunk in just how incredibly useful this is, try this: the return value of getattr is the method, which you can then call just as if you had said li.append("Moe") directly. But you didn't call the function directly; you specified the function name as a string instead.

In theory, getattr would work on tuples, except that tuples have no methods, so getattr will raise an exception no matter what attribute name you give.

4.4.1 getattr with Modules
getattr isn't just for built-in datatypes. It also works on modules.

examples ----- The getattr Function in apihelper.py:
>>> import odbchelper
>>> odbchelper.buildConnectionString
<function buildConnectionString at 0x0137F8F0>
>>> getattr(odbchelper,"buildConnectionString")
<function buildConnectionString at 0x0137F8F0>
>>> object = odbchelper
>>> method = "buildConnectionString"
>>> getattr(object,method)
<function buildConnectionString at 0x0137F8F0>
>>> type(getattr(object,method))
<type 'function'>
>>> import types
>>> type(getattr(object,method)) == types.FunctionType
True
>>> callable(getattr(object,method))
True

Using getattr, you can get the same reference to the same function. In general, getattr(object, "attribute") is equivalent to object.attribute. If object is a module, then attribute can be anything defined in the module: a function, class, or global variable.

4.4.2. getattr As a Dispatcher
A common usage pattern of getattr is as a dispatcher. For example, if you had a program that could output data in a variety of different formats, you could define separate functions for each output format and use a single dispatch function to call the right one.

For example, let's imagine a program that prints site statistics in HTML, XML, and plain text formats. The choice of output format could be specified on the command line, or stored in a configuration file. A statsout module defines three functions, output_html, output_xml, and output_text. Then the main program defines a single output function, like this:

examples ------ Creating a Dispatcher with getattr:
>>> import statsout
>>> def output(data,format="text"):
... output_function = getattr(statsout,"output_%s" % format)
... return output_function(data)
...
>>>

You concatenate the format argument with "output_" to produce a function name, and then go get that function from the statsout module. This allows you to easily extend the program later to support other output formats, without changing this dispatch function. Just add another function to statsout named, for instance, output_pdf, and pass "pdf" as the format into the output function.

Did you see the bug in the previous example? This is a very loose coupling of strings and functions, and there is no error checking. What happens if the user passes in a format that doesn't have a corresponding function defined in statsout? Well, getattr will return None, which will be assigned to output_function instead of a valid function, and the next line that attempts to call that function will crash and raise an exception. That's bad.

Luckily, getattr takes an optional third argument, a default value.

examples---- getattr Default Values:
>>> import statsout
>>> def output(data,format="text"):
... output_function = getattr(statsout,"output_%s" % format,statsout.statsout_text)
... return output_function(data)
...
>>>

This function call is guaranteed to work, because you added a third argument to the call to getattr. The third argument is a default value that is returned if the attribute or method specified by the second argument wasn't found.

4.5. Filtering Lists
As you know, Python has powerful capabilities for mapping lists into other lists, via list comprehensions (Section 3.6, “Mapping Lists”). This can be combined with a filtering mechanism, where some elements in the list are mapped while others are skipped entirely.

Here is the list filtering syntax:
[mapping-expression for element in source-list if filter-expression]

A filter expression can be any expression that evaluates true or false (which in Python can be almost anything). Any element for which the filter expression evaluates true will be included in the mapping. All other elements are ignored, so they are never put through the mapping expression and are not included in the output list.

examples Introducing List Filtering:
>>> li = ['a','mpilgrim','foo','b','c','b','d','d']
>>> [elem for elem in li if len(elem) > 1]
['mpilgrim', 'foo']
>>> [elem for elem in li if elem != 'b']
['a', 'mpilgrim', 'foo', 'c', 'd', 'd']
>>> [elem for elem in li if li.count(elem) == 1]
['a', 'mpilgrim', 'foo', 'c']

count is a list method that returns the number of times a value occurs in a list. You might think that this filter would eliminate duplicates from a list, returning a list containing only one copy of each value in the original list. But it doesn't, because values that appear twice in the original list (in this case, b and d) are excluded completely. There are ways of eliminating duplicates from a list, but filtering is not the solution.

examples:
methodList = [method for method in dir(object) if callable(getattr(object, method))]

This looks complicated, and it is complicated, but the basic structure is the same. The whole filter expression returns a list, which is assigned to the methodList variable. The first half of the expression is the list mapping part. The mapping expression is an identity expression, which it returns the value of each element. dir(object) returns a list of object's attributes and methods -- that's the list you're mapping. So the only new part is the filter expression after the if.

The filter expression looks scary, but it's not. You already know about callable, getattr, and in. As you saw in the previous section, the expression getattr(object, method) returns a function object if object is a module and method is the name of a function in that module.

You do the weeding out by taking the name of each attribute/method/function and getting a reference to the real thing, via the getattr function. Then you check to see if that object is callable, which will be any methods and functions, both built-in (like the pop method of a list) and user-defined (like the buildConnectionString function of the odbchelper module). You don't care about other attributes, like the __name__ attribute that's built in to every module.

4.6. The Peculiar Nature of and and or
In Python, and and or perform boolean logic as you would expect, but they do not return boolean values; instead, they return one of the actual values they are comparing.

examples ---- Introducing and:
>>> 'a' and 'b'
'b'
>>> '' and 'b'
''
>>> 'a' and 'b' and 'c'
'c'

When using and, values are evaluated in a boolean context from left to right. 0, '', [], (), {}, and None are false in a boolean context; everything else is true. Well, almost everything. By default, instances of classes are true in a boolean context, but you can define special methods in your class to make an instance evaluate to false. If all values are true in a boolean context, and returns the last value. In this case, and evaluates 'a', which is true, then 'b', which is true, and returns 'b'.

If any value is false in a boolean context, and returns the first false value. In this case, '' is the first false value.

All values are true, so and returns the last value, 'c'.

examples----Introducing or:
>>> 'a' or 'b'
'a'
>>> '' or 'b'
'b'
>>> '' or [] or {}
{}
>>> def sidefx():
... print "in sidefx()"
... return 1
...
>>> 'a' or sidefx()
'a'
>>> def sidefx2():
... print "in sidefx2()"
... return 0
...
>>> 'a' or sidefx()
'a'

When using or, values are evaluated in a boolean context from left to right, just like and. If any value is true, or returns that value immediately. In this case, 'a' is the first true value.

or evaluates '', which is false, then 'b', which is true, and returns 'b'.

If all values are false, or returns the last value. or evaluates '', which is false, then [], which is false, then {}, which is false, and returns {}.

Note that or evaluates values only until it finds one that is true in a boolean context, and then it ignores the rest. This distinction is important if some values can have side effects. Here, the function sidefx is never called, because or evaluates 'a', which is true, and returns 'a' immediately.

If you're a C hacker, you are certainly familiar with the bool ? a : b expression, which evaluates to a if bool is true, and b otherwise. Because of the way and and or work in Python, you can accomplish the same thing.

4.6.1 Using the and-or Trick
examples----Introducing the and-or Trick
>>> a = ''
>>> b = 'second'
>>> 1 and a
''
>>> 1 and a or b
'second'
>>> '' or b
'second'

Since a is an empty string, which Python considers false in a boolean context, 1 and '' evalutes to '', and then '' or 'second' evalutes to 'second'.

4.7.Using lambda Functions
Python supports an interesting syntax that lets you define one-line mini-functions on the fly.

examples -----Introducing lambda Functions:
>>> def f(x):
... return x*2
...
>>> f(3)
6
>>> g = lambda x:x*2
>>> g(3)
6
>>> (lambda x:x*2)(3)
6

This is a lambda function that accomplishes the same thing as the normal function above it. Note the abbreviated syntax here: there are no parentheses around the argument list, and the return keyword is missing (it is implied, since the entire function can only be one expression). Also, the function has no name, but it can be called through the variable it is assigned to.

To generalize, a lambda function is a function that takes any number of arguments (including optional arguments) and returns the value of a single expression.

4.7.1. Real-World lambda Functions
examples:
processFunc = collapse and (lambda s: " ".join(s.split())) or (lambda s: s)

Notice that this uses the simple form of the and-or trick, which is okay, because a lambda function is always true in a boolean context.

Also notice that you're using the split function with no arguments. You've already seen it used with one or two arguments, but without any arguments it splits on whitespace.

examples---- split With No Arguments:
>>> s = "this is \na\ttest"
>>> print s
this is
a test
>>> print s.split()
['this', 'is', 'a', 'test']
>>> print "".join(s.split())
thisisatest
>>> print " ".join(s.split())
this is a test

This is a multiline string, defined by escape characters instead of triple quotes. \n is a carriage return, and \t is a tab character.

split without any arguments splits on whitespace. So three spaces, a carriage return, and a tab character are all the same.

examples:
processFunc = collapse and (lambda s: " ".join(s.split())) or (lambda s: s)

processFunc is now a function, but which function it is depends on the value of the collapse variable. If collapse is true, processFunc(string) will collapse whitespace; otherwise, processFunc(string) will return its argument unchanged.

4.8.Putting It All Together
examples:
print "\n".join(["%s %s" %
(method.ljust(spacing),
processFunc(str(getattr(object, method).__doc__)))
for method in methodList])

Note that this is one command, split over multiple lines, but it doesn't use the line continuation character (\). Remember when I said that some expressions can be split into multiple lines without using a backslash? A list comprehension is one of those expressions, since the entire expression is contained in square brackets.

examples Getting a doc string Dynamically:
>>> import odbchelper
>>> object = odbchelper
>>> method = 'buildConnectionString'
>>> getattr(object,method)
<function buildConnectionString at 0x0138C230>
>>> print getattr(object,method).__doc__
Build a connection string from a dictionary

Returns string.

>>>

The next piece of the puzzle is the use of str around the doc string. As you may recall, str is a built-in function that coerces data into a string. But a doc string is always a string, so why bother with the str function? The answer is that not every function has a doc string, and if it doesn't, its __doc__ attribute is None.

examples---Why Use str On a doc string:
>>> def foo():print 2
...
>>> foo()
2
>>> foo.__doc__
>>> foo.__doc__ == None
True
>>> str(foo.__doc__)
'None'

Now that you are guaranteed to have a string, you can pass the string to processFunc, which you have already defined as a function that either does or doesn't collapse whitespace. Now you see why it was important to use str to convert a None value into a string representation. processFunc is assuming a string argument and calling its split method, which would crash if you passed it None because None doesn't have a split method.

examples----Introducing ljust:
>>> s = 'buildConnectionString'
>>> s.ljust(30)
'buildConnectionString '
>>> s.ljust(20)
'buildConnectionString'

ljust pads the string with spaces to the given length.

If the given length is smaller than the length of the string, ljust will simply return the string unchanged. It never truncates the string.
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值