python中不区分大小写语句怎么写_python中不区分大小写的字符串类

最新推荐文章于 2024-05-09 10:55:19 发布

weixin_39922868

最新推荐文章于 2024-05-09 10:55:19 发布

阅读量298

点赞数

文章标签： python中不区分大小写语句怎么写

I need to perform case insensitive string comparisons in python in sets and dictionary keys. Now, to create sets and dict subclasses that are case insensitive proves surprisingly tricky (see: Case insensitive dictionary for ideas, note they all use lower - hey there's even a rejected PEP, albeit its scope is a bit broader). So I went with creating a case insensitive string class (leveraging this answer by @AlexMartelli):

class CIstr(unicode):

"""Case insensitive with respect to hashes and comparisons string class"""

#--Hash/Compare

def __hash__(self):

return hash(self.lower())

def __eq__(self, other):

if isinstance(other, basestring):

return self.lower() == other.lower()

return NotImplemented

def __ne__(self, other): return not (self == other)

def __lt__(self, other):

if isinstance(other, basestring):

return self.lower() < other.lower()

return NotImplemented

def __ge__(self, other): return not (self < other)

def __gt__(self, other):

if isinstance(other, basestring):

return self.lower() > other.lower()

return NotImplemented

def __le__(self, other): return not (self > other)

I am fully aware that lower is not really enough to cover all cases of string comparisons in unicode but I am refactoring existing code that used a much clunkier class for string comparisons (memory and speed wise) which anyway used lower() - so I can amend this on a later stage - plus I am on python 2 (as seen by unicode). My questions are:

did I get the operators right ?

is this class enough for my purposes, given that I take care to construct keys in dicts and set elements as CIstr instances - my purposes being checking equality, containment, set differences and similar operations in a case insensitive way. Or am I missing something ?

is it worth it to cache the lower case version of the string (as seen for instance in this ancient python recipe: Case Insensitive Strings). This comment suggests that not - plus I want to have construction as fast as possible and size as small as possible but people seem to include this.

Python 3 compatibility tips are appreciated !

Tiny demo:

d = {CIstr('A'): 1, CIstr('B'): 2}

print 'a' in d # True

s = set(d)

print {'a'} - s # set([])

解决方案

In your demo you are using 'a' to look stuff up in your set. It wouldn't work if you tried to use 'A', because 'A' has a different hash. Also 'A' in d.keys() would be true, but 'A' in d would be false. You've essentially created a type that violates the normal contract of all hashes, by claiming to be equal to objects that have different hashes.

You could combine this answer with the answers about creating specialised dicts, and have a dict that converted any possible key into CIstr before trying to look it up. Then all your CIstr conversions could be hidden away inside the dictionary class.

E.g.

class CaseInsensitiveDict(dict):

def __setitem__(self, key, value):

super(CaseInsensitiveDict, self).__setitem__(convert_to_cistr(key), value)

def __getitem__(self, key):

return super(CaseInsensitiveDict, self).__getitem__(convert_to_cistr(key))

# __init__, __contains__ etc.