python中不区分大小写语句怎么写_python中不区分大小写的字符串类

I need to perform case insensitive string comparisons in python in sets and dictionary keys. Now, to create sets and dict subclasses that are case insensitive proves surprisingly tricky (see: Case insensitive dictionary for ideas, note they all use lower - hey there's even a rejected PEP, albeit its scope is a bit broader). So I went with creating a case insensitive string class (leveraging this answer by @AlexMartelli):

class CIstr(unicode):

"""Case insensitive with respect to hashes and comparisons string class"""

#--Hash/Compare

def __hash__(self):

return hash(self.lower())

def __eq__(self, other):

if isinstance(other, basestring):

return self.lower() == other.lower()

return NotImplemented

def __ne__(self, other): return not (self == other)

def __lt__(self, other):

if isinstance(other, basestring):

return self.lower() < other.lower()

return NotImplemented

def __ge__(self, other): return not (self < other)

def __gt__(self, other):

if isinstance(other, basestring):

return self.lower() > other.lower()

return NotImplemented

def __le__(self, other): return not (self > other)

I am fully aware that lower is not really enough to cover all cases of string comparisons in unicode but I am refactoring existing code that used a much clunkier class for string comparisons (memory and speed wise) which anyway used lower() - so I can amend this on a later stage - plus I am on python 2 (as seen by unicode). My questions are:

did I get the operators right ?

is this class enough for my purposes, given that I take care to construct keys in dicts and set elements as CIstr instances - my purposes being checking equality, containment, set differences and similar operations in a case insensitive way. Or am I missing something ?

is it worth it to cache the lower case version of the string (as seen for instance in this ancient python recipe: Case Insensitive Strings). This comment suggests that not - plus I want to have construction as fast as possible and size as small as possible but people seem to include this.

Python 3 compatibility tips are appreciated !

Tiny demo:

d = {CIstr('A'): 1, CIstr('B'): 2}

print 'a' in d # True

s = set(d)

print {'a'} - s # set([])

解决方案

In your demo you are using 'a' to look stuff up in your set. It wouldn't work if you tried to use 'A', because 'A' has a different hash. Also 'A' in d.keys() would be true, but 'A' in d would be false. You've essentially created a type that violates the normal contract of all hashes, by claiming to be equal to objects that have different hashes.

You could combine this answer with the answers about creating specialised dicts, and have a dict that converted any possible key into CIstr before trying to look it up. Then all your CIstr conversions could be hidden away inside the dictionary class.

E.g.

class CaseInsensitiveDict(dict):

def __setitem__(self, key, value):

super(CaseInsensitiveDict, self).__setitem__(convert_to_cistr(key), value)

def __getitem__(self, key):

return super(CaseInsensitiveDict, self).__getitem__(convert_to_cistr(key))

# __init__, __contains__ etc.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值