I can't believe it's that difficult to treat a variable as a raw string! I have searched and found questions alike, but no proper answer.
I have a variable with domain name stored in. e.g. 'domain\user', I need to get the username only using re. The problem is Python gives me hex values for special character combinations, for example when I have \b in the string.
I just need to get the literal string from the variable, and nothing else.
author = list[0] // list[0] contains 'domain\blah'
author = re.sub('.*\\\\(.+)$', r'\1', author)
I'd expect blah, getting 'domain\x08lah'!
Saving the string as raw string at the start is not an option, because I'm getting the string from other regex operations.
Any ideas?
EDIT:
I was mistaken by assuming the variable had a single slash in. In fact, when getting the variable from another operation, the backclash had already been escaped. So I was making it a problem for myself when trying to create a test scenario.
解决方案
A raw string literal is only used to create string values, by avoiding (most) string escape codes that a regular string literal would use.
Your string started with the \x08 character; it never contained a backslash and a b character. If you defined the value contained in list[0] with a string literal, you forgot to escape the backslash. If the data came from somewhere else, you are looking at a raw hex byte value of 08:
>>> list_0 = 'domain\x08lah'
>>> list_0[6]
'\x08'
>>> len(list_0[6])
1
>>> ord(list_0[6])
8
If this byte was meant to be two characters instead, you could repair the data with string replacement:
>>> list_0.replace('\b', '\\b')
'domain\\blah'