I'm trying to use the following regex in Java, that's supposed to match any lang="2-char-lang-name":
String lang = "lang=\"" + L.detectLang(inputText) +"\"";
shovel.replaceFirst("lang=\"[..]\"", lang);
I know that a single slash would be interpreted by regex as a slash and not an escape character (so my code doesn't work), but if I escape the slash, the " won't be escaped any more and I'd get a syntax error.
In other words, how can I include a " in the regex? "lang=\\"[..]\\"" won't work. I've also tried three slashes and that didn't have any matches either.
I am also aware of the general rule that you don't use regex to parse XML/HTML. (and shovel is an XML) However, all I'm doing is, looking for a lang attribute that is within the first 30 characters of the XML, and I want to replace it. Is it really a bad idea to use regex in this case? I don't think using DOM would be any better/more efficient.
解决方案
Three slashes would be correct (\\ + \" becomes \ + " = \"). (Update: Actually, it turns out that isn't even necessary. A single slash also works, it seems.) The problem is your use of [..]; the [] symbols mean "any of the characters in here" (so [..] just means "any character").
Drop the [] and you should be getting what you want:
String ab = "foo=\"bar\" lang=\"AB\"";
String regex = "lang=\\\"..\\\"";
String cd = ab.replaceFirst(regex, "lang=\"CD\"");
System.out.println(cd);
Output:
foo="bar" lang="CD"