What is a good regular expression for handling a floating point number (i.e. like Java's Float)
The answer must match against the following targets:
1) 1.
2) .2
3) 3.14
4) 5e6
5) 5e-6
6) 5E+6
7) 7.e8
8) 9.0E-10
9) .11e12
In summary, it should
ignore preceding signs
require the first character to the left of the decimal point to be non-zero
allow 0 or more digits on either side of the decimal point
permit a number without a decimal point
allow scientific notation
allow capital or lowercase 'e'
allow positive or negative exponents
For those who are wondering, yes this is a homework problem. We received this as an assignment in my graduate CS class on compilers. I've already turned in my answer for the class and will post it as an answer to this question.
[Epilogue]
My solution didn't get full credit because it didn't handle more than 1 digit to the left of the decimal. The assignment did mention handling Java floats even though none of the examples had more than 1 digit to the left of the decimal. I'll post the accepted answer in it's own post.
解决方案
[This is the answer from the professor]
Define:
N = [1-9]
D = 0 | N
E = [eE] [+-]? D+
L = 0 | ( N D* )
Then floating point numbers can be matched with:
( ( L . D* | . D+ ) E? ) | ( L E )
It was also acceptable to use D+ rather than L, and to prepend [+-]?.
A common mistake was to write D* . D*, but this can match just '.'.
[Edit]
Someone asked about a leading sign; I should have asked him why it was excluded but never got the chance. Since this was part of the lecture on grammars, my guess is that either it made the problem easier (not likely) or there is a small detail in parsing where you divide the problem set such that the floating point value, regardless of sign, is the focus (possible).
If you are parsing through an expression, e.g.
-5.04e-10 + 3.14159E10
the sign of the floating point value is part of the operation to be applied to the value and not an attribute of the number itself. In other words,
subtract (5.04e-10)
add (3.14159E10)
to form the result of the expression. While I'm sure mathematicians may argue the point, remember this was from a lecture on parsing.