在MS下的一般文件命名习惯
Naming Conventions(命名习惯)
The following fundamental rules enable applications to create and process valid names for files and directories, regardless of the file system:
(在MS中的不同文件系统下,该名字系统都可适用)
- Use a period to separate the base file name from the extension in the name of a directory or file.
- (在MS的filesystem中对于文件或者是目录使用"."分割基本的文件名和明见后缀名)
- Use a backslash (\) to separate the components of a path. The backslash divides the file name from the path to it, and one directory name from another directory name in a path. You cannot use a backslash in the name for the actual file or directory because it is a reserved character that separates the names into components.
- (在MS中通过"\"进行路径的分离,一般文件名字是通过"\"隔离的,如果在文件名字中有"\",则不会作为一个文件名字,而是一个路径)
- Use a backslash as required as part of volume names, for example, the "C:\" in "C:\path\file" or the "\\server\share" in "\\server\share\path\file" for Universal Naming Convention (UNC) names. For more information about UNC names, see the Maximum Path Length Limitation section.
- (在MS中基本的盘符路径的分离也是通过"\"来表现)
- Do not assume case sensitivity. For example, consider the names OSCAR, Oscar, and oscar to be the same, even though some file systems (such as a POSIX-compliant file system) may consider them as different. Note that NTFS supports POSIX semantics for case sensitivity but this is not the default behavior. For more information, see CreateFile.
- (在MS中是不区分文件名大小写,这点和Linux与unix都是不同的,前者不区分,后者严格区分)
- Volume designators (drive letters) are similarly case-insensitive. For example, "D:\" and "d:\" refer to the same volume.
- (在MS中对于盘符目录也是如此)
-
Use any character in the current code page for a name, including Unicode characters and characters in the extended character set (128–255), except for the following:(一下字符不能作为文件名中的一部分,对于扩展的ASII码也是可以作为文件名字)
-
The following reserved characters:
- < (less than)
- > (greater than)
- : (colon)
- " (double quote)
- / (forward slash)
- \ (backslash)
- | (vertical bar or pipe)
- ? (question mark)
- * (asterisk)
- (以上的除了"/"外,其它的在Linux中都可以作为文件名字)
- Integer value zero, sometimes referred to as the ASCII NUL character.(空字符不能作为文件名)
- Characters whose integer representations are in the range from 1 through 31, except for alternate data streams where these characters are allowed. For more information about file streams, see File Streams.(0x001-0x01f的控制字符不能作为文件名,但是可以作为I/O文件流的输入)
- Any other character that the target file system does not allow.(文件系统不允许的字符)
-
- Use a period as a directory component in a path to represent the current directory, for example ".\temp.txt". For more information, see Paths.
- Use two consecutive periods (..) as a directory component in a path to represent the parent of the current directory, for example "..\temp.txt". For more information, see Paths.
- (不管是在Linux还是Unix中,"."和".."作为当前目录和父目录表示,因此不能以它们来命名,MS中也不能,但是对于前者三个","的都可以,但是MS不行)
-
Do not use the following reserved device names for the name of a file:
CON, PRN, AUX, NUL, COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8, COM9, LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, and LPT9. Also avoid these names followed immediately by an extension; for example, NUL.txt is not recommended. For more information, see Namespaces.
(在MS中不能使用如上保留字作为文件名字的单独一部分)
- Do not end a file or directory name with a space or a period. Although the underlying file system may support such names, the Windows shell and user interface does not. However, it is acceptable to specify a period as the first character of a name. For example, ".temp".
- (在MS中不要以空格作为文件名字的结尾,会自动省略,因此不能以单个空格为文件命名,但是在Linux和Unix中则可以,同时最好不要用"."开头命名,在Linux和Unix中代表是隐藏文件)
- (在Linux和Unix中不能直接使用字符"/"或者是多个"///"命名,但是可以使用"\"甚至多个命名)
1 #ifndef __ISVALIDFILENAME_H__
2 #define __ISVALIDFILENAME_H__
3
4 #include <tchar.h>
5
6 enum
7 {
8 ISVALID_FILENAME_ERROR = -1,
9 INVALID_FILENAME_CLOCK = -2,
10 INVALID_FILENAME_AUX = -3,
11 INVALID_FILENAME_CON = -4,
12 INVALID_FILENAME_NUL = -5,
13 INVALID_FILENAME_PRN = -6,
14 INVALID_FILENAME_COM1 = -7,
15 INVALID_FILENAME_COM2 = -8,
16 INVALID_FILENAME_COM3 = -9,
17 INVALID_FILENAME_COM4 = -10,
18 INVALID_FILENAME_COM5 = -11,
19 INVALID_FILENAME_COM6 = -12,
20 INVALID_FILENAME_COM7 = -13,
21 INVALID_FILENAME_COM8 = -14,
22 INVALID_FILENAME_COM9 = -15,
23 INVALID_FILENAME_LPT1 = -16,
24 INVALID_FILENAME_LPT2 = -17,
25 INVALID_FILENAME_LPT3 = -18,
26 INVALID_FILENAME_LPT4 = -19,
27 INVALID_FILENAME_LPT5 = -20,
28 INVALID_FILENAME_LPT6 = -21,
29 INVALID_FILENAME_LPT7 = -22,
30 INVALID_FILENAME_LPT8 = -23,
31 INVALID_FILENAME_LPT9 = -24
32 };
33
34 const TCHAR *pInvalidFileNameErrStr[];
35 const TCHAR *GetIsValidFileNameErrStr(int err);
36
37 int IsValidFileName(const char *pFileName);
38 int IsValidFileName(const wchar_t *pFileName);
39
40 /**************************************************************************
41 Copyright 2002 Joseph Woodbury.
42
43 Use of this file constitutes a full acceptance of the following license
44 agreement:
45
46 Redistribution and use in source and binary forms, with or without
47 modification, are permitted provided that the following conditions
48 are met:
49
50 1. Redistribution of source code must retain the above copyright
51 notice, this list of conditions and the following disclaimer.
52
53 2. A fee cannot be charged for any redistribution.
54
55 3. Full source must accompany any redistribution in binary form which
56 exposes the interfaces of that source whether directly or indirectly.
57
58 4. This software cannot be used in such as matter as to cause it,
59 or portions of it, in source and/or binary forms, to be covered,
60 or required to be disclosed, by the GNU Public License (GPL) and/or
61 any similarly structured software license.
62
63 5. Any binaries produced using this software must fully indemnify the
64 author with a disclaimer at least as effective and comprehensive as
65 the following:
66
67 THIS SOFTWARE IS PROVIDED BY JOSEPH WOODBURY "AS IS" AND ANY
68 EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
69 IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
70 PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL JOSEPH WOODBURY BE LIABLE
71 FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
72 CONSEQUENTIAL DAMAGES HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
73 WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE
74 OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
75 EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
76 **************************************************************************/
77
78 #endif
1 #include "IsValidFileName.h"
2
3 ///
4 // Strings are only for error display, they are not used by
5 // the IsValidFileName functions
6
7 const TCHAR *pInvalidFileNameErrStr[] =
8 {
9 _T("Error"),
10 _T("CLOCK$"),
11 _T("AUX"),
12 _T("CON"),
13 _T("NUL"),
14 _T("PRN"),
15 _T("COM1"),
16 _T("COM2"),
17 _T("COM3"),
18 _T("COM4"),
19 _T("COM5"),
20 _T("COM6"),
21 _T("COM7"),
22 _T("COM8"),
23 _T("COM9"),
24 _T("LPT1"),
25 _T("LPT2"),
26 _T("LPT3"),
27 _T("LPT4"),
28 _T("LPT5"),
29 _T("LPT6"),
30 _T("LPT7"),
31 _T("LPT8"),
32 _T("LPT9"),
33 NULL
34 };
35
36 ///
37 // const TCHAR *GetIsValidFileNameErrStr(int err)
38 //
39 // Return an string associated with the passed error code
40 //
41 // Parameters
42 //
43 // err A negative error number returned by IsValidFileName
44 //
45 // Returns
46 //
47 // A pointer to the device string
48 //
49 // NULL if err is not in the range of the INVALID_FILENAME_ enumeration.
50 //
51
52 const TCHAR *GetIsValidFileNameErrStr(int err)
53 {
54 if (err >= 0 || err < INVALID_FILENAME_LPT9)
55 return pInvalidFileNameErrStr[0];
56
57 return pInvalidFileNameErrStr[(-err) - 1];
58 }
59
60 ///
61 // int IsValidFileName(const char *pFileName)
62 //
63 // Ensure a file name is legal.
64 //
65 // Parameters
66 //
67 // pFileName The file name to check. This must be only the file name.
68 // If a full path is passed, the check will fail.
69 //
70 // Returns
71 //
72 // Zero on success.
73 //
74 // Non-Zero on failure.
75 //
76 // The return code can be used to determine why the call failed:
77 //
78 // >0 The illegal character that was used. If the value is a
79 // dot ('.', 46) the file name was nothing but dots.
80 //
81 // -1 A NULL or zero length file name was passed, or the file
82 // name exceeded 255 characters.
83 //
84 // <-1 A device name was used. The value corresponds to the
85 // INVALID_FILENAME_... series of enumerations. You can pass
86 // this value to GetIsValidFileNameErrStr to obtain a pointer to
87 // the name of this device.
88 //
89 // Remarks
90 //
91 // The NT file naming convention specifies that:
92 //
93 // - All characters greater than ASCII 31 to be used except for the following:
94 //
95 // "*/:<>?\|
96 //
97 // - A file may not be only dots
98 //
99 // - The following device names cannot be used for a file name nor may they
100 // be used for the first segment of a file name (that part which precedes the
101 // first dot):
102 //
103 // CLOCK$, AUX, CON, NUL, PRN, COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8,
104 // COM9, LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, LPT9
105 //
106 // - Device names are case insensitve. aux, AUX, Aux, etc. are identical.
107 //
108 // The ANSI and UNICODE functions are identical except for the declaration of the
109 // argument. A template function could have been used, but I chose not to since
110 // there are only two valid cases.
111 //
112 // The algorithm used looks convoluted because it is highly optimized. It is
113 // more than 11.5 times faster than if scanning method using _strnicmp
114 // was used.
115 //
116
117 int IsValidFileName(const char *pFileName)
118 {
119 if (!pFileName || !*pFileName)
120 return ISVALID_FILENAME_ERROR;
121
122 int nonDot = -1; // position of the first non dot in the file name
123 int dot = -1; // position of the first dot in the file name
124 int len = 0; // length of the file name
125
126 // If a non-dot character has been encountered
127
128 for (; len < 256 && pFileName[len]; len++)
129 {
130 if (pFileName[len] == '.')
131 {
132 if (dot < 0)
133 dot = len;
134 continue;
135 }
136 else if (nonDot < 0)
137 nonDot = len;
138
139 // The upper characters can be passed with a single check and
140 // since only the backslash and bar are above the ampersand
141 // it saves memory to do the check this way with little performance
142 // cost.
143 if (pFileName[len] >= '@')
144 {
145 if (pFileName[len] == '\\' || pFileName[len] == '|')
146 return pFileName[len];
147
148 continue;
149 }
150
151 static bool isCharValid[32] =
152 {
153 // ' ' ! " # $ % & ' ( ) * + , - . /
154 true, true, false, true, true, true, true, true, true, true, false, true, true, true, true, false,
155 // 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
156 true, true, true, true, true, true, true, true, true, true, false, true, false, true, false, false
157 // 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
158 };
159
160 // This is faster, at the expense of memory, than checking each
161 // invalid character individually. However, either method is much
162 // faster than using strchr().
163 if (pFileName[len] >= 32)
164 {
165 if (isCharValid[pFileName[len] - 32])
166 continue;
167 }
168 return pFileName[len];
169 }
170
171 if (len == 256)
172 return ISVALID_FILENAME_ERROR;
173
174 // if nonDot is still -1, no non-dots were encountered, return a dot (period)
175 if (nonDot < 0)
176 return '.';
177
178 // if the first character is a dot, the filename is okay
179 if (dot == 0)
180 return 0;
181
182 // if the file name has a dot, we only need to check up to the first dot
183 if (dot > 0)
184 len = dot;
185
186 // Since the device names aren't numerous, this method of checking is the
187 // fastest. Note that each character is checked with both cases.
188 if (len == 3)
189 {
190 if (pFileName[0] == 'a' || pFileName[0] == 'A')
191 {
192 if ((pFileName[1] == 'u' || pFileName[1] == 'U') &&
193 (pFileName[2] == 'x' || pFileName[2] == 'X'))
194 return INVALID_FILENAME_AUX;
195 }
196 else if (pFileName[0] == 'c' || pFileName[0] == 'C')
197 {
198 if ((pFileName[1] == 'o' || pFileName[1] == 'O') &&
199 (pFileName[2] == 'n' || pFileName[2] == 'N'))
200 return INVALID_FILENAME_CON;
201 }
202 else if (pFileName[0] == 'n' || pFileName[0] == 'N')
203 {
204 if ((pFileName[1] == 'u' || pFileName[1] == 'U') &&
205 (pFileName[2] == 'l' || pFileName[2] == 'L'))
206 return INVALID_FILENAME_NUL;
207 }
208 else if (pFileName[0] == 'p' || pFileName[0] == 'P')
209 {
210 if ((pFileName[1] == 'r' || pFileName[1] == 'R') &&
211 (pFileName[2] == 'n' || pFileName[2] == 'N'))
212 return INVALID_FILENAME_PRN;
213 }
214 }
215 else if (len == 4)
216 {
217 if (pFileName[0] == 'c' || pFileName[0] == 'C')
218 {
219 if ((pFileName[1] == 'o' || pFileName[1] == 'O') &&
220 (pFileName[2] == 'm' || pFileName[2] == 'M') &&
221 (pFileName[3] >= '1' || pFileName[3] <= '9'))
222 return INVALID_FILENAME_COM1 - (pFileName[3] - '1');
223 }
224 else if (pFileName[0] == 'l' || pFileName[0] == 'L')
225 {
226 if ((pFileName[1] == 'p' || pFileName[1] == 'P') &&
227 (pFileName[2] == 't' || pFileName[2] == 'T') &&
228 (pFileName[3] >= '1' || pFileName[3] <= '9'))
229 return INVALID_FILENAME_LPT1 - (pFileName[3] - '1');
230 }
231 }
232 else if (len == 6)
233 {
234 if ((pFileName[0] == 'c' || pFileName[0] == 'C') &&
235 (pFileName[1] == 'l' || pFileName[1] == 'L') &&
236 (pFileName[2] == 'o' || pFileName[2] == 'O') &&
237 (pFileName[3] == 'c' || pFileName[3] == 'C') &&
238 (pFileName[4] == 'k' || pFileName[4] == 'K') &&
239 (pFileName[5] == '$' || pFileName[5] == '$'))
240 return INVALID_FILENAME_CLOCK;
241 }
242
243 return 0;
244 }
245
246 int IsValidFileName(const wchar_t *pFileName)
247 {
248 if (!pFileName || !*pFileName)
249 return ISVALID_FILENAME_ERROR;
250
251 int nonDot = -1; // position of the first non dot in the file name
252 int dot = -1; // position of the first dot in the file name
253 int len = 0; // length of the file name
254 for (; len < 256 && pFileName[len]; len++)
255 {
256 if (pFileName[len] == '.')
257 {
258 if (dot < 0)
259 dot = len;
260 continue;
261 }
262 else if (nonDot < 0)
263 nonDot = len;
264
265 // The upper characters can be passed with a single check and
266 // since only the backslash and bar are above the ampersand
267 // it saves memory to do the check this way with little performance
268 // cost.
269 if (pFileName[len] >= '@')
270 {
271 if (pFileName[len] == '\\' || pFileName[len] == '|')
272 return pFileName[len];
273
274 continue;
275 }
276
277 static bool isCharValid[32] =
278 {
279 // ' ' ! " # $ % & ' ( ) * + , - . /
280 true, true, false, true, true, true, true, true, true, true, false, true, true, true, true, false,
281 // 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
282 true, true, true, true, true, true, true, true, true, true, false, true, false, true, false, false
283 // 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
284 };
285
286 // This is faster, at the expense of memory, than checking each
287 // invalid character individually. However, either method is much
288 // faster than using strchr().
289 if (pFileName[len] >= 32)
290 {
291 if (isCharValid[pFileName[len] - 32])
292 continue;
293 }
294 return pFileName[len];
295 }
296
297 if (len == 256)
298 return ISVALID_FILENAME_ERROR;
299
300 // if nonDot is still -1, no non-dots were encountered, return a dot (period)
301 if (nonDot < 0)
302 return '.';
303
304 // if the first character is a dot, the filename is okay
305 if (dot == 0)
306 return 0;
307
308 // if the file name has a dot, we only need to check up to the first dot
309 if (dot > 0)
310 len = dot;
311
312 // Since the device names aren't numerous, this method of checking is the
313 // fastest. Note that each character is checked with both cases.
314 if (len == 3)
315 {
316 if (pFileName[0] == 'a' || pFileName[0] == 'A')
317 {
318 if ((pFileName[1] == 'u' || pFileName[1] == 'U') &&
319 (pFileName[2] == 'x' || pFileName[2] == 'X'))
320 return INVALID_FILENAME_AUX;
321 }
322 else if (pFileName[0] == 'c' || pFileName[0] == 'C')
323 {
324 if ((pFileName[1] == 'o' || pFileName[1] == 'O') &&
325 (pFileName[2] == 'n' || pFileName[2] == 'N'))
326 return INVALID_FILENAME_CON;
327 }
328 else if (pFileName[0] == 'n' || pFileName[0] == 'N')
329 {
330 if ((pFileName[1] == 'u' || pFileName[1] == 'U') &&
331 (pFileName[2] == 'l' || pFileName[2] == 'L'))
332 return INVALID_FILENAME_NUL;
333 }
334 else if (pFileName[0] == 'p' || pFileName[0] == 'P')
335 {
336 if ((pFileName[1] == 'r' || pFileName[1] == 'R') &&
337 (pFileName[2] == 'n' || pFileName[2] == 'N'))
338 return INVALID_FILENAME_PRN;
339 }
340 }
341 else if (len == 4)
342 {
343 if (pFileName[0] == 'c' || pFileName[0] == 'C')
344 {
345 if ((pFileName[1] == 'o' || pFileName[1] == 'O') &&
346 (pFileName[2] == 'm' || pFileName[2] == 'M') &&
347 (pFileName[3] >= '1' || pFileName[3] <= '9'))
348 return INVALID_FILENAME_COM1 - (pFileName[3] - '1');
349 }
350 else if (pFileName[0] == 'l' || pFileName[0] == 'L')
351 {
352 if ((pFileName[1] == 'p' || pFileName[1] == 'P') &&
353 (pFileName[2] == 't' || pFileName[2] == 'T') &&
354 (pFileName[3] >= '1' || pFileName[3] <= '9'))
355 return INVALID_FILENAME_LPT1 - (pFileName[3] - '1');
356 }
357 }
358 else if (len == 6)
359 {
360 if ((pFileName[0] == 'c' || pFileName[0] == 'C') &&
361 (pFileName[1] == 'l' || pFileName[1] == 'L') &&
362 (pFileName[2] == 'o' || pFileName[2] == 'O') &&
363 (pFileName[3] == 'c' || pFileName[3] == 'C') &&
364 (pFileName[4] == 'k' || pFileName[4] == 'K') &&
365 (pFileName[5] == '$' || pFileName[5] == '$'))
366 return INVALID_FILENAME_CLOCK;
367 }
368
369 return 0;
370 }
371
372 /**************************************************************************
373 Copyright 2002 Joseph Woodbury.
374
375 Use of this file constitutes a full acceptance of the following license
376 agreement:
377
378 Redistribution and use in source and binary forms, with or without
379 modification, are permitted provided that the following conditions
380 are met:
381
382 1. Redistribution of source code must retain the above copyright
383 notice, this list of conditions and the following disclaimer.
384
385 2. A fee cannot be charged for any redistribution.
386
387 3. Full source must accompany any redistribution in binary form which
388 exposes the interfaces of that source whether directly or indirectly.
389
390 4. This software cannot be used in such as matter as to cause it,
391 or portions of it, in source and/or binary forms, to be covered,
392 or required to be disclosed, by the GNU Public License (GPL) and/or
393 any similarly structured software license.
394
395 5. Any binaries produced using this software must fully indemnify the
396 author with a disclaimer at least as effective and comprehensive as
397 the following:
398
399 THIS SOFTWARE IS PROVIDED BY JOSEPH WOODBURY "AS IS" AND ANY
400 EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
401 IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
402 PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL JOSEPH WOODBURY BE LIABLE
403 FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
404 CONSEQUENTIAL DAMAGES HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
405 WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE
406 OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
407 EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
408 **************************************************************************/
409
1 #include "IsValidFileName.h"
2
3 ///
4 // Strings are only for error display, they are not used by
5 // the IsValidFileName functions
6
7 const TCHAR *pInvalidFileNameErrStr[] =
8 {
9 _T("Error"),
10 _T("CLOCK$"),
11 _T("AUX"),
12 _T("CON"),
13 _T("NUL"),
14 _T("PRN"),
15 _T("COM1"),
16 _T("COM2"),
17 _T("COM3"),
18 _T("COM4"),
19 _T("COM5"),
20 _T("COM6"),
21 _T("COM7"),
22 _T("COM8"),
23 _T("COM9"),
24 _T("LPT1"),
25 _T("LPT2"),
26 _T("LPT3"),
27 _T("LPT4"),
28 _T("LPT5"),
29 _T("LPT6"),
30 _T("LPT7"),
31 _T("LPT8"),
32 _T("LPT9"),
33 NULL
34 };
35
36 ///
37 // const TCHAR *GetIsValidFileNameErrStr(int err)
38 //
39 // Return an string associated with the passed error code
40 //
41 // Parameters
42 //
43 // err A negative error number returned by IsValidFileName
44 //
45 // Returns
46 //
47 // A pointer to the device string
48 //
49 // NULL if err is not in the range of the INVALID_FILENAME_ enumeration.
50 //
51
52 const TCHAR *GetIsValidFileNameErrStr(int err)
53 {
54 if (err >= 0 || err < INVALID_FILENAME_LPT9)
55 return pInvalidFileNameErrStr[0];
56
57 return pInvalidFileNameErrStr[(-err) - 1];
58 }
59
60 ///
61 // int IsValidFileName(const char *pFileName)
62 //
63 // Ensure a file name is legal.
64 //
65 // Parameters
66 //
67 // pFileName The file name to check. This must be only the file name.
68 // If a full path is passed, the check will fail.
69 //
70 // Returns
71 //
72 // Zero on success.
73 //
74 // Non-Zero on failure.
75 //
76 // The return code can be used to determine why the call failed:
77 //
78 // >0 The illegal character that was used. If the value is a
79 // dot ('.', 46) the file name was nothing but dots.
80 //
81 // -1 A NULL or zero length file name was passed, or the file
82 // name exceeded 255 characters.
83 //
84 // <-1 A device name was used. The value corresponds to the
85 // INVALID_FILENAME_... series of enumerations. You can pass
86 // this value to GetIsValidFileNameErrStr to obtain a pointer to
87 // the name of this device.
88 //
89 // Remarks
90 //
91 // The NT file naming convention specifies that:
92 //
93 // - All characters greater than ASCII 31 to be used except for the following:
94 //
95 // "*/:<>?\|
96 //
97 // - A file may not be only dots
98 //
99 // - The following device names cannot be used for a file name nor may they
100 // be used for the first segment of a file name (that part which precedes the
101 // first dot):
102 //
103 // CLOCK$, AUX, CON, NUL, PRN, COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8,
104 // COM9, LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, LPT9
105 //
106 // - Device names are case insensitve. aux, AUX, Aux, etc. are identical.
107 //
108 // The ANSI and UNICODE functions are identical except for the declaration of the
109 // argument. A template function could have been used, but I chose not to since
110 // there are only two valid cases.
111 //
112 // The algorithm used looks convoluted because it is highly optimized. It is
113 // more than 11.5 times faster than if scanning method using _strnicmp
114 // was used.
115 //
116
117 int IsValidFileName(const char *pFileName)
118 {
119 if (!pFileName || !*pFileName)
120 return ISVALID_FILENAME_ERROR;
121
122 int nonDot = -1; // position of the first non dot in the file name
123 int dot = -1; // position of the first dot in the file name
124 int len = 0; // length of the file name
125
126 // If a non-dot character has been encountered
127
128 for (; len < 256 && pFileName[len]; len++)
129 {
130 if (pFileName[len] == '.')
131 {
132 if (dot < 0)
133 dot = len;
134 continue;
135 }
136 else if (nonDot < 0)
137 nonDot = len;
138
139 // The upper characters can be passed with a single check and
140 // since only the backslash and bar are above the ampersand
141 // it saves memory to do the check this way with little performance
142 // cost.
143 if (pFileName[len] >= '@')
144 {
145 if (pFileName[len] == '\\' || pFileName[len] == '|')
146 return pFileName[len];
147
148 continue;
149 }
150
151 static bool isCharValid[32] =
152 {
153 // ' ' ! " # $ % & ' ( ) * + , - . /
154 true, true, false, true, true, true, true, true, true, true, false, true, true, true, true, false,
155 // 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
156 true, true, true, true, true, true, true, true, true, true, false, true, false, true, false, false
157 // 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
158 };
159
160 // This is faster, at the expense of memory, than checking each
161 // invalid character individually. However, either method is much
162 // faster than using strchr().
163 if (pFileName[len] >= 32)
164 {
165 if (isCharValid[pFileName[len] - 32])
166 continue;
167 }
168 return pFileName[len];
169 }
170
171 if (len == 256)
172 return ISVALID_FILENAME_ERROR;
173
174 // if nonDot is still -1, no non-dots were encountered, return a dot (period)
175 if (nonDot < 0)
176 return '.';
177
178 // if the first character is a dot, the filename is okay
179 if (dot == 0)
180 return 0;
181
182 // if the file name has a dot, we only need to check up to the first dot
183 if (dot > 0)
184 len = dot;
185
186 // Since the device names aren't numerous, this method of checking is the
187 // fastest. Note that each character is checked with both cases.
188 if (len == 3)
189 {
190 if (pFileName[0] == 'a' || pFileName[0] == 'A')
191 {
192 if ((pFileName[1] == 'u' || pFileName[1] == 'U') &&
193 (pFileName[2] == 'x' || pFileName[2] == 'X'))
194 return INVALID_FILENAME_AUX;
195 }
196 else if (pFileName[0] == 'c' || pFileName[0] == 'C')
197 {
198 if ((pFileName[1] == 'o' || pFileName[1] == 'O') &&
199 (pFileName[2] == 'n' || pFileName[2] == 'N'))
200 return INVALID_FILENAME_CON;
201 }
202 else if (pFileName[0] == 'n' || pFileName[0] == 'N')
203 {
204 if ((pFileName[1] == 'u' || pFileName[1] == 'U') &&
205 (pFileName[2] == 'l' || pFileName[2] == 'L'))
206 return INVALID_FILENAME_NUL;
207 }
208 else if (pFileName[0] == 'p' || pFileName[0] == 'P')
209 {
210 if ((pFileName[1] == 'r' || pFileName[1] == 'R') &&
211 (pFileName[2] == 'n' || pFileName[2] == 'N'))
212 return INVALID_FILENAME_PRN;
213 }
214 }
215 else if (len == 4)
216 {
217 if (pFileName[0] == 'c' || pFileName[0] == 'C')
218 {
219 if ((pFileName[1] == 'o' || pFileName[1] == 'O') &&
220 (pFileName[2] == 'm' || pFileName[2] == 'M') &&
221 (pFileName[3] >= '1' || pFileName[3] <= '9'))
222 return INVALID_FILENAME_COM1 - (pFileName[3] - '1');
223 }
224 else if (pFileName[0] == 'l' || pFileName[0] == 'L')
225 {
226 if ((pFileName[1] == 'p' || pFileName[1] == 'P') &&
227 (pFileName[2] == 't' || pFileName[2] == 'T') &&
228 (pFileName[3] >= '1' || pFileName[3] <= '9'))
229 return INVALID_FILENAME_LPT1 - (pFileName[3] - '1');
230 }
231 }
232 else if (len == 6)
233 {
234 if ((pFileName[0] == 'c' || pFileName[0] == 'C') &&
235 (pFileName[1] == 'l' || pFileName[1] == 'L') &&
236 (pFileName[2] == 'o' || pFileName[2] == 'O') &&
237 (pFileName[3] == 'c' || pFileName[3] == 'C') &&
238 (pFileName[4] == 'k' || pFileName[4] == 'K') &&
239 (pFileName[5] == '$' || pFileName[5] == '$'))
240 return INVALID_FILENAME_CLOCK;
241 }
242
243 return 0;
244 }
245
246 int IsValidFileName(const wchar_t *pFileName)
247 {
248 if (!pFileName || !*pFileName)
249 return ISVALID_FILENAME_ERROR;
250
251 int nonDot = -1; // position of the first non dot in the file name
252 int dot = -1; // position of the first dot in the file name
253 int len = 0; // length of the file name
254 for (; len < 256 && pFileName[len]; len++)
255 {
256 if (pFileName[len] == '.')
257 {
258 if (dot < 0)
259 dot = len;
260 continue;
261 }
262 else if (nonDot < 0)
263 nonDot = len;
264
265 // The upper characters can be passed with a single check and
266 // since only the backslash and bar are above the ampersand
267 // it saves memory to do the check this way with little performance
268 // cost.
269 if (pFileName[len] >= '@')
270 {
271 if (pFileName[len] == '\\' || pFileName[len] == '|')
272 return pFileName[len];
273
274 continue;
275 }
276
277 static bool isCharValid[32] =
278 {
279 // ' ' ! " # $ % & ' ( ) * + , - . /
280 true, true, false, true, true, true, true, true, true, true, false, true, true, true, true, false,
281 // 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
282 true, true, true, true, true, true, true, true, true, true, false, true, false, true, false, false
283 // 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
284 };
285
286 // This is faster, at the expense of memory, than checking each
287 // invalid character individually. However, either method is much
288 // faster than using strchr().
289 if (pFileName[len] >= 32)
290 {
291 if (isCharValid[pFileName[len] - 32])
292 continue;
293 }
294 return pFileName[len];
295 }
296
297 if (len == 256)
298 return ISVALID_FILENAME_ERROR;
299
300 // if nonDot is still -1, no non-dots were encountered, return a dot (period)
301 if (nonDot < 0)
302 return '.';
303
304 // if the first character is a dot, the filename is okay
305 if (dot == 0)
306 return 0;
307
308 // if the file name has a dot, we only need to check up to the first dot
309 if (dot > 0)
310 len = dot;
311
312 // Since the device names aren't numerous, this method of checking is the
313 // fastest. Note that each character is checked with both cases.
314 if (len == 3)
315 {
316 if (pFileName[0] == 'a' || pFileName[0] == 'A')
317 {
318 if ((pFileName[1] == 'u' || pFileName[1] == 'U') &&
319 (pFileName[2] == 'x' || pFileName[2] == 'X'))
320 return INVALID_FILENAME_AUX;
321 }
322 else if (pFileName[0] == 'c' || pFileName[0] == 'C')
323 {
324 if ((pFileName[1] == 'o' || pFileName[1] == 'O') &&
325 (pFileName[2] == 'n' || pFileName[2] == 'N'))
326 return INVALID_FILENAME_CON;
327 }
328 else if (pFileName[0] == 'n' || pFileName[0] == 'N')
329 {
330 if ((pFileName[1] == 'u' || pFileName[1] == 'U') &&
331 (pFileName[2] == 'l' || pFileName[2] == 'L'))
332 return INVALID_FILENAME_NUL;
333 }
334 else if (pFileName[0] == 'p' || pFileName[0] == 'P')
335 {
336 if ((pFileName[1] == 'r' || pFileName[1] == 'R') &&
337 (pFileName[2] == 'n' || pFileName[2] == 'N'))
338 return INVALID_FILENAME_PRN;
339 }
340 }
341 else if (len == 4)
342 {
343 if (pFileName[0] == 'c' || pFileName[0] == 'C')
344 {
345 if ((pFileName[1] == 'o' || pFileName[1] == 'O') &&
346 (pFileName[2] == 'm' || pFileName[2] == 'M') &&
347 (pFileName[3] >= '1' || pFileName[3] <= '9'))
348 return INVALID_FILENAME_COM1 - (pFileName[3] - '1');
349 }
350 else if (pFileName[0] == 'l' || pFileName[0] == 'L')
351 {
352 if ((pFileName[1] == 'p' || pFileName[1] == 'P') &&
353 (pFileName[2] == 't' || pFileName[2] == 'T') &&
354 (pFileName[3] >= '1' || pFileName[3] <= '9'))
355 return INVALID_FILENAME_LPT1 - (pFileName[3] - '1');
356 }
357 }
358 else if (len == 6)
359 {
360 if ((pFileName[0] == 'c' || pFileName[0] == 'C') &&
361 (pFileName[1] == 'l' || pFileName[1] == 'L') &&
362 (pFileName[2] == 'o' || pFileName[2] == 'O') &&
363 (pFileName[3] == 'c' || pFileName[3] == 'C') &&
364 (pFileName[4] == 'k' || pFileName[4] == 'K') &&
365 (pFileName[5] == '$' || pFileName[5] == '$'))
366 return INVALID_FILENAME_CLOCK;
367 }
368
369 return 0;
370 }
371
372 /**************************************************************************
373 Copyright 2002 Joseph Woodbury.
374
375 Use of this file constitutes a full acceptance of the following license
376 agreement:
377
378 Redistribution and use in source and binary forms, with or without
379 modification, are permitted provided that the following conditions
380 are met:
381
382 1. Redistribution of source code must retain the above copyright
383 notice, this list of conditions and the following disclaimer.
384
385 2. A fee cannot be charged for any redistribution.
386
387 3. Full source must accompany any redistribution in binary form which
388 exposes the interfaces of that source whether directly or indirectly.
389
390 4. This software cannot be used in such as matter as to cause it,
391 or portions of it, in source and/or binary forms, to be covered,
392 or required to be disclosed, by the GNU Public License (GPL) and/or
393 any similarly structured software license.
394
395 5. Any binaries produced using this software must fully indemnify the
396 author with a disclaimer at least as effective and comprehensive as
397 the following:
398
399 THIS SOFTWARE IS PROVIDED BY JOSEPH WOODBURY "AS IS" AND ANY
400 EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
401 IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
402 PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL JOSEPH WOODBURY BE LIABLE
403 FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
404 CONSEQUENTIAL DAMAGES HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
405 WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE
406 OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
407 EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
408 **************************************************************************/
409