不同系统下的文件命名规则(部分参考网上资料)

在MS下的一般文件命名习惯
Naming Conventions(命名习惯)

The following fundamental rules enable applications to create and process valid names for files and directories, regardless of the file system:

(在MS中的不同文件系统下,该名字系统都可适用)

  • Use a period to separate the base file name from the extension in the name of a directory or file.
  • (在MS的filesystem中对于文件或者是目录使用"."分割基本的文件名和明见后缀名)
  • Use a backslash (\) to separate the components of a path. The backslash divides the file name from the path to it, and one directory name from another directory name in a path. You cannot use a backslash in the name for the actual file or directory because it is a reserved character that separates the names into components.
  • (在MS中通过"\"进行路径的分离,一般文件名字是通过"\"隔离的,如果在文件名字中有"\",则不会作为一个文件名字,而是一个路径)
  • Use a backslash as required as part of volume names, for example, the "C:\" in "C:\path\file" or the "\\server\share" in "\\server\share\path\file" for Universal Naming Convention (UNC) names. For more information about UNC names, see the Maximum Path Length Limitation section.
  • (在MS中基本的盘符路径的分离也是通过"\"来表现)
  • Do not assume case sensitivity. For example, consider the names OSCAR, Oscar, and oscar to be the same, even though some file systems (such as a POSIX-compliant file system) may consider them as different. Note that NTFS supports POSIX semantics for case sensitivity but this is not the default behavior. For more information, see CreateFile.
  • (在MS中是不区分文件名大小写,这点和Linux与unix都是不同的,前者不区分,后者严格区分)
  • Volume designators (drive letters) are similarly case-insensitive. For example, "D:\" and "d:\" refer to the same volume.
  • (在MS中对于盘符目录也是如此)
  • Use any character in the current code page for a name, including Unicode characters and characters in the extended character set (128–255), except for the following:(一下字符不能作为文件名中的一部分,对于扩展的ASII码也是可以作为文件名字)

    • The following reserved characters:

      • < (less than)
      • > (greater than)
      • : (colon)
      • " (double quote)
      • / (forward slash)
      • \ (backslash)
      • | (vertical bar or pipe)
      • ? (question mark)
      • * (asterisk)
      • (以上的除了"/"外,其它的在Linux中都可以作为文件名字)
    • Integer value zero, sometimes referred to as the ASCII NUL character.(空字符不能作为文件名)
    • Characters whose integer representations are in the range from 1 through 31, except for alternate data streams where these characters are allowed. For more information about file streams, see File Streams.(0x001-0x01f的控制字符不能作为文件名,但是可以作为I/O文件流的输入)
    • Any other character that the target file system does not allow.(文件系统不允许的字符)
  • Use a period as a directory component in a path to represent the current directory, for example ".\temp.txt". For more information, see Paths.
  • Use two consecutive periods (..) as a directory component in a path to represent the parent of the current directory, for example "..\temp.txt". For more information, see Paths.
  • (不管是在Linux还是Unix中,"."和".."作为当前目录和父目录表示,因此不能以它们来命名,MS中也不能,但是对于前者三个","的都可以,但是MS不行)
  • Do not use the following reserved device names for the name of a file:

    CON, PRN, AUX, NUL, COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8, COM9, LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, and LPT9. Also avoid these names followed immediately by an extension; for example, NUL.txt is not recommended. For more information, see Namespaces.

    (在MS中不能使用如上保留字作为文件名字的单独一部分)

  • Do not end a file or directory name with a space or a period. Although the underlying file system may support such names, the Windows shell and user interface does not. However, it is acceptable to specify a period as the first character of a name. For example, ".temp".
  • (在MS中不要以空格作为文件名字的结尾,会自动省略,因此不能以单个空格为文件命名,但是在Linux和Unix中则可以,同时最好不要用"."开头命名,在Linux和Unix中代表是隐藏文件)
  • (在Linux和Unix中不能直接使用字符"/"或者是多个"///"命名,但是可以使用"\"甚至多个命名)
同时需要关注不同FS对文件长度的要求。
其他可以参考如下文档:
(实现的是关于合法文件名的检测在MS下)

参考代码:
  1  #ifndef __ISVALIDFILENAME_H__
  2  #define __ISVALIDFILENAME_H__
  3  
  4  #include <tchar.h>
  5  
  6  enum
  7  {
  8      ISVALID_FILENAME_ERROR = -1,
  9      INVALID_FILENAME_CLOCK = -2,
 10      INVALID_FILENAME_AUX   = -3,
 11      INVALID_FILENAME_CON   = -4,
 12      INVALID_FILENAME_NUL   = -5,
 13      INVALID_FILENAME_PRN   = -6,
 14      INVALID_FILENAME_COM1  = -7,
 15      INVALID_FILENAME_COM2  = -8,
 16      INVALID_FILENAME_COM3  = -9,
 17      INVALID_FILENAME_COM4  = -10,
 18      INVALID_FILENAME_COM5  = -11,
 19      INVALID_FILENAME_COM6  = -12,
 20      INVALID_FILENAME_COM7  = -13,
 21      INVALID_FILENAME_COM8  = -14,
 22      INVALID_FILENAME_COM9  = -15,
 23      INVALID_FILENAME_LPT1  = -16,
 24      INVALID_FILENAME_LPT2  = -17,
 25      INVALID_FILENAME_LPT3  = -18,
 26      INVALID_FILENAME_LPT4  = -19,
 27      INVALID_FILENAME_LPT5  = -20,
 28      INVALID_FILENAME_LPT6  = -21,
 29      INVALID_FILENAME_LPT7  = -22,
 30      INVALID_FILENAME_LPT8  = -23,
 31      INVALID_FILENAME_LPT9  = -24
 32  };
 33  
 34  const TCHAR *pInvalidFileNameErrStr[];
 35  const TCHAR *GetIsValidFileNameErrStr(int err);
 36  
 37  int IsValidFileName(const char *pFileName);
 38  int IsValidFileName(const wchar_t *pFileName);
 39  
 40  /**************************************************************************
 41    Copyright 2002 Joseph Woodbury.
 42  
 43    Use of this file constitutes a full acceptance of the following license
 44    agreement:
 45  
 46    Redistribution and use in source and binary forms, with or without
 47    modification, are permitted provided that the following conditions
 48    are met:
 49  
 50    1. Redistribution of source code must retain the above copyright
 51       notice, this list of conditions and the following disclaimer.
 52  
 53    2. A fee cannot be charged for any redistribution.
 54  
 55    3. Full source must accompany any redistribution in binary form which
 56       exposes the interfaces of that source whether directly or indirectly.
 57    
 58    4. This software cannot be used in such as matter as to cause it,
 59       or portions of it, in source and/or binary forms, to be covered,
 60       or required to be disclosed, by the GNU Public License (GPL) and/or
 61       any similarly structured software license.
 62    
 63    5. Any binaries produced using this software must fully indemnify the
 64       author with a disclaimer at least as effective and comprehensive as
 65       the following:
 66  
 67    THIS SOFTWARE IS PROVIDED BY JOSEPH WOODBURY "AS IS" AND ANY
 68    EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 69    IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
 70    PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL JOSEPH WOODBURY BE LIABLE
 71    FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
 72    CONSEQUENTIAL DAMAGES HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
 73    WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE
 74    OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
 75    EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 76  **************************************************************************/
 77  
 78  #endif
1  #include "IsValidFileName.h"
  2  
  3  ///
  4  // Strings are only for error display, they are not used by
  5  // the IsValidFileName functions
  6  
  7  const TCHAR *pInvalidFileNameErrStr[] =
  8  {
  9      _T("Error"),
 10      _T("CLOCK$"),
 11      _T("AUX"),
 12      _T("CON"),
 13      _T("NUL"),
 14      _T("PRN"),
 15      _T("COM1"),
 16      _T("COM2"),
 17      _T("COM3"),
 18      _T("COM4"),
 19      _T("COM5"),
 20      _T("COM6"),
 21      _T("COM7"),
 22      _T("COM8"),
 23      _T("COM9"),
 24      _T("LPT1"),
 25      _T("LPT2"),
 26      _T("LPT3"),
 27      _T("LPT4"),
 28      _T("LPT5"),
 29      _T("LPT6"),
 30      _T("LPT7"),
 31      _T("LPT8"),
 32      _T("LPT9"),
 33      NULL
 34  };
 35  
 36  ///
 37  // const TCHAR *GetIsValidFileNameErrStr(int err)
 38  //
 39  //  Return an string associated with the passed error code
 40  //
 41  // Parameters
 42  //
 43  //  err   A negative error number returned by IsValidFileName
 44  //
 45  // Returns
 46  //
 47  //  A pointer to the device string
 48  //
 49  //  NULL if err is not in the range of the INVALID_FILENAME_ enumeration.
 50  //
 51  
 52  const TCHAR *GetIsValidFileNameErrStr(int err)
 53  {
 54      if (err >= 0 || err < INVALID_FILENAME_LPT9)
 55          return pInvalidFileNameErrStr[0];
 56  
 57      return pInvalidFileNameErrStr[(-err) - 1];
 58  }
 59  
 60  ///
 61  // int IsValidFileName(const char *pFileName)
 62  //
 63  //  Ensure a file name is legal.
 64  //
 65  // Parameters
 66  //
 67  //  pFileName   The file name to check. This must be only the file name.
 68  //              If a full path is passed, the check will fail.
 69  //
 70  // Returns
 71  //
 72  //  Zero on success.
 73  // 
 74  //  Non-Zero on failure.
 75  //
 76  //      The return code can be used to determine why the call failed:
 77  //
 78  //      >0       The illegal character that was used. If the value is a
 79  //              dot ('.', 46) the file name was nothing but dots.
 80  //
 81  //      -1      A NULL or zero length file name was passed, or the file
 82  //              name exceeded 255 characters.
 83  //
 84  //      <-1      A device name was used. The value corresponds to the
 85  //              INVALID_FILENAME_... series of enumerations. You can pass
 86  //              this value to GetIsValidFileNameErrStr to obtain a pointer to
 87  //              the name of this device.
 88  //
 89  // Remarks
 90  //
 91  //  The NT file naming convention specifies that:
 92  //
 93  //  - All characters greater than ASCII 31 to be used except for the following:
 94  //
 95  //      "*/:<>?\|
 96  //
 97  //  - A file may not be only dots
 98  //
 99  //  - The following device names cannot be used for a file name nor may they
100  //    be used for the first segment of a file name (that part which precedes the
101  //    first dot):
102  //
103  //    CLOCK$, AUX, CON, NUL, PRN, COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8,
104  //    COM9, LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, LPT9
105  //
106  //  - Device names are case insensitve. aux, AUX, Aux, etc. are identical. 
107  //
108  //  The ANSI and UNICODE functions are identical except for the declaration of the
109  //  argument. A template function could have been used, but I chose not to since
110  //  there are only two valid cases.
111  //
112  //  The algorithm used looks convoluted because it is highly optimized. It is
113  //  more than 11.5 times faster than if scanning method using _strnicmp
114  //  was used.
115  //
116  
117  int IsValidFileName(const char *pFileName)
118  {
119      if (!pFileName || !*pFileName)
120          return ISVALID_FILENAME_ERROR;
121  
122      int nonDot = -1;    // position of the first non dot in the file name
123      int dot = -1;       // position of the first dot in the file name
124      int len = 0;        // length of the file name
125  
126      // If a non-dot character has been encountered
127  
128      for (; len < 256 && pFileName[len]; len++)
129      {
130          if (pFileName[len] == '.')
131          {
132              if (dot < 0)
133                  dot = len;
134              continue;
135          }
136          else if (nonDot < 0)
137              nonDot = len;
138  
139          // The upper characters can be passed with a single check and
140          // since only the backslash and bar are above the ampersand
141          // it saves memory to do the check this way with little performance
142          // cost.
143          if (pFileName[len] >= '@')
144          {
145              if (pFileName[len] == '\\' || pFileName[len] == '|')
146                  return pFileName[len];
147  
148              continue;
149          }
150  
151          static bool isCharValid[32] =
152          {
153          //  ' '   !     "      #     $     %     &     '     (     )     *      +     ,      -     .      / 
154              true, true, false, true, true, true, true, true, true, true, false, true, true,  true, true,  false,
155          //  0     1     2      3     4     5     6     7     8     9     :      ;     <      =     >      ?
156              true, true, true,  true, true, true, true, true, true, true, false, true, false, true, false, false
157          //  0     1     2      3     4     5     6     7     8     9     :      ;     <      =     >      ?
158          };
159  
160          // This is faster, at the expense of memory, than checking each
161          // invalid character individually. However, either method is much
162          // faster than using strchr().
163          if (pFileName[len] >= 32)
164          {
165              if (isCharValid[pFileName[len] - 32])
166                  continue;
167          }
168          return pFileName[len];
169      }
170  
171      if (len == 256)
172          return ISVALID_FILENAME_ERROR;
173  
174      // if nonDot is still -1, no non-dots were encountered, return a dot (period)
175      if (nonDot < 0)
176          return '.';
177  
178      // if the first character is a dot, the filename is okay
179      if (dot == 0)
180          return 0;
181  
182      // if the file name has a dot, we only need to check up to the first dot
183      if (dot > 0)
184          len = dot;
185  
186      // Since the device names aren't numerous, this method of checking is the
187      // fastest. Note that each character is checked with both cases.
188      if (len == 3)
189      {
190          if (pFileName[0] == 'a' || pFileName[0] == 'A')
191          {
192              if ((pFileName[1] == 'u' || pFileName[1] == 'U') &&
193                  (pFileName[2] == 'x' || pFileName[2] == 'X'))
194                  return INVALID_FILENAME_AUX;
195          }
196          else if (pFileName[0] == 'c' || pFileName[0] == 'C')
197          {
198              if ((pFileName[1] == 'o' || pFileName[1] == 'O') &&
199                  (pFileName[2] == 'n' || pFileName[2] == 'N'))
200                  return INVALID_FILENAME_CON;
201          }
202          else if (pFileName[0] == 'n' || pFileName[0] == 'N')
203          {
204              if ((pFileName[1] == 'u' || pFileName[1] == 'U') &&
205                  (pFileName[2] == 'l' || pFileName[2] == 'L'))
206                  return INVALID_FILENAME_NUL;
207          }
208          else if (pFileName[0] == 'p' || pFileName[0] == 'P')
209          {
210              if ((pFileName[1] == 'r' || pFileName[1] == 'R') &&
211                  (pFileName[2] == 'n' || pFileName[2] == 'N'))
212                  return INVALID_FILENAME_PRN;
213          }
214      }
215      else if (len == 4)
216      {
217          if (pFileName[0] == 'c' || pFileName[0] == 'C')
218          {
219              if ((pFileName[1] == 'o' || pFileName[1] == 'O') &&
220                  (pFileName[2] == 'm' || pFileName[2] == 'M') &&
221                  (pFileName[3] >= '1' || pFileName[3] <= '9'))
222                  return INVALID_FILENAME_COM1 - (pFileName[3] - '1');
223          }
224          else if (pFileName[0] == 'l' || pFileName[0] == 'L')
225          {
226              if ((pFileName[1] == 'p' || pFileName[1] == 'P') &&
227                  (pFileName[2] == 't' || pFileName[2] == 'T') &&
228                  (pFileName[3] >= '1' || pFileName[3] <= '9'))
229                  return INVALID_FILENAME_LPT1 - (pFileName[3] - '1');
230          }
231      }
232      else if (len == 6)
233      {
234          if ((pFileName[0] == 'c' || pFileName[0] == 'C') &&
235              (pFileName[1] == 'l' || pFileName[1] == 'L') &&
236              (pFileName[2] == 'o' || pFileName[2] == 'O') &&
237              (pFileName[3] == 'c' || pFileName[3] == 'C') &&
238              (pFileName[4] == 'k' || pFileName[4] == 'K') &&
239              (pFileName[5] == '$' || pFileName[5] == '$'))
240              return INVALID_FILENAME_CLOCK;
241      }
242  
243      return 0;
244  }
245  
246  int IsValidFileName(const wchar_t *pFileName)
247  {
248      if (!pFileName || !*pFileName)
249          return ISVALID_FILENAME_ERROR;
250  
251      int nonDot = -1;    // position of the first non dot in the file name
252      int dot = -1;       // position of the first dot in the file name
253      int len = 0;        // length of the file name
254      for (; len < 256 && pFileName[len]; len++)
255      {
256          if (pFileName[len] == '.')
257          {
258              if (dot < 0)
259                  dot = len;
260              continue;
261          }
262          else if (nonDot < 0)
263              nonDot = len;
264  
265          // The upper characters can be passed with a single check and
266          // since only the backslash and bar are above the ampersand
267          // it saves memory to do the check this way with little performance
268          // cost.
269          if (pFileName[len] >= '@')
270          {
271              if (pFileName[len] == '\\' || pFileName[len] == '|')
272                  return pFileName[len];
273  
274              continue;
275          }
276  
277          static bool isCharValid[32] =
278          {
279          //  ' '   !     "      #     $     %     &     '     (     )     *      +     ,      -     .      / 
280              true, true, false, true, true, true, true, true, true, true, false, true, true,  true, true,  false,
281          //  0     1     2      3     4     5     6     7     8     9     :      ;     <      =     >      ?
282              true, true, true,  true, true, true, true, true, true, true, false, true, false, true, false, false
283          //  0     1     2      3     4     5     6     7     8     9     :      ;     <      =     >      ?
284          };
285  
286          // This is faster, at the expense of memory, than checking each
287          // invalid character individually. However, either method is much
288          // faster than using strchr().
289          if (pFileName[len] >= 32)
290          {
291              if (isCharValid[pFileName[len] - 32])
292                  continue;
293          }
294          return pFileName[len];
295      }
296  
297      if (len == 256)
298          return ISVALID_FILENAME_ERROR;
299  
300      // if nonDot is still -1, no non-dots were encountered, return a dot (period)
301      if (nonDot < 0)
302          return '.';
303  
304      // if the first character is a dot, the filename is okay
305      if (dot == 0)
306          return 0;
307  
308      // if the file name has a dot, we only need to check up to the first dot
309      if (dot > 0)
310          len = dot;
311  
312      // Since the device names aren't numerous, this method of checking is the
313      // fastest. Note that each character is checked with both cases.
314      if (len == 3)
315      {
316          if (pFileName[0] == 'a' || pFileName[0] == 'A')
317          {
318              if ((pFileName[1] == 'u' || pFileName[1] == 'U') &&
319                  (pFileName[2] == 'x' || pFileName[2] == 'X'))
320                  return INVALID_FILENAME_AUX;
321          }
322          else if (pFileName[0] == 'c' || pFileName[0] == 'C')
323          {
324              if ((pFileName[1] == 'o' || pFileName[1] == 'O') &&
325                  (pFileName[2] == 'n' || pFileName[2] == 'N'))
326                  return INVALID_FILENAME_CON;
327          }
328          else if (pFileName[0] == 'n' || pFileName[0] == 'N')
329          {
330              if ((pFileName[1] == 'u' || pFileName[1] == 'U') &&
331                  (pFileName[2] == 'l' || pFileName[2] == 'L'))
332                  return INVALID_FILENAME_NUL;
333          }
334          else if (pFileName[0] == 'p' || pFileName[0] == 'P')
335          {
336              if ((pFileName[1] == 'r' || pFileName[1] == 'R') &&
337                  (pFileName[2] == 'n' || pFileName[2] == 'N'))
338                  return INVALID_FILENAME_PRN;
339          }
340      }
341      else if (len == 4)
342      {
343          if (pFileName[0] == 'c' || pFileName[0] == 'C')
344          {
345              if ((pFileName[1] == 'o' || pFileName[1] == 'O') &&
346                  (pFileName[2] == 'm' || pFileName[2] == 'M') &&
347                  (pFileName[3] >= '1' || pFileName[3] <= '9'))
348                  return INVALID_FILENAME_COM1 - (pFileName[3] - '1');
349          }
350          else if (pFileName[0] == 'l' || pFileName[0] == 'L')
351          {
352              if ((pFileName[1] == 'p' || pFileName[1] == 'P') &&
353                  (pFileName[2] == 't' || pFileName[2] == 'T') &&
354                  (pFileName[3] >= '1' || pFileName[3] <= '9'))
355                  return INVALID_FILENAME_LPT1 - (pFileName[3] - '1');
356          }
357      }
358      else if (len == 6)
359      {
360          if ((pFileName[0] == 'c' || pFileName[0] == 'C') &&
361              (pFileName[1] == 'l' || pFileName[1] == 'L') &&
362              (pFileName[2] == 'o' || pFileName[2] == 'O') &&
363              (pFileName[3] == 'c' || pFileName[3] == 'C') &&
364              (pFileName[4] == 'k' || pFileName[4] == 'K') &&
365              (pFileName[5] == '$' || pFileName[5] == '$'))
366              return INVALID_FILENAME_CLOCK;
367      }
368  
369      return 0;
370  }
371  
372  /**************************************************************************
373    Copyright 2002 Joseph Woodbury.
374  
375    Use of this file constitutes a full acceptance of the following license
376    agreement:
377  
378    Redistribution and use in source and binary forms, with or without
379    modification, are permitted provided that the following conditions
380    are met:
381  
382    1. Redistribution of source code must retain the above copyright
383       notice, this list of conditions and the following disclaimer.
384  
385    2. A fee cannot be charged for any redistribution.
386  
387    3. Full source must accompany any redistribution in binary form which
388       exposes the interfaces of that source whether directly or indirectly.
389    
390    4. This software cannot be used in such as matter as to cause it,
391       or portions of it, in source and/or binary forms, to be covered,
392       or required to be disclosed, by the GNU Public License (GPL) and/or
393       any similarly structured software license.
394    
395    5. Any binaries produced using this software must fully indemnify the
396       author with a disclaimer at least as effective and comprehensive as
397       the following:
398  
399    THIS SOFTWARE IS PROVIDED BY JOSEPH WOODBURY "AS IS" AND ANY
400    EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
401    IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
402    PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL JOSEPH WOODBURY BE LIABLE
403    FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
404    CONSEQUENTIAL DAMAGES HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
405    WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE
406    OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
407    EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
408  **************************************************************************/
409  

1  #include "IsValidFileName.h"
  2  
  3  ///
  4  // Strings are only for error display, they are not used by
  5  // the IsValidFileName functions
  6  
  7  const TCHAR *pInvalidFileNameErrStr[] =
  8  {
  9      _T("Error"),
 10      _T("CLOCK$"),
 11      _T("AUX"),
 12      _T("CON"),
 13      _T("NUL"),
 14      _T("PRN"),
 15      _T("COM1"),
 16      _T("COM2"),
 17      _T("COM3"),
 18      _T("COM4"),
 19      _T("COM5"),
 20      _T("COM6"),
 21      _T("COM7"),
 22      _T("COM8"),
 23      _T("COM9"),
 24      _T("LPT1"),
 25      _T("LPT2"),
 26      _T("LPT3"),
 27      _T("LPT4"),
 28      _T("LPT5"),
 29      _T("LPT6"),
 30      _T("LPT7"),
 31      _T("LPT8"),
 32      _T("LPT9"),
 33      NULL
 34  };
 35  
 36  ///
 37  // const TCHAR *GetIsValidFileNameErrStr(int err)
 38  //
 39  //  Return an string associated with the passed error code
 40  //
 41  // Parameters
 42  //
 43  //  err   A negative error number returned by IsValidFileName
 44  //
 45  // Returns
 46  //
 47  //  A pointer to the device string
 48  //
 49  //  NULL if err is not in the range of the INVALID_FILENAME_ enumeration.
 50  //
 51  
 52  const TCHAR *GetIsValidFileNameErrStr(int err)
 53  {
 54      if (err >= 0 || err < INVALID_FILENAME_LPT9)
 55          return pInvalidFileNameErrStr[0];
 56  
 57      return pInvalidFileNameErrStr[(-err) - 1];
 58  }
 59  
 60  ///
 61  // int IsValidFileName(const char *pFileName)
 62  //
 63  //  Ensure a file name is legal.
 64  //
 65  // Parameters
 66  //
 67  //  pFileName   The file name to check. This must be only the file name.
 68  //              If a full path is passed, the check will fail.
 69  //
 70  // Returns
 71  //
 72  //  Zero on success.
 73  // 
 74  //  Non-Zero on failure.
 75  //
 76  //      The return code can be used to determine why the call failed:
 77  //
 78  //      >0       The illegal character that was used. If the value is a
 79  //              dot ('.', 46) the file name was nothing but dots.
 80  //
 81  //      -1      A NULL or zero length file name was passed, or the file
 82  //              name exceeded 255 characters.
 83  //
 84  //      <-1      A device name was used. The value corresponds to the
 85  //              INVALID_FILENAME_... series of enumerations. You can pass
 86  //              this value to GetIsValidFileNameErrStr to obtain a pointer to
 87  //              the name of this device.
 88  //
 89  // Remarks
 90  //
 91  //  The NT file naming convention specifies that:
 92  //
 93  //  - All characters greater than ASCII 31 to be used except for the following:
 94  //
 95  //      "*/:<>?\|
 96  //
 97  //  - A file may not be only dots
 98  //
 99  //  - The following device names cannot be used for a file name nor may they
100  //    be used for the first segment of a file name (that part which precedes the
101  //    first dot):
102  //
103  //    CLOCK$, AUX, CON, NUL, PRN, COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8,
104  //    COM9, LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, LPT9
105  //
106  //  - Device names are case insensitve. aux, AUX, Aux, etc. are identical. 
107  //
108  //  The ANSI and UNICODE functions are identical except for the declaration of the
109  //  argument. A template function could have been used, but I chose not to since
110  //  there are only two valid cases.
111  //
112  //  The algorithm used looks convoluted because it is highly optimized. It is
113  //  more than 11.5 times faster than if scanning method using _strnicmp
114  //  was used.
115  //
116  
117  int IsValidFileName(const char *pFileName)
118  {
119      if (!pFileName || !*pFileName)
120          return ISVALID_FILENAME_ERROR;
121  
122      int nonDot = -1;    // position of the first non dot in the file name
123      int dot = -1;       // position of the first dot in the file name
124      int len = 0;        // length of the file name
125  
126      // If a non-dot character has been encountered
127  
128      for (; len < 256 && pFileName[len]; len++)
129      {
130          if (pFileName[len] == '.')
131          {
132              if (dot < 0)
133                  dot = len;
134              continue;
135          }
136          else if (nonDot < 0)
137              nonDot = len;
138  
139          // The upper characters can be passed with a single check and
140          // since only the backslash and bar are above the ampersand
141          // it saves memory to do the check this way with little performance
142          // cost.
143          if (pFileName[len] >= '@')
144          {
145              if (pFileName[len] == '\\' || pFileName[len] == '|')
146                  return pFileName[len];
147  
148              continue;
149          }
150  
151          static bool isCharValid[32] =
152          {
153          //  ' '   !     "      #     $     %     &     '     (     )     *      +     ,      -     .      / 
154              true, true, false, true, true, true, true, true, true, true, false, true, true,  true, true,  false,
155          //  0     1     2      3     4     5     6     7     8     9     :      ;     <      =     >      ?
156              true, true, true,  true, true, true, true, true, true, true, false, true, false, true, false, false
157          //  0     1     2      3     4     5     6     7     8     9     :      ;     <      =     >      ?
158          };
159  
160          // This is faster, at the expense of memory, than checking each
161          // invalid character individually. However, either method is much
162          // faster than using strchr().
163          if (pFileName[len] >= 32)
164          {
165              if (isCharValid[pFileName[len] - 32])
166                  continue;
167          }
168          return pFileName[len];
169      }
170  
171      if (len == 256)
172          return ISVALID_FILENAME_ERROR;
173  
174      // if nonDot is still -1, no non-dots were encountered, return a dot (period)
175      if (nonDot < 0)
176          return '.';
177  
178      // if the first character is a dot, the filename is okay
179      if (dot == 0)
180          return 0;
181  
182      // if the file name has a dot, we only need to check up to the first dot
183      if (dot > 0)
184          len = dot;
185  
186      // Since the device names aren't numerous, this method of checking is the
187      // fastest. Note that each character is checked with both cases.
188      if (len == 3)
189      {
190          if (pFileName[0] == 'a' || pFileName[0] == 'A')
191          {
192              if ((pFileName[1] == 'u' || pFileName[1] == 'U') &&
193                  (pFileName[2] == 'x' || pFileName[2] == 'X'))
194                  return INVALID_FILENAME_AUX;
195          }
196          else if (pFileName[0] == 'c' || pFileName[0] == 'C')
197          {
198              if ((pFileName[1] == 'o' || pFileName[1] == 'O') &&
199                  (pFileName[2] == 'n' || pFileName[2] == 'N'))
200                  return INVALID_FILENAME_CON;
201          }
202          else if (pFileName[0] == 'n' || pFileName[0] == 'N')
203          {
204              if ((pFileName[1] == 'u' || pFileName[1] == 'U') &&
205                  (pFileName[2] == 'l' || pFileName[2] == 'L'))
206                  return INVALID_FILENAME_NUL;
207          }
208          else if (pFileName[0] == 'p' || pFileName[0] == 'P')
209          {
210              if ((pFileName[1] == 'r' || pFileName[1] == 'R') &&
211                  (pFileName[2] == 'n' || pFileName[2] == 'N'))
212                  return INVALID_FILENAME_PRN;
213          }
214      }
215      else if (len == 4)
216      {
217          if (pFileName[0] == 'c' || pFileName[0] == 'C')
218          {
219              if ((pFileName[1] == 'o' || pFileName[1] == 'O') &&
220                  (pFileName[2] == 'm' || pFileName[2] == 'M') &&
221                  (pFileName[3] >= '1' || pFileName[3] <= '9'))
222                  return INVALID_FILENAME_COM1 - (pFileName[3] - '1');
223          }
224          else if (pFileName[0] == 'l' || pFileName[0] == 'L')
225          {
226              if ((pFileName[1] == 'p' || pFileName[1] == 'P') &&
227                  (pFileName[2] == 't' || pFileName[2] == 'T') &&
228                  (pFileName[3] >= '1' || pFileName[3] <= '9'))
229                  return INVALID_FILENAME_LPT1 - (pFileName[3] - '1');
230          }
231      }
232      else if (len == 6)
233      {
234          if ((pFileName[0] == 'c' || pFileName[0] == 'C') &&
235              (pFileName[1] == 'l' || pFileName[1] == 'L') &&
236              (pFileName[2] == 'o' || pFileName[2] == 'O') &&
237              (pFileName[3] == 'c' || pFileName[3] == 'C') &&
238              (pFileName[4] == 'k' || pFileName[4] == 'K') &&
239              (pFileName[5] == '$' || pFileName[5] == '$'))
240              return INVALID_FILENAME_CLOCK;
241      }
242  
243      return 0;
244  }
245  
246  int IsValidFileName(const wchar_t *pFileName)
247  {
248      if (!pFileName || !*pFileName)
249          return ISVALID_FILENAME_ERROR;
250  
251      int nonDot = -1;    // position of the first non dot in the file name
252      int dot = -1;       // position of the first dot in the file name
253      int len = 0;        // length of the file name
254      for (; len < 256 && pFileName[len]; len++)
255      {
256          if (pFileName[len] == '.')
257          {
258              if (dot < 0)
259                  dot = len;
260              continue;
261          }
262          else if (nonDot < 0)
263              nonDot = len;
264  
265          // The upper characters can be passed with a single check and
266          // since only the backslash and bar are above the ampersand
267          // it saves memory to do the check this way with little performance
268          // cost.
269          if (pFileName[len] >= '@')
270          {
271              if (pFileName[len] == '\\' || pFileName[len] == '|')
272                  return pFileName[len];
273  
274              continue;
275          }
276  
277          static bool isCharValid[32] =
278          {
279          //  ' '   !     "      #     $     %     &     '     (     )     *      +     ,      -     .      / 
280              true, true, false, true, true, true, true, true, true, true, false, true, true,  true, true,  false,
281          //  0     1     2      3     4     5     6     7     8     9     :      ;     <      =     >      ?
282              true, true, true,  true, true, true, true, true, true, true, false, true, false, true, false, false
283          //  0     1     2      3     4     5     6     7     8     9     :      ;     <      =     >      ?
284          };
285  
286          // This is faster, at the expense of memory, than checking each
287          // invalid character individually. However, either method is much
288          // faster than using strchr().
289          if (pFileName[len] >= 32)
290          {
291              if (isCharValid[pFileName[len] - 32])
292                  continue;
293          }
294          return pFileName[len];
295      }
296  
297      if (len == 256)
298          return ISVALID_FILENAME_ERROR;
299  
300      // if nonDot is still -1, no non-dots were encountered, return a dot (period)
301      if (nonDot < 0)
302          return '.';
303  
304      // if the first character is a dot, the filename is okay
305      if (dot == 0)
306          return 0;
307  
308      // if the file name has a dot, we only need to check up to the first dot
309      if (dot > 0)
310          len = dot;
311  
312      // Since the device names aren't numerous, this method of checking is the
313      // fastest. Note that each character is checked with both cases.
314      if (len == 3)
315      {
316          if (pFileName[0] == 'a' || pFileName[0] == 'A')
317          {
318              if ((pFileName[1] == 'u' || pFileName[1] == 'U') &&
319                  (pFileName[2] == 'x' || pFileName[2] == 'X'))
320                  return INVALID_FILENAME_AUX;
321          }
322          else if (pFileName[0] == 'c' || pFileName[0] == 'C')
323          {
324              if ((pFileName[1] == 'o' || pFileName[1] == 'O') &&
325                  (pFileName[2] == 'n' || pFileName[2] == 'N'))
326                  return INVALID_FILENAME_CON;
327          }
328          else if (pFileName[0] == 'n' || pFileName[0] == 'N')
329          {
330              if ((pFileName[1] == 'u' || pFileName[1] == 'U') &&
331                  (pFileName[2] == 'l' || pFileName[2] == 'L'))
332                  return INVALID_FILENAME_NUL;
333          }
334          else if (pFileName[0] == 'p' || pFileName[0] == 'P')
335          {
336              if ((pFileName[1] == 'r' || pFileName[1] == 'R') &&
337                  (pFileName[2] == 'n' || pFileName[2] == 'N'))
338                  return INVALID_FILENAME_PRN;
339          }
340      }
341      else if (len == 4)
342      {
343          if (pFileName[0] == 'c' || pFileName[0] == 'C')
344          {
345              if ((pFileName[1] == 'o' || pFileName[1] == 'O') &&
346                  (pFileName[2] == 'm' || pFileName[2] == 'M') &&
347                  (pFileName[3] >= '1' || pFileName[3] <= '9'))
348                  return INVALID_FILENAME_COM1 - (pFileName[3] - '1');
349          }
350          else if (pFileName[0] == 'l' || pFileName[0] == 'L')
351          {
352              if ((pFileName[1] == 'p' || pFileName[1] == 'P') &&
353                  (pFileName[2] == 't' || pFileName[2] == 'T') &&
354                  (pFileName[3] >= '1' || pFileName[3] <= '9'))
355                  return INVALID_FILENAME_LPT1 - (pFileName[3] - '1');
356          }
357      }
358      else if (len == 6)
359      {
360          if ((pFileName[0] == 'c' || pFileName[0] == 'C') &&
361              (pFileName[1] == 'l' || pFileName[1] == 'L') &&
362              (pFileName[2] == 'o' || pFileName[2] == 'O') &&
363              (pFileName[3] == 'c' || pFileName[3] == 'C') &&
364              (pFileName[4] == 'k' || pFileName[4] == 'K') &&
365              (pFileName[5] == '$' || pFileName[5] == '$'))
366              return INVALID_FILENAME_CLOCK;
367      }
368  
369      return 0;
370  }
371  
372  /**************************************************************************
373    Copyright 2002 Joseph Woodbury.
374  
375    Use of this file constitutes a full acceptance of the following license
376    agreement:
377  
378    Redistribution and use in source and binary forms, with or without
379    modification, are permitted provided that the following conditions
380    are met:
381  
382    1. Redistribution of source code must retain the above copyright
383       notice, this list of conditions and the following disclaimer.
384  
385    2. A fee cannot be charged for any redistribution.
386  
387    3. Full source must accompany any redistribution in binary form which
388       exposes the interfaces of that source whether directly or indirectly.
389    
390    4. This software cannot be used in such as matter as to cause it,
391       or portions of it, in source and/or binary forms, to be covered,
392       or required to be disclosed, by the GNU Public License (GPL) and/or
393       any similarly structured software license.
394    
395    5. Any binaries produced using this software must fully indemnify the
396       author with a disclaimer at least as effective and comprehensive as
397       the following:
398  
399    THIS SOFTWARE IS PROVIDED BY JOSEPH WOODBURY "AS IS" AND ANY
400    EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
401    IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
402    PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL JOSEPH WOODBURY BE LIABLE
403    FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
404    CONSEQUENTIAL DAMAGES HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
405    WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE
406    OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
407    EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
408  **************************************************************************/
409  

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值