Windows和Linux目录名称中禁止使用什么字符?

本文翻译自:What characters are forbidden in Windows and Linux directory names?

I know that / is illegal in Linux, and the following are illegal in Windows (I think) * . 我知道/在Linux中是非法的,以下是Windows(我认为) *中的非法. " / \\ [ ] : ; | , " / \\ [ ] : ; | ,

What else am I missing? 我还想念什么?

I need a comprehensive guide, however, and one that takes into account double-byte characters. 但是,我需要一份全面的指南,其中要考虑到双字节字符。 Linking to outside resources is fine with me. 链接到外部资源对我来说很好。

I need to first create a directory on the filesystem using a name that may contain forbidden characters, so I plan to replace those characters with underscores. 我首先需要使用可能包含禁止字符的名称在文件系统上创建目录,因此我计划将这些字符替换为下划线。 I then need to write this directory and its contents to a zip file (using Java), so any additional advice concerning the names of zip directories would be appreciated. 然后,我需要将此目录及其内容写入zip文件(使用Java),因此,有关zip目录名称的任何其他建议将不胜感激。


#1楼

参考:https://stackoom.com/question/8I35/Windows和Linux目录名称中禁止使用什么字符


#2楼

A “comprehensive guide” of forbidden filename characters is not going to work on Windows because it reserves filenames as well as characters. 禁止使用文件名字符的“综合指南”在Windows上不起作用,因为它保留了文件名和字符。 Yes, characters like * " ? and others are forbidden, but there are a infinite number of names composed only of valid characters that are forbidden. For example, spaces and dots are valid filename characters, but names composed only of those characters are forbidden. 是的,诸如* " ?类的字符和其他字符是被禁止的,但是有无限多个名称仅由有效字符组成,例如,空格和点是有效的文件名字符,但仅由这些字符组成的名称是禁止的。

Windows does not distinguish between upper-case and lower-case characters, so you cannot create a folder named A if one named a already exists. Windows不大写之间区分小写字符,所以你不能创建一个文件夹命名为A ,如果一个名为a已经存在。 Worse, seemingly-allowed names like PRN and CON , and many others, are reserved and not allowed. 更糟的,貌似允许的名称(如PRNCON )以及许多其他名称被保留,不允许使用。 Windows also has several length restrictions; Windows也有几个长度限制; a filename valid in one folder may become invalid if moved to another folder. 如果将文件名移动到另一个文件夹,则在一个文件夹中有效的文件名可能变得无效。 The rules for naming files and folders is on MSDN. 文件和文件夹的命名规则在MSDN上。

You cannot, in general, use user-generated text to create Windows directory names. 通常,您不能使用用户生成的文本来创建Windows目录名称。 If you want to allow users to name anything they want, you have to create safe names like A , AB , A2 et al., store user-generated names and their path equivalents in an application data file, and perform path mapping in your application. 如果要允许用户命名所需的任何名称,则必须创建安全名称,如AABA2等,将用户生成的名称及其等效路径存储在应用程序数据文件中,并在应用程序中执行路径映射。

If you absolutely must allow user-generated folder names, the only way to tell if they are invalid is to catch exceptions and assume the name is invalid. 如果绝对必须允许用户生成的文件夹名称,则判断它们是否无效的唯一方法是捕获异常并假定名称无效。 Even that is fraught with peril, as the exceptions thrown for denied access, offline drives, and out of drive space overlap with those that can be thrown for invalid names. 即使这样也充满了危险,因为拒绝访问,脱机驱动器和驱动器空间不足引发的异常与可以为无效名称引发的异常重叠。 You are opening up one huge can of hurt. 您正在打开一罐巨大的伤害。


#3楼

Well, if only for research purposes, then your best bet is to look at this Wikipedia entry on Filenames . 好吧,如果仅出于研究目的,那么最好的选择是查看Filenames上的Wikipedia条目

If you want to write a portable function to validate user input and create filenames based on that, the short answer is don't . 如果您想编写一个可移植的函数来验证用户输入并基于此创建文件名,则简短的回答是not Take a look at a portable module like Perl's File::Spec to have a glimpse to all the hops needed to accomplish such a "simple" task. 看一下诸如Perl的File :: Spec之类的可移植模块,以瞥见完成这种“简单”任务所需的所有跃点。


#4楼

Under Linux and other Unix-related systems, there are only two characters that cannot appear in the name of a file or directory, and those are NUL '\\0' and slash '/' . 在Linux和其他与Unix相关的系统上,文件或目录的名称中不能出现两个字符,它们是NUL '\\0'和斜杠'/' The slash, of course, can appear in a path name, separating directory components. 当然,斜杠可以出现在路径名中,分隔目录组件。

Rumour 1 has it that Steven Bourne (of 'shell' fame) had a directory containing 254 files, one for every single letter (character code) that can appear in a file name (excluding / , '\\0' ; the name . was the current directory, of course). 谣言1认为Steven Bourne(“ shell”成名)的目录包含254个文件,每个出现在文件名(不包括/'\\0' )中的单个字母(字符代码)都一个.当前目录)。 It was used to test the Bourne shell and routinely wrought havoc on unwary programs such as backup programs. 它用于测试Bourne shell,并经常对诸如备份程序之类的粗心程序造成严重破坏。

Other people have covered the Windows rules. 其他人已经涵盖了Windows规则。

Note that MacOS X has a case-insensitive file system. 请注意,MacOS X具有不区分大小写的文件系统。


1 It was Kernighan & Pike in The Practice of Programming who said as much in Chapter 6, Testing, §6.5 Stress Tests: 1《编程实践》中的 Kernighan&Pike在第6章,测试,第6.5节“压力测试”中说了很多:

When Steve Bourne was writing his Unix shell (which came to be known as the Bourne shell), he made a directory of 254 files with one-character names, one for each byte value except '\\0' and slash, the two characters that cannot appear in Unix file names. 当史蒂夫·伯恩(Steve Bourne)编写自己的Unix Shell(后来称为Bourne Shell)时,他创建了一个包含254个文件的目录,每个文件包含一个字符名称,每个字节值一个,除了'\\0'和斜杠(两个字符)不能出现在Unix文件名中。 He used that directory for all manner of tests of pattern-matching and tokenization. 他将该目录用于模式匹配和标记化的所有测试方式。 (The test directory was of course created by a program.) For years afterwards, that directory was the bane of file-tree-walking programs; (当然,测试目录是由程序创建的。)多年之后,该目录成为文件树遍历程序的祸根。 it tested them to destruction. 它测试了他们的破坏。

Note that the directory must have contained entries . 请注意,目录必须包含条目. and .. , so it was arguably 253 files (and 2 directories), or 255 name entries, rather than 254 files. .. ,因此可以说是253个文件(和2个目录)或255个名称条目,而不是254个文件。 This doesn't affect the effectiveness of the anecdote, or the careful testing it describes. 这不会影响轶事的有效性或它描述的仔细测试。


#5楼

Instead of creating a blacklist of characters, you could use a whitelist . 除了创建字符黑名单,您还可以使用白名单 All things considered, the range of characters that make sense in a file or directory name context is quite short, and unless you have some very specific naming requirements your users will not hold it against your application if they cannot use the whole ASCII table. 考虑到所有因素,在文件或目录名称上下文中有意义的字符范围非常短,并且除非您有一些非常具体的命名要求,否则如果用户无法使用整个ASCII表,则用户将无法将其保留在应用程序中。

It does not solve the problem of reserved names in the target file system, but with a whitelist it is easier to mitigate the risks at the source. 它不能解决目标文件系统中保留名称的问题,但是通过白名单,可以更轻松地减轻源头的风险。

In that spirit, this is a range of characters that can be considered safe: 本着这种精神,这是可以认为是安全的一系列字符:

  • Letters (az AZ) - Unicode characters as well, if needed 字母(AZ AZ) -如果需要,也可以使用Unicode字符
  • Digits (0-9) 数字(0-9)
  • Underscore (_) 下划线(_)
  • Hyphen (-) 连字号(-)
  • Space 空间
  • Dot (.) 点(。)

And any additional safe characters you wish to allow. 以及您希望允许的所有其他安全字符。 Beyond this, you just have to enforce some additional rules regarding spaces and dots . 除此之外,您只需要执行一些有关空格和点的附加规则 This is usually sufficient: 通常就足够了:

  • Name must contain at least one letter or number (to avoid only dots/spaces) 名称必须包含至少一个字母或数字(以避免仅点/空格)
  • Name must start with a letter or number (to avoid leading dots/spaces) 名称必须以字母或数字开头(以避免前导点/空格)
  • Name may not end with a dot or space (simply trim those if present, like Explorer does) 名称不能以点或空格结尾(如资源管理器所做的那样,请简单地修剪点或空格)

This already allows quite complex and nonsensical names. 这已经允许使用非常复杂且荒谬的名称。 For example, these names would be possible with these rules, and be valid file names in Windows/Linux: 例如,这些名称可能符合以下规则,并且在Windows / Linux中是有效的文件名:

  • A...........ext
  • B -.- .ext

In essence, even with so few whitelisted characters you should still decide what actually makes sense, and validate/adjust the name accordingly. 从本质上讲,即使白名单中的字符很少,您仍然应该决定什么才是真正有意义的,并相应地验证/调整名称。 In one of my applications, I used the same rules as above but stripped any duplicate dots and spaces. 在我的一个应用程序中,我使用了与上述相同的规则,但删除了所有重复的点和空格。


#6楼

Let's keep it simple and answer the question, first. 让我们保持简单并首先回答问题。

  1. The forbidden printable ASCII characters are: 禁止打印的ASCII字符是:

    • Linux/Unix: Linux / Unix:

       / (forward slash) 
    • Windows: 视窗:

       < (less than) > (greater than) : (colon - sometimes works, but is actually NTFS Alternate Data Streams) " (double quote) / (forward slash) \\ (backslash) | (vertical bar or pipe) ? (question mark) * (asterisk) 
  2. Non-printable characters 不可打印字符

    If your data comes from a source that would permit non-printable characters then there is more to check for. 如果您的数据来自允许非打印字符的来源,则还有更多要检查的内容。

    • Linux/Unix: Linux / Unix:

       0 (NULL byte) 
    • Windows: 视窗:

       0-31 (ASCII control characters) 

    Note: While it is legal under Linux/Unix file systems to create files with control characters in the filename, it might be a nightmare for the users to deal with such files . 注意:虽然在Linux / Unix文件系统下创建文件名中带有控制字符的文件是合法的, 但用户处理此类文件可能是一场噩梦

  3. Reserved file names 保留文件名

    The following filenames are reserved: 保留以下文件名:

    • Windows: 视窗:

       CON, PRN, AUX, NUL COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8, COM9 LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, LPT9 

      (both on their own and with arbitrary file extensions, eg LPT1.txt ). (既可以LPT1.txt ,也可以使用任意文件扩展名,例如LPT1.txt )。

  4. Other rules 其他规定

    • Windows: 视窗:

      Filenames cannot end in a space or dot. 文件名不能以空格或点结尾。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值