html+form+multipartform-data,html - What does enctype='multipart/form-data' mean? - Stack Overflow

when should we use it

Quentin's answer is right: use multipart/form-data if the form contains a file upload, and application/x-www-form-urlencoded otherwise, which is the default if you omit enctype.

I'm going to:

add some more HTML5 references

explain why he is right with a form submit example

HTML5 references

There are three possibilities for enctype:

text/plain. This is "not reliably interpretable by computer", so it should never be used in production, and we will not look further into it.

How to generate the examples

Once you see an example of each method, it becomes obvious how they work, and when you should use each one.

You can produce examples using:

an user agent like a browser or cURL

Save the form to a minimal .html file:

upload

Submit

We set the default text value to aωb, which means aωb because ω is U+03C9, which are the bytes 61 CF 89 62 in UTF-8.

Create files to upload:

echo 'Content of a.txt.' > a.txt

echo '

Content of a.html.' > a.html

# Binary file containing 4 bytes: 'a', 1, 2 and 'b'.

printf 'a\xCF\x89b' > binary

Run our little echo server:

while true; do printf '' | nc -l 8000 localhost; done

Open the HTML on your browser, select the files and click on submit and check the terminal.

nc prints the request received.

Tested on: Ubuntu 14.04.3, nc BSD 1.105, Firefox 40.

multipart/form-data

Firefox sent:

POST / HTTP/1.1

[[ Less interesting headers ... ]]

Content-Type: multipart/form-data; boundary=---------------------------735323031399963166993862150

Content-Length: 834

-----------------------------735323031399963166993862150

Content-Disposition: form-data; name="text1"

text default

-----------------------------735323031399963166993862150

Content-Disposition: form-data; name="text2"

aωb

-----------------------------735323031399963166993862150

Content-Disposition: form-data; name="file1"; filename="a.txt"

Content-Type: text/plain

Content of a.txt.

-----------------------------735323031399963166993862150

Content-Disposition: form-data; name="file2"; filename="a.html"

Content-Type: text/html

Content of a.html.

-----------------------------735323031399963166993862150

Content-Disposition: form-data; name="file3"; filename="binary"

Content-Type: application/octet-stream

aωb

-----------------------------735323031399963166993862150--

For the binary file and text field, the bytes 61 CF 89 62 (aωb in UTF-8) are sent literally. You could verify that with nc -l localhost 8000 | hd, which says that the bytes:

61 CF 89 62

were sent (61 == 'a' and 62 == 'b').

Therefore it is clear that:

Content-Type: multipart/form-data; boundary=---------------------------735323031399963166993862150 sets the content type to multipart/form-data and says that the fields are separated by the given boundary string.

But note that the:

boundary=---------------------------735323031399963166993862150

has two less dadhes -- than the actual barrier

-----------------------------735323031399963166993862150

This is because the standard requires the boundary to start with two dashes --. The other dashes appear to be just how Firefox chose to implement the arbitrary boundary. RFC 7578 clearly mentions that those two leading dashes -- are required:

4.1. "Boundary" Parameter of multipart/form-data

As with other multipart types, the parts are delimited with a

boundary delimiter, constructed using CRLF, "--", and the value of

the "boundary" parameter.

every field gets some sub headers before its data: Content-Disposition: form-data;, the field name, the filename, followed by the data.

The server reads the data until the next boundary string. The browser must choose a boundary that will not appear in any of the fields, so this is why the boundary may vary between requests.

Because we have the unique boundary, no encoding of the data is necessary: binary data is sent as is.

TODO: what is the optimal boundary size (log(N) I bet), and name / running time of the algorithm that finds it? Asked at: https://cs.stackexchange.com/questions/39687/find-the-shortest-sequence-that-is-not-a-sub-sequence-of-a-set-of-sequences

Content-Type is automatically determined by the browser.

application/x-www-form-urlencoded

Now change the enctype to application/x-www-form-urlencoded, reload the browser, and resubmit.

Firefox sent:

POST / HTTP/1.1

[[ Less interesting headers ... ]]

Content-Type: application/x-www-form-urlencoded

Content-Length: 51

text1=text+default&text2=a%CF%89b&file1=a.txt&file2=a.html&file3=binary

Clearly the file data was not sent, only the basenames. So this cannot be used for files.

As for the text field, we see that usual printable characters like a and b were sent in one byte, while non-printable ones like 0xCF and 0x89 took up 3 bytes each: %CF%89!

Comparison

File uploads often contain lots of non-printable characters (e.g. images), while text forms almost never do.

From the examples we have seen that:

multipart/form-data: adds a few bytes of boundary overhead to the message, and must spend some time calculating it, but sends each byte in one byte.

application/x-www-form-urlencoded: has a single byte boundary per field (&), but adds a linear overhead factor of 3x for every non-printable character.

Therefore, even if we could send files with application/x-www-form-urlencoded, we wouldn't want to, because it is so inefficient.

But for printable characters found in text fields, it does not matter and generates less overhead, so we just use it.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值