WebServer

最新推荐文章于 2024-07-03 10:43:42 发布

Water Fan

最新推荐文章于 2024-07-03 10:43:42 发布

阅读量566

点赞数

分类专栏： Servlet

本文链接：https://blog.csdn.net/qq_41765518/article/details/106843070

版权

Servlet 专栏收录该内容

15 篇文章 1 订阅

订阅专栏

WebServer

前言
写在后面
- 关于异常问题

前言

Web服务器：

CS架构软件，自己开发客户端和服务端，以及规定它们之间传输的数据格式，即自定义通信协议，比如QQ。
BS架构软件，由WWW组织统一规定了通信协议，即HTTP协议，任何其他组织开发的客户端和服务器端应用，只要遵守了这个HTTP协议，就可以互联互通。这种客户端通常被称为浏览器，服务端则被称为Web服务器。
相对于CS架构软件，BS软件则将客户端与服务端分离解耦，软件公司省去了开发客户端和自定义通信协议的步骤，只要遵守了HTTP协议的规定，任何一个浏览器都可以访问到自己公司开发的服务端应用，从而开启了互联网的时代。

1.HTTP协议之请求数据

1.HTTP协议：

软件开发中，一般的，我们称客户端发给服务器的东西叫请求，服务器返回给我们客户端的叫响应。
HTTP是一种无状态的通信协议。HTTP规定了传输的数据格式，是基于一次请求，一次响应的规则。
HTTP规定了它的请求数据格式，包含了3部分内容：请求行，消息头和消息正文。每一部分内容都会以CRLF结尾。

2.CR和LF分别就是回车加换行：

回车CR：就是把光标移回到最开始；换行LF：是往下走一行；那它们结合起来就是下一行，回到最开始；如果只是单纯的换行LF 就是：往下走一行，光标还停留再原来的那个位置。只有加了回车CR才会回到最开始。也可以理解为就是我们键盘上的回车键。

3.请求行

请求行的格式，也是分为3部分，分别使用空格隔开的，而最后一部分结尾和CRLF之间实际上是没有空格的，有时为了书写上，方便理解加上一个空格。
请求行第一部分，就是请求的方式；
请求行第二部分，是请求资源的路径；
请求行第三部分，协议版本，告诉你浏览器当前使用的HTTP协议是什么版本；
比如，GET /index.html HTTP/1.1CRLF

4.请求方式

请求行的第一部分：主要的请求方式有两种：GET请求和POST请求。
GET请求：会将请求数据将通过地址栏(浏览器的网址输入框)形式发送给服务器。
POST请求：会将请求数据通过打包的形式发送服务器。比如上传附件，图片，表单发送密码信息等都是使用POST请求。

5.URI

URI(Uniform Resource Identifier)：统一资源定位，简单说就是浏览器地址栏里的地址，网址。一个网址就是一个URI，URI其实就相当是一个字符串，它是用来描述我们网络上一个资源的位置，它基本的规则要求有这么三部分组成：
第一部分是协议；
第2部分是主机名IP地址；
第3部分是主机里面那个资源的路径。
第2部分主机名也可以写成域名，在URI里面协议不一定只是http，常见的还有ftp吧，有很多种。

6.请求路径和协议版本

请求行的第二部分：请求资源路径(ServletPath)：就是URI中的第3个部分，服务器主机里面那个资源的路径，访问的资源路径。我们可以简称其访问路径。
请求行的第三部分：HTTP的协议版本，一般是HTTP/1.1。

7.DNS

DNS(Domain Name System)：域名解析服务器，长话短说，在浏览器地址栏里，简单说写的叫网址，其实我们叫域名。比如www.baidu.com。
那么你写这个网址，我怎么就知道连到百度服务器上，实际在互联网上，每台计算机都有个IP地址，比如127.0.0.1，就类似这样的东西，就是每台服务器，它都有自己的IP地址，在互联网上都是记的IP，关键有一个恶心的事情是，记号码对于人来讲并不容易，比如记电话号码，一般都是记人名，因为电话号码和人没有一个必然的联系，而且不好记，但是记名字就好记，记字符是我们人类比较容易的一件事情。
所以就说，网络上每台服务器都有自己的IP地址，但是你要让用户去记IP，这是不现实的。于是就出了一个DNS，域名解析服务器。这个东西就是帮你把这些人类能记得住的名字，转换成它所对应的那个IP地址。其实你在访问淘宝的时候，你并不是马上去访问淘宝的，当你输入www.taobao.com的时候，首先浏览器先访问到DNS服务器，说我要访问淘宝，你把它那个IP告诉我是多少；然后DNS告诉你IP是多少，我在基于这个IP去访问淘宝。
世界上很多国家都有自己的DNS解析服务器，中国有，美国也有，而且中国是这样的，你们将来自己想做一个网站，你们也得要先注册个域名吧！注册域名就相当于你注册一个这样的名字，然后呢，你把你的服务器架到公网上，然后呢，公网上会有一个固定IP地址，你要在注册域名那个地方，声明我这个域名所对应的IP地址是多少，一旦你这么定义好，DNS服务器上就有这么一条记录了。将来人家通过这个域名一访问，就访问到你的IP，就能访问到你的服务器了！
所以，比如你访问http://www.badu.com/index.html，就可以理解为这东西对应的，就是一段带有IP地址的URI：http://163.177.151.109/index.html。所以，通过网址这个www.baidu.com，我们就能找到我这台服务器IP地址163.177.151.109，然后/index.html，就是在那台服务器上，你想访问的资源的路径。

8.消息头和消息正文

请求信息一共3个部分，在请求行后面，还带有消息头和消息正文，那么，消息头和消息正文，它们都是以那种类似于KEY-VALUE对的形式发送。并且消息头中又存在若干类似于key-value对这样的消息信息，每一条消息之间都是以CRLF分隔的。HTTP规定消息头中包含了一些数据，这些数据通常是浏览器或服务器自身所需要的，比如IP地址端口号，连接状态，浏览器内核信息，数据的类型格式等等。平时普通的案例也用不上这些数据，只有一些特殊案例中，比如说你访问的某一个网站以后，这网站崩，给你弹一个框，我们不支持某某浏览器，或者说，它知道你是什么浏览器，你是使的是Chrome啊，还是Firefox啊，还是IE，还是使的什么，它是不是能知道，那他怎么知道我使的是什么浏览器，实际上就是在消息头当中获取的；
消息头中还有我的很多消息正文的相关信息，比如说，我们要是用POST请求过去，我给你打包的那个数据有多大，甚至比如说，你想做一个类似投票的功能，需要伪装的话，这个消息头，它还会告诉你很多我的一些户主的信息的。那么这个东西在这先不强调了，就了解一些简单的消息头的内容，就知道比如说给你发送我的浏览器的名字是什么，它有一个叫user-agent，告诉你是IE内核啊，还是什么什么内核，比较常见的消息头域，比如有contentType啊，contentLength啊，告诉你我给你发送的这个数据的类型是什么，我给你发送的长度是多少，将来获取这个长度，后面去读那些数据就好了，它还会给你发一些那个户主的信息，那么这个就是头，消息头，其实头是用来说明什么，很多情况是用来说明后面正文的一些内容，当然还包括自己浏览器本身的一些信息等等，是这种东西。

9.请求消息的理解

那么关于请求，我们知道请求行，以及请求行后面还包含两部分，消息头和消息正文，而且每一条消息数据，它都是以CRLF结尾，告诉你我的某一段信息结束了，就是请求行完事，它会给你发一个以CRLF结尾，然后呢，如果你这个消息头完了，它也会给你发一个CRLF，并且在消息头与消息正文之间，还有一个空行表示整个消息头信息的结束。就是它总是，就跟我们读东西的时候，按行读那个感觉一样，就是一行，一行，一行，相当于一行一行给我们啊，这个呢，就是我们说的内容咱们简单的写一写吧，消息头。

3.2.2 http URL

   The "http" scheme is used to locate network resources via the HTTP
   protocol. This section defines the scheme-specific syntax and
   semantics for http URLs.

   http_URL = "http:" "//" host [ ":" port ] [ abs_path [ "?" query ]]

   If the port is empty or not given, port 80 is assumed. The semantics
   are that the identified resource is located at the server listening
   for TCP connections on that port of that host, and the Request-URI
   for the resource is abs_path (section 5.1.2). The use of IP addresses
   in URLs SHOULD be avoided whenever possible (see RFC 1900 [24]). If
   the abs_path is not present in the URL, it MUST be given as "/" when
   used as a Request-URI for a resource (section 5.1.2). If a proxy
   receives a host name which is not a fully qualified domain name, it
   MAY add its domain to the host name it received. If a proxy receives
   a fully qualified domain name, the proxy MUST NOT change the host
   name.

4.1 Message Types

   HTTP messages consist of requests from client to server and responses
   from server to client.

       HTTP-message   = Request | Response     ; HTTP/1.1 messages

   Request (section 5) and Response (section 6) messages use the generic
   message format of RFC 822 [9] for transferring entities (the payload
   of the message). Both types of message consist of a start-line, zero
   or more header fields (also known as "headers"), an empty line (i.e.,
   a line with nothing preceding the CRLF) indicating the end of the
   header fields, and possibly a message-body.

        generic-message = start-line
                          *(message-header CRLF)
                          CRLF
                          [ message-body ]
        start-line      = Request-Line | Status-Line

   In the interest of robustness, servers SHOULD ignore any empty
   line(s) received where a Request-Line is expected. In other words, if
   the server is reading the protocol stream at the beginning of a
   message and receives a CRLF first, it should ignore the CRLF.

   Certain buggy HTTP/1.0 client implementations generate extra CRLF's
   after a POST request. To restate what is explicitly forbidden by the
   BNF, an HTTP/1.1 client MUST NOT preface or follow a request with an
   extra CRLF.

4.3 Message Body

   The message-body (if any) of an HTTP message is used to carry the
   entity-body associated with the request or response. The message-body
   differs from the entity-body only when a transfer-coding has been
   applied, as indicated by the Transfer-Encoding header field (section
   14.41).

       message-body = entity-body
                    | <entity-body encoded as per Transfer-Encoding>

   Transfer-Encoding MUST be used to indicate any transfer-codings
   applied by an application to ensure safe and proper transfer of the
   message. Transfer-Encoding is a property of the message, not of the



Fielding, et al.            Standards Track                    [Page 32]

RFC 2616                        HTTP/1.1                       June 1999


   entity, and thus MAY be added or removed by any application along the
   request/response chain. (However, section 3.6 places restrictions on
   when certain transfer-codings may be used.)

   The rules for when a message-body is allowed in a message differ for
   requests and responses.

   The presence of a message-body in a request is signaled by the
   inclusion of a Content-Length or Transfer-Encoding header field in
   the request's message-headers. A message-body MUST NOT be included in
   a request if the specification of the request method (section 5.1.1)
   does not allow sending an entity-body in requests. A server SHOULD
   read and forward a message-body on any request; if the request method
   does not include defined semantics for an entity-body, then the
   message-body SHOULD be ignored when handling the request.

   For response messages, whether or not a message-body is included with
   a message is dependent on both the request method and the response
   status code (section 6.1.1). All responses to the HEAD request method
   MUST NOT include a message-body, even though the presence of entity-
   header fields might lead one to believe they do. All 1xx
   (informational), 204 (no content), and 304 (not modified) responses
   MUST NOT include a message-body. All other responses do include a
   message-body, although it MAY be of zero length.

5 Request

   A request message from a client to a server includes, within the
   first line of that message, the method to be applied to the resource,
   the identifier of the resource, and the protocol version in use.

        Request       = Request-Line              ; Section 5.1
                        *(( general-header        ; Section 4.5
                         | request-header         ; Section 5.3
                         | entity-header ) CRLF)  ; Section 7.1
                        CRLF
                        [ message-body ]          ; Section 4.3

5.1 Request-Line

   The Request-Line begins with a method token, followed by the
   Request-URI and the protocol version, and ending with CRLF. The
   elements are separated by SP characters. No CR or LF is allowed
   except in the final CRLF sequence.

        Request-Line   = Method SP Request-URI SP HTTP-Version CRLF

5.1.1 Method

   The Method  token indicates the method to be performed on the
   resource identified by the Request-URI. The method is case-sensitive.

       Method         = "OPTIONS"                ; Section 9.2
                      | "GET"                    ; Section 9.3
                      | "HEAD"                   ; Section 9.4
                      | "POST"                   ; Section 9.5
                      | "PUT"                    ; Section 9.6
                      | "DELETE"                 ; Section 9.7
                      | "TRACE"                  ; Section 9.8
                      | "CONNECT"                ; Section 9.9
                      | extension-method
       extension-method = token

   The list of methods allowed by a resource can be specified in an
   Allow header field (section 14.7). The return code of the response
   always notifies the client whether a method is currently allowed on a
   resource, since the set of allowed methods can change dynamically. An
   origin server SHOULD return the status code 405 (Method Not Allowed)
   if the method is known by the origin server but not allowed for the
   requested resource, and 501 (Not Implemented) if the method is
   unrecognized or not implemented by the origin server. The methods GET
   and HEAD MUST be supported by all general-purpose servers. All other
   methods are OPTIONAL; however, if the above methods are implemented,
   they MUST be implemented with the same semantics as those specified
   in section 9.

2.WebServer代码实现第一版：请求和响应信息的简单处理

3.WebServer代码实现第二版：通过注册登录功能完善WebServer

4.WebServer代码实现第三版：JDBC重构WebServer业务功能一

5.WebServer代码实现第四版：JDBC重构WebServer业务功能二

6.WebServer代码实现第五版：重构页面目录结构之代码实现总汇

WebServer目录结构

一个简易的，可以实现get请求的WebServer服务器，至此就告一段落了，我们把在webapp目录下新建一个test目录，并将以上应用到的所有页面都整理到此test目录中，代码中相关的路径也进行相应的改动，然后建立一个新的目录999，是对后续应用这个WebServer基础上，构建新项目私有云库的测试目录结构的用例模拟。项目目录结构如下：

WebServer代码实现

WebServer代码实现

写在后面

关于异常问题

Java异常Throwable分为两类：Error和Exception
Error类是错误，程序本身不能处理的。
Exception有分为两类：IOException（非运行时异常）和RuntimeException（运行时异常）
其中RuntimeException是在程序设计时尽量避免的。
除了RuntimeException及其子类以外，其他的Exception类及其子类都属于可查异常。这种异常编译器要求强制处置，要么try-cath，要么在方法名后面抛出。
不可查异常(编译器不要求强制处置的异常):包括运行时异常（RuntimeException与其子类）和错误（Error）。
以下内容是转载的：
运行与非运行异常
Exception 这种异常分两大类运行时异常和非运行时异常(编译异常)。程序中应当尽可能去处理这些异常。

运行时异常：都是RuntimeException类及其子类异常，如NullPointerException(空指针异常)、IndexOutOfBoundsException(下标越界异常)等，这些异常是不检查异常，程序中可以选择捕获处理，也可以不处理。这些异常一般是由程序逻辑错误引起的，程序应该从逻辑角度尽可能避免这类异常的发生。
运行时异常的特点是Java编译器不会检查它，也就是说，当程序中可能出现这类异常，即使没有用try-catch语句捕获它，也没有用throws子句声明抛出它，也会编译通过。

非运行时异常（编译异常）：从程序语法角度讲是必须进行处理的异常，如果不处理，程序就不能编译通过。如IOException、SQLException等以及用户自定义的Exception异常，不过一般情况下不自定义检查异常。
其实就是：RuntimeException与其子类和错误（Error）不是必须要捕获或抛出。
————————————————
版权声明：本文为CSDN博主「qq_菜鸟向上爬」的原创文章，遵循CC 4.0 BY-SA版权协议，转载请附上原文出处链接及本声明。
原文链接：https://blog.csdn.net/qq_37734194/article/details/79966478