【已解决】前端到后端400错误(The server cannot or will not process the request due to...)

看到400错误,一般是请求无效。出现该异常一般有三种情况:

第一种情况:

前端提交的内容在后端一般都用String类型来接收,用Date类型接收会报错。

第二种情况:

在提交表单的时候,填写的数据类型与Controller层的接收类型不一致导致400错误,可以检查一下代码,看看是不是请求参数错误,表单传过去的数据无法与pojo对象匹配。

第三种情况:

controller代码方法中使用了@RequestParam注解,但是在jsp中没有对应的@RequestParam注解name属性参数值,而且@RequestParam注解的required属性默认为true,也就是说,jsp中参数值必须对应@RequestParam注解的name属性值。

在这里插入图片描述
错误演示如下:
在这里插入图片描述

解决

针对第一种情况,转化为实体类中的时间类型(Date)出现了错误,在set中进行转化应该OK解决,当然也可以进行自定义类型转换器,专门自定义date格式的转换,关于自定义类型转换器具体操作可以参考我的这篇文章SpringMVC参数绑定学习总结【前后端数据参数传递】

针对第二种情况,同学你可以检查一下jsp填写的数据类型与Controller层的接收类型是否一致

如果该文章能给到你帮助就太好了,点个赞呗QAQ

当然,我想我总结的可能还是不够全面,望各位大牛有独特的见解可以指出,抱拳~

  • 8
    点赞
  • 14
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 1
    评论
[PHP] ;;;;;;;;;;;;;;;;;;; ; About php.ini ; ;;;;;;;;;;;;;;;;;;; ; PHP's initialization file, generally called php.ini, is responsible for ; configuring many of the aspects of PHP's behavior. ; PHP attempts to find and load this configuration from a number of locations. ; The following is a summary of its search order: ; 1. SAPI module specific location. ; 2. The PHPRC environment variable. (As of PHP 5.2.0) ; 3. A number of predefined registry keys on Windows (As of PHP 5.2.0) ; 4. Current working directory (except CLI) ; 5. The web server's directory (for SAPI modules), or directory of PHP ; (otherwise in Windows) ; 6. The directory from the --with-config-file-path compile time option, or the ; Windows directory (C:\windows or C:\winnt) ; See the PHP docs for more specific information. ; http://php.net/configuration.file ; The syntax of the file is extremely simple. Whitespace and lines ; beginning with a semicolon are silently ignored (as you probably guessed). ; Section headers (e.g. [Foo]) are also silently ignored, even though ; they might mean something in the future. ; Directives following the section heading [PATH=/www/mysite] only ; apply to PHP files in the /www/mysite directory. Directives ; following the section heading [HOST=www.example.com] only apply to ; PHP files served from www.example.com. Directives set in these ; special sections cannot be overridden by user-defined INI files or ; at runtime. Currently, [PATH=] and [HOST=] sections only work under ; CGI/FastCGI. ; http://php.net/ini.sections ; Directives are specified using the following syntax: ; directive = value ; Directive names are *case sensitive* - foo=bar is different from FOO=bar. ; Directives are variables used to configure PHP or PHP extensions. ; There is no name validation. If PHP can't find an expected ; directive because it is not set or is mistyped, a default value will be used. ; The value can be a string, a number, a PHP constant (e.g. E_ALL or M_PI), one ; of the INI constants (On, Off, True, False, Yes, No and None) or an expression ; (e.g. E_ALL & ~E_NOTICE), a quoted string ("bar"), or a reference to a ; previously set variable or directive (e.g. ${foo}) ; Expressions in the INI file are limited to bitwise operators and parentheses: ; | bitwise OR ; ^ bitwise XOR ; & bitwise AND ; ~ bitwise NOT ; ! boolean NOT ; Boolean flags can be turned on using the values 1, On, True or Yes. ; They can be turned off using the values 0, Off, False or No. ; An empty string can be denoted by simply not writing anything after the equal ; sign, or by using the None keyword: ; foo = ; sets foo to an empty string ; foo = None ; sets foo to an empty string ; foo = "None" ; sets foo to the string 'None' ; If you use constants in your value, and these constants belong to a ; dynamically loaded extension (either a PHP extension or a Zend extension), ; you may only use these constants *after* the line that loads the extension. ;;;;;;;;;;;;;;;;;;; ; About this file ; ;;;;;;;;;;;;;;;;;;; ; PHP comes packaged with two INI files. One that is recommended to be used ; in production environments and one that is recommended to be used in ; development environments. ; php.ini-production contains settings which hold security, performance and ; best practices at its core. But please be aware, these settings may break ; compatibility with older or less security conscience applications. We ; recommending using the production ini in production and testing environments. ; php.ini-development is very similar to its production variant, except it's ; much more verbose when it comes to errors. We recommending using the ; development version only in development environments as errors shown to ; application users can inadvertently leak otherwise secure information. ; This is php.ini-development INI file. ;;;;;;;;;;;;;;;;;;; ; Quick Reference ; ;;;;;;;;;;;;;;;;;;; ; The following are all the settings which are different in either the production ; or development versions of the INIs with respect to PHP's default behavior. ; Please see the actual settings later in the document for more details as to why ; we recommend these changes in PHP's behavior. ; display_errors ; Default Value: On ; Development Value: On ; Production Value: Off ; display_startup_errors ; Default Value: Off ; Development Value: On ; Production Value: Off ; error_reporting ; Default Value: E_ALL & ~E_NOTICE & ~E_STRICT & ~E_DEPRECATED ; Development Value: E_ALL ; Production Value: E_ALL & ~E_DEPRECATED & ~E_STRICT ; html_errors ; Default Value: On ; Development Value: On ; Production value: On ; log_errors ; Default Value: Off ; Development Value: On ; Production Value: On ; max_input_time ; Default Value: -1 (Unlimited) ; Development Value: 60 (60 seconds) ; Production Value: 60 (60 seconds) ; output_buffering ; Default Value: Off ; Development Value: 4096 ; Production Value: 4096 ; register_argc_argv ; Default Value: On ; Development Value: Off ; Production Value: Off ; request_order ; Default Value: None ; Development Value: "GP" ; Production Value: "GP" ; session.gc_divisor ; Default Value: 100 ; Development Value: 1000 ; Production Value: 1000 ; session.hash_bits_per_character ; Default Value: 4 ; Development Value: 5 ; Production Value: 5 ; short_open_tag ; Default Value: On ; Development Value: Off ; Production Value: Off ; track_errors ; Default Value: Off ; Development Value: On ; Production Value: Off ; url_rewriter.tags ; Default Value: "a=href,area=href,frame=src,form=,fieldset=" ; Development Value: "a=href,area=href,frame=src,input=src,form=fakeentry" ; Production Value: "a=href,area=href,frame=src,input=src,form=fakeentry" ; variables_order ; Default Value: "EGPCS" ; Development Value: "GPCS" ; Production Value: "GPCS" ;;;;;;;;;;;;;;;;;;;; ; php.ini Options ; ;;;;;;;;;;;;;;;;;;;; ; Name for user-defined php.ini (.htaccess) files. Default is ".user.ini" ;user_ini.filename = ".user.ini" ; To disable this feature set this option to empty value ;user_ini.filename = ; TTL for user-defined php.ini files (time-to-live) in seconds. Default is 300 seconds (5 minutes) ;user_ini.cache_ttl = 300 ;;;;;;;;;;;;;;;;;;;; ; Language Options ; ;;;;;;;;;;;;;;;;;;;; ; Enable the PHP scripting language engine under Apache. ; http://php.net/engine engine = On ; This directive determines whether or not PHP will recognize code between ; <? and ?> tags as PHP source which should be processed as such. It is ; generally recommended that <?php and ?> should be used and that this feature ; should be disabled, as enabling it may result in issues when generating XML ; documents, however this remains supported for backward compatibility reasons. ; Note that this directive does not control the <?= shorthand tag, which can be ; used regardless of this directive. ; Default Value: On ; Development Value: Off ; Production Value: Off ; http://php.net/short-open-tag short_open_tag = Off ; Allow ASP-style <% %> tags. ; http://php.net/asp-tags asp_tags = Off ; The number of significant digits displayed in floating point numbers. ; http://php.net/precision precision = 14 ; Output buffering is a mechanism for controlling how much output data ; (excluding headers and cookies) PHP should keep internally before pushing that ; data to the client. If your application's output exceeds this setting, PHP ; will send that data in chunks of roughly the size you specify. ; Turning on this setting and managing its maximum buffer size can yield some ; interesting side-effects depending on your application and web server. ; You may be able to send headers and cookies after you've already sent output ; through print or echo. You also may see performance benefits if your server is ; emitting less packets due to buffered output versus PHP streaming the output ; as it gets it. On production servers, 4096 bytes is a good setting for performance ; reasons. ; Note: Output buffering can also be controlled via Output Buffering Control ; functions. ; Possible Values: ; On = Enabled and buffer is unlimited. (Use with caution) ; Off = Disabled ; Integer = Enables the buffer and sets its maximum size in bytes. ; Note: This directive is hardcoded to Off for the CLI SAPI ; Default Value: Off ; Development Value: 4096 ; Production Value: 4096 ; http://php.net/output-buffering output_buffering = 4096 ; You can redirect all of the output of your scripts to a function. For ; example, if you set output_handler to "mb_output_handler", character ; encoding will be transparently converted to the specified encoding. ; Setting any output handler automatically turns on output buffering. ; Note: People who wrote portable scripts should not depend on this ini ; directive. Instead, explicitly set the output handler using ob_start(). ; Using this ini directive may cause problems unless you know what script ; is doing. ; Note: You cannot use both "mb_output_handler" with "ob_iconv_handler" ; and you cannot use both "ob_gzhandler" and "zlib.output_compression". ; Note: output_handler must be empty if this is set 'On' !!!! ; Instead you must use zlib.output_handler. ; http://php.net/output-handler ;output_handler = ; Transparent output compression using the zlib library ; Valid values for this option are 'off', 'on', or a specific buffer size ; to be used for compression (default is 4KB) ; Note: Resulting chunk size may vary due to nature of compression. PHP ; outputs chunks that are few hundreds bytes each as a result of ; compression. If you prefer a larger chunk size for better ; performance, enable output_buffering in addition. ; Note: You need to use zlib.output_handler instead of the standard ; output_handler, or otherwise the output will be corrupted. ; http://php.net/zlib.output-compression zlib.output_compression = Off ; http://php.net/zlib.output-compression-level ;zlib.output_compression_level = -1 ; You cannot specify additional output handlers if zlib.output_compression ; is activated here. This setting does the same as output_handler but in ; a different order. ; http://php.net/zlib.output-handler ;zlib.output_handler = ; Implicit flush tells PHP to tell the output layer to flush itself ; automatically after every output block. This is equivalent to calling the ; PHP function flush() after each and every call to print() or echo() and each ; and every HTML block. Turning this option on has serious performance ; implications and is generally recommended for debugging purposes only. ; http://php.net/implicit-flush ; Note: This directive is hardcoded to On for the CLI SAPI implicit_flush = Off ; The unserialize callback function will be called (with the undefined class' ; name as parameter), if the unserializer finds an undefined class ; which should be instantiated. A warning appears if the specified function is ; not defined, or if the function doesn't include/implement the missing class. ; So only set this entry, if you really want to implement such a ; callback-function. unserialize_callback_func = ; When floats & doubles are serialized store serialize_precision significant ; digits after the floating point. The default value ensures that when floats ; are decoded with unserialize, the data will remain the same. serialize_precision = 17 ; open_basedir, if set, limits all file operations to the defined directory ; and below. This directive makes most sense if used in a per-directory ; or per-virtualhost web server configuration file. This directive is ; *NOT* affected by whether Safe Mode is turned On or Off. ; http://php.net/open-basedir ;open_basedir = ; This directive allows you to disable certain functions for security reasons. ; It receives a comma-delimited list of function names. This directive is ; *NOT* affected by whether Safe Mode is turned On or Off. ; http://php.net/disable-functions disable_functions = ; This directive allows you to disable certain classes for security reasons. ; It receives a comma-delimited list of class names. This directive is ; *NOT* affected by whether Safe Mode is turned On or Off. ; http://php.net/disable-classes disable_classes = ; Colors for Syntax Highlighting mode. Anything that's acceptable in ; <span style="color: ???????"> would work. ; http://php.net/syntax-highlighting ;highlight.string = #DD0000 ;highlight.comment = #FF9900 ;highlight.keyword = #007700 ;highlight.default = #0000BB ;highlight.html = #000000 ; If enabled, the request will be allowed to complete even if the user aborts ; the request. Consider enabling it if executing long requests, which may end up ; being interrupted by the user or a browser timing out. PHP's default behavior ; is to disable this feature. ; http://php.net/ignore-user-abort ;ignore_user_abort = On ; Determines the size of the realpath cache to be used by PHP. This value should ; be increased on systems where PHP opens many files to reflect the quantity of ; the file operations performed. ; http://php.net/realpath-cache-size ;realpath_cache_size = 16k ; Duration of time, in seconds for which to cache realpath information for a given ; file or directory. For systems with rarely changing files, consider increasing this ; value. ; http://php.net/realpath-cache-ttl ;realpath_cache_ttl = 120 ; Enables or disables the circular reference collector. ; http://php.net/zend.enable-gc zend.enable_gc = On ; If enabled, scripts may be written in encodings that are incompatible with ; the scanner. CP936, Big5, CP949 and Shift_JIS are the examples of such ; encodings. To use this feature, mbstring extension must be enabled. ; Default: Off ;zend.multibyte = Off ; Allows to set the default encoding for the scripts. This value will be used ; unless "declare(encoding=...)" directive appears at the top of the script. ; Only affects if zend.multibyte is set. ; Default: "" ;zend.script_encoding = ;;;;;;;;;;;;;;;;; ; Miscellaneous ; ;;;;;;;;;;;;;;;;; ; Decides whether PHP may expose the fact that it is installed on the server ; (e.g. by adding its signature to the Web server header). It is no security ; threat in any way, but it makes it possible to determine whether you use PHP ; on your server or not. ; http://php.net/expose-php expose_php = On ;;;;;;;;;;;;;;;;;;; ; Resource Limits ; ;;;;;;;;;;;;;;;;;;; ; Maximum execution time of each script, in seconds ; http://php.net/max-execution-time ; Note: This directive is hardcoded to 0 for the CLI SAPI max_execution_time = 30 ; Maximum amount of time each script may spend parsing request data. It's a good ; idea to limit this time on productions servers in order to eliminate unexpectedly ; long running scripts. ; Note: This directive is hardcoded to -1 for the CLI SAPI ; Default Value: -1 (Unlimited) ; Development Value: 60 (60 seconds) ; Production Value: 60 (60 seconds) ; http://php.net/max-input-time max_input_time = 60 ; Maximum input variable nesting level ; http://php.net/max-input-nesting-level ;max_input_nesting_level = 64 ; How many GET/POST/COOKIE input variables may be accepted ; max_input_vars = 1000 ; Maximum amount of memory a script may consume (128MB) ; http://php.net/memory-limit memory_limit = 128M ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ; Error handling and logging ; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ; This directive informs PHP of which errors, warnings and notices you would like ; it to take action for. The recommended way of setting values for this ; directive is through the use of the error level constants and bitwise ; operators. The error level constants are below here for convenience as well as ; some common settings and their meanings. ; By default, PHP is set to take action on all errors, notices and warnings EXCEPT ; those related to E_NOTICE and E_STRICT, which together cover best practices and ; recommended coding standards in PHP. For performance reasons, this is the ; recommend error reporting setting. Your production server shouldn't be wasting ; resources complaining about best practices and coding standards. That's what ; development servers and development settings are for. ; Note: The php.ini-development file has this setting as E_ALL. This ; means it pretty much reports everything which is exactly what you want during ; development and early testing. ; ; Error Level Constants: ; E_ALL - All errors and warnings (includes E_STRICT as of PHP 5.4.0) ; E_ERROR - fatal run-time errors ; E_RECOVERABLE_ERROR - almost fatal run-time errors ; E_WARNING - run-time warnings (non-fatal errors) ; E_PARSE - compile-time parse errors ; E_NOTICE - run-time notices (these are warnings which often result ; from a bug in your code, but it's possible that it was ; intentional (e.g., using an uninitialized variable and ; relying on the fact it's automatically initialized to an ; empty string) ; E_STRICT - run-time notices, enable to have PHP suggest changes ; to your code which will ensure the best interoperability ; and forward compatibility of your code ; E_CORE_ERROR - fatal errors that occur during PHP's initial startup ; E_CORE_WARNING - warnings (non-fatal errors) that occur during PHP's ; initial startup ; E_COMPILE_ERROR - fatal compile-time errors ; E_COMPILE_WARNING - compile-time warnings (non-fatal errors) ; E_USER_ERROR - user-generated error message ; E_USER_WARNING - user-generated warning message ; E_USER_NOTICE - user-generated notice message ; E_DEPRECATED - warn about code that will not work in future versions ; of PHP ; E_USER_DEPRECATED - user-generated deprecation warnings ; ; Common Values: ; E_ALL (Show all errors, warnings and notices including coding standards.) ; E_ALL & ~E_NOTICE (Show all errors, except for notices) ; E_ALL & ~E_NOTICE & ~E_STRICT (Show all errors, except for notices and coding standards warnings.) ; E_COMPILE_ERROR|E_RECOVERABLE_ERROR|E_ERROR|E_CORE_ERROR (Show only errors) ; Default Value: E_ALL & ~E_NOTICE & ~E_STRICT & ~E_DEPRECATED ; Development Value: E_ALL ; Production Value: E_ALL & ~E_DEPRECATED & ~E_STRICT ; http://php.net/error-reporting error_reporting = E_ALL ; This directive controls whether or not and where PHP will output errors, ; notices and warnings too. Error output is very useful during development, but ; it could be very dangerous in production environments. Depending on the code ; which is triggering the error, sensitive information could potentially leak ; out of your application such as database usernames and passwords or worse. ; It's recommended that errors be logged on production servers rather than ; having the errors sent to STDOUT. ; Possible Values: ; Off = Do not display any errors ; stderr = Display errors to STDERR (affects only CGI/CLI binaries!) ; On or stdout = Display errors to STDOUT ; Default Value: On ; Development Value: On ; Production Value: Off ; http://php.net/display-errors display_errors = On ; The display of errors which occur during PHP's startup sequence are handled ; separately from display_errors. PHP's default behavior is to suppress those ; errors from clients. Turning the display of startup errors on can be useful in ; debugging configuration problems. But, it's strongly recommended that you ; leave this setting off on production servers. ; Default Value: Off ; Development Value: On ; Production Value: Off ; http://php.net/display-startup-errors display_startup_errors = On ; Besides displaying errors, PHP can also log errors to locations such as a ; server-specific log, STDERR, or a location specified by the error_log ; directive found below. While errors should not be displayed on productions ; servers they should still be monitored and logging is a great way to do that. ; Default Value: Off ; Development Value: On ; Production Value: On ; http://php.net/log-errors log_errors = On ; Set maximum length of log_errors. In error_log information about the source is ; added. The default is 1024 and 0 allows to not apply any maximum length at all. ; http://php.net/log-errors-max-len log_errors_max_len = 1024 ; Do not log repeated messages. Repeated errors must occur in same file on same ; line unless ignore_repeated_source is set true. ; http://php.net/ignore-repeated-errors ignore_repeated_errors = Off ; Ignore source of message when ignoring repeated messages. When this setting ; is On you will not log errors with repeated messages from different files or ; source lines. ; http://php.net/ignore-repeated-source ignore_repeated_source = Off ; If this parameter is set to Off, then memory leaks will not be shown (on ; stdout or in the log). This has only effect in a debug compile, and if ; error reporting includes E_WARNING in the allowed list ; http://php.net/report-memleaks report_memleaks = On ; This setting is on by default. ;report_zend_debug = 0 ; Store the last error/warning message in $php_errormsg (boolean). Setting this value ; to On can assist in debugging and is appropriate for development servers. It should ; however be disabled on production servers. ; Default Value: Off ; Development Value: On ; Production Value: Off ; http://php.net/track-errors track_errors = On ; Turn off normal error reporting and emit XML-RPC error XML ; http://php.net/xmlrpc-errors ;xmlrpc_errors = 0 ; An XML-RPC faultCode ;xmlrpc_error_number = 0 ; When PHP displays or logs an error, it has the capability of formatting the ; error message as HTML for easier reading. This directive controls whether ; the error message is formatted as HTML or not. ; Note: This directive is hardcoded to Off for the CLI SAPI ; Default Value: On ; Development Value: On ; Production value: On ; http://php.net/html-errors html_errors = On ; If html_errors is set to On *and* docref_root is not empty, then PHP ; produces clickable error messages that direct to a page describing the error ; or function causing the error in detail. ; You can download a copy of the PHP manual from http://php.net/docs ; and change docref_root to the base URL of your local copy including the ; leading '/'. You must also specify the file extension being used including ; the dot. PHP's default behavior is to leave these settings empty, in which ; case no links to documentation are generated. ; Note: Never use this feature for production boxes. ; http://php.net/docref-root ; Examples ;docref_root = "/phpmanual/" ; http://php.net/docref-ext ;docref_ext = .html ; String to output before an error message. PHP's default behavior is to leave ; this setting blank. ; http://php.net/error-prepend-string ; Example: ;error_prepend_string = "<span style='color: #ff0000'>" ; String to output after an error message. PHP's default behavior is to leave ; this setting blank. ; http://php.net/error-append-string ; Example: ;error_append_string = "</span>" ; Log errors to specified file. PHP's default behavior is to leave this value ; empty. ; http://php.net/error-log ; Example: ;error_log = php_errors.log ; Log errors to syslog (Event Log on Windows). ;error_log = syslog ;windows.show_crt_warning ; Default value: 0 ; Development value: 0 ; Production value: 0 ;;;;;;;;;;;;;;;;; ; Data Handling ; ;;;;;;;;;;;;;;;;; ; The separator used in PHP generated URLs to separate arguments. ; PHP's default setting is "&". ; http://php.net/arg-separator.output ; Example: ;arg_separator.output = "&" ; List of separator(s) used by PHP to parse input URLs into variables. ; PHP's default setting is "&". ; NOTE: Every character in this directive is considered as separator! ; http://php.net/arg-separator.input ; Example: ;arg_separator.input = ";&" ; This directive determines which super global arrays are registered when PHP ; starts up. G,P,C,E & S are abbreviations for the following respective super ; globals: GET, POST, COOKIE, ENV and SERVER. There is a performance penalty ; paid for the registration of these arrays and because ENV is not as commonly ; used as the others, ENV is not recommended on productions servers. You ; can still get access to the environment variables through getenv() should you ; need to. ; Default Value: "EGPCS" ; Development Value: "GPCS" ; Production Value: "GPCS"; ; http://php.net/variables-order variables_order = "GPCS" ; This directive determines which super global data (G,P,C,E & S) should ; be registered into the super global array REQUEST. If so, it also determines ; the order in which that data is registered. The values for this directive are ; specified in the same manner as the variables_order directive, EXCEPT one. ; Leaving this value empty will cause PHP to use the value set in the ; variables_order directive. It does not mean it will leave the super globals ; array REQUEST empty. ; Default Value: None ; Development Value: "GP" ; Production Value: "GP" ; http://php.net/request-order request_order = "GP" ; This directive determines whether PHP registers $argv & $argc each time it ; runs. $argv contains an array of all the arguments passed to PHP when a script ; is invoked. $argc contains an integer representing the number of arguments ; that were passed when the script was invoked. These arrays are extremely ; useful when running scripts from the command line. When this directive is ; enabled, registering these variables consumes CPU cycles and memory each time ; a script is executed. For performance reasons, this feature should be disabled ; on production servers. ; Note: This directive is hardcoded to On for the CLI SAPI ; Default Value: On ; Development Value: Off ; Production Value: Off ; http://php.net/register-argc-argv register_argc_argv = Off ; When enabled, the ENV, REQUEST and SERVER variables are created when they're ; first used (Just In Time) instead of when the script starts. If these ; variables are not used within a script, having this directive on will result ; in a performance gain. The PHP directive register_argc_argv must be disabled ; for this directive to have any affect. ; http://php.net/auto-globals-jit auto_globals_jit = On ; Whether PHP will read the POST data. ; This option is enabled by default. ; Most likely, you won't want to disable this option globally. It causes $_POST ; and $_FILES to always be empty; the only way you will be able to read the ; POST data will be through the php://input stream wrapper. This can be useful ; to proxy requests or to process the POST data in a memory efficient fashion. ; http://php.net/enable-post-data-reading ;enable_post_data_reading = Off ; Maximum size of POST data that PHP will accept. ; Its value may be 0 to disable the limit. It is ignored if POST data reading ; is disabled through enable_post_data_reading. ; http://php.net/post-max-size post_max_size = 8M ; Automatically add files before PHP document. ; http://php.net/auto-prepend-file auto_prepend_file = ; Automatically add files after PHP document. ; http://php.net/auto-append-file auto_append_file = ; By default, PHP will output a character encoding using ; the Content-type: header. To disable sending of the charset, simply ; set it to be empty. ; ; PHP's built-in default is text/html ; http://php.net/default-mimetype default_mimetype = "text/html" ; PHP's default character set is set to empty. ; http://php.net/default-charset ;default_charset = "UTF-8" ; Always populate the $HTTP_RAW_POST_DATA variable. PHP's default behavior is ; to disable this feature. If post reading is disabled through ; enable_post_data_reading, $HTTP_RAW_POST_DATA is *NOT* populated. ; http://php.net/always-populate-raw-post-data ;always_populate_raw_post_data = On ;;;;;;;;;;;;;;;;;;;;;;;;; ; Paths and Directories ; ;;;;;;;;;;;;;;;;;;;;;;;;; ; UNIX: "/path1:/path2" ;include_path = ".:/php/includes" ; ; Windows: "\path1;\path2" ;include_path = ".;c:\php\includes" ; ; PHP's default setting for include_path is ".;/path/to/php/pear" ; http://php.net/include-path ; The root of the PHP pages, used only if nonempty. ; if PHP was not compiled with FORCE_REDIRECT, you SHOULD set doc_root ; if you are running php as a CGI under any web server (other than IIS) ; see documentation for security issues. The alternate is to use the ; cgi.force_redirect configuration below ; http://php.net/doc-root doc_root = ; The directory under which PHP opens the script using /~username used only ; if nonempty. ; http://php.net/user-dir user_dir = ; Directory in which the loadable extensions (modules) reside. ; http://php.net/extension-dir ; extension_dir = "./" ; On windows: ; extension_dir = "ext" ; Whether or not to enable the dl() function. The dl() function does NOT work ; properly in multithreaded servers, such as IIS or Zeus, and is automatically ; disabled on them. ; http://php.net/enable-dl enable_dl = Off ; cgi.force_redirect is necessary to provide security running PHP as a CGI under ; most web servers. Left undefined, PHP turns this on by default. You can ; turn it off here AT YOUR OWN RISK ; **You CAN safely turn this off for IIS, in fact, you MUST.** ; http://php.net/cgi.force-redirect ;cgi.force_redirect = 1 ; if cgi.nph is enabled it will force cgi to always sent Status: 200 with ; every request. PHP's default behavior is to disable this feature. ;cgi.nph = 1 ; if cgi.force_redirect is turned on, and you are not running under Apache or Netscape ; (iPlanet) web servers, you MAY need to set an environment variable name that PHP ; will look for to know it is OK to continue execution. Setting this variable MAY ; cause security issues, KNOW WHAT YOU ARE DOING FIRST. ; http://php.net/cgi.redirect-status-env ;cgi.redirect_status_env = ; cgi.fix_pathinfo provides *real* PATH_INFO/PATH_TRANSLATED support for CGI. PHP's ; previous behaviour was to set PATH_TRANSLATED to SCRIPT_FILENAME, and to not grok ; what PATH_INFO is. For more information on PATH_INFO, see the cgi specs. Setting ; this to 1 will cause PHP CGI to fix its paths to conform to the spec. A setting ; of zero causes PHP to behave as before. Default is 1. You should fix your scripts ; to use SCRIPT_FILENAME rather than PATH_TRANSLATED. ; http://php.net/cgi.fix-pathinfo ;cgi.fix_pathinfo=1 ; FastCGI under IIS (on WINNT based OS) supports the ability to impersonate ; security tokens of the calling client. This allows IIS to define the ; security context that the request runs under. mod_fastcgi under Apache ; does not currently support this feature (03/17/2002) ; Set to 1 if running under IIS. Default is zero. ; http://php.net/fastcgi.impersonate ;fastcgi.impersonate = 1 ; Disable logging through FastCGI connection. PHP's default behavior is to enable ; this feature. ;fastcgi.logging = 0 ; cgi.rfc2616_headers configuration option tells PHP what type of headers to ; use when sending HTTP response code. If it's set 0 PHP sends Status: header that ; is supported by Apache. When this option is set to 1 PHP will send ; RFC2616 compliant header. ; Default is zero. ; http://php.net/cgi.rfc2616-headers ;cgi.rfc2616_headers = 0 ;;;;;;;;;;;;;;;; ; File Uploads ; ;;;;;;;;;;;;;;;; ; Whether to allow HTTP file uploads. ; http://php.net/file-uploads file_uploads = On ; Temporary directory for HTTP uploaded files (will use system default if not ; specified). ; http://php.net/upload-tmp-dir ;upload_tmp_dir = ; Maximum allowed size for uploaded files. ; http://php.net/upload-max-filesize upload_max_filesize = 2M ; Maximum number of files that can be uploaded via a single request max_file_uploads = 20 ;;;;;;;;;;;;;;;;;; ; Fopen wrappers ; ;;;;;;;;;;;;;;;;;; ; Whether to allow the treatment of URLs (like http:// or ftp://) as files. ; http://php.net/allow-url-fopen allow_url_fopen = On ; Whether to allow include/require to open URLs (like http:// or ftp://) as files. ; http://php.net/allow-url-include allow_url_include = Off ; Define the anonymous ftp password (your email address). PHP's default setting ; for this is empty. ; http://php.net/from ;from="[email protected]" ; Define the User-Agent string. PHP's default setting for this is empty. ; http://php.net/user-agent ;user_agent="PHP" ; Default timeout for socket based streams (seconds) ; http://php.net/default-socket-timeout default_socket_timeout = 60 ; If your scripts have to deal with files from Macintosh systems, ; or you are running on a Mac and need to deal with files from ; unix or win32 systems, setting this flag will cause PHP to ; automatically detect the EOL character in those files so that ; fgets() and file() will work regardless of the source of the file. ; http://php.net/auto-detect-line-endings ;auto_detect_line_endings = Off ;;;;;;;;;;;;;;;;;;;;;; ; Dynamic Extensions ; ;;;;;;;;;;;;;;;;;;;;;; ; If you wish to have an extension loaded automatically, use the following ; syntax: ; ; extension=modulename.extension ; ; For example, on Windows: ; ; extension=msql.dll ; ; ... or under UNIX: ; ; extension=msql.so ; ; ... or with a path: ; ; extension=/path/to/extension/msql.so ; ; If you only provide the name of the extension, PHP will look for it in its ; default extension directory. ; ; Windows Extensions ; Note that ODBC support is built in, so no dll is needed for it. ; Note that many DLL files are located in the extensions/ (PHP 4) ext/ (PHP 5) ; extension folders as well as the separate PECL DLL download (PHP 5). ; Be sure to appropriately set the extension_dir directive. ; ;extension=php_bz2.dll ;extension=php_curl.dll ;extension=php_fileinfo.dll ;extension=php_gd2.dll ;extension=php_gettext.dll ;extension=php_gmp.dll ;extension=php_intl.dll ;extension=php_imap.dll ;extension=php_interbase.dll ;extension=php_ldap.dll ;extension=php_mbstring.dll ;extension=php_exif.dll ; Must be after mbstring as it depends on it ;extension=php_mysql.dll ;extension=php_mysqli.dll ;extension=php_oci8.dll ; Use with Oracle 10gR2 Instant Client ;extension=php_oci8_11g.dll ; Use with Oracle 11gR2 Instant Client ;extension=php_openssl.dll ;extension=php_pdo_firebird.dll ;extension=php_pdo_mysql.dll ;extension=php_pdo_oci.dll ;extension=php_pdo_odbc.dll ;extension=php_pdo_pgsql.dll ;extension=php_pdo_sqlite.dll ;extension=php_pgsql.dll ;extension=php_pspell.dll ;extension=php_shmop.dll ; The MIBS data available in the PHP distribution must be installed. ; See http://www.php.net/manual/en/snmp.installation.php ;extension=php_snmp.dll ;extension=php_soap.dll ;extension=php_sockets.dll ;extension=php_sqlite3.dll ;extension=php_sybase_ct.dll ;extension=php_tidy.dll ;extension=php_xmlrpc.dll ;extension=php_xsl.dll ;;;;;;;;;;;;;;;;;;; ; Module Settings ; ;;;;;;;;;;;;;;;;;;; [CLI Server] ; Whether the CLI web server uses ANSI color coding in its terminal output. cli_server.color = On [Date] ; Defines the default timezone used by the date functions ; http://php.net/date.timezone ;date.timezone = ; http://php.net/date.default-latitude ;date.default_latitude = 31.7667 ; http://php.net/date.default-longitude ;date.default_longitude = 35.2333 ; http://php.net/date.sunrise-zenith ;date.sunrise_zenith = 90.583333 ; http://php.net/date.sunset-zenith ;date.sunset_zenith = 90.583333 [filter] ; http://php.net/filter.default ;filter.default = unsafe_raw ; http://php.net/filter.default-flags ;filter.default_flags = [iconv] ;iconv.input_encoding = ISO-8859-1 ;iconv.internal_encoding = ISO-8859-1 ;iconv.output_encoding = ISO-8859-1 [intl] ;intl.default_locale = ; This directive allows you to produce PHP errors when some error ; happens within intl functions. The value is the level of the error produced. ; Default is 0, which does not produce any errors. ;intl.error_level = E_WARNING [sqlite] ; http://php.net/sqlite.assoc-case ;sqlite.assoc_case = 0 [sqlite3] ;sqlite3.extension_dir = [Pcre] ;PCRE library backtracking limit. ; http://php.net/pcre.backtrack-limit ;pcre.backtrack_limit=100000 ;PCRE library recursion limit. ;Please note that if you set this value to a high number you may consume all ;the available process stack and eventually crash PHP (due to reaching the ;stack size limit imposed by the Operating System). ; http://php.net/pcre.recursion-limit ;pcre.recursion_limit=100000 [Pdo] ; Whether to pool ODBC connections. Can be one of "strict", "relaxed" or "off" ; http://php.net/pdo-odbc.connection-pooling ;pdo_odbc.connection_pooling=strict ;pdo_odbc.db2_instance_name [Pdo_mysql] ; If mysqlnd is used: Number of cache slots for the internal result set cache ; http://php.net/pdo_mysql.cache_size pdo_mysql.cache_size = 2000 ; Default socket name for local MySQL connects. If empty, uses the built-in ; MySQL defaults. ; http://php.net/pdo_mysql.default-socket pdo_mysql.default_socket= [Phar] ; http://php.net/phar.readonly ;phar.readonly = On ; http://php.net/phar.require-hash ;phar.require_hash = On ;phar.cache_list = [mail function] ; For Win32 only. ; http://php.net/smtp SMTP = localhost ; http://php.net/smtp-port smtp_port = 25 ; For Win32 only. ; http://php.net/sendmail-from ;sendmail_from = [email protected] ; For Unix only. You may supply arguments as well (default: "sendmail -t -i"). ; http://php.net/sendmail-path ;sendmail_path = ; Force the addition of the specified parameters to be passed as extra parameters ; to the sendmail binary. These parameters will always replace the value of ; the 5th parameter to mail(), even in safe mode. ;mail.force_extra_parameters = ; Add X-PHP-Originating-Script: that will include uid of the script followed by the filename mail.add_x_header = On ; The path to a log file that will log all mail() calls. Log entries include ; the full path of the script, line number, To address and headers. ;mail.log = ; Log mail to syslog (Event Log on Windows). ;mail.log = syslog [SQL] ; http://php.net/sql.safe-mode sql.safe_mode = Off [ODBC] ; http://php.net/odbc.default-db ;odbc.default_db = Not yet implemented ; http://php.net/odbc.default-user ;odbc.default_user = Not yet implemented ; http://php.net/odbc.default-pw ;odbc.default_pw = Not yet implemented ; Controls the ODBC cursor model. ; Default: SQL_CURSOR_STATIC (default). ;odbc.default_cursortype ; Allow or prevent persistent links. ; http://php.net/odbc.allow-persistent odbc.allow_persistent = On ; Check that a connection is still valid before reuse. ; http://php.net/odbc.check-persistent odbc.check_persistent = On ; Maximum number of persistent links. -1 means no limit. ; http://php.net/odbc.max-persistent odbc.max_persistent = -1 ; Maximum number of links (persistent + non-persistent). -1 means no limit. ; http://php.net/odbc.max-links odbc.max_links = -1 ; Handling of LONG fields. Returns number of bytes to variables. 0 means ; passthru. ; http://php.net/odbc.defaultlrl odbc.defaultlrl = 4096 ; Handling of binary data. 0 means passthru, 1 return as is, 2 convert to char. ; See the documentation on odbc_binmode and odbc_longreadlen for an explanation ; of odbc.defaultlrl and odbc.defaultbinmode ; http://php.net/odbc.defaultbinmode odbc.defaultbinmode = 1 ;birdstep.max_links = -1 [Interbase] ; Allow or prevent persistent links. ibase.allow_persistent = 1 ; Maximum number of persistent links. -1 means no limit. ibase.max_persistent = -1 ; Maximum number of links (persistent + non-persistent). -1 means no limit. ibase.max_links = -1 ; Default database name for ibase_connect(). ;ibase.default_db = ; Default username for ibase_connect(). ;ibase.default_user = ; Default password for ibase_connect(). ;ibase.default_password = ; Default charset for ibase_connect(). ;ibase.default_charset = ; Default timestamp format. ibase.timestampformat = "%Y-%m-%d %H:%M:%S" ; Default date format. ibase.dateformat = "%Y-%m-%d" ; Default time format. ibase.timeformat = "%H:%M:%S" [MySQL] ; Allow accessing, from PHP's perspective, local files with LOAD DATA statements ; http://php.net/mysql.allow_local_infile mysql.allow_local_infile = On ; Allow or prevent persistent links. ; http://php.net/mysql.allow-persistent mysql.allow_persistent = On ; If mysqlnd is used: Number of cache slots for the internal result set cache ; http://php.net/mysql.cache_size mysql.cache_size = 2000 ; Maximum number of persistent links. -1 means no limit. ; http://php.net/mysql.max-persistent mysql.max_persistent = -1 ; Maximum number of links (persistent + non-persistent). -1 means no limit. ; http://php.net/mysql.max-links mysql.max_links = -1 ; Default port number for mysql_connect(). If unset, mysql_connect() will use ; the $MYSQL_TCP_PORT or the mysql-tcp entry in /etc/services or the ; compile-time value defined MYSQL_PORT (in that order). Win32 will only look ; at MYSQL_PORT. ; http://php.net/mysql.default-port mysql.default_port = ; Default socket name for local MySQL connects. If empty, uses the built-in ; MySQL defaults. ; http://php.net/mysql.default-socket mysql.default_socket = ; Default host for mysql_connect() (doesn't apply in safe mode). ; http://php.net/mysql.default-host mysql.default_host = ; Default user for mysql_connect() (doesn't apply in safe mode). ; http://php.net/mysql.default-user mysql.default_user = ; Default password for mysql_connect() (doesn't apply in safe mode). ; Note that this is generally a *bad* idea to store passwords in this file. ; *Any* user with PHP access can run 'echo get_cfg_var("mysql.default_password") ; and reveal this password! And of course, any users with read access to this ; file will be able to reveal the password as well. ; http://php.net/mysql.default-password mysql.default_password = ; Maximum time (in seconds) for connect timeout. -1 means no limit ; http://php.net/mysql.connect-timeout mysql.connect_timeout = 60 ; Trace mode. When trace_mode is active (=On), warnings for table/index scans and ; SQL-Errors will be displayed. ; http://php.net/mysql.trace-mode mysql.trace_mode = Off [MySQLi] ; Maximum number of persistent links. -1 means no limit. ; http://php.net/mysqli.max-persistent mysqli.max_persistent = -1 ; Allow accessing, from PHP's perspective, local files with LOAD DATA statements ; http://php.net/mysqli.allow_local_infile ;mysqli.allow_local_infile = On ; Allow or prevent persistent links. ; http://php.net/mysqli.allow-persistent mysqli.allow_persistent = On ; Maximum number of links. -1 means no limit. ; http://php.net/mysqli.max-links mysqli.max_links = -1 ; If mysqlnd is used: Number of cache slots for the internal result set cache ; http://php.net/mysqli.cache_size mysqli.cache_size = 2000 ; Default port number for mysqli_connect(). If unset, mysqli_connect() will use ; the $MYSQL_TCP_PORT or the mysql-tcp entry in /etc/services or the ; compile-time value defined MYSQL_PORT (in that order). Win32 will only look ; at MYSQL_PORT. ; http://php.net/mysqli.default-port mysqli.default_port = 3306 ; Default socket name for local MySQL connects. If empty, uses the built-in ; MySQL defaults. ; http://php.net/mysqli.default-socket mysqli.default_socket = ; Default host for mysql_connect() (doesn't apply in safe mode). ; http://php.net/mysqli.default-host mysqli.default_host = ; Default user for mysql_connect() (doesn't apply in safe mode). ; http://php.net/mysqli.default-user mysqli.default_user = ; Default password for mysqli_connect() (doesn't apply in safe mode). ; Note that this is generally a *bad* idea to store passwords in this file. ; *Any* user with PHP access can run 'echo get_cfg_var("mysqli.default_pw") ; and reveal this password! And of course, any users with read access to this ; file will be able to reveal the password as well. ; http://php.net/mysqli.default-pw mysqli.default_pw = ; Allow or prevent reconnect mysqli.reconnect = Off [mysqlnd] ; Enable / Disable collection of general statistics by mysqlnd which can be ; used to tune and monitor MySQL operations. ; http://php.net/mysqlnd.collect_statistics mysqlnd.collect_statistics = On ; Enable / Disable collection of memory usage statistics by mysqlnd which can be ; used to tune and monitor MySQL operations. ; http://php.net/mysqlnd.collect_memory_statistics mysqlnd.collect_memory_statistics = On ; Size of a pre-allocated buffer used when sending commands to MySQL in bytes. ; http://php.net/mysqlnd.net_cmd_buffer_size ;mysqlnd.net_cmd_buffer_size = 2048 ; Size of a pre-allocated buffer used for reading data sent by the server in ; bytes. ; http://php.net/mysqlnd.net_read_buffer_size ;mysqlnd.net_read_buffer_size = 32768 [OCI8] ; Connection: Enables privileged connections using external ; credentials (OCI_SYSOPER, OCI_SYSDBA) ; http://php.net/oci8.privileged-connect ;oci8.privileged_connect = Off ; Connection: The maximum number of persistent OCI8 connections per ; process. Using -1 means no limit. ; http://php.net/oci8.max-persistent ;oci8.max_persistent = -1 ; Connection: The maximum number of seconds a process is allowed to ; maintain an idle persistent connection. Using -1 means idle ; persistent connections will be maintained forever. ; http://php.net/oci8.persistent-timeout ;oci8.persistent_timeout = -1 ; Connection: The number of seconds that must pass before issuing a ; ping during oci_pconnect() to check the connection validity. When ; set to 0, each oci_pconnect() will cause a ping. Using -1 disables ; pings completely. ; http://php.net/oci8.ping-interval ;oci8.ping_interval = 60 ; Connection: Set this to a user chosen connection class to be used ; for all pooled server requests with Oracle 11g Database Resident ; Connection Pooling (DRCP). To use DRCP, this value should be set to ; the same string for all web servers running the same application, ; the database pool must be configured, and the connection string must ; specify to use a pooled server. ;oci8.connection_class = ; High Availability: Using On lets PHP receive Fast Application ; Notification (FAN) events generated when a database node fails. The ; database must also be configured to post FAN events. ;oci8.events = Off ; Tuning: This option enables statement caching, and specifies how ; many statements to cache. Using 0 disables statement caching. ; http://php.net/oci8.statement-cache-size ;oci8.statement_cache_size = 20 ; Tuning: Enables statement prefetching and sets the default number of ; rows that will be fetched automatically after statement execution. ; http://php.net/oci8.default-prefetch ;oci8.default_prefetch = 100 ; Compatibility. Using On means oci_close() will not close ; oci_connect() and oci_new_connect() connections. ; http://php.net/oci8.old-oci-close-semantics ;oci8.old_oci_close_semantics = Off [PostgreSQL] ; Allow or prevent persistent links. ; http://php.net/pgsql.allow-persistent pgsql.allow_persistent = On ; Detect broken persistent links always with pg_pconnect(). ; Auto reset feature requires a little overheads. ; http://php.net/pgsql.auto-reset-persistent pgsql.auto_reset_persistent = Off ; Maximum number of persistent links. -1 means no limit. ; http://php.net/pgsql.max-persistent pgsql.max_persistent = -1 ; Maximum number of links (persistent+non persistent). -1 means no limit. ; http://php.net/pgsql.max-links pgsql.max_links = -1 ; Ignore PostgreSQL backends Notice message or not. ; Notice message logging require a little overheads. ; http://php.net/pgsql.ignore-notice pgsql.ignore_notice = 0 ; Log PostgreSQL backends Notice message or not. ; Unless pgsql.ignore_notice=0, module cannot log notice message. ; http://php.net/pgsql.log-notice pgsql.log_notice = 0 [Sybase-CT] ; Allow or prevent persistent links. ; http://php.net/sybct.allow-persistent sybct.allow_persistent = On ; Maximum number of persistent links. -1 means no limit. ; http://php.net/sybct.max-persistent sybct.max_persistent = -1 ; Maximum number of links (persistent + non-persistent). -1 means no limit. ; http://php.net/sybct.max-links sybct.max_links = -1 ; Minimum server message severity to display. ; http://php.net/sybct.min-server-severity sybct.min_server_severity = 10 ; Minimum client message severity to display. ; http://php.net/sybct.min-client-severity sybct.min_client_severity = 10 ; Set per-context timeout ; http://php.net/sybct.timeout ;sybct.timeout= ;sybct.packet_size ; The maximum time in seconds to wait for a connection attempt to succeed before returning failure. ; Default: one minute ;sybct.login_timeout= ; The name of the host you claim to be connecting from, for display by sp_who. ; Default: none ;sybct.hostname= ; Allows you to define how often deadlocks are to be retried. -1 means "forever". ; Default: 0 ;sybct.deadlock_retry_count= [bcmath] ; Number of decimal digits for all bcmath functions. ; http://php.net/bcmath.scale bcmath.scale = 0 [browscap] ; http://php.net/browscap ;browscap = extra/browscap.ini [Session] ; Handler used to store/retrieve data. ; http://php.net/session.save-handler session.save_handler = files ; Argument passed to save_handler. In the case of files, this is the path ; where data files are stored. Note: Windows users have to change this ; variable in order to use PHP's session functions. ; ; The path can be defined as: ; ; session.save_path = "N;/path" ; ; where N is an integer. Instead of storing all the session files in ; /path, what this will do is use subdirectories N-levels deep, and ; store the session data in those directories. This is useful if you ; or your OS have problems with lots of files in one directory, and is ; a more efficient layout for servers that handle lots of sessions. ; ; NOTE 1: PHP will not create this directory structure automatically. ; You can use the script in the ext/session dir for that purpose. ; NOTE 2: See the section on garbage collection below if you choose to ; use subdirectories for session storage ; ; The file storage module creates files using mode 600 by default. ; You can change that by using ; ; session.save_path = "N;MODE;/path" ; ; where MODE is the octal representation of the mode. Note that this ; does not overwrite the process's umask. ; http://php.net/session.save-path ;session.save_path = "/tmp" ; Whether to use cookies. ; http://php.net/session.use-cookies session.use_cookies = 1 ; http://php.net/session.cookie-secure ;session.cookie_secure = ; This option forces PHP to fetch and use a cookie for storing and maintaining ; the session id. We encourage this operation as it's very helpful in combating ; session hijacking when not specifying and managing your own session id. It is ; not the end all be all of session hijacking defense, but it's a good start. ; http://php.net/session.use-only-cookies session.use_only_cookies = 1 ; Name of the session (used as cookie name). ; http://php.net/session.name session.name = PHPSESSID ; Initialize session on request startup. ; http://php.net/session.auto-start session.auto_start = 0 ; Lifetime in seconds of cookie or, if 0, until browser is restarted. ; http://php.net/session.cookie-lifetime session.cookie_lifetime = 0 ; The path for which the cookie is valid. ; http://php.net/session.cookie-path session.cookie_path = / ; The domain for which the cookie is valid. ; http://php.net/session.cookie-domain session.cookie_domain = ; Whether or not to add the httpOnly flag to the cookie, which makes it inaccessible to browser scripting languages such as JavaScript. ; http://php.net/session.cookie-httponly session.cookie_httponly = ; Handler used to serialize data. php is the standard serializer of PHP. ; http://php.net/session.serialize-handler session.serialize_handler = php ; Defines the probability that the 'garbage collection' process is started ; on every session initialization. The probability is calculated by using ; gc_probability/gc_divisor. Where session.gc_probability is the numerator ; and gc_divisor is the denominator in the equation. Setting this value to 1 ; when the session.gc_divisor value is 100 will give you approximately a 1% chance ; the gc will run on any give request. ; Default Value: 1 ; Development Value: 1 ; Production Value: 1 ; http://php.net/session.gc-probability session.gc_probability = 1 ; Defines the probability that the 'garbage collection' process is started on every ; session initialization. The probability is calculated by using the following equation: ; gc_probability/gc_divisor. Where session.gc_probability is the numerator and ; session.gc_divisor is the denominator in the equation. Setting this value to 1 ; when the session.gc_divisor value is 100 will give you approximately a 1% chance ; the gc will run on any give request. Increasing this value to 1000 will give you ; a 0.1% chance the gc will run on any give request. For high volume production servers, ; this is a more efficient approach. ; Default Value: 100 ; Development Value: 1000 ; Production Value: 1000 ; http://php.net/session.gc-divisor session.gc_divisor = 1000 ; After this number of seconds, stored data will be seen as 'garbage' and ; cleaned up by the garbage collection process. ; http://php.net/session.gc-maxlifetime session.gc_maxlifetime = 1440 ; NOTE: If you are using the subdirectory option for storing session files ; (see session.save_path above), then garbage collection does *not* ; happen automatically. You will need to do your own garbage ; collection through a shell script, cron entry, or some other method. ; For example, the following script would is the equivalent of ; setting session.gc_maxlifetime to 1440 (1440 seconds = 24 minutes): ; find /path/to/sessions -cmin +24 -type f | xargs rm ; Check HTTP Referer to invalidate externally stored URLs containing ids. ; HTTP_REFERER has to contain this substring for the session to be ; considered as valid. ; http://php.net/session.referer-check session.referer_check = ; How many bytes to read from the file. ; http://php.net/session.entropy-length ;session.entropy_length = 32 ; Specified here to create the session id. ; http://php.net/session.entropy-file ; Defaults to /dev/urandom ; On systems that don't have /dev/urandom but do have /dev/arandom, this will default to /dev/arandom ; If neither are found at compile time, the default is no entropy file. ; On windows, setting the entropy_length setting will activate the ; Windows random source (using the CryptoAPI) ;session.entropy_file = /dev/urandom ; Set to {nocache,private,public,} to determine HTTP caching aspects ; or leave this empty to avoid sending anti-caching headers. ; http://php.net/session.cache-limiter session.cache_limiter = nocache ; Document expires after n minutes. ; http://php.net/session.cache-expire session.cache_expire = 180 ; trans sid support is disabled by default. ; Use of trans sid may risk your users security. ; Use this option with caution. ; - User may send URL contains active session ID ; to other person via. email/irc/etc. ; - URL that contains active session ID may be stored ; in publicly accessible computer. ; - User may access your site with the same session ID ; always using URL stored in browser's history or bookmarks. ; http://php.net/session.use-trans-sid session.use_trans_sid = 0 ; Select a hash function for use in generating session ids. ; Possible Values ; 0 (MD5 128 bits) ; 1 (SHA-1 160 bits) ; This option may also be set to the name of any hash function supported by ; the hash extension. A list of available hashes is returned by the hash_algos() ; function. ; http://php.net/session.hash-function session.hash_function = 0 ; Define how many bits are stored in each character when converting ; the binary hash data to something readable. ; Possible values: ; 4 (4 bits: 0-9, a-f) ; 5 (5 bits: 0-9, a-v) ; 6 (6 bits: 0-9, a-z, A-Z, "-", ",") ; Default Value: 4 ; Development Value: 5 ; Production Value: 5 ; http://php.net/session.hash-bits-per-character session.hash_bits_per_character = 5 ; The URL rewriter will look for URLs in a defined set of HTML tags. ; form/fieldset are special; if you include them here, the rewriter will ; add a hidden <input> field with the info which is otherwise appended ; to URLs. If you want XHTML conformity, remove the form entry. ; Note that all valid entries require a "=", even if no value follows. ; Default Value: "a=href,area=href,frame=src,form=,fieldset=" ; Development Value: "a=href,area=href,frame=src,input=src,form=fakeentry" ; Production Value: "a=href,area=href,frame=src,input=src,form=fakeentry" ; http://php.net/url-rewriter.tags url_rewriter.tags = "a=href,area=href,frame=src,input=src,form=fakeentry" ; Enable upload progress tracking in $_SESSION ; Default Value: On ; Development Value: On ; Production Value: On ; http://php.net/session.upload-progress.enabled ;session.upload_progress.enabled = On ; Cleanup the progress information as soon as all POST data has been read ; (i.e. upload completed). ; Default Value: On ; Development Value: On ; Production Value: On ; http://php.net/session.upload-progress.cleanup ;session.upload_progress.cleanup = On ; A prefix used for the upload progress key in $_SESSION ; Default Value: "upload_progress_" ; Development Value: "upload_progress_" ; Production Value: "upload_progress_" ; http://php.net/session.upload-progress.prefix ;session.upload_progress.prefix = "upload_progress_" ; The index name (concatenated with the prefix) in $_SESSION ; containing the upload progress information ; Default Value: "PHP_SESSION_UPLOAD_PROGRESS" ; Development Value: "PHP_SESSION_UPLOAD_PROGRESS" ; Production Value: "PHP_SESSION_UPLOAD_PROGRESS" ; http://php.net/session.upload-progress.name ;session.upload_progress.name = "PHP_SESSION_UPLOAD_PROGRESS" ; How frequently the upload progress should be updated. ; Given either in percentages (per-file), or in bytes ; Default Value: "1%" ; Development Value: "1%" ; Production Value: "1%" ; http://php.net/session.upload-progress.freq ;session.upload_progress.freq = "1%" ; The minimum delay between updates, in seconds ; Default Value: 1 ; Development Value: 1 ; Production Value: 1 ; http://php.net/session.upload-progress.min-freq ;session.upload_progress.min_freq = "1" [MSSQL] ; Allow or prevent persistent links. mssql.allow_persistent = On ; Maximum number of persistent links. -1 means no limit. mssql.max_persistent = -1 ; Maximum number of links (persistent+non persistent). -1 means no limit. mssql.max_links = -1 ; Minimum error severity to display. mssql.min_error_severity = 10 ; Minimum message severity to display. mssql.min_message_severity = 10 ; Compatibility mode with old versions of PHP 3.0. mssql.compatability_mode = Off ; Connect timeout ;mssql.connect_timeout = 5 ; Query timeout ;mssql.timeout = 60 ; Valid range 0 - 2147483647. Default = 4096. ;mssql.textlimit = 4096 ; Valid range 0 - 2147483647. Default = 4096. ;mssql.textsize = 4096 ; Limits the number of records in each batch. 0 = all records in one batch. ;mssql.batchsize = 0 ; Specify how datetime and datetim4 columns are returned ; On => Returns data converted to SQL server settings ; Off => Returns values as YYYY-MM-DD hh:mm:ss ;mssql.datetimeconvert = On ; Use NT authentication when connecting to the server mssql.secure_connection = Off ; Specify max number of processes. -1 = library default ; msdlib defaults to 25 ; FreeTDS defaults to 4096 ;mssql.max_procs = -1 ; Specify client character set. ; If empty or not set the client charset from freetds.conf is used ; This is only used when compiled with FreeTDS ;mssql.charset = "ISO-8859-1" [Assertion] ; Assert(expr); active by default. ; http://php.net/assert.active ;assert.active = On ; Issue a PHP warning for each failed assertion. ; http://php.net/assert.warning ;assert.warning = On ; Don't bail out by default. ; http://php.net/assert.bail ;assert.bail = Off ; User-function to be called if an assertion fails. ; http://php.net/assert.callback ;assert.callback = 0 ; Eval the expression with current error_reporting(). Set to true if you want ; error_reporting(0) around the eval(). ; http://php.net/assert.quiet-eval ;assert.quiet_eval = 0 [COM] ; path to a file containing GUIDs, IIDs or filenames of files with TypeLibs ; http://php.net/com.typelib-file ;com.typelib_file = ; allow Distributed-COM calls ; http://php.net/com.allow-dcom ;com.allow_dcom = true ; autoregister constants of a components typlib on com_load() ; http://php.net/com.autoregister-typelib ;com.autoregister_typelib = true ; register constants casesensitive ; http://php.net/com.autoregister-casesensitive ;com.autoregister_casesensitive = false ; show warnings on duplicate constant registrations ; http://php.net/com.autoregister-verbose ;com.autoregister_verbose = true ; The default character set code-page to use when passing strings to and from COM objects. ; Default: system ANSI code page ;com.code_page= [mbstring] ; language for internal character representation. ; http://php.net/mbstring.language ;mbstring.language = Japanese ; internal/script encoding. ; Some encoding cannot work as internal encoding. ; (e.g. SJIS, BIG5, ISO-2022-*) ; http://php.net/mbstring.internal-encoding ;mbstring.internal_encoding = EUC-JP ; http input encoding. ; http://php.net/mbstring.http-input ;mbstring.http_input = auto ; http output encoding. mb_output_handler must be ; registered as output buffer to function ; http://php.net/mbstring.http-output ;mbstring.http_output = SJIS ; enable automatic encoding translation according to ; mbstring.internal_encoding setting. Input chars are ; converted to internal encoding by setting this to On. ; Note: Do _not_ use automatic encoding translation for ; portable libs/applications. ; http://php.net/mbstring.encoding-translation ;mbstring.encoding_translation = Off ; automatic encoding detection order. ; auto means ; http://php.net/mbstring.detect-order ;mbstring.detect_order = auto ; substitute_character used when character cannot be converted ; one from another ; http://php.net/mbstring.substitute-character ;mbstring.substitute_character = none; ; overload(replace) single byte functions by mbstring functions. ; mail(), ereg(), etc are overloaded by mb_send_mail(), mb_ereg(), ; etc. Possible values are 0,1,2,4 or combination of them. ; For example, 7 for overload everything. ; 0: No overload ; 1: Overload mail() function ; 2: Overload str*() functions ; 4: Overload ereg*() functions ; http://php.net/mbstring.func-overload ;mbstring.func_overload = 0 ; enable strict encoding detection. ;mbstring.strict_detection = Off ; This directive specifies the regex pattern of content types for which mb_output_handler() ; is activated. ; Default: mbstring.http_output_conv_mimetype=^(text/|application/xhtml\+xml) ;mbstring.http_output_conv_mimetype= [gd] ; Tell the jpeg decode to ignore warnings and try to create ; a gd image. The warning will then be displayed as notices ; disabled by default ; http://php.net/gd.jpeg-ignore-warning ;gd.jpeg_ignore_warning = 0 [exif] ; Exif UNICODE user comments are handled as UCS-2BE/UCS-2LE and JIS as JIS. ; With mbstring support this will automatically be converted into the encoding ; given by corresponding encode setting. When empty mbstring.internal_encoding ; is used. For the decode settings you can distinguish between motorola and ; intel byte order. A decode setting cannot be empty. ; http://php.net/exif.encode-unicode ;exif.encode_unicode = ISO-8859-15 ; http://php.net/exif.decode-unicode-motorola ;exif.decode_unicode_motorola = UCS-2BE ; http://php.net/exif.decode-unicode-intel ;exif.decode_unicode_intel = UCS-2LE ; http://php.net/exif.encode-jis ;exif.encode_jis = ; http://php.net/exif.decode-jis-motorola ;exif.decode_jis_motorola = JIS ; http://php.net/exif.decode-jis-intel ;exif.decode_jis_intel = JIS [Tidy] ; The path to a default tidy configuration file to use when using tidy ; http://php.net/tidy.default-config ;tidy.default_config = /usr/local/lib/php/default.tcfg ; Should tidy clean and repair output automatically? ; WARNING: Do not use this option if you are generating non-html content ; such as dynamic images ; http://php.net/tidy.clean-output tidy.clean_output = Off [soap] ; Enables or disables WSDL caching feature. ; http://php.net/soap.wsdl-cache-enabled soap.wsdl_cache_enabled=1 ; Sets the directory name where SOAP extension will put cache files. ; http://php.net/soap.wsdl-cache-dir soap.wsdl_cache_dir="/tmp" ; (time to live) Sets the number of second while cached file will be used ; instead of original one. ; http://php.net/soap.wsdl-cache-ttl soap.wsdl_cache_ttl=86400 ; Sets the size of the cache limit. (Max. number of WSDL files to cache) soap.wsdl_cache_limit = 5 [sysvshm] ; A default size of the shared memory segment ;sysvshm.init_mem = 10000 [ldap] ; Sets the maximum number of open links or -1 for unlimited. ldap.max_links = -1 [mcrypt] ; For more information about mcrypt settings see http://php.net/mcrypt-module-open ; Directory where to load mcrypt algorithms ; Default: Compiled in into libmcrypt (usually /usr/local/lib/libmcrypt) ;mcrypt.algorithms_dir= ; Directory where to load mcrypt modes ; Default: Compiled in into libmcrypt (usually /usr/local/lib/libmcrypt) ;mcrypt.modes_dir= [dba] ;dba.default_handler= [curl] ; A default value for the CURLOPT_CAINFO option. This is required to be an ; absolute path. ;curl.cainfo = ; Local Variables: ; tab-width: 4 ; End:
Contents Overview 1 Lesson 1: Index Concepts 3 Lesson 2: Concepts – Statistics 29 Lesson 3: Concepts – Query Optimization 37 Lesson 4: Information Collection and Analysis 61 Lesson 5: Formulating and Implementing Resolution 75 Module 6: Troubleshooting Query Performance Overview At the end of this module, you will be able to:  Describe the different types of indexes and how indexes can be used to improve performance.  Describe what statistics are used for and how they can help in optimizing query performance.  Describe how queries are optimized.  Analyze the information collected from various tools.  Formulate resolution to query performance problems. Lesson 1: Index Concepts Indexes are the most useful tool for improving query performance. Without a useful index, Microsoft® SQL Server™ must search every row on every page in table to find the rows to return. With a multitable query, SQL Server must sometimes search a table multiple times so each page is scanned much more than once. Having useful indexes speeds up finding individual rows in a table, as well as finding the matching rows needed to join two tables. What You Will Learn After completing this lesson, you will be able to:  Understand the structure of SQL Server indexes.  Describe how SQL Server uses indexes to find rows.  Describe how fillfactor can impact the performance of data retrieval and insertion.  Describe the different types of fragmentation that can occur within an index. Recommended Reading  Chapter 8: “Indexes”, Inside SQL Server 2000 by Kalen Delaney  Chapter 11: “Batches, Stored Procedures and Functions”, Inside SQL Server 2000 by Kalen Delaney Finding Rows without Indexes With No Indexes, A Table Must Be Scanned SQL Server keeps track of which pages belong to a table or index by using IAM pages. If there is no clustered index, there is a sysindexes row for the table with an indid value of 0, and that row will keep track of the address of the first IAM for the table. The IAM is a giant bitmap, and every 1 bit indicates that the corresponding extent belongs to the table. The IAM allows SQL Server to do efficient prefetching of the table’s extents, but every row still must be examined. General Index Structure All SQL Server Indexes Are Organized As B-Trees Indexes in SQL Server store their information using standard B-trees. A B-tree provides fast access to data by searching on a key value of the index. B-trees cluster records with similar keys. The B stands for balanced, and balancing the tree is a core feature of a B-tree’s usefulness. The trees are managed, and branches are grafted as necessary, so that navigating down the tree to find a value and locate a specific record takes only a few page accesses. Because the trees are balanced, finding any record requires about the same amount of resources, and retrieval speed is consistent because the index has the same depth throughout. Clustered and Nonclustered Indexes Both Index Types Have Many Common Features An index consists of a tree with a root from which the navigation begins, possible intermediate index levels, and bottom-level leaf pages. You use the index to find the correct leaf page. The number of levels in an index will vary depending on the number of rows in the table and the size of the key column or columns for the index. If you create an index using a large key, fewer entries will fit on a page, so more pages (and possibly more levels) will be needed for the index. On a qualified select, update, or delete, the correct leaf page will be the lowest page of the tree in which one or more rows with the specified key or keys reside. A qualified operation is one that affects only specific rows that satisfy the conditions of a WHERE clause, as opposed to accessing the whole table. An index can have multiple node levels An index page above the leaf is called a node page. Each index row in node pages contains an index key (or set of keys for a composite index) and a pointer to a page at the next level for which the first key value is the same as the key value in the current index row. Leaf Level contains all key values In any index, whether clustered or nonclustered, the leaf level contains every key value, in key sequence. In SQL Server 2000, the sequence can be either ascending or descending. The sysindexes table contains all sizing, location and distribution information Any information about size of indexes or tables is stored in sysindexes. The only source of any storage location information is the sysindexes table, which keeps track of the address of the root page for every index, and the first IAM page for the index or table. There is also a column for the first page of the table, but this is not guaranteed to be reliable. SQL Server can find all pages belonging to an index or table by examining the IAM pages. Sysindexes contains a pointer to the first IAM page, and each IAM page contains a pointer to the next one. The Difference between Clustered and Nonclustered Indexes The main difference between the two types of indexes is how much information is stored at the leaf. The leaf levels of both types of indexes contain all the key values in order, but they also contain other information. Clustered Indexes The Leaf Level of a Clustered Index Is the Data The leaf level of a clustered index contains the data pages, not just the index keys. Another way to say this is that the data itself is part of the clustered index. A clustered index keeps the data in a table ordered around the key. The data pages in the table are kept in a doubly linked list called the page chain. The order of pages in the page chain, and the order of rows on the data pages, is the order of the index key or keys. Deciding which key to cluster on is an important performance consideration. When the index is traversed to the leaf level, the data itself has been retrieved, not simply pointed to. Uniqueness Is Maintained In Key Values In SQL Server 2000, all clustered indexes are unique. If you build a clustered index without specifying the unique keyword, SQL Server forces uniqueness by adding a uniqueifier to the rows when necessary. This uniqueifier is a 4-byte value added as an additional sort key to only the rows that have duplicates of their primary sort key. You can see this extra value if you use DBCC PAGE to look at the actual index rows the section on indexes internal. . Finding Rows in a Clustered Index The Leaf Level of a Clustered Index Contains the Data A clustered index is like a telephone directory in which all of the rows for customers with the same last name are clustered together in the same part of the book. Just as the organization of a telephone directory makes it easy for a person to search, SQL Server quickly searches a table with a clustered index. Because a clustered index determines the sequence in which rows are stored in a table, there can only be one clustered index for a table at a time. Performance Considerations Keeping your clustered key value small increases the number of index rows that can be placed on an index page and decreases the number of levels that must be traversed. This minimizes I/O. As we’ll see, the clustered key is duplicated in every nonclustered index row, so keeping your clustered key small will allow you to have more index fit per page in all your indexes. Note The query corresponding to the slide is: SELECT lastname, firstname FROM member WHERE lastname = ‘Ota’ Nonclustered Indexes The Leaf Level of a Nonclustered Index Contains a Bookmark A nonclustered index is like the index of a textbook. The data is stored in one place and the index is stored in another. Pointers indicate the storage location of the indexed items in the underlying table. In a nonclustered index, the leaf level contains each index key, plus a bookmark that tells SQL Server where to find the data row corresponding to the key in the index. A bookmark can take one of two forms:  If the table has a clustered index, the bookmark is the clustered index key for the corresponding data row. This clustered key can be multiple column if the clustered index is composite, or is defined to be non-unique.  If the table is a heap (in other words, it has no clustered index), the bookmark is a RID, which is an actual row locator in the form File#:Page#:Slot#. Finding Rows with a NC Index on a Heap Nonclustered Indexes Are Very Efficient When Searching For A Single Row After the nonclustered key at the leaf level of the index is found, only one more page access is needed to find the data row. Searching for a single row using a nonclustered index is almost as efficient as searching for a single row in a clustered index. However, if we are searching for multiple rows, such as duplicate values, or keys in a range, anything more than a small number of rows will make the nonclustered index search very inefficient. Note The query corresponding to the slide is: SELECT lastname, firstname FROM member WHERE lastname BETWEEN ‘Master’ AND ‘Rudd’ Finding Rows with a NC Index on a Clustered Table A Clustered Key Is Used as the Bookmark for All Nonclustered Indexes If the table has a clustered index, all columns of the clustered key will be duplicated in the nonclustered index leaf rows, unless there is overlap between the clustered and nonclustered key. For example, if the clustered index is on (lastname, firstname) and a nonclustered index is on firstname, the firstname value will not be duplicated in the nonclustered index leaf rows. Note The query corresponding to the slide is: SELECT lastname, firstname, phone FROM member WHERE firstname = ‘Mike’ Covering Indexes A Covering Index Provides the Fastest Data Access A covering index contains ALL the fields accessed in the query. Normally, only the columns in the WHERE clause are helpful in determining useful indexes, but for a covering index, all columns must be included. If all columns needed for the query are in the index, SQL Server never needs to access the data pages. If even one column in the query is not part of the index, the data rows must be accessed. The leaf level of an index is the only level that contains every key value, or set of key values. For a clustered index, the leaf level is the data itself, so in reality, a clustered index ALWAYS covers any query. Nevertheless, for most of our optimization discussions, we only consider nonclustered indexes. Scanning the leaf level of a nonclustered index is almost always faster than scanning a clustered index, so covering indexes are particular valuable when we need ALL the key values of a particular nonclustered index. Example: Select an aggregate value of a column with a clustered index. Suppose we have a nonclustered index on price, this query is covered: SELECT avg(price) from titles Since the clustered key is included in every nonclustered index row, the clustered key can be included in the covering. Suppose you have a nonclustered index on price and a clustered index on title_id; then this query is covered: SELECT title_id, price FROM titles WHERE price between 10 and 20 Performance Considerations In general, you do want to keep your indexes narrow. However, if you have a critical query that just is not giving you satisfactory performance no matter what you do, you should consider creating an index to cover it, or adding one or two extra columns to an existing index, so that the query will be covered. The leaf level of a nonclustered index is like a ‘mini’ clustered index, so you can have most of the benefits of clustering, even if there already is another clustered index on the table. The tradeoff to adding more, wider indexes for covering queries are the added disk space, and more overhead for updating those columns that are now part of the index. Bug In general, SQL Server will detect when a query is covered, and detect the possible covering indexes. However, in some cases, you must force SQL Server to use a covering index by including a WHERE clause, even if the WHERE clause will return ALL the rows in the table. This is SHILOH bug #352079 Steps to reproduce 1. Make copy of orders table from Northwind: USE Northwind CREATE TABLE [NewOrders] ( [OrderID] [int] NOT NULL , [CustomerID] [nchar] (5) NULL , [EmployeeID] [int] NULL , [OrderDate] [datetime] NULL , [RequiredDate] [datetime] NULL , [ShippedDate] [datetime] NULL , [ShipVia] [int] NULL , [Freight] [money] NULL , [ShipName] [nvarchar] (40) NULL, [ShipAddress] [nvarchar] (60) , [ShipCity] [nvarchar] (15) NULL, [ShipRegion] [nvarchar] (15) NULL, [ShipPostalCode] [nvarchar] (10) NULL, [ShipCountry] [nvarchar] (15) NULL ) INSERT into NewOrders SELECT * FROM Orders 2. Build nc index on OrderDate: create index dateindex on neworders(orderdate) 3. Test Query by looking at query plan: select orderdate from NewOrders The index is being scanned, as expected. 4. Build an index on orderId: create index orderid_index on neworders(orderID) 5. Test Query by looking at query plan: select orderdate from NewOrders Now the TABLE is being scanned, instead of the original index! Index Intersection Multiple Indexes Can Be Used On A Single Table In versions prior to SQL Server 7, only one index could be used for any table to process any single query. The only exception was a query involving an OR. In current SQL Server versions, multiple nonclustered indexes can each be accessed, retrieving a set of keys with bookmarks, and then the result sets can be joined on the common bookmarks. The optimizer weighs the cost of performing the unindexed join on the intermediate result sets, with the cost of only using one index, and then scanning the entire result set from that single index. Fillfactor and Performance Creating an Index with a Low Fillfactor Delays Page Splits when Inserting DBCC SHOWCONTIG will show you a low value for “Avg. Page Density” when a low fillfactor has been specified. This is good for inserts and updates, because it will delay the need to split pages to make room for new rows. It can be bad for scans, because fewer rows will be on each page, and more pages must be read to access the same amount of data. However, this cost will be minimal if the scan density value is good. Index Reorganization DBCC SHOWCONTIG Provides Lots of Information Here’s some sample output from running a basic DBCC SHOWCONTIG on the order details table in the Northwind database: DBCC SHOWCONTIG scanning 'Order Details' table... Table: 'Order Details' (325576198); index ID: 1, database ID:6 TABLE level scan performed. - Pages Scanned................................: 9 - Extents Scanned..............................: 6 - Extent Switches..............................: 5 - Avg. Pages per Extent........................: 1.5 - Scan Density [Best Count:Actual Count].......: 33.33% [2:6] - Logical Scan Fragmentation ..................: 0.00% - Extent Scan Fragmentation ...................: 16.67% - Avg. Bytes Free per Page.....................: 673.2 - Avg. Page Density (full).....................: 91.68% By default, DBCC SHOWCONTIG scans the page chain at the leaf level of the specified index and keeps track of the following values:  Average number of bytes free on each page (Avg. Bytes Free per Page)  Number of pages accessed (Pages scanned)  Number of extents accessed (Extents scanned)  Number of times a page had a lower page number than the previous page in the scan (This value for Out of order pages is not displayed, but is used for additional computations.)  Number of times a page in the scan was on a different extent than the previous page in the scan (Extent switches) SQL Server also keeps track of all the extents that have been accessed, and then it determines how many gaps are in the used extents. An extent is identified by the page number of its first page. So, if extents 8, 16, 24, 32, and 40 make up an index, there are no gaps. If the extents are 8, 16, 24, and 40, there is one gap. The value in DBCC SHOWCONTIG’s output called Extent Scan Fragmentation is computed by dividing the number of gaps by the number of extents, so in this example the Extent Scan Fragmentation is ¼, or 25 percent. A table using extents 8, 24, 40, and 56 has three gaps, and its Extent Scan Fragmentation is ¾, or 75 percent. The maximum number of gaps is the number of extents - 1, so Extent Scan Fragmentation can never be 100 percent. The value in DBCC SHOWCONTIG’s output called Logical Scan Fragmentation is computed by dividing the number of Out of order pages by the number of pages in the table. This value is meaningless in a heap. You can use either the Extent Scan Fragmentation value or the Logical Scan Fragmentation value to determine the general level of fragmentation in a table. The lower the value, the less fragmentation there is. Alternatively, you can use the value called Scan Density, which is computed by dividing the optimum number of extent switches by the actual number of extent switches. A high value means that there is little fragmentation. Scan Density is not valid if the table spans multiple files; therefore, it is less useful than the other values. SQL Server 2000 allows online defragmentation You can choose from several methods for removing fragmentation from an index. You could rebuild the index and have SQL Server allocate all new contiguous pages for you. To rebuild the index, you can use a simple DROP INDEX and CREATE INDEX combination, but in many cases using these commands is less than optimal. In particular, if the index is supporting a constraint, you cannot use the DROP INDEX command. Alternatively, you can use DBCC DBREINDEX, which can rebuild all the indexes on a table in one operation, or you can use the drop_existing clause along with CREATE INDEX. The drawback of these methods is that the table is unavailable while SQL Server is rebuilding the index. When you are rebuilding only nonclustered indexes, SQL Server takes a shared lock on the table, which means that users cannot make modifications, but other processes can SELECT from the table. Of course, those SELECT queries cannot take advantage of the index you are rebuilding, so they might not perform as well as they would otherwise. If you are rebuilding a clustered index, SQL Server takes an exclusive lock and does not allow access to the table, so your data is temporarily unavailable. SQL Server 2000 lets you defragment an index without completely rebuilding it. DBCC INDEXDEFRAG reorders the leaf-level pages into physical order as well as logical order, but using only the pages that are already allocated to the leaf level. This command does an in-place ordering, which is similar to a sorting technique called bubble sort (you might be familiar with this technique if you've studied and compared various sorting algorithms). In-place ordering can reduce logical fragmentation to 2 percent or less, making an ordered scan through the leaf level much faster. DBCC INDEXDEFRAG also compacts the pages of an index, based on the original fillfactor. The pages will not always end up with the original fillfactor, but SQL Server uses that value as a goal. The defragmentation process attempts to leave at least enough space for one average-size row on each page. In addition, if SQL Server cannot obtain a lock on a page during the compaction phase of DBCC INDEXDEFRAG, it skips the page and does not return to it. Any empty pages created as a result of compaction are removed. The algorithm SQL Server 2000 uses for DBCC INDEXDEFRAG finds the next physical page in a file belonging to the index's leaf level and the next logical page in the leaf level to swap it with. To find the next physical page, the algorithm scans the IAM pages belonging to that index. In a database spanning multiple files, in which a table or index has pages on more than one file, SQL Server handles pages on different files separately. SQL Server finds the next logical page by scanning the index's leaf level. After each page move, SQL Server drops all locks and saves the last key on the last page it moved. The next iteration of the algorithm uses the last key to find the next logical page. This process lets other users update the table and index while DBCC INDEXDEFRAG is running. Let us look at an example in which an index's leaf level consists of the following pages in the following logical order: 47 22 83 32 12 90 64 The first key is on page 47, and the last key is on page 64. SQL Server would have to scan the pages in this order to retrieve the data in sorted order. As its first step, DBCC INDEXDEFRAG would find the first physical page, 12, and the first logical page, 47. It would then swap the pages, using a temporary buffer as a holding area. After the first swap, the leaf level would look like this: 12 22 83 32 47 90 64 The next physical page is 22, which is also the next logical page, so no work would be necessary. DBCC INDEXDEFRAG would then swap the next physical page, 32, with the next logical page, 83: 12 22 32 83 47 90 64 After the next swap of 47 with 83, the leaf level would look like this: 12 22 32 47 83 90 64 Then, the defragmentation process would swap 64 with 83: 12 22 32 47 64 90 83 and 83 with 90: 12 22 32 47 64 83 90 At the end of the DBCC INDEXDEFRAG operation, the pages in the table or index are not contiguous, but their logical order matches their physical order. Now, if the pages were accessed from disk in sorted order, the head would need to move in only one direction. Keep in mind that DBCC INDEXDEFRAG uses only pages that are already part of the index's leaf level; it allocates no new pages. In addition, defragmenting a large table can take quite a while, and you will get a report every 5 minutes about the estimated percentage completed. However, except for the locks on the pages being switched, this command needs no additional locks. All the table's other pages and indexes are fully available for your applications to use during the defragmentation process. If you must completely rebuild an index because you want a new fillfactor, or if simple defragmentation is not enough because you want to remove all fragmentation from your indexes, another SQL Server 2000 improvement makes index rebuilding less of an imposition on the rest of the system. SQL Server 2000 lets you create an index in parallel—that is, using multiple processors—which drastically reduces the time necessary to perform the rebuild. The algorithm SQL Server 2000 uses, allows near-linear scaling with the number of processors you use for the rebuild, so four processors will take only one-fourth the time that one processor requires to rebuild an index. System availability increases because the length of time that a table is unavailable decreases. Note that only the SQL Server 2000 Enterprise Edition supports parallel index creation. Indexes on Views and Computed Columns Building an Index Gives the Data Physical Existence Normally, views are only logical and the rows comprising the view’s data are not generated until the view is accessed. The values for computed columns are typically not stored anywhere in the database; only the definition for the computation is stored and the computation is redone every time a computed column is accessed. The first index on a view must be a clustered index, so that the leaf level can hold all the actual rows that make up the view. Once that clustered index has been build, and the view’s data is now physical, additional (nonclustered) indexes can be built. An index on a computed column can be nonclustered, because all we need to store is the index key values. Common Prerequisites for Indexed Views and Indexes on Computed Columns In order for SQL Server to create use these special indexes, you must have the seven SET options correctly specified: ARITHABORT, CONCAT_NULL_YIELDS_NULL, QUOTED_IDENTIFIER, ANSI_NULLS, ANSI_PADDING, ANSI_WARNING must be all ON NUMERIC_ROUNDABORT must be OFF Only deterministic expressions can be used in the definition of Indexed Views or indexes on Computed Columns. See the BOL for the list of deterministic functions and expressions. Property functions are available to check if a column or view meets the requirements and is indexable. SELECT OBJECTPROPERTY (Object_id, ‘IsIndexable’) SELECT COLUMNPROPERTY (Object_id, column_name , ‘IsIndexable’ ) Schema Binding Guarantees That Object Definition Won’t Change A view can only be indexed if it has been built with schema binding. The SQL Server Optimizer Determines If the Indexed View Can Be Used The query must request a subset of the data contained in the view. The ability of the optimizer to use the indexed view even if the view is not directly referenced is available only in SQL Server 2000 Enterprise Edition. In Standard edition, you can create indexed views, and you can select directly from them, but the optimizer will not choose to use them if they are not directly referenced. Examples of Indexed Views: The best candidates for improvement by indexed views are queries performing aggregations and joins. We will explain how the useful indexed views may be created for these two major groups of queries. The considerations are valid also for queries and indexed views using both joins and aggregations. -- Example: USE Northwind -- Identify 5 products with overall biggest discount total. -- This may be expressed for example by two different queries: -- Q1. select TOP 5 ProductID, SUM(UnitPrice*Quantity)- SUM(UnitPrice*Quantity*(1.00-Discount)) Rebate from [order details] group by ProductID order by Rebate desc --Q2. select TOP 5 ProductID, SUM(UnitPrice*Quantity*Discount) Rebate from [order details] group by ProductID order by Rebate desc --The following indexed view will be used to execute Q1. create view Vdiscount1 with schemabinding as select SUM(UnitPrice*Quantity) SumPrice, SUM(UnitPrice*Quantity*(1.00-Discount)) SumDiscountPrice, COUNT_BIG(*) Count, ProductID from dbo.[order details] group By ProductID create unique clustered index VDiscountInd on Vdiscount1 (ProductID) However, it will not be used by the Q2 because the indexed view does not contain the SUM(UnitPrice*Quantity*Discount) aggregate. We can construct another indexed view create view Vdiscount2 with schemabinding as select SUM(UnitPrice*Quantity) SumPrice, SUM(UnitPrice*Quantity*(1.00-Discount)) SumDiscountPrice, SUM(UnitPrice*Quantity*Discount) SumDiscoutPrice2, COUNT_BIG(*) Count, ProductID from dbo.[order details] group By ProductID create unique clustered index VDiscountInd on Vdiscount2 (ProductID) This view may be used by both Q1 and Q2. Observe that the indexed view Vdiscount2 will have the same number of rows and only one more column compared to Vdiscount1, and it may be used by more queries. In general, try to design indexed views that may be used by more queries. The following query asking for the order with the largest total discount -- Q3. select TOP 3 OrderID, SUM(UnitPrice*Quantity*Discount) OrderRebate from dbo.[order details] group By OrderID Q3 can use neither of the Vdiscount views because the column OrderID is not included in the view definition. To address this variation of the discount analysis query we may create a different indexed view, similar to the query itself. An attempt to generalize the previous indexed view Vdiscount2 so that all three queries Q1, Q2, and Q3 can take advantage of a single indexed view would require a view with both OrderID and ProductID as grouping columns. Because the OrderID, ProductID combination is unique in the original order details table the resulting view would have as many rows as the original table and we would see no savings in using such view compared to using the original table. Consider the size of the resulting indexed view. In the case of pure aggregation, the indexed view may provide no significant performance gains if its size is close to the size of the original table. Complex aggregates (STDEV, VARIANCE, AVG) cannot participate in the index view definition. However, SQL Server may use an indexed view to execute a query containing AVG aggregate. Query containing STDEV or VARIANCE cannot use indexed view to pre-compute these values. The next example shows a query producing the average price for a particular product -- Q4. select ProductName, od.ProductID, AVG(od.UnitPrice*(1.00-Discount)) AvgPrice, SUM(od.Quantity) Units from [order details] od, Products p where od.ProductID=p.ProductID group by ProductName, od.ProductID This is an example of indexed view that will be considered by the SQL Server to answer the Q4 create view v3 with schemabinding as select od.ProductID, SUM(od.UnitPrice*(1.00-Discount)) Price, COUNT_BIG(*) Count, SUM(od.Quantity) Units from dbo.[order details] od group by od.ProductID go create UNIQUE CLUSTERED index iv3 on v3 (ProductID) go Observe that the view definition does not contain the table Products. The indexed view does not need to contain all tables used in the query that uses the indexed view. In addition, the following query (same as above Q4 only with one additional search condition) will use the same indexed view. Observe that the added predicate references only columns from tables not present in the v3 view definition. -- Q5. select ProductName, od.ProductID, AVG(od.UnitPrice*(1.00-Discount)) AvgPrice, SUM(od.Quantity) Units from [order details] od, Products p where od.ProductID=p.ProductID and p.ProductName like '%tofu%' group by ProductName, od.ProductID The following query cannot use the indexed view because the added search condition od.UnitPrice>10 contains a column from the table in the view definition and the column is neither grouping column nor the predicate appears in the view definition. -- Q6. select ProductName, od.ProductID, AVG(od.UnitPrice*(1.00-Discount)) AvgPrice, SUM(od.Quantity) Units from [order details] od, Products p where od.ProductID=p.ProductID and od.UnitPrice>10 group by ProductName, od.ProductID To contrast the Q6 case, the following query will use the indexed view v3 since the added predicate is on the grouping column of the view v3. -- Q7. select ProductName, od.ProductID, AVG(od.UnitPrice*(1.00-Discount)) AvgPrice, SUM(od.Quantity) Units from [order details] od, Products p where od.ProductID=p.ProductID and od.ProductID in (1,2,13,41) group by ProductName, od.ProductID -- The previous query Q6 will use the following indexed view V4: create view V4 with schemabinding as select ProductName, od.ProductID, SUM(od.UnitPrice*(1.00-Discount)) AvgPrice, SUM(od.Quantity) Units, COUNT_BIG(*) Count from dbo.[order details] od, dbo.Products p where od.ProductID=p.ProductID and od.UnitPrice>10 group by ProductName, od.ProductID create unique clustered index VDiscountInd on V4 (ProductName, ProductID) The same index on the view V4 will be used also for a query where a join to the table Orders is added, for example -- Q8. select ProductName, od.ProductID, AVG(od.UnitPrice*(1.00-Discount)) AvgPrice, SUM(od.Quantity) Units from dbo.[order details] od, dbo.Products p, dbo.Orders o where od.ProductID=p.ProductID and o.OrderID=od.OrderID and od.UnitPrice>10 group by ProductName, od.ProductID We will show several modifications of the query Q8 and explain why such modifications cannot use the above view V4. -- Q8a. select ProductName, od.ProductID, AVG(od.UnitPrice*(1.00-Discount)) AvgPrice, SUM(od.Quantity) Units from dbo.[order details] od, dbo.Products p, dbo.Orders o where od.ProductID=p.ProductID and o.OrderID=od.OrderID and od.UnitPrice>25 group by ProductName, od.ProductID 8a cannot use the indexed view because of the where clause mismatch. Observe that table Orders does not participate in the indexed view V4 definition. In spite of that, adding a predicate on this table will disallow using the indexed view because the added predicate may eliminate additional rows participating in the aggregates as it is shown in Q8b. -- Q8b. select ProductName, od.ProductID, AVG(od.UnitPrice*(1.00-Discount)) AvgPrice, SUM(od.Quantity) Units from dbo.[order details] od, dbo.Products p, dbo.Orders o where od.ProductID=p.ProductID and o.OrderID=od.OrderID and od.UnitPrice>10 and o.OrderDate>'01/01/1998' group by ProductName, od.ProductID Locking and Indexes In General, You Should Let SQL Server Control the Locking within Indexes The stored procedure sp_indexoption lets you manually control the unit of locking within an index. It also lets you disallow page locks or row locks within an index. Since these options are available only for indexes, there is no way to control the locking within the data pages of a heap. (But remember that if a table has a clustered index, the data pages are part of the index and are affected by the sp_indexoption setting.) The index options are set for each table or index individually. Two options, Allow Rowlocks and AllowPageLocks, are both set to TRUE initially for every table and index. If both of these options are set to FALSE for a table, only full table locks are allowed. As described in Module 4, SQL Server determines at runtime whether to initially lock rows, pages, or the entire table. The locking of rows (or keys) is heavily favored. The type of locking chosen is based on the number of rows and pages to be scanned, the number of rows on a page, the isolation level in effect, the update activity going on, the number of users on the system needing memory for their own purposes, and so on. SAP databases frequently use sp_indexoption to reduce deadlocks Setting vs. Querying In SQL Server 2000, the procedure sp_indexoption should only be used for setting an index option. To query an option, use the INDEXPROPERTY function. Lesson 2: Concepts – Statistics Statistics are the most important tool that the SQL Server query optimizer has to determine the ideal execution plan for a query. Statistics that are out of date or nonexistent seriously jeopardize query performance. SQL Server 2000 computes and stores statistics in a completely different format that all earlier versions of SQL Server. One of the improvements is an increased ability to determine which values are out of the normal range in terms of the number of occurrences. The new statistics maintenance routines are particularly good at determining when a key value has a very unusual skew of data. What You Will Learn After completing this lesson, you will be able to:  Define terms related to statistics collected by SQL Server.  Describe how statistics are maintained by SQL Server.  Discuss the autostats feature of SQL Server.  Describe how statistics are used in query optimization. Recommended Reading  Statistics Used by the Query Optimizer in Microsoft SQL Server 2000 http://msdn.microsoft.com/library/techart/statquery.htm Definitions Cardinality The cardinality means how many unique values exist in the data. Density For each index and set of column statistics, SQL Server keeps track of details about the uniqueness (or density) of the data values encountered, which provides a measure of how selective the index is. A unique index, of course, has the lowest density —by definition, each index entry can point to only one row. A unique index has a density value of 1/number of rows in the table. Density values range from 0 through 1. Highly selective indexes have density values of 0.10 or lower. For example, a unique index on a table with 8345 rows has a density of 0.00012 (1/8345). If a nonunique nonclustered index has a density of 0.2165 on the same table, each index key can be expected to point to about 1807 rows (0.2165 × 8345). This is probably not selective enough to be more efficient than just scanning the table, so this index is probably not useful. Because driving the query from a nonclustered index means that the pages must be retrieved in index order, an estimated 1807 data page accesses (or logical reads) are needed if there is no clustered index on the table and the leaf level of the index contains the actual RID of the desired data row. The only time a data page doesn’t need to be reaccessed is when the occasional coincidence occurs in which two adjacent index entries happen to point to the same data page. In general, you can think of density as the average number of duplicates. We can also talk about the term ‘join density’, which applies to the average number of duplicates in the foreign key column. This would answer the question: in this one-to-many relationship, how many is ‘many’? Selectivity In general selectivity applies to a particular data value referenced in a WHERE clause. High selectivity means that only a small percentage of the rows satisfy the WHERE clause filter, and a low selectivity means that many rows will satisfy the filter. For example, in an employees table, the column employee_id is probably very selective, and the column gender is probably not very selective at all. Statistics Statistics are a histogram consisting of an even sampling of values for a column or for an index key (or the first column of the key for a composite index) based on the current data. The histogram is stored in the statblob field of the sysindexes table, which is of type image. (Remember that image data is actually stored in structures separate from the data row itself. The data row merely contains a pointer to the image data. For simplicity’s sake, we’ll talk about the index statistics as being stored in the image field called statblob.) To fully estimate the usefulness of an index, the optimizer also needs to know the number of pages in the table or index; this information is stored in the dpages column of sysindexes. During the second phase of query optimization, index selection, the query optimizer determines whether an index exists for a columns in your WHERE clause, assesses the index’s usefulness by determining the selectivity of the clause (that is, how many rows will be returned), and estimates the cost of finding the qualifying rows. Statistics for a single column index consist of one histogram and one density value. The multicolumn statistics for one set of columns in a composite index consist of one histogram for the first column in the index and density values for each prefix combination of columns (including the first column alone). The fact that density information is kept for all columns helps the optimizer decide how useful the index is for joins. Suppose, for example, that an index is composed of three key fields. The density on the first column might be 0.50, which is not too useful. However, as you look at more key columns in the index, the number of rows pointed to is fewer than (or in the worst case, the same as) the first column, so the density value goes down. If you are looking at both the first and second columns, the density might be 0.25, which is somewhat better. Moreover, if you examine three columns, the density might be 0.03, which is highly selective. It does not make sense to refer to the density of only the second column. The lead column density is always needed. Statistics Maintenance Statistics Information Tracks the Distribution of Key Values SQL Server statistics is basically a histogram that contains up to 200 values of a given key column. In addition to the histogram, the statblob field contains the following information:  The time of the last statistics collection  The number of rows used to produce the histogram and density information  The average key length  Densities for other combinations of columns In the statblob column, up to 200 sample values are stored; the range of key values between each sample value is called a step. The sample value is the endpoint of the range. Three values are stored along with each step: a value called EQ_ROWS, which is the number of rows that have a value equal to that sample value; a value called RANGE_ROWS, which specifies how many other values are inside the range (between two adjacent sample values); and the number of distinct values, or RANGE_DENSITY of the range. DBCC SHOW_STATISTICS The DBCC SHOW_STATISTICS output shows us the first two of these three values, but not the range density. The RANGE_DENSITY is instead used to compute two additional values:  DISTINCT_RANGE_ROWS—the number of distinct rows inside this range (not counting the RANGE_HI_KEY value itself. This is computed as 1/RANGE_DENSITY.  AVG_RANGE_ROWS—the average number of rows per distinct value, computed as RANGE_DENSITY * RANGE_ROWS. In addition to statistics on indexes, SQL Server can also keep track of statistics on columns with no indexes. Knowing the density, or the likelihood of a particular value occurring, can help the optimizer determine an optimum processing strategy, even if SQL Server can’t use an index to actually locate the values. Statistics on Columns Column statistics can be useful for two main purposes  When the SQL Server optimizer is determining the optimal join order, it frequently is best to have the smaller input processed first. By ‘input’ we mean table after all filters in the WHERE clause have been applied. Even if there is no useful index on a column in the WHERE clause, statistics could tell us that only a few rows will quality, and those the resulting input will be very small.  The SQL Server query optimizer can use column statistics on non-initial columns in a composite nonclustered index to determine if scanning the leaf level to obtain the bookmarks will be an efficient processing strategy. For example, in the member table in the credit database, the first name column is almost unique. Suppose we have a nonclustered index on (lastname, firstname), and we issue this query: select * from member where firstname = 'MPRO' In this case, statistics on the firstname column would indicate very few rows satisfying this condition, so the optimizer will choose to scan the nonclustered index, since it is smaller than the clustered index (the table). The small number of bookmarks will then be followed to retrieve the actual data. Manually Updating Statistics You can also manually force statistics to be updated in one of two ways. You can run the UPDATE STATISTICS command on a table or on one specific index or column statistics, or you can also execute the procedure sp_updatestats, which runs UPDATE STATISTICS against all user-defined tables in the current database. You can create statistics on unindexed columns using the CREATE STATISTICS command or by executing sp_createstats, which creates single-column statistics for all eligible columns for all user tables in the current database. This includes all columns except computed columns and columns of the ntext, text, or image datatypes, and columns that already have statistics or are the first column of an index. Autostats By Default SQL Server Will Update Statistics on Any Index or Column as Needed Every database is created with the database options auto create statistics and auto update statistics set to true, but you can turn either one off. You can also turn off automatic updating of statistics for a specific table in one of two ways:  UPDATE STATISTICS In addition to updating the statistics, the option WITH NORECOMPUTE indicates that the statistics should not be automatically recomputed in the future. Running UPDATE STATISTICS again without the WITH NORECOMPUTE option enables automatic updates.  sp_autostats This procedure sets or unsets a flag for a table to indicate that statistics should or should not be updated automatically. You can also use this procedure with only the table name to find out whether the table is set to automatically have its index statistics updated. ' However, setting the database option auto update statistics to FALSE overrides any individual table settings. In other words, no automatic updating of statistics takes place. This is not a recommended practice unless thorough testing has shown you that you do not need the automatic updates or that the performance overhead is more than you can afford. Trace Flags Trace flag 205 – reports recompile due to autostats. Trace flag 8721 – writes information to the errorlog when AutoStats has been run. For more information, see the following Knowledge Base article: Q195565 “INF: How SQL Server 7.0 Autostats Work.” Statistics and Performance The Performance Penalty of NOT Having Up-To-Date Statistics Far Outweighs the Benefit of Avoiding Automatic Updating Autostats should be turned off only after thorough testing shows it to be necessary. Because autostats only forces a recompile after a certain number or percentage of rows has been changed, you do not have to make any adjustments for a read-only database. Lesson 3: Concepts – Query Optimization What You Will Learn After completing this lesson, you will be able to:  Describe the phases of query optimization.  Discuss how SQL Server estimates the selectivity of indexes and column and how this estimate is used in query optimization. Recommended Reading  Chapter 15: “The Query Processor”, Inside SQL Server 2000 by Kalen Delaney  Chapter 16: “Query Tuning”, Inside SQL Server 2000 by Kalen Delaney  Whitepaper about SQL Server Query Processor Architecture by Hal Berenson and Kalen Delaney http://msdn.microsoft.com/library/backgrnd/html/sqlquerproc.htm Phases of Query Optimization Query Optimization Involves several phases Trivial Plan Optimization Optimization itself goes through several steps. The first step is something called Trivial Plan Optimization. The whole idea of trivial plan optimization is that cost based optimization is a bit expensive to run. The optimizer can try a great many possible variations trying to find the cheapest plan. If SQL Server knows that there is only one really viable plan for a query, it could avoid a lot of work. A prime example is a query that consists of an INSERT with a VALUES clause. There is only one possible plan. Another example is a SELECT where all the columns are in a unique covering index, and that index is the only one that is useable. There is no other index that has that set of columns in it. These two examples are cases where SQL Server should just generate the plan and not try to find something better. The trivial plan optimizer finds the really obvious plans, which are typically very inexpensive. In fact, all the plans that get through the autoparameterization template result in plans that the trivial plan optimizer can find. Between those two mechanisms, the plans that are simple tend to be weeded out earlier in the process and do not pay a lot of the compilation cost. This is a good thing, because the number of potential plans in 7.0 went up astronomically as SQL Server added hash joins, merge joins and index intersections, to its list of processing techniques. Simplification and Statistics Loading If a plan is not found by the trivial plan optimizer, SQL Server can perform some simplifications, usually thought of as syntactic transformations of the query itself, looking for commutative properties and operations that can be rearranged. SQL Server can do constant folding, and other operations that do not require looking at the cost or analyzing what indexes are, but that can result in a more efficient query. SQL Server then loads up the metadata including the statistics information on the indexes, and then the optimizer goes through a series of phases of cost based optimization. Cost Based Optimization Phases The cost based optimizer is designed as a set of transformation rules that try various permutations of indexes and join strategies. Because of the number of potential plans in SQL Server 7.0 and SQL Server 2000, if the optimizer just ran through all the combinations and produced a plan, the optimization process would take a very long time to run. Therefore, optimization is broken up into phases. Each phase is a set of rules. After each phase is run, the cost of any resulting plan is examined, and if SQL Server determines that the plan is cheap enough, that plan is kept and executed. If the plan is not cheap enough, the optimizer runs the next phase, which is another set of rules. In the vast majority of cases, a good plan will be found in the preliminary phases. Typically, if the plan that a query would have had in SQL Server 6.5 is also the optimal plan in SQL Server 7.0 and SQL Server 2000, the plan will tend to be found either by the trivial plan optimizer or by the first phase of the cost based optimizer. The rules were intentionally organized to try to make that be true. The plan will probably consist of using a single index and using nested loops. However, every once in a while, because of lack of statistical information, or some other nuance, the optimizer will have to proceed with the later phases of optimization. Sometimes this is because there is a real possibility that the optimizer could find a better plan. When a plan is found, it becomes the optimizer’s output, and then SQL Server goes through all the caching mechanisms that we have already discussed in Module 5. Full Optimization At some point, the optimizer determines that it has gone through enough preliminary phases, and it reverts to a phase called full optimization. If the optimizer goes through all the preliminary phases, and still has not found a cheap plan, it examines the cost for the plan that it has so far. If the cost is above the threshold, the optimizer goes into a phase called full optimization. This threshold is configurable, as the configuration option ‘cost threshold for parallelism’. The full optimization phase assumes that this plan should be run this in parallel. If the machine is very busy, the plan will end up running it in serial, but the optimizer has a goal to produce a good parallel. If the cost is below the threshold (or a single processor machine), the full optimization phase just uses a brute force method to find a serial plan. Selectivity Estimation Selectivity Is One of The Most Important Pieces of Information One of the most import things the optimizer needs to know is the number of rows from any table that will meet all the conditions in the query. If there are no restrictions on a table, and all the rows will be needed, the optimizer can determine the number of rows from the sysindexes table. This number is not absolutely guaranteed to be accurate, but it is the number the optimizer uses. If there is a filter on the table in a WHERE clause, the optimizer needs statistics information. Indexes automatically maintain statistics, and the optimizer will use these values to determine the usefulness of the index. If there is no index on the column involved in the filter, then column statistics can be used or generated. Optimizing Search Arguments In General, the Filters in the WHERE Clause Determine Which Indexes Will Be Useful If an indexed column is referenced in a Search Argument (SARG), the optimizer will analyze the cost of using that index. A SARG has the form:  column <operator> value  value <operator> column  Operator must be one of =, >, >= <, <= The value can be a constant, an operation, or a variable. Some functions also will be treated as SARGs. These queries have SARGs, and a nonclustered index on firstname will be used in most cases: select * from member where firstname < 'AKKG' select * from member where firstname = substring('HAAKGALSFJA', 2,5) select * from member where firstname = 'AA' + 'KG' declare @name char(4) set @name = 'AKKG' select * from member where firstname < @name Not all functions can be used in SARGs. select * from charge where charge_amt < 2*2 select * from charge where charge_amt < sqrt(16) Compare these queries to ones using = instead of <. With =, the optimizer can use the density information to come up with a good row estimate, even if it’s not going to actually perform the function’s calculations. A filter with a variable is usually a SARG The issue is, can the optimizer come up with useful costing information? A filter with a variable is not a SARG if the variable is of a different datatype, and the column must be converted to the variable’s datatype For more information, see the following Knowledge Base article: Q198625 Enter Title of KB Article Here Use credit go CREATE TABLE [member2] ( [member_no] [smallint] NOT NULL , [lastname] [shortstring] NOT NULL , [firstname] [shortstring] NOT NULL , [middleinitial] [letter] NULL , [street] [shortstring] NOT NULL , [city] [shortstring] NOT NULL , [state_prov] [statecode] NOT NULL , [country] [countrycode] NOT NULL , [mail_code] [mailcode] NOT NULL ) GO insert into member2 select member_no, lastname, firstname, middleinitial, street, city, state_prov, country, mail_code from member alter table member2 add constraint pk_member2 primary key clustered (lastname, member_no, firstname, country) declare @id int set @id = 47 update member2 set city = city + ' City', state_prov = state_prov + ' State' where lastname = 'Barr' and member_no = @id and firstname = 'URQYJBFVRRPWKVW' and country = 'USA' These queries don’t have SARGs, and a table scan will be done: select * from member where substring(lastname, 1,2) = ‘BA’ Some non-SARGs can be converted select * from member where lastname like ‘ba%’ In some cases, you can rewrite your query to turn a non-SARG into a SARG; for example, you can rewrite the substring query above and the LIKE query that follows it. Join Order and Types of Joins Join Order and Strategy Is Determined By the Optimizer The execution plan output will display the join order from top to bottom; i.e. the table listed on top is the first one accessed in a join. You can override the optimizer’s join order decision in two ways:  OPTION (FORCE ORDER) applies to one query  SET FORCEPLAN ON applies to entire session, until set OFF If either of these options is used, the join order is determined by the order the tables are listed in the query’s FROM clause, and no optimizer on JOIN ORDER is done. Forcing the JOIN order may force a particular join strategy. For example, in most outer join operations, the outer table is processed first, and a nested loops join is done. However, if you force the inner table to be accessed first, a merge join will need to be done. Compare the query plan for this query with and without the FORCE ORDER hint: select * from titles right join publishers on titles.pub_id = publishers.pub_id -- OPTION (FORCE ORDER) Nested Loop Join A nested iteration is when the query optimizer constructs a set of nested loops, and the result set grows as it progresses through the rows. The query optimizer performs the following steps. 1. Finds a row from the first table. 2. Uses that row to scan the next table. 3. Uses the result of the previous table to scan the next table. Evaluating Join Combinations The query optimizer automatically evaluates at least four or more possible join combinations, even if those combinations are not specified in the join predicate. You do not have to add redundant clauses. The query optimizer balances the cost and uses statistics to determine the number of join combinations that it evaluates. Evaluating every possible join combination is inefficient and costly. Evaluating Cost of Query Performance When the query optimizer performs a nested join, you should be aware that certain costs are incurred. Nested loop joins are far superior to both merge joins and hash joins when executing small transactions, such as those affecting only a small set of rows. The query optimizer:  Uses nested loop joins if the outer input is quite small and the inner input is indexed and quite large.  Uses the smaller input as the outer table.  Requires that a useful index exist on the join predicate for the inner table.  Always uses a nested loop join strategy if the join operation uses an operator other than an equality operator. Merge Joins The columns of the join conditions are used as inputs to process a merge join. SQL Server performs the following steps when using a merge join strategy: 1. Gets the first input values from each input set. 2. Compares input values. 3. Performs a merge algorithm. • If the input values are equal, the rows are returned. • If the input values are not equal, the lower value is discarded, and the next input value from that input is used for the next comparison. 4. Repeats the process until all of the rows from one of the input sets have been processed. 5. Evaluates any remaining search conditions in the query and returns only rows that qualify. Note Only one pass per input is done. The merge join operation ends after all of the input values of one input have been evaluated. The remaining values from the other input are not processed. Requires That Joined Columns Are Sorted If you execute a query with join operations, and the joined columns are in sorted order, the query optimizer processes the query by using a merge join strategy. A merge join is very efficient because the columns are already sorted, and it requires fewer page I/O. Evaluates Sorted Values For the query optimizer to use the merge join, the inputs must be sorted. The query optimizer evaluates sorted values in the following order: 1. Uses an existing index tree (most typical). The query optimizer can use the index tree from a clustered index or a covered nonclustered index. 2. Leverages sort operations that the GROUP BY, ORDER BY, and CUBE clauses use. The sorting operation only has to be performed once. 3. Performs its own sort operation in which a SORT operator is displayed when graphically viewing the execution plan. The query optimizer does this very rarely. Performance Considerations Consider the following facts about the query optimizer's use of the merge join:  SQL Server performs a merge join for all types of join operations (except cross join or full join operations), including UNION operations.  A merge join operation may be a one-to-one, one-to-many, or many-to-many operation. If the merge join is a many-to-many operation, SQL Server uses a temporary table to store the rows. If duplicate values from each input exist, one of the inputs rewinds to the start of the duplicates as each duplicate value from the other input is processed.  Query performance for a merge join is very fast, but the cost can be high if the query optimizer must perform its own sort operation. If the data volume is large and the desired data can be obtained presorted from existing Balanced-Tree (B-Tree) indexes, merge join is often the fastest join algorithm.  A merge join is typically used if the two join inputs have a large amount of data and are sorted on their join columns (for example, if the join inputs were obtained by scanning sorted indexes).  Merge join operations can only be performed with an equality operator in the join predicate. Hashing is a strategy for dividing data into equal sets of a manageable size based on a given property or characteristic. The grouped data can then be used to determine whether a particular data item matches an existing value. Note Duplicate data or ranges of data are not useful for hash joins because the data is not organized together or in order. When a Hash Join Is Used The query optimizer uses a hash join option when it estimates that it is more efficient than processing queries by using a nested loop or merge join. It typically uses a hash join when an index does not exist or when existing indexes are not useful. Assigns a Build and Probe Input The query optimizer assigns a build and probe input. If the query optimizer incorrectly assigns the build and probe input (this may occur because of imprecise density estimates), it reverses them dynamically. The ability to change input roles dynamically is called role reversal. Build input consists of the column values from a table with the lowest number of rows. Build input creates a hash table in memory to store these values. The hash bucket is a storage place in the hash table in which each row of the build input is inserted. Rows from one of the join tables are placed into the hash bucket where the hash key value of the row matches the hash key value of the bucket. Hash buckets are stored as a linked list and only contain the columns that are needed for the query. A hash table contains hash buckets. The hash table is created from the build input. Probe input consists of the column values from the table with the most rows. Probe input is what the build input checks to find a match in the hash buckets. Note The query optimizer uses column or index statistics to help determine which input is the smaller of the two. Processing a Hash Join The following list is a simplified description of how the query optimizer processes a hash join. It is not intended to be comprehensive because the algorithm is very complex. SQL Server: 1. Reads the probe input. Each probe input is processed one row at a time. 2. Performs the hash algorithm against each probe input and generates a hash key value. 3. Finds the hash bucket that matches the hash key value. 4. Accesses the hash bucket and looks for the matching row. 5. Returns the row if a match is found. Performance Considerations Consider the following facts about the hash joins that the query optimizer uses:  Similar to merge joins, a hash join is very efficient, because it uses hash buckets, which are like a dynamic index but with less overhead for combining rows.  Hash joins can be performed for all types of join operations (except cross join operations), including UNION and DIFFERENCE operations.  A hash operator can remove duplicates and group data, such as SUM (salary) GROUP BY department. The query optimizer uses only one input for both the build and probe roles.  If join inputs are large and are of similar size, the performance of a hash join operation is similar to a merge join with prior sorting. However, if the size of the join inputs is significantly different, the performance of a hash join is often much faster.  Hash joins can process large, unsorted, non-indexed inputs efficiently. Hash joins are useful in complex queries because the intermediate results: • Are not indexed (unless explicitly saved to disk and then indexed). • Are often not sorted for the next operation in the execution plan.  The query optimizer can identify incorrect estimates and make corrections dynamically to process the query more efficiently.  A hash join reduces the need for database denormalization. Denormalization is typically used to achieve better performance by reducing join operations despite redundancy, such as inconsistent updates. Hash joins give you the option to vertically partition your data as part of your physical database design. Vertical partitioning represents groups of columns from a single table in separate files or indexes. Subquery Performance Joins Are Not Inherently Better Than Subqueries Here is an example showing three different ways to update a table, using a second table for lookup purposes. The first uses a JOIN with the update, the second uses a regular introduced with IN, and the third uses a correlated subquery. All three yield nearly identical performance. Note Note that performance comparisons cannot just be made based on I/Os. With HASHING and MERGING techniques, the number of reads may be the same for two queries, yet one may take a lot longer and use more memory resources. Also, always be sure to monitor statistics time. Suppose you want to add a 5 percent discount to order items in the Order Details table for which the supplier is Exotic Liquids, whose supplierid is 1. -- JOIN solution BEGIN TRAN UPDATE OD SET discount = discount + 0.05 FROM [Order Details] AS OD JOIN Products AS P ON OD.productid = P.productid WHERE supplierid = 1 ROLLBACK TRAN -- Regular subquery solution BEGIN TRAN UPDATE [Order Details] SET discount = discount + 0.05 WHERE productid IN (SELECT productid FROM Products WHERE supplierid = 1) ROLLBACK TRAN -- Correlated Subquery Solution BEGIN TRAN UPDATE [Order Details] SET discount = discount + 0.05 WHERE EXISTS(SELECT supplierid FROM Products WHERE [Order Details].productid = Products.productid AND supplierid = 1) ROLLBACK TRAN Internally, Your Join May Be Rewritten SQL Server’s query processor had many different ways of resolving your JOIN expressions. Subqueries may be converted to a JOIN with an implied distinct, which may result in a logical operator of SEMI JOIN. Compare the plans of the first two queries: USE credit select member_no from member where member_no in (select member_no from charge) select distinct m.member_no from member m join charge c on m.member_no = c.member_no The second query uses a HASH MATCH as the final step to remove the duplicates. The first query only had to do a semi join. For these queries, although the I/O values are the same, the first query (with the subquery) runs much faster (almost twice as fast). Another similar looking join is
Computer Networking: A Top-Down Approach, 6th Edition Solutions to Review Questions and Problems Version Date: May 2012 This document contains the solutions to review questions and problems for the 5th edition of Computer Networking: A Top-Down Approach by Jim Kurose and Keith Ross. These solutions are being made available to instructors ONLY. Please do NOT copy or distribute this document to others (even other instructors). Please do not post any solutions on a publicly-available Web site. We’ll be happy to provide a copy (up-to-date) of this solution manual ourselves to anyone who asks. Acknowledgments: Over the years, several students and colleagues have helped us prepare this solutions manual. Special thanks goes to HongGang Zhang, Rakesh Kumar, Prithula Dhungel, and Vijay Annapureddy. Also thanks to all the readers who have made suggestions and corrected errors. All material © copyright 1996-2012 by J.F. Kurose and K.W. Ross. All rights reserved Chapter 1 Review Questions There is no difference. Throughout this text, the words “host” and “end system” are used interchangeably. End systems include PCs, workstations, Web servers, mail servers, PDAs, Internet-connected game consoles, etc. From Wikipedia: Diplomatic protocol is commonly described as a set of international courtesy rules. These well-established and time-honored rules have made it easier for nations and people to live and work together. Part of protocol has always been the acknowledgment of the hierarchical standing of all present. Protocol rules are based on the principles of civility. Standards are important for protocols so that people can create networking systems and products that interoperate. 1. Dial-up modem over telephone line: home; 2. DSL over telephone line: home or small office; 3. Cable to HFC: home; 4. 100 Mbps switched Ethernet: enterprise; 5. Wifi (802.11): home and enterprise: 6. 3G and 4G: wide-area wireless. HFC bandwidth is shared among the users. On the downstream channel, all packets emanate from a single source, namely, the head end. Thus, there are no collisions in the downstream channel. In most American cities, the current possibilities include: dial-up; DSL; cable modem; fiber-to-the-home. 7. Ethernet LANs have transmission rates of 10 Mbps, 100 Mbps, 1 Gbps and 10 Gbps. 8. Today, Ethernet most commonly runs over twisted-pair copper wire. It also can run over fibers optic links. 9. Dial up modems: up to 56 Kbps, bandwidth is dedicated; ADSL: up to 24 Mbps downstream and 2.5 Mbps upstream, bandwidth is dedicated; HFC, rates up to 42.8 Mbps and upstream rates of up to 30.7 Mbps, bandwidth is shared. FTTH: 2-10Mbps upload; 10-20 Mbps download; bandwidth is not shared. 10. There are two popular wireless Internet access technologies today: Wifi (802.11) In a wireless LAN, wireless users transmit/receive packets to/from an base station (i.e., wireless access point) within a radius of few tens of meters. The base station is typically connected to the wired Internet and thus serves to connect wireless users to the wired network. 3G and 4G wide-area wireless access networks. In these systems, packets are transmitted over the same wireless infrastructure used for cellular telephony, with the base station thus being managed by a telecommunications provider. This provides wireless access to users within a radius of tens of kilometers of the base station. 11. At time t0 the sending host begins to transmit. At time t1 = L/R1, the sending host completes transmission and the entire packet is received at the router (no propagation delay). Because the router has the entire packet at time t1, it can begin to transmit the packet to the receiving host at time t1. At time t2 = t1 + L/R2, the router completes transmission and the entire packet is received at the receiving host (again, no propagation delay). Thus, the end-to-end delay is L/R1 + L/R2. 12. A circuit-switched network can guarantee a certain amount of end-to-end bandwidth for the duration of a call. Most packet-switched networks today (including the Internet) cannot make any end-to-end guarantees for bandwidth. FDM requires sophisticated analog hardware to shift signal into appropriate frequency bands. 13. a) 2 users can be supported because each user requires half of the link bandwidth. b) Since each user requires 1Mbps when transmitting, if two or fewer users transmit simultaneously, a maximum of 2Mbps will be required. Since the available bandwidth of the shared link is 2Mbps, there will be no queuing delay before the link. Whereas, if three users transmit simultaneously, the bandwidth required will be 3Mbps which is more than the available bandwidth of the shared link. In this case, there will be queuing delay before the link. c) Probability that a given user is transmitting = 0.2 d) Probability that all three users are transmitting simultaneously = = (0.2)3 = 0.008. Since the queue grows when all the users are transmitting, the fraction of time during which the queue grows (which is equal to the probability that all three users are transmitting simultaneously) is 0.008. 14. If the two ISPs do not peer with each other, then when they send traffic to each other they have to send the traffic through a provider ISP (intermediary), to which they have to pay for carrying the traffic. By peering with each other directly, the two ISPs can reduce their payments to their provider ISPs. An Internet Exchange Points (IXP) (typically in a standalone building with its own switches) is a meeting point where multiple ISPs can connect and/or peer together. An ISP earns its money by charging each of the the ISPs that connect to the IXP a relatively small fee, which may depend on the amount of traffic sent to or received from the IXP. 15. Google's private network connects together all its data centers, big and small. Traffic between the Google data centers passes over its private network rather than over the public Internet. Many of these data centers are located in, or close to, lower tier ISPs. Therefore, when Google delivers content to a user, it often can bypass higher tier ISPs. What motivates content providers to create these networks? First, the content provider has more control over the user experience, since it has to use few intermediary ISPs. Second, it can save money by sending less traffic into provider networks. Third, if ISPs decide to charge more money to highly profitable content providers (in countries where net neutrality doesn't apply), the content providers can avoid these extra payments. 16. The delay components are processing delays, transmission delays, propagation delays, and queuing delays. All of these delays are fixed, except for the queuing delays, which are variable. 17. a) 1000 km, 1 Mbps, 100 bytes b) 100 km, 1 Mbps, 100 bytes 18. 10msec; d/s; no; no 19. a) 500 kbps b) 64 seconds c) 100kbps; 320 seconds 20. End system A breaks the large file into chunks. It adds header to each chunk, thereby generating multiple packets from the file. The header in each packet includes the IP address of the destination (end system B). The packet switch uses the destination IP address in the packet to determine the outgoing link. Asking which road to take is analogous to a packet asking which outgoing link it should be forwarded on, given the packet’s destination address. 21. The maximum emission rate is 500 packets/sec and the maximum transmission rate is 350 packets/sec. The corresponding traffic intensity is 500/350 =1.43 > 1. Loss will eventually occur for each experiment; but the time when loss first occurs will be different from one experiment to the next due to the randomness in the emission process. 22. Five generic tasks are error control, flow control, segmentation and reassembly, multiplexing, and connection setup. Yes, these tasks can be duplicated at different layers. For example, error control is often provided at more than one layer. 23. The five layers in the Internet protocol stack are – from top to bottom – the application layer, the transport layer, the network layer, the link layer, and the physical layer. The principal responsibilities are outlined in Section 1.5.1. 24. Application-layer message: data which an application wants to send and passed onto the transport layer; transport-layer segment: generated by the transport layer and encapsulates application-layer message with transport layer header; network-layer datagram: encapsulates transport-layer segment with a network-layer header; link-layer frame: encapsulates network-layer datagram with a link-layer header. 25. Routers process network, link and physical layers (layers 1 through 3). (This is a little bit of a white lie, as modern routers sometimes act as firewalls or caching components, and process Transport layer as well.) Link layer switches process link and physical layers (layers 1 through2). Hosts process all five layers. 26. a) Virus Requires some form of human interaction to spread. Classic example: E-mail viruses. b) Worms No user replication needed. Worm in infected host scans IP addresses and port numbers, looking for vulnerable processes to infect. 27. Creation of a botnet requires an attacker to find vulnerability in some application or system (e.g. exploiting the buffer overflow vulnerability that might exist in an application). After finding the vulnerability, the attacker needs to scan for hosts that are vulnerable. The target is basically to compromise a series of systems by exploiting that particular vulnerability. Any system that is part of the botnet can automatically scan its environment and propagate by exploiting the vulnerability. An important property of such botnets is that the originator of the botnet can remotely control and issue commands to all the nodes in the botnet. Hence, it becomes possible for the attacker to issue a command to all the nodes, that target a single node (for example, all nodes in the botnet might be commanded by the attacker to send a TCP SYN message to the target, which might result in a TCP SYN flood attack at the target). 28. Trudy can pretend to be Bob to Alice (and vice-versa) and partially or completely modify the message(s) being sent from Bob to Alice. For example, she can easily change the phrase “Alice, I owe you $1000” to “Alice, I owe you $10,000”. Furthermore, Trudy can even drop the packets that are being sent by Bob to Alice (and vise-versa), even if the packets from Bob to Alice are encrypted. Chapter 1 Problems Problem 1 There is no single right answer to this question. Many protocols would do the trick. Here's a simple answer below: Messages from ATM machine to Server Msg name purpose -------- ------- HELO Let server know that there is a card in the ATM machine ATM card transmits user ID to Server PASSWD User enters PIN, which is sent to server BALANCE User requests balance WITHDRAWL User asks to withdraw money BYE user all done Messages from Server to ATM machine (display) Msg name purpose -------- ------- PASSWD Ask user for PIN (password) OK last requested operation (PASSWD, WITHDRAWL) OK ERR last requested operation (PASSWD, WITHDRAWL) in ERROR AMOUNT sent in response to BALANCE request BYE user done, display welcome screen at ATM Correct operation: client server HELO (userid) --------------> (check if valid userid) <------------- PASSWD PASSWD --------------> (check password) <------------- AMOUNT WITHDRAWL --------------> check if enough $ to cover withdrawl (check if valid userid) <------------- PASSWD PASSWD --------------> (check password) <------------- AMOUNT WITHDRAWL --------------> check if enough $ to cover withdrawl <------------- BYE Problem 2 At time N*(L/R) the first packet has reached the destination, the second packet is stored in the last router, the third packet is stored in the next-to-last router, etc. At time N*(L/R) + L/R, the second packet has reached the destination, the third packet is stored in the last router, etc. Continuing with this logic, we see that at time N*(L/R) + (P-1)*(L/R) = (N+P-1)*(L/R) all packets have reached the destination. Problem 3 a) A circuit-switched network would be well suited to the application, because the application involves long sessions with predictable smooth bandwidth requirements. Since the transmission rate is known and not bursty, bandwidth can be reserved for each application session without significant waste. In addition, the overhead costs of setting up and tearing down connections are amortized over the lengthy duration of a typical application session. b) In the worst case, all the applications simultaneously transmit over one or more network links. However, since each link has sufficient bandwidth to handle the sum of all of the applications' data rates, no congestion (very little queuing) will occur. Given such generous link capacities, the network does not need congestion control mechanisms. Problem 4 Between the switch in the upper left and the switch in the upper right we can have 4 connections. Similarly we can have four connections between each of the 3 other pairs of adjacent switches. Thus, this network can support up to 16 connections. We can 4 connections passing through the switch in the upper-right-hand corner and another 4 connections passing through the switch in the lower-left-hand corner, giving a total of 8 connections. Yes. For the connections between A and C, we route two connections through B and two connections through D. For the connections between B and D, we route two connections through A and two connections through C. In this manner, there are at most 4 connections passing through any link. Problem 5 Tollbooths are 75 km apart, and the cars propagate at 100km/hr. A tollbooth services a car at a rate of one car every 12 seconds. a) There are ten cars. It takes 120 seconds, or 2 minutes, for the first tollbooth to service the 10 cars. Each of these cars has a propagation delay of 45 minutes (travel 75 km) before arriving at the second tollbooth. Thus, all the cars are lined up before the second tollbooth after 47 minutes. The whole process repeats itself for traveling between the second and third tollbooths. It also takes 2 minutes for the third tollbooth to service the 10 cars. Thus the total delay is 96 minutes. b) Delay between tollbooths is 8*12 seconds plus 45 minutes, i.e., 46 minutes and 36 seconds. The total delay is twice this amount plus 8*12 seconds, i.e., 94 minutes and 48 seconds. Problem 6 a) seconds. b) seconds. c) seconds. d) The bit is just leaving Host A. e) The first bit is in the link and has not reached Host B. f) The first bit has reached Host B. g) Want km. Problem 7 Consider the first bit in a packet. Before this bit can be transmitted, all of the bits in the packet must be generated. This requires sec=7msec. The time required to transmit the packet is sec= sec. Propagation delay = 10 msec. The delay until decoding is 7msec + sec + 10msec = 17.224msec A similar analysis shows that all bits experience a delay of 17.224 msec. Problem 8 a) 20 users can be supported. b) . c) . d) . We use the central limit theorem to approximate this probability. Let be independent random variables such that . “21 or more users” when is a standard normal r.v. Thus “21 or more users” . Problem 9 10,000 Problem 10 The first end system requires L/R1 to transmit the packet onto the first link; the packet propagates over the first link in d1/s1; the packet switch adds a processing delay of dproc; after receiving the entire packet, the packet switch connecting the first and the second link requires L/R2 to transmit the packet onto the second link; the packet propagates over the second link in d2/s2. Similarly, we can find the delay caused by the second switch and the third link: L/R3, dproc, and d3/s3. Adding these five delays gives dend-end = L/R1 + L/R2 + L/R3 + d1/s1 + d2/s2 + d3/s3+ dproc+ dproc To answer the second question, we simply plug the values into the equation to get 6 + 6 + 6 + 20+16 + 4 + 3 + 3 = 64 msec. Problem 11 Because bits are immediately transmitted, the packet switch does not introduce any delay; in particular, it does not introduce a transmission delay. Thus, dend-end = L/R + d1/s1 + d2/s2+ d3/s3 For the values in Problem 10, we get 6 + 20 + 16 + 4 = 46 msec. Problem 12 The arriving packet must first wait for the link to transmit 4.5 *1,500 bytes = 6,750 bytes or 54,000 bits. Since these bits are transmitted at 2 Mbps, the queuing delay is 27 msec. Generally, the queuing delay is (nL + (L - x))/R. Problem 13 The queuing delay is 0 for the first transmitted packet, L/R for the second transmitted packet, and generally, (n-1)L/R for the nth transmitted packet. Thus, the average delay for the N packets is: (L/R + 2L/R + ....... + (N-1)L/R)/N = L/(RN) * (1 + 2 + ..... + (N-1)) = L/(RN) * N(N-1)/2 = LN(N-1)/(2RN) = (N-1)L/(2R) Note that here we used the well-known fact: 1 + 2 + ....... + N = N(N+1)/2 It takes seconds to transmit the packets. Thus, the buffer is empty when a each batch of packets arrive. Thus, the average delay of a packet across all batches is the average delay within one batch, i.e., (N-1)L/2R. Problem 14 The transmission delay is . The total delay is Let . Total delay = For x=0, the total delay =0; as we increase x, total delay increases, approaching infinity as x approaches 1/a. Problem 15 Total delay . Problem 16 The total number of packets in the system includes those in the buffer and the packet that is being transmitted. So, N=10+1. Because , so (10+1)=a*(queuing delay + transmission delay). That is, 11=a*(0.01+1/100)=a*(0.01+0.01). Thus, a=550 packets/sec. Problem 17 There are nodes (the source host and the routers). Let denote the processing delay at the th node. Let be the transmission rate of the th link and let . Let be the propagation delay across the th link. Then . Let denote the average queuing delay at node . Then . Problem 18 On linux you can use the command traceroute www.targethost.com and in the Windows command prompt you can use tracert www.targethost.com In either case, you will get three delay measurements. For those three measurements you can calculate the mean and standard deviation. Repeat the experiment at different times of the day and comment on any changes. Here is an example solution: Traceroutes between San Diego Super Computer Center and www.poly.edu The average (mean) of the round-trip delays at each of the three hours is 71.18 ms, 71.38 ms and 71.55 ms, respectively. The standard deviations are 0.075 ms, 0.21 ms, 0.05 ms, respectively. In this example, the traceroutes have 12 routers in the path at each of the three hours. No, the paths didn’t change during any of the hours. Traceroute packets passed through four ISP networks from source to destination. Yes, in this experiment the largest delays occurred at peering interfaces between adjacent ISPs. Traceroutes from www.stella-net.net (France) to www.poly.edu (USA). The average round-trip delays at each of the three hours are 87.09 ms, 86.35 ms and 86.48 ms, respectively. The standard deviations are 0.53 ms, 0.18 ms, 0.23 ms, respectively. In this example, there are 11 routers in the path at each of the three hours. No, the paths didn’t change during any of the hours. Traceroute packets passed three ISP networks from source to destination. Yes, in this experiment the largest delays occurred at peering interfaces between adjacent ISPs. Problem 19 An example solution: Traceroutes from two different cities in France to New York City in United States In these traceroutes from two different cities in France to the same destination host in United States, seven links are in common including the transatlantic link. In this example of traceroutes from one city in France and from another city in Germany to the same host in United States, three links are in common including the transatlantic link. Traceroutes to two different cities in China from same host in United States Five links are common in the two traceroutes. The two traceroutes diverge before reaching China Problem 20 Throughput = min{Rs, Rc, R/M} Problem 21 If only use one path, the max throughput is given by: . If use all paths, the max throughput is given by . Problem 22 Probability of successfully receiving a packet is: ps= (1-p)N. The number of transmissions needed to be performed until the packet is successfully received by the client is a geometric random variable with success probability ps. Thus, the average number of transmissions needed is given by: 1/ps . Then, the average number of re-transmissions needed is given by: 1/ps -1. Problem 23 Let’s call the first packet A and call the second packet B. If the bottleneck link is the first link, then packet B is queued at the first link waiting for the transmission of packet A. So the packet inter-arrival time at the destination is simply L/Rs. If the second link is the bottleneck link and both packets are sent back to back, it must be true that the second packet arrives at the input queue of the second link before the second link finishes the transmission of the first packet. That is, L/Rs + L/Rs + dprop = L/Rs + dprop + L/Rc Thus, the minimum value of T is L/Rc  L/Rs . Problem 24 40 terabytes = 40 * 1012 * 8 bits. So, if using the dedicated link, it will take 40 * 1012 * 8 / (100 *106 ) =3200000 seconds = 37 days. But with FedEx overnight delivery, you can guarantee the data arrives in one day, and it should cost less than $100. Problem 25 160,000 bits 160,000 bits The bandwidth-delay product of a link is the maximum number of bits that can be in the link. the width of a bit = length of link / bandwidth-delay product, so 1 bit is 125 meters long, which is longer than a football field s/R Problem 26 s/R=20000km, then R=s/20000km= 2.5*108/(2*107)= 12.5 bps Problem 27 80,000,000 bits 800,000 bits, this is because that the maximum number of bits that will be in the link at any given time = min(bandwidth delay product, packet size) = 800,000 bits. .25 meters Problem 28 ttrans + tprop = 400 msec + 80 msec = 480 msec. 20 * (ttrans + 2 tprop) = 20*(20 msec + 80 msec) = 2 sec. Breaking up a file takes longer to transmit because each data packet and its corresponding acknowledgement packet add their own propagation delays. Problem 29 Recall geostationary satellite is 36,000 kilometers away from earth surface. 150 msec 1,500,000 bits 600,000,000 bits Problem 30 Let’s suppose the passenger and his/her bags correspond to the data unit arriving to the top of the protocol stack. When the passenger checks in, his/her bags are checked, and a tag is attached to the bags and ticket. This is additional information added in the Baggage layer if Figure 1.20 that allows the Baggage layer to implement the service or separating the passengers and baggage on the sending side, and then reuniting them (hopefully!) on the destination side. When a passenger then passes through security and additional stamp is often added to his/her ticket, indicating that the passenger has passed through a security check. This information is used to ensure (e.g., by later checks for the security information) secure transfer of people. Problem 31 Time to send message from source host to first packet switch = With store-and-forward switching, the total time to move message from source host to destination host = Time to send 1st packet from source host to first packet switch = . . Time at which 2nd packet is received at the first switch = time at which 1st packet is received at the second switch = Time at which 1st packet is received at the destination host = . After this, every 5msec one packet will be received; thus time at which last (800th) packet is received = . It can be seen that delay in using message segmentation is significantly less (almost 1/3rd). Without message segmentation, if bit errors are not tolerated, if there is a single bit error, the whole message has to be retransmitted (rather than a single packet). Without message segmentation, huge packets (containing HD videos, for example) are sent into the network. Routers have to accommodate these huge packets. Smaller packets have to queue behind enormous packets and suffer unfair delays. Packets have to be put in sequence at the destination. Message segmentation results in many smaller packets. Since header size is usually the same for all packets regardless of their size, with message segmentation the total amount of header bytes is more. Problem 32 Yes, the delays in the applet correspond to the delays in the Problem 31.The propagation delays affect the overall end-to-end delays both for packet switching and message switching equally. Problem 33 There are F/S packets. Each packet is S=80 bits. Time at which the last packet is received at the first router is sec. At this time, the first F/S-2 packets are at the destination, and the F/S-1 packet is at the second router. The last packet must then be transmitted by the first router and the second router, with each transmission taking sec. Thus delay in sending the whole file is To calculate the value of S which leads to the minimum delay, Problem 34 The circuit-switched telephone networks and the Internet are connected together at "gateways". When a Skype user (connected to the Internet) calls an ordinary telephone, a circuit is established between a gateway and the telephone user over the circuit switched network. The skype user's voice is sent in packets over the Internet to the gateway. At the gateway, the voice signal is reconstructed and then sent over the circuit. In the other direction, the voice signal is sent over the circuit switched network to the gateway. The gateway packetizes the voice signal and sends the voice packets to the Skype user.   Chapter 2 Review Questions The Web: HTTP; file transfer: FTP; remote login: Telnet; e-mail: SMTP; BitTorrent file sharing: BitTorrent protocol Network architecture refers to the organization of the communication process into layers (e.g., the five-layer Internet architecture). Application architecture, on the other hand, is designed by an application developer and dictates the broad structure of the application (e.g., client-server or P2P). The process which initiates the communication is the client; the process that waits to be contacted is the server. No. In a P2P file-sharing application, the peer that is receiving a file is typically the client and the peer that is sending the file is typically the server. The IP address of the destination host and the port number of the socket in the destination process. You would use UDP. With UDP, the transaction can be completed in one roundtrip time (RTT) - the client sends the transaction request into a UDP socket, and the server sends the reply back to the client's UDP socket. With TCP, a minimum of two RTTs are needed - one to set-up the TCP connection, and another for the client to send the request, and for the server to send back the reply. One such example is remote word processing, for example, with Google docs. However, because Google docs runs over the Internet (using TCP), timing guarantees are not provided. a) Reliable data transfer TCP provides a reliable byte-stream between client and server but UDP does not. b) A guarantee that a certain value for throughput will be maintained Neither c) A guarantee that data will be delivered within a specified amount of time Neither d) Confidentiality (via encryption) Neither SSL operates at the application layer. The SSL socket takes unencrypted data from the application layer, encrypts it and then passes it to the TCP socket. If the application developer wants TCP to be enhanced with SSL, she has to include the SSL code in the application. A protocol uses handshaking if the two communicating entities first exchange control packets before sending data to each other. SMTP uses handshaking at the application layer whereas HTTP does not. The applications associated with those protocols require that all application data be received in the correct order and without gaps. TCP provides this service whereas UDP does not. When the user first visits the site, the server creates a unique identification number, creates an entry in its back-end database, and returns this identification number as a cookie number. This cookie number is stored on the user’s host and is managed by the browser. During each subsequent visit (and purchase), the browser sends the cookie number back to the site. Thus the site knows when this user (more precisely, this browser) is visiting the site. Web caching can bring the desired content “closer” to the user, possibly to the same LAN to which the user’s host is connected. Web caching can reduce the delay for all objects, even objects that are not cached, since caching reduces the traffic on links. Telnet is not available in Windows 7 by default. to make it available, go to Control Panel, Programs and Features, Turn Windows Features On or Off, Check Telnet client. To start Telnet, in Windows command prompt, issue the following command > telnet webserverver 80 where "webserver" is some webserver. After issuing the command, you have established a TCP connection between your client telnet program and the web server. Then type in an HTTP GET message. An example is given below: Since the index.html page in this web server was not modified since Fri, 18 May 2007 09:23:34 GMT, and the above commands were issued on Sat, 19 May 2007, the server returned "304 Not Modified". Note that the first 4 lines are the GET message and header lines inputed by the user, and the next 4 lines (starting from HTTP/1.1 304 Not Modified) is the response from the web server. FTP uses two parallel TCP connections, one connection for sending control information (such as a request to transfer a file) and another connection for actually transferring the file. Because the control information is not sent over the same connection that the file is sent over, FTP sends control information out of band. The message is first sent from Alice’s host to her mail server over HTTP. Alice’s mail server then sends the message to Bob’s mail server over SMTP. Bob then transfers the message from his mail server to his host over POP3. 17. Received: from 65.54.246.203 (EHLO bay0-omc3-s3.bay0.hotmail.com) (65.54.246.203) by mta419.mail.mud.yahoo.com with SMTP; Sat, 19 May 2007 16:53:51 -0700 Received: from hotmail.com ([65.55.135.106]) by bay0-omc3-s3.bay0.hotmail.com with Microsoft SMTPSVC(6.0.3790.2668); Sat, 19 May 2007 16:52:42 -0700 Received: from mail pickup service by hotmail.com with Microsoft SMTPSVC; Sat, 19 May 2007 16:52:41 -0700 Message-ID: Received: from 65.55.135.123 by by130fd.bay130.hotmail.msn.com with HTTP; Sat, 19 May 2007 23:52:36 GMT From: "prithula dhungel" To: [email protected] Bcc: Subject: Test mail Date: Sat, 19 May 2007 23:52:36 +0000 Mime-Version: 1.0 Content-Type: Text/html; format=flowed Return-Path: [email protected] Figure: A sample mail message header Received: This header field indicates the sequence in which the SMTP servers send and receive the mail message including the respective timestamps. In this example there are 4 “Received:” header lines. This means the mail message passed through 5 different SMTP servers before being delivered to the receiver’s mail box. The last (forth) “Received:” header indicates the mail message flow from the SMTP server of the sender to the second SMTP server in the chain of servers. The sender’s SMTP server is at address 65.55.135.123 and the second SMTP server in the chain is by130fd.bay130.hotmail.msn.com. The third “Received:” header indicates the mail message flow from the second SMTP server in the chain to the third server, and so on. Finally, the first “Received:” header indicates the flow of the mail messages from the forth SMTP server to the last SMTP server (i.e. the receiver’s mail server) in the chain. Message-id: The message has been given this number [email protected] (by bay0-omc3-s3.bay0.hotmail.com. Message-id is a unique string assigned by the mail system when the message is first created. From: This indicates the email address of the sender of the mail. In the given example, the sender is “[email protected]” To: This field indicates the email address of the receiver of the mail. In the example, the receiver is “[email protected]” Subject: This gives the subject of the mail (if any specified by the sender). In the example, the subject specified by the sender is “Test mail” Date: The date and time when the mail was sent by the sender. In the example, the sender sent the mail on 19th May 2007, at time 23:52:36 GMT. Mime-version: MIME version used for the mail. In the example, it is 1.0. Content-type: The type of content in the body of the mail message. In the example, it is “text/html”. Return-Path: This specifies the email address to which the mail will be sent if the receiver of this mail wants to reply to the sender. This is also used by the sender’s mail server for bouncing back undeliverable mail messages of mailer-daemon error messages. In the example, the return path is “[email protected]”. With download and delete, after a user retrieves its messages from a POP server, the messages are deleted. This poses a problem for the nomadic user, who may want to access the messages from many different machines (office PC, home PC, etc.). In the download and keep configuration, messages are not deleted after the user retrieves the messages. This can also be inconvenient, as each time the user retrieves the stored messages from a new machine, all of non-deleted messages will be transferred to the new machine (including very old messages). Yes an organization’s mail server and Web server can have the same alias for a host name. The MX record is used to map the mail server’s host name to its IP address. You should be able to see the sender's IP address for a user with an .edu email address. But you will not be able to see the sender's IP address if the user uses a gmail account. It is not necessary that Bob will also provide chunks to Alice. Alice has to be in the top 4 neighbors of Bob for Bob to send out chunks to her; this might not occur even if Alice provides chunks to Bob throughout a 30-second interval. Recall that in BitTorrent, a peer picks a random peer and optimistically unchokes the peer for a short period of time. Therefore, Alice will eventually be optimistically unchoked by one of her neighbors, during which time she will receive chunks from that neighbor. The overlay network in a P2P file sharing system consists of the nodes participating in the file sharing system and the logical links between the nodes. There is a logical link (an “edge” in graph theory terms) from node A to node B if there is a semi-permanent TCP connection between A and B. An overlay network does not include routers. Mesh DHT: The advantage is in order to a route a message to the peer (with ID) that is closest to the key, only one hop is required; the disadvantage is that each peer must track all other peers in the DHT. Circular DHT: the advantage is that each peer needs to track only a few other peers; the disadvantage is that O(N) hops are needed to route a message to the peer that is closest to the key. 25. File Distribution Instant Messaging Video Streaming Distributed Computing With the UDP server, there is no welcoming socket, and all data from different clients enters the server through this one socket. With the TCP server, there is a welcoming socket, and each time a client initiates a connection to the server, a new socket is created. Thus, to support n simultaneous connections, the server would need n+1 sockets. For the TCP application, as soon as the client is executed, it attempts to initiate a TCP connection with the server. If the TCP server is not running, then the client will fail to make a connection. For the UDP application, the client does not initiate connections (or attempt to communicate with the UDP server) immediately upon execution Chapter 2 Problems Problem 1 a) F b) T c) F d) F e) F Problem 2 Access control commands: USER, PASS, ACT, CWD, CDUP, SMNT, REIN, QUIT. Transfer parameter commands: PORT, PASV, TYPE STRU, MODE. Service commands: RETR, STOR, STOU, APPE, ALLO, REST, RNFR, RNTO, ABOR, DELE, RMD, MRD, PWD, LIST, NLST, SITE, SYST, STAT, HELP, NOOP. Problem 3 Application layer protocols: DNS and HTTP Transport layer protocols: UDP for DNS; TCP for HTTP Problem 4 The document request was http://gaia.cs.umass.edu/cs453/index.html. The Host : field indicates the server's name and /cs453/index.html indicates the file name. The browser is running HTTP version 1.1, as indicated just before the first pair. The browser is requesting a persistent connection, as indicated by the Connection: keep-alive. This is a trick question. This information is not contained in an HTTP message anywhere. So there is no way to tell this from looking at the exchange of HTTP messages alone. One would need information from the IP datagrams (that carried the TCP segment that carried the HTTP GET request) to answer this question. Mozilla/5.0. The browser type information is needed by the server to send different versions of the same object to different types of browsers. Problem 5 The status code of 200 and the phrase OK indicate that the server was able to locate the document successfully. The reply was provided on Tuesday, 07 Mar 2008 12:39:45 Greenwich Mean Time. The document index.html was last modified on Saturday 10 Dec 2005 18:27:46 GMT. There are 3874 bytes in the document being returned. The first five bytes of the returned document are : <!doc. The server agreed to a persistent connection, as indicated by the Connection: Keep-Alive field Problem 6 Persistent connections are discussed in section 8 of RFC 2616 (the real goal of this question was to get you to retrieve and read an RFC). Sections 8.1.2 and 8.1.2.1 of the RFC indicate that either the client or the server can indicate to the other that it is going to close the persistent connection. It does so by including the connection-token "close" in the Connection-header field of the http request/reply. HTTP does not provide any encryption services. (From RFC 2616) “Clients that use persistent connections should limit the number of simultaneous connections that they maintain to a given server. A single-user client SHOULD NOT maintain more than 2 connections with any server or proxy.” Yes. (From RFC 2616) “A client might have started to send a new request at the same time that the server has decided to close the "idle" connection. From the server's point of view, the connection is being closed while it was idle, but from the client's point of view, a request is in progress.” Problem 7 The total amount of time to get the IP address is . Once the IP address is known, elapses to set up the TCP connection and another elapses to request and receive the small object. The total response time is Problem 8 . . Problem 9 The time to transmit an object of size L over a link or rate R is L/R. The average time is the average size of the object divided by R:  = (850,000 bits)/(15,000,000 bits/sec) = .0567 sec The traffic intensity on the link is given by =(16 requests/sec)(.0567 sec/request) = 0.907. Thus, the average access delay is (.0567 sec)/(1 - .907)  .6 seconds. The total average response time is therefore .6 sec + 3 sec = 3.6 sec. The traffic intensity on the access link is reduced by 60% since the 60% of the requests are satisfied within the institutional network. Thus the average access delay is (.0567 sec)/[1 – (.4)(.907)] = .089 seconds. The response time is approximately zero if the request is satisfied by the cache (which happens with probability .6); the average response time is .089 sec + 3 sec = 3.089 sec for cache misses (which happens 40% of the time). So the average response time is (.6)(0 sec) + (.4)(3.089 sec) = 1.24 seconds. Thus the average response time is reduced from 3.6 sec to 1.24 sec. Problem 10 Note that each downloaded object can be completely put into one data packet. Let Tp denote the one-way propagation delay between the client and the server. First consider parallel downloads using non-persistent connections. Parallel downloads would allow 10 connections to share the 150 bits/sec bandwidth, giving each just 15 bits/sec. Thus, the total time needed to receive all objects is given by: (200/150+Tp + 200/150 +Tp + 200/150+Tp + 100,000/150+ Tp ) + (200/(150/10)+Tp + 200/(150/10) +Tp + 200/(150/10)+Tp + 100,000/(150/10)+ Tp ) = 7377 + 8*Tp (seconds) Now consider a persistent HTTP connection. The total time needed is given by: (200/150+Tp + 200/150 +Tp + 200/150+Tp + 100,000/150+ Tp ) + 10*(200/150+Tp + 100,000/150+ Tp ) =7351 + 24*Tp (seconds) Assuming the speed of light is 300*106 m/sec, then Tp=10/(300*106)=0.03 microsec. Tp is therefore negligible compared with transmission delay. Thus, we see that persistent HTTP is not significantly faster (less than 1 percent) than the non-persistent case with parallel download. Problem 11 Yes, because Bob has more connections, he can get a larger share of the link bandwidth. Yes, Bob still needs to perform parallel downloads; otherwise he will get less bandwidth than the other four users. Problem 12 Server.py from socket import * serverPort=12000 serverSocket=socket(AF_INET,SOCK_STREAM) serverSocket.bind(('',serverPort)) serverSocket.listen(1) connectionSocket, addr = serverSocket.accept() while 1: sentence = connectionSocket.recv(1024) print 'From Server:', sentence, '\n' serverSocket.close() Problem 13 The MAIL FROM: in SMTP is a message from the SMTP client that identifies the sender of the mail message to the SMTP server. The From: on the mail message itself is NOT an SMTP message, but rather is just a line in the body of the mail message. Problem 14 SMTP uses a line containing only a period to mark the end of a message body. HTTP uses “Content-Length header field” to indicate the length of a message body. No, HTTP cannot use the method used by SMTP, because HTTP message could be binary data, whereas in SMTP, the message body must be in 7-bit ASCII format. Problem 15 MTA stands for Mail Transfer Agent. A host sends the message to an MTA. The message then follows a sequence of MTAs to reach the receiver’s mail reader. We see that this spam message follows a chain of MTAs. An honest MTA should report where it receives the message. Notice that in this message, “asusus-4b96 ([58.88.21.177])” does not report from where it received the email. Since we assume only the originator is dishonest, so “asusus-4b96 ([58.88.21.177])” must be the originator. Problem 16 UIDL abbreviates “unique-ID listing”. When a POP3 client issues the UIDL command, the server responds with the unique message ID for all of the messages present in the user's mailbox. This command is useful for “download and keep”. By maintaining a file that lists the messages retrieved during earlier sessions, the client can use the UIDL command to determine which messages on the server have already been seen. Problem 17 a) C: dele 1 C: retr 2 S: (blah blah … S: ………..blah) S: . C: dele 2 C: quit S: +OK POP3 server signing off b) C: retr 2 S: blah blah … S: ………..blah S: . C: quit S: +OK POP3 server signing off C: list S: 1 498 S: 2 912 S: . C: retr 1 S: blah ….. S: ….blah S: . C: retr 2 S: blah blah … S: ………..blah S: . C: quit S: +OK POP3 server signing off Problem 18 For a given input of domain name (such as ccn.com), IP address or network administrator name, the whois database can be used to locate the corresponding registrar, whois server, DNS server, and so on. NS4.YAHOO.COM from www.register.com; NS1.MSFT.NET from ww.register.com Local Domain: www.mindspring.com Web servers : www.mindspring.com 207.69.189.21, 207.69.189.22, 207.69.189.23, 207.69.189.24, 207.69.189.25, 207.69.189.26, 207.69.189.27, 207.69.189.28 Mail Servers : mx1.mindspring.com (207.69.189.217) mx2.mindspring.com (207.69.189.218) mx3.mindspring.com (207.69.189.219) mx4.mindspring.com (207.69.189.220) Name Servers: itchy.earthlink.net (207.69.188.196) scratchy.earthlink.net (207.69.188.197) www.yahoo.com Web Servers: www.yahoo.com (216.109.112.135, 66.94.234.13) Mail Servers: a.mx.mail.yahoo.com (209.191.118.103) b.mx.mail.yahoo.com (66.196.97.250) c.mx.mail.yahoo.com (68.142.237.182, 216.39.53.3) d.mx.mail.yahoo.com (216.39.53.2) e.mx.mail.yahoo.com (216.39.53.1) f.mx.mail.yahoo.com (209.191.88.247, 68.142.202.247) g.mx.mail.yahoo.com (209.191.88.239, 206.190.53.191) Name Servers: ns1.yahoo.com (66.218.71.63) ns2.yahoo.com (68.142.255.16) ns3.yahoo.com (217.12.4.104) ns4.yahoo.com (68.142.196.63) ns5.yahoo.com (216.109.116.17) ns8.yahoo.com (202.165.104.22) ns9.yahoo.com (202.160.176.146) www.hotmail.com Web Servers: www.hotmail.com (64.4.33.7, 64.4.32.7) Mail Servers: mx1.hotmail.com (65.54.245.8, 65.54.244.8, 65.54.244.136) mx2.hotmail.com (65.54.244.40, 65.54.244.168, 65.54.245.40) mx3.hotmail.com (65.54.244.72, 65.54.244.200, 65.54.245.72) mx4.hotmail.com (65.54.244.232, 65.54.245.104, 65.54.244.104) Name Servers: ns1.msft.net (207.68.160.190) ns2.msft.net (65.54.240.126) ns3.msft.net (213.199.161.77) ns4.msft.net (207.46.66.126) ns5.msft.net (65.55.238.126) d) The yahoo web server has multiple IP addresses www.yahoo.com (216.109.112.135, 66.94.234.13) e) The address range for Polytechnic University: 128.238.0.0 – 128.238.255.255 f) An attacker can use the whois database and nslookup tool to determine the IP address ranges, DNS server addresses, etc., for the target institution. By analyzing the source address of attack packets, the victim can use whois to obtain information about domain from which the attack is coming and possibly inform the administrators of the origin domain. Problem 19 The following delegation chain is used for gaia.cs.umass.edu a.root-servers.net E.GTLD-SERVERS.NET ns1.umass.edu(authoritative) First command: dig +norecurse @a.root-servers.net any gaia.cs.umass.edu ;; AUTHORITY SECTION: edu. 172800 IN NS E.GTLD-SERVERS.NET. edu. 172800 IN NS A.GTLD-SERVERS.NET. edu. 172800 IN NS G3.NSTLD.COM. edu. 172800 IN NS D.GTLD-SERVERS.NET. edu. 172800 IN NS H3.NSTLD.COM. edu. 172800 IN NS L3.NSTLD.COM. edu. 172800 IN NS M3.NSTLD.COM. edu. 172800 IN NS C.GTLD-SERVERS.NET. Among all returned edu DNS servers, we send a query to the first one. dig +norecurse @E.GTLD-SERVERS.NET any gaia.cs.umass.edu umass.edu. 172800 IN NS ns1.umass.edu. umass.edu. 172800 IN NS ns2.umass.edu. umass.edu. 172800 IN NS ns3.umass.edu. Among all three returned authoritative DNS servers, we send a query to the first one. dig +norecurse @ns1.umass.edu any gaia.cs.umass.edu gaia.cs.umass.edu. 21600 IN A 128.119.245.12 The answer for google.com could be: a.root-servers.net E.GTLD-SERVERS.NET ns1.google.com(authoritative) Problem 20 We can periodically take a snapshot of the DNS caches in the local DNS servers. The Web server that appears most frequently in the DNS caches is the most popular server. This is because if more users are interested in a Web server, then DNS requests for that server are more frequently sent by users. Thus, that Web server will appear in the DNS caches more frequently. For a complete measurement study, see: Craig E. Wills, Mikhail Mikhailov, Hao Shang “Inferring Relative Popularity of Internet Applications by Actively Querying DNS Caches”, in IMC'03, October 27­29, 2003, Miami Beach, Florida, USA Problem 21 Yes, we can use dig to query that Web site in the local DNS server. For example, “dig cnn.com” will return the query time for finding cnn.com. If cnn.com was just accessed a couple of seconds ago, an entry for cnn.com is cached in the local DNS cache, so the query time is 0 msec. Otherwise, the query time is large. Problem 22 For calculating the minimum distribution time for client-server distribution, we use the following formula: Dcs = max {NF/us, F/dmin} Similarly, for calculating the minimum distribution time for P2P distribution, we use the following formula: Where, F = 15 Gbits = 15 * 1024 Mbits us = 30 Mbps dmin = di = 2 Mbps Note, 300Kbps = 300/1024 Mbps. Client Server N 10 100 1000 u 300 Kbps 7680 51200 512000 700 Kbps 7680 51200 512000 2 Mbps 7680 51200 512000 Peer to Peer N 10 100 1000 u 300 Kbps 7680 25904 47559 700 Kbps 7680 15616 21525 2 Mbps 7680 7680 7680 Problem 23 Consider a distribution scheme in which the server sends the file to each client, in parallel, at a rate of a rate of us/N. Note that this rate is less than each of the client’s download rate, since by assumption us/N ≤ dmin. Thus each client can also receive at rate us/N. Since each client receives at rate us/N, the time for each client to receive the entire file is F/( us/N) = NF/ us. Since all the clients receive the file in NF/ us, the overall distribution time is also NF/ us. Consider a distribution scheme in which the server sends the file to each client, in parallel, at a rate of dmin. Note that the aggregate rate, N dmin, is less than the server’s link rate us, since by assumption us/N ≥ dmin. Since each client receives at rate dmin, the time for each client to receive the entire file is F/ dmin. Since all the clients receive the file in this time, the overall distribution time is also F/ dmin. From Section 2.6 we know that DCS ≥ max {NF/us, F/dmin} (Equation 1) Suppose that us/N ≤ dmin. Then from Equation 1 we have DCS ≥ NF/us . But from (a) we have DCS ≤ NF/us . Combining these two gives: DCS = NF/us when us/N ≤ dmin. (Equation 2) We can similarly show that: DCS =F/dmin when us/N ≥ dmin (Equation 3). Combining Equation 2 and Equation 3 gives the desired result. Problem 24 Define u = u1 + u2 + ….. + uN. By assumption us <= (us + u)/N Equation 1 Divide the file into N parts, with the ith part having size (ui/u)F. The server transmits the ith part to peer i at rate ri = (ui/u)us. Note that r1 + r2 + ….. + rN = us, so that the aggregate server rate does not exceed the link rate of the server. Also have each peer i forward the bits it receives to each of the N-1 peers at rate ri. The aggregate forwarding rate by peer i is (N-1)ri. We have (N-1)ri = (N-1)(usui)/u = (us + u)/N Equation 2 Let ri = ui/(N-1) and rN+1 = (us – u/(N-1))/N In this distribution scheme, the file is broken into N+1 parts. The server sends bits from the ith part to the ith peer (i = 1, …., N) at rate ri. Each peer i forwards the bits arriving at rate ri to each of the other N-1 peers. Additionally, the server sends bits from the (N+1) st part at rate rN+1 to each of the N peers. The peers do not forward the bits from the (N+1)st part. The aggregate send rate of the server is r1+ …. + rN + N rN+1 = u/(N-1) + us – u/(N-1) = us Thus, the server’s send rate does not exceed its link rate. The aggregate send rate of peer i is (N-1)ri = ui Thus, each peer’s send rate does not exceed its link rate. In this distribution scheme, peer i receives bits at an aggregate rate of Thus each peer receives the file in NF/(us+u). (For simplicity, we neglected to specify the size of the file part for i = 1, …., N+1. We now provide that here. Let Δ = (us+u)/N be the distribution time. For i = 1, …, N, the ith file part is Fi = ri Δ bits. The (N+1)st file part is FN+1 = rN+1 Δ bits. It is straightforward to show that F1+ ….. + FN+1 = F.) The solution to this part is similar to that of 17 (c). We know from section 2.6 that Combining this with a) and b) gives the desired result. Problem 25 There are N nodes in the overlay network. There are N(N-1)/2 edges. Problem 26 Yes. His first claim is possible, as long as there are enough peers staying in the swarm for a long enough time. Bob can always receive data through optimistic unchoking by other peers. His second claim is also true. He can run a client on each host, let each client “free-ride,” and combine the collected chunks from the different hosts into a single file. He can even write a small scheduling program to make the different hosts ask for different chunks of the file. This is actually a kind of Sybil attack in P2P networks. Problem 27 Peer 3 learns that peer 5 has just left the system, so Peer 3 asks its first successor (Peer 4) for the identifier of its immediate successor (peer 8). Peer 3 will then make peer 8 its second successor. Problem 28 Peer 6 would first send peer 15 a message, saying “what will be peer 6’s predecessor and successor?” This message gets forwarded through the DHT until it reaches peer 5, who realizes that it will be 6’s predecessor and that its current successor, peer 8, will become 6’s successor. Next, peer 5 sends this predecessor and successor information back to 6. Peer 6 can now join the DHT by making peer 8 its successor and by notifying peer 5 that it should change its immediate successor to 6. Problem 29 For each key, we first calculate the distances (using d(k,p)) between itself and all peers, and then store the key in the peer that is closest to the key (that is, with smallest distance value). Problem 30 Yes, randomly assigning keys to peers does not consider the underlying network at all, so it very likely causes mismatches. Such mismatches may degrade the search performance. For example, consider a logical path p1 (consisting of only two logical links): ABC, where A and B are neighboring peers, and B and C are neighboring peers. Suppose that there is another logical path p2 from A to C (consisting of 3 logical links): ADEC. It might be the case that A and B are very far away physically (and separated by many routers), and B and C are very far away physically (and separated by many routers). But it may be the case that A, D, E, and C are all very close physically (and all separated by few routers). In other words, a shorter logical path may correspond to a much longer physical path. Problem 31 If you run TCPClient first, then the client will attempt to make a TCP connection with a non-existent server process. A TCP connection will not be made. UDPClient doesn't establish a TCP connection with the server. Thus, everything should work fine if you first run UDPClient, then run UDPServer, and then type some input into the keyboard. If you use different port numbers, then the client will attempt to establish a TCP connection with the wrong process or a non-existent process. Errors will occur. Problem 32 In the original program, UDPClient does not specify a port number when it creates the socket. In this case, the code lets the underlying operating system choose a port number. With the additional line, when UDPClient is executed, a UDP socket is created with port number 5432 . UDPServer needs to know the client port number so that it can send packets back to the correct client socket. Glancing at UDPServer, we see that the client port number is not “hard-wired” into the server code; instead, UDPServer determines the client port number by unraveling the datagram it receives from the client. Thus UDP server will work with any client port number, including 5432. UDPServer therefore does not need to be modified. Before: Client socket = x (chosen by OS) Server socket = 9876 After: Client socket = 5432 Problem 33 Yes, you can configure many browsers to open multiple simultaneous connections to a Web site. The advantage is that you will you potentially download the file faster. The disadvantage is that you may be hogging the bandwidth, thereby significantly slowing down the downloads of other users who are sharing the same physical links. Problem 34 For an application such as remote login (telnet and ssh), a byte-stream oriented protocol is very natural since there is no notion of message boundaries in the application. When a user types a character, we simply drop the character into the TCP connection. In other applications, we may be sending a series of messages that have inherent boundaries between them. For example, when one SMTP mail server sends another SMTP mail server several email messages back to back. Since TCP does not have a mechanism to indicate the boundaries, the application must add the indications itself, so that receiving side of the application can distinguish one message from the next. If each message were instead put into a distinct UDP segment, the receiving end would be able to distinguish the various messages without any indications added by the sending side of the application. Problem 35 To create a web server, we need to run web server software on a host. Many vendors sell web server software. However, the most popular web server software today is Apache, which is open source and free. Over the years it has been highly optimized by the open-source community. Problem 36 The key is the infohash, the value is an IP address that currently has the file designated by the infohash.   Chapter 3 Review Questions Call this protocol Simple Transport Protocol (STP). At the sender side, STP accepts from the sending process a chunk of data not exceeding 1196 bytes, a destination host address, and a destination port number. STP adds a four-byte header to each chunk and puts the port number of the destination process in this header. STP then gives the destination host address and the resulting segment to the network layer. The network layer delivers the segment to STP at the destination host. STP then examines the port number in the segment, extracts the data from the segment, and passes the data to the process identified by the port number. The segment now has two header fields: a source port field and destination port field. At the sender side, STP accepts a chunk of data not exceeding 1192 bytes, a destination host address, a source port number, and a destination port number. STP creates a segment which contains the application data, source port number, and destination port number. It then gives the segment and the destination host address to the network layer. After receiving the segment, STP at the receiving host gives the application process the application data and the source port number. No, the transport layer does not have to do anything in the core; the transport layer “lives” in the end systems. For sending a letter, the family member is required to give the delegate the letter itself, the address of the destination house, and the name of the recipient. The delegate clearly writes the recipient’s name on the top of the letter. The delegate then puts the letter in an envelope and writes the address of the destination house on the envelope. The delegate then gives the letter to the planet’s mail service. At the receiving side, the delegate receives the letter from the mail service, takes the letter out of the envelope, and takes note of the recipient name written at the top of the letter. The delegate then gives the letter to the family member with this name. No, the mail service does not have to open the envelope; it only examines the address on the envelope. Source port number y and destination port number x. An application developer may not want its application to use TCP’s congestion control, which can throttle the application’s sending rate at times of congestion. Often, designers of IP telephony and IP videoconference applications choose to run their applications over UDP because they want to avoid TCP’s congestion control. Also, some applications do not need the reliable data transfer provided by TCP. Since most firewalls are configured to block UDP traffic, using TCP for video and voice traffic lets the traffic though the firewalls. Yes. The application developer can put reliable data transfer into the application layer protocol. This would require a significant amount of work and debugging, however. Yes, both segments will be directed to the same socket. For each received segment, at the socket interface, the operating system will provide the process with the IP addresses to determine the origins of the individual segments. For each persistent connection, the Web server creates a separate “connection socket”. Each connection socket is identified with a four-tuple: (source IP address, source port number, destination IP address, destination port number). When host C receives and IP datagram, it examines these four fields in the datagram/segment to determine to which socket it should pass the payload of the TCP segment. Thus, the requests from A and B pass through different sockets. The identifier for both of these sockets has 80 for the destination port; however, the identifiers for these sockets have different values for source IP addresses. Unlike UDP, when the transport layer passes a TCP segment’s payload to the application process, it does not specify the source IP address, as this is implicitly specified by the socket identifier. Sequence numbers are required for a receiver to find out whether an arriving packet contains new data or is a retransmission. To handle losses in the channel. If the ACK for a transmitted packet is not received within the duration of the timer for the packet, the packet (or its ACK or NACK) is assumed to have been lost. Hence, the packet is retransmitted. A timer would still be necessary in the protocol rdt 3.0. If the round trip time is known then the only advantage will be that, the sender knows for sure that either the packet or the ACK (or NACK) for the packet has been lost, as compared to the real scenario, where the ACK (or NACK) might still be on the way to the sender, after the timer expires. However, to detect the loss, for each packet, a timer of constant duration will still be necessary at the sender. The packet loss caused a time out after which all the five packets were retransmitted. Loss of an ACK didn’t trigger any retransmission as Go-Back-N uses cumulative acknowledgements. The sender was unable to send sixth packet as the send window size is fixed to 5. When the packet was lost, the received four packets were buffered the receiver. After the timeout, sender retransmitted the lost packet and receiver delivered the buffered packets to application in correct order. Duplicate ACK was sent by the receiver for the lost ACK. The sender was unable to send sixth packet as the send win
A Novel RFID Authentication Protocol with Ownership Transfer Han Jia1, Jun Wen2 School of Computer Science and Technology, University of Electronic Science and Technology of China,Chengdu, China [email protected], 2wenjun@uestc.edu.cn Abstract. RFID technology has a wide application in many fields. However, there are many security and privacy issues. The paper presents a RFID security proposal to enhance security levels. It is established as following steps. It first builds a security communication channel, then implements tags and corresponding reader authentication, finally solves the ownership transfer issue. This protocol involves minimal interaction between tags and corresponding reader, which can efficiently lower the computational burden on the tag. Its security is verified by BAN logic. Keywords: Radio Frequency Identification; authentication; BAN; security and privacy 1 Introduction RFID is one of the rapidly developing techniques in recent years. It is widely used in many fields, such as retail trade, libraries, car tracking, product identification and passport. It would play an important role in the future. RFID system consist of tags, readers and a database server. In the life of tag, it may transfer ownership on many occasions, for example, it occurs when a manufacture delivers it to a retailer. Therefore, the seamless ownership transfer of tag is required in RFID system. It may suffer from attacks when both of previous owner and new owner accessed the information of the tag during the processing of ownership transfer. After the operation, the previous owner cannot access the information of tag. Typically, the tag have 5~10K logic gates, can store only hundreds bits. With the limit logic gates, about between 300 and 3000 gates can be devoted to security function. There is no such security transfer mechanism in RFID system due to limited computation capabilities and storage on tags. RFID system may suffer from some security threats. They are listed as follows: Replay attack: An attacker transmits the information he got and spoofs legitimate tag. This attack may leak out the information of tag. Impersonation: An attacker forges a tag or a reader as an authenticated one to steal the information in the database server. Eavesdropping: It is easy for eavesdroppers to get the signal from the open wireless circumstance, which lead to leak the business information. Dos attack: An attacker transmits some messages to interrupt the communication among tags, readers and database servers. De-synchronization attack: The difference between the key in tags and the one in database results in the authenticated tag cannot be recognized. Windowing problem: During the process of ownership transfer, both the old and new owners possess the information to authenticate the tag. This paper proposes a RFID protocol that can resist the above attacks. It adopts random numbers to make sure every round of access is fresh. The message transmitted in the channel is cipher text to preventing leakage of the tag’s information. If the bad occasion of de-synchronization happens, this protocol provides the corresponding mechanism to recover it. The major contribution of this paper is to present a novel security and privacy RFID method with group ownership transfer. This protocol involves minimal interaction between reader and tags. The proposed method provides not only security but also efficiency. This paper is organized as follows. Section 2 describes related work of RFID. Section 3 presents a new mutual authentication method. Section 4 verifies this protocol’s security by BAN Logic. Section 5 draws a conclusion. 2 Related work Previous papers have done some research in RFID security. [1] pointed out there are several practical scenarios of group transfer. In addition, group transfer can substantially expand the application of RFID system. [2] is one of the earlier ownership transfer protocol. Unfortunately, there is a flaw in their solution that allows killing of the tag. [3] improved the protocol [2], [3]added some message to the last message form the database to the reader. However, this would lead to de-synchronization, and it cannot resist Dos attacks. [4] proposed a protocol to achieve group ownership transfer with the help of database server, which played a role of trusted third party. However, it may leak out the owner’s privacy. [5] adopted a dynamic ID to avoid a replay attack, but it may suffer form a Dos attack. [6] adopted one-time secret to prevent attacks in his protocol, the secrets shared between tags and servers are changed once ownership transfer occurs. It may lead to de-synchronization even if a trusted third party is used. [7] achieved ownership transfer without TTP. This protocol vulnerable to suffer from eavesdropping attacks by the previous owner, and it cannot resist Dos attacks. The above solutions can be divided into two kinds according to the involvement of a Trusted Third Party (TTP). One not involving a TTP requires many rounds to complete authentication. The other involving a TTP relies on the security and robustness of server, which required the TTP to be online anytime. 3 Paper Preparation A novel RFID security protocol An RFID security method that achieves all requirements based on xor and public key Infrastructure is proposed. The notations using throughout this paper are listed in table1. Table1 Ek() Encryption function (under key k); It maybe xor , symmetrical encryption or asymmetric encryption , which depends the capability of computation about entity. D ( ) Decryption function ; K&K1; The key for encryption ; Different group owner have different key , so it can identify the owner of tag. In this paper, K1 represent a new owner. ID The unique identifier of tag ; Rt The random number generate by tag ; Rr The random number generate by reader; Rs The random number generate by database server; ⊕ XOR operation; Info(ID) The specific information of tag which has this ID . Assume public key infrastructure has been constructed between a reader and a server. The process of protocol is depicted as follows: A section to establish a secure communication channel is as follows: 1. A reader generates a random number Rr, and transmits a ClientHello(contains Rr) which is encrypted using the public key of the server to the database server. 2. A server generates a random number Rs after receiving the random number Rr from the reader. ServerHello(contains Rr and Rs) is encrypted using the reader’s public key, then sends ServerHello to the reader. 3. The Reader checks whether the random number received from server equal to Rr. If true, stores the random number Rs and sends ClientHelloDone to the server. Else it disposes this message and goes to step 1. 4. The server sends ServerHelloDone to the reader when the server receives ClientHelloDone from the reader. In this case, a secure communications channel has been established. A section of mutual authentication is as follows: 5. The Reader sends a request (contains Rr) to a tag. 6. The tag stores Rr and generates a random number Rt. The tag sends Ek(Rt), Ek(ID) to the reader. 7. The reader encrypts Ek(ID)⊕Rs, Ek(Rt), Rs using public key of server and sends them to the server. 8. The server checks whether the random number received from reader equals to Rs. If false, the protocol is terminated, else the server gets Ek(ID) by computing Ek(ID)⊕Rs⊕Rs. The real ID is gotten by decryption function Dk(Ek(ID)). The server searches ID in the database. If it is found, authentication process is successful. The server sends Info(ID) encrypted using the public key of the reader to the reader. If it fails, decrypt Ek(ID) using the last successful authenticated key. If ID is found, the authentication process is successful. The server sends info(ID) encrypted using public key of the reader to the reader and goes to step 9. Else the protocol is terminated. Figure 1 . Authentication Process A section of ownership transfer is as follows: 9. The server gets the public key K1 of new group owner. The server sends Ek(ID⊕Rt), Rt⊕K1, Ek(Rt)and Rr encrypted using the public key of the reader to the reader. Then updates the public key of owner to K1 and stores K as the last successful authentication key. 10. The reader checks whether the Rr is equal to the random number the reader stored. If them equal, goes to the next step, else terminate the protocol. 11. The Reader sends Ek(ID⊕Rt), Ek(Rt)⊕Rr and Rt⊕K1 to the tag. 12. The tag gets Rr by computing Ek(Rt)⊕Rr⊕Ek(Rt) and checks whether Rr equals to the random number the tag stored. If not equal, terminates the protocol. Else the reader is authenticated. 13. The tag checks whether Ek(ID⊕Rt) equals to what it stored. If true, goes to the next step, else terminates the protocol. 14. The tag gets the public key K1 of new group owner by computing Rt⊕K1⊕Rt, and then replaces K with K1. The process of group owner transfer is finished. Figure 2 . Ownership Transfer Process Figure 1 shows the process of mutual authentication. Figure 2 shows the process of group ownership transfer. This protocol can help resist the attack mentioned above. This protocol can adapt the capability of tag computation. If tag has strong computing capability, it can adopt hash or PKI Infrastructure. If tag has weak computing capability, it can adopt xor instead of Ek(), but the premise is public key large enough. 4 Analyzing Protocol with BAN Logic Whether this protocol can achieve the security goal to expect can be proved by formal methods. BAN logic is a well-known authentication logic. Protocol security can be verified by BAN logic to decide whether a protocol can reach expected target and some flaws can thus be found. Syntax and Semantics of BAN logic is shown as follows [8]: P|≡X : P trusts the message X is true , P believes X . PX : P received a message contains X , P sees X . P|~X : P has transmitted a message contains X . P said X . P|⇒X : P controls X #(X) : X is fresh . X has not been transmitted in any message before. PQ : P and Q communicate to each other with the shared key K . No one discovered K except P , Q or a third party trusted by P or Q . {X}K: It means that X is encrypted under K. Rules of BAN Logic message-meaning rule Rule 1 : P|≡PQ , P{X}K ┣ P|≡Q|~X nonce-verification rule Rule 2 : P|≡#{X} , P|≡Q|~X ┣ P|≡Q|≡X jurisdiction rule Rule 3 : P|≡Q|X , P|≡Q|≡X ┣ P|≡X seeing rules Rule 4 : p(X,Y) ┣ PX Rule 5 : P<X>K ┣PX Rule 6 : P|≡PQ , P{X}K ┣ PX freshness rule Rule 7 : P|≡#{X} ┣ P|≡#{X,Y} belief rules Rule 8 : P|≡X ,P|≡Y ┣ P|≡(X,Y) Rule 9 : P|≡(X,Y) ┣ P|≡X Rule 10 : P|≡Q|≡(X,Y) ┣ P|≡Q|≡X Rule 11 : P|≡Q|~(X,Y) ┣ P|≡Q|~X Assume A represents a tag, B represents a reader, and S represents a database server. KBS represents the shared key between B and S. The initial assumptions are as follows: B|≡B S (1) S|≡B S (2) A|≡AS (3) S|≡AS (4) A|≡S|⇒AS (5) A|≡#(Rt) (6) B|≡#(Info(ID)) (7) B|≡S|⇒Info(ID) (8) A|≡S|⇒K1 (9) The idealization of the RFID protocol is listed as follows: B→S : S→B : B→A : Rr A→B: B→S : S→B : S→B : B→A : , Rr According to BAN Logic, the interpretation of the RFID protocol is as follows: S (10) B (11) A (12) B (13) S (14) B (15) B (16) A, Rr (17) The goals expected to achieve: B|≡Info(ID), A|≡SA Under the Rule 1, formula(15) and the assumption(1) is to obtain the following: B|≡S|~Info(ID) (18) Under the Rule 2, formula(18) and the assumption(7) is to obtain the following: B|≡S|≡Info(ID) (19) Under the Rule 3, formula(19) and the assumption(8) is to obtain the following: B|≡Info(ID) So the goal of B|≡Info(ID) has been proved. Under the message-meaning rule, formula(17) and the assumption (3) is to obtain the following: A|≡S|~ (20) Under the Rule 5 and assumption(6) is to obtain the following : A|≡# (21) Under the Rule 2, formula(20) and (21) is to obtain the following: A|≡S|≡ (22) Under the Rule 9 and formula(22) is to obtain the following : A|≡S|≡ (23) Under the jurisdiction rule, formula(23) and assumption(9) is to obtain the following: A|≡SA From the initial assumption, the goals are deduced by applying logic rules. By analyzing the RFID protocol with BAN Logic, it can verify that the RFID protocol is secure and flawless. The protocol security of above issues is analyzed as follows: This paper adopts random numbers to make sure that every round of communication is fresh. The random numbers are produced every round to prevent the replay attack. The message transmitted in the channel is cipher text so that attacker cannot figure out the original message. If an attacker forge a tag to involve the process of exchanged information, the database server can detect the identification does not exists in database, which will deny its next operations. When de-synchronization happens, the database server can identify the tag by using old owner’s secret key. Then the database server sends the new secret key once again to recover it. This protocol transfers ownership by one-step operation. If this operation success, the old owner cannot access the tag because it does not know the new secret key; otherwise, the tag cannot identify the new owner, it will deny its access. By the way, it can resist windowing problems. 5 Conclusions This paper proposes a RFID protocol, which can be implemented in either high-cost tags or low-cost tags. This protocol’s security has been proved by BAN Logic. In this protocol, three random numbers are involved. How to reduce the random numbers in the protocol and simplify operations is what will be done in the future. References 1. A. Juels.: Yoking-proofs for RFID Tags. Second IEEE Annual Conference on Pervasive Computing and Communications Workshops, Washington DC, USA, 2004, PP. 138-142. 2. K. Osaka, T. Takagi, K. Yamazaki and O. Takahash.: An Efficient and Secure RFID Method with Ownership Transfer. Computational Intelligence and Security, vol. 2, 2006, pp. 1090-1095. 3. Jappinen, P. Hamalainen, H.: Enhanced RFID Security Method with Ownership Transfer. Proceedings of the International Conference on Computational Intelligence and Security. (2008) PP 382-385 4. H. Lei and T. Cao.: RFID Protocol enabling Ownership Transfer to protect against Tractability and Dos attacks. The First International Symposium on Data, Privacy and E-Commerce, 2007(ISDPE 2007). 1-3 Nov. 2007, pp. 508-510. 5. S.Tripathy and S. Nandi.: Robust Mutual Authentication for Low cost RFID Systems. 2006 IEEE International Conference on Industrial Informatics, Aug, 2006, pp, 949-954 6. L. Kulseng.: Lightweight mutual authentication, owner transfer, and secure search protocols for RFID systems. Master of Science thesis, Electrical & Computer Engineering Department, Iowa State University, 2009. 7. T. Dimitrious.: RFIDDOT:RFID delegation and ownership transfer made simple. in Proc.International Conference onComputational Intelligence and Security,2008,PP. 382-385. 8. Kernal Bicakci , Nazife Baykal.: One-Time Passwords:Security Analysis Using BAN Logic and Integrating with Smartcard Authentication[J]. Lecture Notes in Computer Science, 2003:794-801.

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

宜春

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值