WinHTTP中的统一资源定位符 (URLs)

WinHTTP中的统一资源定位符 (URLs)

A URL is a compact representation of the location and access method for a resource located on the Internet. Each URL consists of a scheme (HTTP, HTTPS, FTP, or Gopher) and a scheme-specific string. This string can also include a combination of a directory path, search string, or name of the resource. The Microsoft Windows HTTP Services (WinHTTP) functions provide the ability to create, combine, break down, and canonicalize URLs. For more information, see RFC 1738, Uniform Resource Locators and RFC 2396, Uniform Resource Identifiers (URI): Generic Syntax.

一个URL就是一个简洁的表征,这个表征显示了资源在Internet中的位置和访问方法。每个URL包含一种指定的访问方式(HTTP, HTTPS, FTP, or Gopher)。这个字符串也包含资源中的一些路径、查找字符串或者名称信息。WinHTTP函数提供创建、组合、分拆和规范URLs的方法。详细信息参见RFC 1738, Uniform Resource LocatorsRFC 2396, Uniform Resource Identifiers (URI): Generic Syntax.

什么是规范的 URL?

The specified syntax and semantics of URLs leaves room for variation and error. Canonicalization is the process of normalizing an actual URL into a correct, standard, "canonical" form.


This involves coding some characters as "escape sequences." Alphanumeric US-ASCII characters need not be encoded (the digits 0-9, the capital letters A-Z, and the lowercase letters a-z). Most other characters must be escaped, including control characters, the space character, the percent sign, "unsafe characters" ( <, >, ", #, {, }, |, \, ^, ~, [, ], and ' ), and all characters with a code point above 127.

这包含对一些如“escape sequences.”的编码。US-ASCII中的字母和数字不需要编码。其它大部分字符需要忽略,包括控制字符,空格,百分号,“不安全的字符”( <, >, ", #, {, }, |, \, ^, ~, [, ], and ' ),还有大于127的字符。

使用 WinHTTP 函数处理 URLs

WinHTTP provides two functions for handling URLs. WinHttpCrackUrl separates a URL into its component parts, and WinHttpCreateUrl creates a URL from components.



The WinHttpCrackUrl function separates a URL into its component parts and returns the components indicated by the URL_COMPONENTS structure that is passed to the function.


The components that make up the URL_COMPONENTS structure are the scheme number, host name, port number, user name, password, URL path, and additional information such as search parameters. Each component, except the scheme and port numbers, has a string member that holds the information and a member that holds the length of the string member. The scheme and port numbers have only a member that stores the corresponding value; both the scheme and port numbers are returned on all successful calls to WinHttpCrackUrl.


To retrieve the value of a particular component in the URL_COMPONENTS structure, the member that stores the string length of that component must be set to a nonzero value. The string member can be either a pointer to a buffer or NULL.


If the pointer member contains a pointer to a buffer, the string length member must contain the size of that buffer. The WinHttpCrackUrl function returns the component information as a string in the buffer and stores the string length in the string length member.

If the pointer member is set to NULL, the string length member can be set to any nonzero value. The WinHttpCrackUrl function stores a pointer to the first character of the URL string that contains the component information and sets the string length to the number of characters in the remaining part of the URL string that pertains to the component.


All pointer members set to NULL with a nonzero length member point to the appropriate starting point in the URL string. The length stored in the length member must be used to determine the end of the individual component's information.


To finish initializing the URL_COMPONENTS structure properly, the dwStructSize member must be set to the size of the URL_COMPONENTS structure.



The WinHttpCreateUrl function uses the information in the previously described URL_COMPONENTS structure to create a URL.


For each required component, the pointer member should contain a pointer to the buffer that holds the information. The length member should be set to zero if the pointer member contains a pointer to a zero-terminated string; the length member should be set to the string length if the pointer member contains a pointer to a string that is not zero-terminated. The pointer member of any components that are not required must be set to NULL.



The following sample code shows how to use the WinHttpCrackUrl and WinHttpCreateUrl to disassemble an existing URL, modify one of its components, and reassemble it into a new URL.

下例展示了如何使用WinHttpCrackUrl WinHttpCreateUrl来拆分一个已知的URL,改变它的元素和把元素重新组成一个新的URL

  LPCWSTR pwszUrl1 = 
  DWORD dwUrlLen = 0;

  // 初始化 URL_COMPONENTS 结构体.
  ZeroMemory(&urlComp, sizeof(urlComp));
  urlComp.dwStructSize = sizeof(urlComp);

  // 设置必要的组件长度为非零,这样它们就可以被解析.
  urlComp.dwSchemeLength    = (DWORD)-1;
  urlComp.dwHostNameLength  = (DWORD)-1;
  urlComp.dwUrlPathLength   = (DWORD)-1;
  urlComp.dwExtraInfoLength = (DWORD)-1;

  // 解析URL.
  if( !WinHttpCrackUrl( pwszUrl1, (DWORD)wcslen(pwszUrl1), 0, &urlComp ) )
      printf( "Error %u in WinHttpCrackUrl.\n", GetLastError( ) );
    // 改变查找信息内容,新信息和原信息长度一致.
    urlComp.lpszExtraInfo = L"?RS=CHECKED&FORM=MSNH&v=1&q=winhttp";

    // 获取新的URL,重新分配内存.
    WinHttpCreateUrl( &urlComp, 0, NULL, &dwUrlLen );
    LPWSTR pwszUrl2 = new WCHAR[dwUrlLen];

    // 创建新的URL.
    if( !WinHttpCreateUrl( &urlComp, 0, pwszUrl2, &dwUrlLen ) )
      printf( "Error %u in WinHttpCreateUrl.\n", GetLastError( ) );
      // 显示新、旧URLs.
      printf( "Old URL:  %S\nNew URL:  %S\n", pwszUrl1, pwszUrl2 );

    // 释放内存.
    delete [] pwszUrl2;





当前余额3.43前往充值 >
领取后你会自动成为博主和红包主的粉丝 规则
钱包余额 0


