使用BHO技术检测IE在浏览网页时下载到IE cache中的文件

本文介绍了如何使用BHO(浏览器帮助对象)技术来检测Internet Explorer在浏览网页时下载到IE缓存中的文件。通过捕获DISPID_STATUSTEXTCHANGE事件,解析StatusTextChange信息,从而找出下载的文件并从缓存中检测其Mime Type。利用FindMimeFromData函数从文件中确定MIME类型。
摘要由CSDN通过智能技术生成

关于BHO的使用介绍,可以参见

(1)Building Browser Helper Objects with Visual Studio 2005

链接地址:http://msdn.microsoft.com/en-us/library/bb250489.aspx

(2)Browser Helper Objects: The Browser the Way You Want It

链接地址:http://msdn.microsoft.com/en-us/library/bb250436%28v=vs.85%29.aspx

这两篇文章都已经在我前面的文章中进行了详细地介绍。

简单描述一下检测检测过程:

首先,我们需要捕获DISPID_STATUSTEXTCHANGE事件(因为我一直没有找到合适的事件来捕获所有的文件下载过程,当然,你可以试一试DISPID_DOCUMENTCOMPLETE,DISPID_DOWNLOADCOMPLETE以及DISPID_FILEDOWNLOAD等事件)

在Invoke函数中,添加如下事件处理代码:

STDMETHODIMP CIECacheFileDetecter::Invoke(DISPID dispidMember, REFIID riid, LCID lcid, WORD wFlags, DISPPARAMS* pDispParams, VARIANT* pvarResult, EXCEPINFO* pExcepInfo, UINT* puArgErr)

{

USES_CONVERSION;

strstream strEventInfo;

if (!pDispParams)

return E_INVALIDARG;

switch (dispidMember)

{

...

case DISPID_STATUSTEXTCHANGE:

LPOLESTR lpStatusText;

m_spWebBrowser2->get_StatusText(&lpStatusText);

strEventInfo << "StatusTextChange: ";

if (!StrCmp(OLE2T(lpStatusText), _T("")))

{

ATLTRACE(_T("Status Text: NULL/n"));

}

else

{

LPCTSTR pszStatusText = OLE2T(lpStatusText);

LPCTSTR pszMatchKey = _T("http");

LPTSTR pszSubText = StrStr(pszStatusText,pszMatchKey);

if (NULL != pszSubText)

{

ATLTRACE(_T("Status Text Change: %s/n"),pszStatusText);

LPTSTR pszEnd = StrStr(pszSubText,_T("..."));

if (NULL != pszEnd)

{

TCHAR *pszUrl = new TCHAR[_tcslen(pszSubText)];

_tcsncpy(pszUrl,pszSubText,_tcslen(pszSubText)-3);

pszUrl[_tcslen(pszSubText)-3] = '/0';

ATLTRACE(_T("URL: %s/n"),pszUrl);

DetectMimeType(pszUrl);

delete[] pszUrl;

}

}

}

break;

...

}

}

这里,DetectMimeType(pszUrl)从IE cache中查找对应的文件,并检测出相应的Mime Type,其具体实现如下:

BOOL CIECacheFileDetecter::DetectMimeType(LPCTSTR pszUrl)

{

USES_CONVERSION;

LPCTSTR strUrl = pszUrl;

DWORD dwEntrySize=0;

LPINTERNET_CACHE_ENTRY_INFO lpCacheEntry;

if (!GetUrlCacheEntryInfo(strUrl,NULL,&dwEntrySize))

{

if (GetLastError()!=ERROR_INSUFFICIENT_BUFFER)

{

return FALSE;

}

else

{

lpCacheEntry = (LPINTERNET_CACHE_ENTRY_INFO)

new char[dwEntrySize];

}

}

else

{

return FALSE;

}

if (!GetUrlCacheEntryInfo(strUrl,lpCacheEntry,&dwEntrySize))

{

return FALSE;

}

else

{

if ((lpCacheEntry->dwHeaderInfoSize)!=0)

{

// Read header information LPSTR(lpCacheEntry->lpHeaderInfo)[lpCacheEntry->dwHeaderInfoSize]=TEXT('/0');

ATLTRACE(_T("Header Info:/n%s/n"),lpCacheEntry->lpHeaderInfo);

}

// Read file information

const int BUF_SIZE = 200;

BYTE buffer[BUF_SIZE];

FILE* fh = fopen(T2A(lpCacheEntry->lpszLocalFileName),"r");

fread(buffer,sizeof(char),BUF_SIZE,fh);

fclose(fh);

LPWSTR strMime;

HRESULT hr = FindMimeFromData(NULL,NULL,(LPVOID)buffer,

BUF_SIZE,NULL,FMFD_DEFAULT,&strMime,0);

if (SUCCEEDED(hr))

{

LPCTSTR pszMimeType = W2CT(strMime);

ATLTRACE(_T("Mime Type: %s/n"),pszMimeType);

}

else

{

ATLTRACE(_T("Detect Mime Type Failed./n"));

}

return TRUE;

}

return 0;

}

这里,我们使用FindMimeFromData从文件中检测相应的Mime type。具体的函数信息如下:

FindMimeFromData Function

Determines the MIME type from the data provided.

Syntax

HRESULT FindMimeFromData(     

LPBC pBC,

LPCWSTR pwzUrl,

LPVOID pBuffer,

DWORD cbSize,

LPCWSTR pwzMimeProposed,

DWORD dwMimeFlags,

LPWSTR *ppwzMimeOut,

DWORD dwReserved

);

Parameters

pBC

A pointer to the IBindCtx interface. Can be set to NULL.

pwzUrl

A pointer to a string value that contains the URL of the data. Can be set to NULL if pBuffer contains the data to be sniffed.

pBuffer

A pointer to the buffer that contains the data to be sniffed. Can be set to NULL if pwzUrl contains a valid URL.

cbSize

An unsigned long integer value that contains the size of the buffer.

pwzMimeProposed

A pointer to a string value that contains the proposed MIME type. This value is authoritative if type cannot be determined from the data. If the proposed type contains a semi-colon (;) it is removed. This parameter can be set to NULL.

dwMimeFlags

One of the following required values:

FMFD_DEFAULT

No flags specified. Use default behavior for the function.

FMFD_URLASFILENAME

Treat the specified pwzUrl as a file name.

FMFD_ENABLEMIMESNIFFING

Microsoft Internet Explorer 6 for Windows XP Service Pack 2 (SP2) and later. Use MIME-type detection even if FEATURE_MIME_SNIFFING is detected. Usually, this feature control key would disable MIME-type detection.

FMFD_IGNOREMIMETEXTPLAIN

Internet Explorer 6 for Windows XP SP2 and later. Perform MIME-type detection if "text/plain" is proposed, even if data sniffing is otherwise disabled. Plain text may be converted to text/html if HTML tags are detected.

FMFD_SERVERMIME

Windows Internet Explorer 8. Use the authoritative MIME type specified in pwzMimeProposed. Unless FMFD_IGNOREMIMETEXTPLAIN is specified, no data sniffing is performed.

FMFD_RESPECTTEXTPLAIN clip_image001[4]

Internet Explorer 9. Do not perform detection if "text/plain" is specified in pwzMimeProposed.

ppwzMimeOut

The address of a string value that receives the suggested MIME type.

dwReserved

Reserved. Must be set to 0.

Return Value

Returns one of the following values.

S_OK

The operation completed successfully.

E_FAIL

The operation failed.

E_INVALIDARG

One or more arguments are invalid.

E_OUTOFMEMORY

There is insufficient memory to complete the operation.

Remarks

MIME type detection, or "data sniffing," refers to the process of determining an appropriate MIME type from binary data. The final result depends on a combination of server-supplied MIME type headers, file name extension, and/or the data itself. Usually, only the first 256 bytes of data are significant. For more information and a complete list of recognized MIME types, see MIME Type Detection in Internet Explorer.

If pwzUrl is specified without data to be sniffed (pBuffer), the file name extension determines the MIME type. If the file name extension cannot be mapped to a MIME type, this method returns E_FAIL unless a proposed MIME type is supplied in pwzMimeProposed.

After ppwzMimeOut returns and is read, the memory allocated for it should be freed with the operator delete function.

Internet Explorer 8 and later. FindMimeFromData will not promote image types to "text/html" even if the data lacks signature bytes.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值