word里Package介绍
参考:
http://www.eggheadcafe.com/software/aspnet/30971618/word--embedded-object-of-type-package.aspx
http://social.msdn.microsoft.com/forums/en-US/vsto/thread/c751c3ae-235d-4327-a26b-74fc297263b6/
最简单一句话:
我可以通过“插入”-》“对象”-》“包”(自我翻译:But I can easly ambedd a object using the folloowing menu in Word Insert -> Object... -> then choose Object type = "Package". )插入这样一个对象
下面截出几段英文:
Hi Bernard,
The "Package" is a legacy of the OLE 1.0 days. Ole 1.0 had a tool called the Packager that would provide an Ole object wrapper around an arbitrary file to allow it to be embedded in a client. In Ole 2.0, they took that concept and exposed it in such a way that clients could create packages via the Insert Object dialog without needing the tool.
I would have expected that extracting the data would have been a simple matter of casting the OLEFormat.Object to an IDataObject and calling GetFormats to get a list of the supported formats. I believe you would have found a "Native" format as an option, which would return the raw bits for the object. However, when I tried this, I found that OLEFormat.Object is throwing an invalid cast exception.
Further research confirms that the problem is that the Packager doesn't implement IDispatch. Since the Object type is a wrapper around IDispatch, the Object accessor is going to throw as a result of the QueryInterface for IDispatch failing. In short, there is no way to access the Object Packager through the Word Object model.
So now it is gut check time. How bad do you want this? There is only two ways I can think of to work around this problem. No managed code is involved--from here on out when I refer to APIs, assume they are Win32, COM, OLE etc. You'll have to create P/Invoke declarations for all of this (or else do this work in a native dll that you call from managed code).
The easiest way would probably be to crack open the Word storage, locate the substorage that the Packager wrote and call OleLoad on it. You can locate all of the Packager substorages by finding the ones with Ole10Native streams in them. You won't be able to map the substorages back to specific objects (that information is burried deep inside of the Word storage somewhere), but if all you are trying to do is extract all embedded objects, this shouldn't matter. So you will need to call StgOpenStorage on the Word document to open it read-only, and then use the storage APIs to enumerate the sub-storages and streams. Of course this approach gets considerably more complicated if you need to support metro file formats (Office 12). In addition to locating the Packager sub-storage, you will also need to grab the IOleClientSite from the document (since you will have to provide this to OleLoad). To do that, just get a Document reference, cast it to IOleObject (you'll have to implement the P/Invoke declarations for this) and call GetClientSite. If you can do all of that, you can call OleLoad with an IID of IDataObject . You should be able to pass in a .Net IDataObject class reference, but if that doesn't work, just get back an IUnknown and cast it. If you opt for this approach, you will probably want to download the Win32 SDK so you can get the doc-file viewer tool (DFVIEW.EXE). This tool will allow you to view the structured storage layout of a doc file. You can get the Win32 SDK here: http://www.microsoft.com/downloads/details.aspx?FamilyId=A55B6B43-E24F-4EA3-A93E-40C0EC4F68E5&displaylang=en
The other approach is IMO a better one, but it requires a much higher level of technical skill. What you would need to do is hook the CoCreateInstance API and replace the Packager IUnknown with a wrapper class that supports IDispatch. You may not even need to implement IDispatch--it may be enough to simply satisfy the QueryInterface for IDispatch by returning an IUnknown pointer. As long as Word doesn't actually try to call any IUnknown methods on the pointer, you would be fine. Doing this would enable Word to create the Packager object and return you the pointer via the OM. This would enable you to get at IDataObject which would in turn allow you to access the Packager storage in a supported way. I'm not sure whether the Packager supports aggregation; ideally your wrapper would just aggregate the packager and tack on IDispatch. Otherwise, you would need to use containment to expose IDataObject, IPersist, and IPersistStorage (where your implementations would just pass through to the contained package object).
The hooking of the API is the hard part. Basically, you would need to fix up the import address tables in the PE image and replace the CoCreateInstance address with your own function. I've done this before, so I can assure you its possible. You can find the PE specification here: http://www.microsoft.com/whdc/system/platform/firmware/PECOFF.mspx. Alternatively, I believe there are some commercial products out there that you might choose to integrate. If you search the internet for "windows api hook" I suspect you will find some offerings. I have no idea if any of them are any good, but you can certainly investigate that on your own. If you do go the hook route, you will want to make sure you install the hook immediately prior to calling OLEFormat.Object and then remove it immediately after so you don't affect any unrelated CoCreateInstance calls.
The thing I like about the hook approach is that it isn't a workaround. It fixes the underlying problem in such a way that you can access the Packager object without relying on internal knowledge of the file format. To me, this point is critical given that the file format changed in Office 12--which means you would need to implement two different solutions (one for doc files, one for metro) if you opted for the first approach.
Once you get an IDataObject, you might have to play around with the formats a bit. As I said, I believe that there will be a "Native" format that will return the blob. However, I do not know this for certain. One thing that makes me a bit cautious is that when I tried copying the package to the clipboard, DOBJVIEW (another Win32 SDK tool that allows you to look at the data object on the clipboard) does not show a "Native" format (though it might be that it just doesn't expose it via IEnumFORMATETC). So the question remains as to which format would return the contents of the Ole10Native stream. I would definitely answer this question before investing too much time in either approach.
For that reason, what I would probably suggest is to create a document with a package in it and then write a quick application that calls CoCreateInstance on the Packager, does a QueryInterface for IPersistStorage, and then calls IPersistStorage::Load passing in the sub storage from the document (just use DFVIEW to get the sub storage name). Then you can QueryInterface the Packager for IDataObject and enumerate formats to see what you get. This is effectively a quick and dirty implementation of option number one. If you prefer, you could use this sequence of calls instead of calling OleLoad. Incidentally, if you don't get a full IDataObject, try calling OleRun after you call IPersistStorage::Load. It may be necessary to put the Packager in a running state to get the full functionality. If that turns out to be the case, you would need to call OleRun after calling OleLoad if you took the first approach.
Sorry I don't have an easy answer for you, but if you really have to solve this problem you should have enough information to do so.
Sincerely,
Geoff Darst
Microsoft VSTO Team
"Package" is what Word (or Office) uses when it isn't able to recognize what the object should be. Often, it indicates there was a problem with the OLE Server not being recognized correctly when the item was created or last edited. In the last couple of years, this appears to be due more and more frequently to third-party applications (file management or anti-virus) interfering.
As far as I know, once the OLE Server information has been lost there's not much you can do. Certainly, there's no chance via the object model.
If you want to pursue the question, a more proper venue would be one of the office.developer newsgroups. You'll find a list in the "Please Read First" message at the top of the forum. This forum is specifically for questions concerning the VSTO technology and not general Office automation.