tCompressedString in SharePoint 2010 Content Database

http://www.digitude.net/blog/?p=362

In order to understand a specific aspect of SharePoint it is sometimes useful to be able to peek in the databases. One such aspect where the backend storage is important is the usage of content types. If a content type is assigned to a list this information is written down in the content database. Depending on how the association is performed the actual data that is stored changes, the association can be performed based on features or one can use the UI to make the association.

The field where this content type association is stored is ‘tp_ContentTypes’ in the table ‘AllLists’. In SharePoint 2007 this field contained literal xml fragments, in SharePoint 2010 the field contains compressed data, so it is not immediately readable.

The data type of the ‘tp_ContentTypes’ field is tCompressedString which is actually a User Defined Type definition pointing to a varbinary(max) data type. The compression method of these fields are described in this document: MS-WSSF02 (File Operations

Database Communications Version 2 Protocol Specification).

If you go to topic 2.2.5.8 (WSS Compressed Structures) in this document you can see that the zlib compression technique is used to compress the data. From the structure schema you can also see that the offset of the compressed data is 12 bytes. Here starts the zlib compressed data. To my limited understanding zlib is some sort of envelope specification with support for different compression mechanisms, the one that is used most though is the deflate compression technique. The .NET framework has support for this compression technique via the ‘DeflateStream’ class. I have read in some articles that there are more robust ways to work with the deflate compression but for our purposes it will do. In order to use the method ‘CopyTo’ I compiled the code for the .NET 4.0 framework, it makes reading from the underlying stream much easier.

Following code snippet shows how you can decompress the data coming from the ‘tp_ContentTypes’  field:

private string Decompress( byte[] compressedBytesBuffer)
{
     string uncompressedString = String.Empty;

    using (MemoryStream compressedMemoryStream = newMemoryStream(compressedBytesBuffer))
    {
        compressedMemoryStream.Position += 12; // Compress Structure Header according to [MS -WSSFO2].
        compressedMemoryStream.Position += 2;  // Zlib header.

        using (DeflateStream deflateStream = newDeflateStream(compressedMemoryStream, CompressionMode.Decompress))
        {
            using (MemoryStream uncompressedMemoryStream = newMemoryStream())
            {
                deflateStream.CopyTo(uncompressedMemoryStream);

                uncompressedMemoryStream.Position = 0;

                using (StreamReader streamReader = newStreamReader(uncompressedMemoryStream))
                {
                    uncompressedString = streamReader.ReadToEnd();
                }
            }
        }
    }

    return uncompressedString;
}

The method itself accepts an array of bytes, in order to convert the string representation of the byte sequence (coming from the textbox) there is a helper function that returns a corresponding byte sequence. This means that the helper method converts for example the string “A8B2” to a byte sequence {0xA8, 0xB2}.

I have created a small windows forms application that performs this decompression. The interface is very basic as I put together this small utility rather quickly.

dd337ced6d3a1183b21cb14c.jpg

The source code for this small utility can be found here. The code is written in C# using Visual Studio 2010.

 

阅读全文
类别: moss技术  查看评论

转载于:https://www.cnblogs.com/frankzye/archive/2011/04/12/2014929.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值