Executive Summary
The following C++ XML serialisation classes and templates can be used with or without MFC/STL. The XML may also, optionally, be compressed to save storage space. XML can be highly compressed due to its repeatedness. A compression ratio of 98% is quite common.
Introduction
Being able to save and load your data in XML is very useful for many different reasons. Here are a few benefits:
- Human readable.
- Platform independent.
- Seamless schema versioning.
The target development and test platform used to create the XML serialisation classes is Microsoft Visual C++ 6 onwards. The XML serialisation classes and templates can be used with or without MFC/STL. The compression classes use zlib to compress the XML data before saving, and to uncompress before loading. This is useful as XML data can be quite large and XML compresses well. This also provides a small amount of protection/privacy from prying eyes, as the raw XML will not be readily viewable. The compression is optional. You only have to concern yourself with the macros, but documentation is provided for all the other classes and templated functions.
Here is a very brief example for clarity:
#include "HsXmlArchive.h" class CFred { private: int m_cost; float m_markup; public: CFred() : m_cost(123), m_markup(456.789f) { } DECLARE_XML_SERIAL; }; bool CFred::SerializeXml( HS::CXmlArchive &archive, MSXML::IXMLDOMNodePtr pCurNode) { IMPLEMENT_XML_SERIAL_BEGIN(CFred); XML_ELEMENT(m_cost); XML_ELEMENT(m_markup); IMPLEMENT_XML_SERIAL_END; } void main(void) { CFred fred; HS::CXmlArchive archive(HS::CXmlArchive::save, false, "output.xml"); MSXML::IXMLDOMNodePtr pCurNode = archive.Start("Your message goes here"); fred.SerializeXml(archive, pCurNode); archive.End(); }
The generated output placed in output.xml is:
<?xml version="1.0" ?>
<!-- Your message goes here -->
<root>
<CFred>
<m_cost>123</m_cost>
<m_markup>456.789</m_markup>
</CFred>
</root>
How to Implement
The design goal of the XML serialisation classes and macros was to facilitate the minimum amount of coding for the developer. MFC and STL containers are also supported.
Compiler settings
Paths
If you are going to use compression, then a path to the zlib.h header file is required. You can accomplish this in one of two ways:
- VC 6:
- Project settings --> C++ tab --> Pre-processor category. Specify the path in the Additional Include Directories edit box, or,
- Tools --> Options --> Directories tab. Add in the path for the include directories.
- VC 7:
- Project properties --> Configuration properties --> C/C++. Specify the path in the Additional Include Directories edit box, or,
- Tools --> Options --> Projects --> VC++ Directories. Select "Include Files" from the ‘Show directories for’ combo box. Add in the path for the include directories.
A path to the XML serialisation files is also required. This can be accomplished using the same method as shown above.
Definitions
If you are going to use compression, then some definitions are required. You can accomplish this in one of two ways:
- VC 6: Select Project settings – C++ tab, General category, Preprocessor definitions: Append the following to the definitions already there:
,HS_USE_COMPRESSION,_WINDOWS,ZLIB_DLL
- VC 7: Project properties --> Configuration properties --> C/C++ --> Preprocessor. Append the following to the definitions already there:
;HS_USE_COMPRESSION;_WINDOWS;ZLIB_DLL
- Add the following to the stdafx.h file:
#define HS_USE_COMPRESSION // Required for zlib #define _WINDOWS #define ZLIB_DLL
Header file
In the header file of your class, you need to add the following #include
statement: #include "HSXmlArchive.h"
.
Within the class, you need to add the following macro in a public
section: DECLARE_XML_SERIAL;
.
Here is an example of a simple class, with the XML serialisation highlighted in bold:
#include "HSXmlArchive.h" class CSimpleClass { private: // Member variables here float m_num; public: CSimpleClass(); ... // Other functions here DECLARE_XML_SERIAL; // The SerializeXml prototype macro };
That’s all that is required in the header file. You may need to provide a path to HSXmlArchive.h.
Implementation file
- In the implementation file, you need to add the following function as this was declared using the
DECLARE_XML_SERIAL
macro in the header file:bool SerializeXml(HS::CXmlArchive &archive, MSXML::IXMLDOMNodePtr pCurNode)
- Use the following macro, passing in the name of your class:
IMPLEMENT_XML_SERIAL_BEGIN(your class name goes here);
- Use the relevant element macros, passing in the variable to be serialised:
XML_ELEMENT(variable); XML_ELEMENT_NAMED(XmlVariableName, variable);
Remember to indent these as the
IMPLEMENT_XML_SERIAL_BEGIN
macro declares an open curly bracket{
. This is strictly not necessary, but static checkers such as PC-Lint would complain about lack of indentation. - If your class has inherited from one or more base classes that also declare the
DECLARE_XML_SERIAL
macro, then simply add the following line for each base class:SERIALIZE_XML_BASE_CLASS(base class name goes here);
- Here is a full example of our simple class.
bool CSimpleClass::SerializeXml(HS::CXmlArchive &archive, MSXML::IXMLDOMNodePtr pCurNode) { IMPLEMENT_XML_SERIAL_BEGIN(CSimpleClass); XML_ELEMENT(m_num); IMPLEMENT_XML_SERIAL_END; }
An Example
For this example, three classes are going to be serialised. CFred
, CFred2
, and CMyStringVector
. These can be found in the "MFC demo" directory.
Figure 1
Classes
CMyStringVector
This class inherits from CObject
and utilises CString
, and therefore uses MFC. It contains two public member variables, both of which utilise STL containers:
m_FredList
attribute is an STL container of type:vector<CFred>
m_list
attribute is an STL container of type:vector<CString>
CFred
This is a simple class, and contains four private member variables that can be seen in Figure 1.
CFred2
This class utilises multiple inheritance by inheriting from both CFred
and CMyStringVector
. This class contains three protected member variables that can be seen in Figure 1.
How classes save / load their state via XML
CMyStringVector
The two STL containers are serialised as shown below:
IMPLEMENT_XML_SERIAL_BEGIN(CMyStringVector); SERIALIZE_XML_STL_VARIANT(m_list, true); SERIALIZE_XML_STL_CLASS(m_FredList, true); IMPLEMENT_XML_SERIAL_END;
- The
m_list
contains a list ofCHsVariant
compatible variables, therefore theSERIALIZE_XML_STL_VARIANT
macro is used. Note the ‘true
’ parameter, this tells the XML serialisation classes that this STL container has the reserve function and that reserve should be called during a load operation to pre-allocate memory. A vector implements reserve. More on this topic later. - The
m_FredList
contains a list of classes, which must all utilise theDECLARE_XML_SERIAL
macro; therefore, theSERIALIZE_XML_STL_CLASS
macro is used.
CFred
Having four variables, it is implemented simply as follows:
IMPLEMENT_XML_SERIAL_BEGIN(CFred); XML_ELEMENT(m_a); XML_ELEMENT(m_b); XML_ELEMENT(m_c); XML_ELEMENT(m_d); IMPLEMENT_XML_SERIAL_END;
CFred2
Contains three variables (two integers and one MFC CString
). XML serialisation is just the same, utilising the XML_ELEMENT
macro. This class inherited from two classes, both of which declare DECLARE_XML_SERIAL
. To serialise a base class, the SERIALIZE_XML_BASE_CLASS
macro is used.
IMPLEMENT_XML_SERIAL_BEGIN(CFred2); XML_ELEMENT(c); XML_ELEMENT(d); XML_ELEMENT(txt); SERIALIZE_XML_BASE_CLASS(CFred); SERIALIZE_XML_BASE_CLASS(CMyStringVector); IMPLEMENT_XML_SERIAL_END;
Dynamic types
This section deals with having a list that contains base class pointers pointing to various derived types. For example:
If you have a container that has a list of CMyGraphicalObject
base class pointers, those pointers could point to any type derived from CMyGraphicalObject
, such as CMyCircle
, CMyTriangle
, or CMySquare
. In order to save/load these dynamic types, the XML serialisation needs a little help from you. We can’t use the mechanism used by MFC and the CRuntimeClass
class, as this would not be a generic solution suitable for non-MFC applications.
The help required is as follows:
- The base class inherits from the
HS::CHsObject
abstract class. - An enumeration named
eHsObjectType
needs to be created. - Each of the dynamically creatable classes need to implement the
HsObjectType()
virtual function. This will return one of theeHsObjectType
enumerated types relevant to that class. - A function named
CreateHsObject
needs to be created. This function takes theeHsObjectType
enumerated type as a parameter. It should return a newly created object of that type.
MFC
Inside the MyList.h, there is a function named SerializeXmlDynamicHsObject
that performs the saving/loading of dynamic types. The saving code looks like this:
int nCount(m_nCount); XML_ELEMENT(nCount); POSITION pos = GetHeadPosition(); while(pos) { TYPE *p = GetNext(pos); eHsObjectType type(p->HsObjectType()); XML_ELEMENT_ENUM(type, eHsObjectType); p->SerializeXml(archive, pCurNode); }
The loading code looks like this:
RemoveAll(); int nCount(0); XML_ELEMENT(nCount); while(nCount--) { eHsObjectType type(NO_OBJECT); XML_ELEMENT_ENUM(type, eHsObjectType); TYPE *p = CreateHsObject(type); ASSERT(p); if(p) { AddTail(p); p->SerializeXml(archive, pCurNode); } }
STL
For STL containers, there are two macros inside HsXmlArchive.h, named:
SERIALIZE_XML_STL_CLASS_DynamicHsObject
SERIALIZE_XML_STL_MAP_CLASS_DynamicHsObject
These macros expand to the following:
HS::STLSerializeClassTypeDynamic<(bCallReserve)>( pCurNode, (list_name), archive, (#list_name)) HS::STLSerializeMapClassTypeDynamic( pCurNode, (list_name), archive, (#list_name))
How it Works
As you can see from the examples above, you only have to utilise the macros provided. This saves a lot of typing and aggravation, and makes your code look neater and simpler.
All the XML serialisation classes have the namespace HS
(short for Hicrest Systems).
You only have to concern yourself with the macros, but documentation is provided for all the other classes and templated functions. You may possibly need to extend CHsVariant
if you have a type that cannot be converted to a variant.
Macros
Please refer to the Word documentation pertaining to the macros, there are too many to list here.
CHsVariant
This class is a replacement for the _variant_t
class. It adds extra functionality not present in _variant_t
. It should be noted that the destructor for _variant_t
is not virtual, and could cause problems during destruction. Therefore, replacing _variant_t
with this class seemed logical.
I have added extra functionality to convert int
, CString
and std::string
into the relevant variant types. You may also want to add extra functionality to this class to help convert your types.
It will convert the following types for you:
short, long, float, double, CY, _bstr_t, wchar_t, char*, IDispatch*, bool, IUnknown, DECIMAL, BYTE, int, CString, std::string
When loading an XML file, CHsVariant
is called upon to set the variable. This function is called GetValue()
:
template <typename t> void GetValue(T *pVar) const throw(_com_error) // Sets variable passed in pVar { *pVar = *this; // T cannot be const as it is being set here }
You will see GetValue()
being used within CXmlArchive()
during a load. If you experience a compiler error at the *pVar = *this
assignment, then you passed in a const variable. This is OK for saving, but not for loading. As the same function is used for saving and loading, a non-const variable should be passed.
CXmlArchive
This class is the heart of the XML serialisation mechanism; all other classes and macros are peripheral to this class.
Member variables
public: static enum eState { save, load }; private: // Main document pointer MSXML::IXMLDOMDocumentPtr m_pDom; // True if all is ok bool m_bIsOk; // True if to compress the XML data during save/load. bool m_bCompress; // Saving or loading const eState m_eState; // Filename to save/load const _variant_t m_sFileName; // Error string if there is an error std::string m_sError;
eState
– This is an enumeration for the saving or loading state. It is public so that you can pass in the state to the construction ofCXmlArchive
to specify whether you are loading or saving. This is the only variable you have to concern yourself with, as it is the only public one.m_pDom
– This is the main XML document pointer, and points to an instance of theIXMLDOMDocument
interface.m_bIsOk
– ‘true
’ ifm_pDom
points to a successful creation instance ofDOMDocument
, ‘false
’ otherwise.m_bCompress
– ‘true
’ if the XML data is to be compressed before saving or uncompressed during loading. ‘false
’ if the raw XML text is to be used.m_eState
– The saving/loading state as passed in by you to theCXmlArchive
constructor. See the declaration ofeState
above for the valid values.m_sFileName
– The filename to save or load as passed in by you to theCXmlArchive
constructor.m_sError
– If an error occurs, then this string holds the error information.
Please refer to the Word documentation for the full documentation for this class.
Global stuff
STLSerializeVariantType
template <bool bCallReserve, class T> bool STLSerializeVariantType( MSXML::IXMLDOMNodePtr pCurNode, T &lst, CXmlArchive &archive, const char *name)
Save or load an STL collection that contains CHsVariant
compatible types. The parameters are as follows:
bCallReserve
– ‘true
’ if reserve should be called on the STL container during a load to pre-allocate memory.pCurNode
– The current DOM node.lst
– This is the STL container.archive
– TheCXmlArchive
class.name
– The name of the STL container.
Please also see the CReserve
template described below as this is used within this function.
STLSerializeClassType
template<class T> bool STLSerializeClassType( MSXML::IXMLDOMNodePtr pCurNode, T &lst, CXmlArchive &archive, const char *name)
Save or load an STL collection that contains classes that declare DECLARE_XML_SERIAL
. The parameters are as follows:
bCallReserve
– ‘true
’ if reserve should be called on the STL container during a load to pre-allocate memory.pCurNode
– The current DOM node.lst
– This is the STL container.archive
– TheCXmlArchive
class.name
– The name of the STL container.
Please also see the CReserve
template described below as this is used within this function.
CReserve
This template was created to facilitate calling reserve on a vector to pre-allocate memory. However, passing in ‘true
’ or ‘false
’ to STLSerializeVariantType
or STLSerializeClassType
to make it call reserve or not would cause a compiler error if passing in an STL container that did not provide this function. For example, the following would not compile if passing in a non std::vector
container:
if(bCallReserve)
lst.reserve(nCount);
This is because the call to lst.reserve()
has to be compiled whether bCallReserve
is true or not. The way round this problem is with a clever technique using template specialisation. The code for the CReserve
is as follows:
template <int v> struct Int2Type { enum { value = v }; }; template <bool b, class T> class CReserve { private: static void reserve(T &lst, const int &n, Loki::Int2Type<true>) { lst.reserve(n); } static void reserve(T &lst, const int &n, Loki::Int2Type<false>) { (void)lst; (void)n; } public: static void reserve(T &lst, const int &n) { reserve(lst, n, Loki::Int2Type<b>()); } };
The call to STLSerializeVariantType
has a template parameter bCallReserve
, which is a bool
. This is ‘true
’ if reserve should be called on the STL container, ‘false
’ to not even compile a call to reserve. CReserve
is used as follows:
CReserve<bCallReserve, T>::reserve(lst, nCount);
An instance of CReserve
is not necessary as all the functions are static, hence we can call straight in to the public reserve function. This public function calls one of the two private reserve functions depending on the template parameter b
. Template specialisation is performed for the two private reserve functions, one is ‘true
’, the other ‘false
’. This causes the compiler to only compile the required function.
For example:
- If we have a
std::vector
, we want to pass ‘true
’ to call reserve on the container before populating it.
std::vector<int> intList; … // Populate intList SERIALIZE_XML_STL_VARIANT(intList, true);
This expands to:
std::vector<int> intList; … // Populate intList HS::STLSerializeVariantType<true>(pCurNode, intList, archive, "intList");
CReserve
then compiles as:
template <bool b, class T> class CReserve { private: static void reserve(T &lst, const int &n, Loki::Int2Type<true>) { lst.reserve(n); } public: static void reserve(T &lst, const int &n) { reserve(lst, n, Loki::Int2Type<b>()); } };
So the call to:
CReserve<true, T>::reserve(lst, nCount);
Will call:
intList.reserve(nCount);
- If we have a
std::list
, we want to pass ‘false
’ so as not to call reserve on the container as this function does not exist.
std::list<int> intList; … // Populate intList SERIALIZE_XML_STL_VARIANT(intList, false);
This expands to:
std::list<int> intList; … // Populate intList HS::STLSerializeVariantType<false>(pCurNode, intList, archive, "intList");
CReserve
then compiles as:
template <bool b, class T> class CReserve { private: static void reserve(T &lst, const int &n, Loki::Int2Type<false>) { (void)lst; (void)n; } public: static void reserve(T &lst, const int &n) { reserve(lst, n, Loki::Int2Type<b>()); } };
So the call to:
CReserve<true, T>::reserve(lst, nCount);
Will call:
(void)lst; (void)n;
This does nothing. It just pretends to use the lst
and n
parameters so as not to get a compiler warning when using level 4 error detection.
Int2Type
Just a quick note about the Int2Type
template. It converts each integral constant into a unique type. Invocation: Int2Type<V>
, where V
is a compile-time constant integral. Defines 'value
', an enum that evaluates to V
. This class was designed by Andrei Alexandrescu who wrote the "Modern C++ Design: Generic Programming and Design Patterns Applied" book published by Addison-Wesley. The Loki library is free, but undocumented, so you really need the book which is well worth the price anyway. This book fully describes the Loki library and the design patterns used behind it. Fundamentally, this book demonstrates ‘generic patterns’ or ‘pattern templates’ as a powerful new way of creating extensible designs in C++. A new way to combine templates and patterns that we may never have dreamt as possible, but is. If your work involves C++ design and coding, you should read this book. Highly recommended… Loki can be freely downloaded from here, or just Google for it.
CXmlCompression
Compresses and decompresses the XML data provided by CXmlArchive
. Again, please refer to the Word documentation for the full documentation for this class.
CHsZlibFile
This class is a simple wrapper class for the gzFile zlib file handler functions. It makes sure that the file is closed upon destruction. Again, please refer to the Word documentation for full documentation for this class.
Files
Here is a list of the XML serialisation files, in alphabetical order:
Filename | Brief description | Usage |
HsAssert.h | Provides ASSERT capability. Will either use MFC's ASSERT if defined, or otherwise uses assert . Also declares TRUE , FALSE , and NULL , if not already defined. | All |
HsBuffer.h and .cpp | Provides a buffer that will automatically be de-allocated during destruction. | Compression |
HsCreationPolicy.h | Not used. Provides the following creation templates: CreateUsingNew , CreateUsingMalloc , CreateStatic . | None |
HsVariant.h and .cpp | Extends _variant_t for extra types | All |
HsXmlArchive.h and .cpp | Provides an XML serialisation mechanism for classes and STL containers. Also supports MFC. | All |
HsXmlCompression.h and .cpp | Compresses and decompresses the XML data provided by CXmlArchive . | Compression |
HsZlibFile.h and .cpp | This class is a simple wrapper class for the gzFile zlib file handler functions. It makes sure that the file is closed upon destruction. | Compression |
TypeManip.h | Part of the Loki library written by Andrei Alexandrescu. Provides the class template Int2Type . | All |
Testing
There are several test programs supplied with the XML serialisation classes. These are:
- MFC demo – Saves and loads MFC objects and data. It also saves classes that contain STL data.
- Non MFC demo. – Saves and loads a simple class.
- STL demo – Saves and loads a
std::vector
and astd::list
which contain classes andstd::string
s. - Compression simple – Similar to the non-MFC demo, except the XML is compressed using zlib.
- Compression complex – Similar to the MFC demo, except the XML is compressed using zlib.
The demo classes all perform the same way in order to validate and test the XML serialisation mechanism. They each:
- Initialise the class data to be saved.
- Save the XML output to "objects.xml". This XML output is also retrieved and stored in the
xml_saved
string for validation purposes. - Class data is destroyed.
- Reads in the XML file named “objects.xml" and processes it.
- Saves the XML output, but not supplying an output filename. This XML output is also retrieved and stored in the
xml_loaded
string for validation purposes. - The
xml_saved
andxml_loaded
strings are compared. They should be identical.
Please note that the compression examples output filename is "objects.gz".
Miscellaneous
Currently, the creation policies are not used, these can be found in HsCreationPolicy.h.
Any comments or improvements please send them to me: Simon Hughes.
Revision Log
Date | By | Ver. | Description |
21 Aug 2001 | SJH | 1.0 | First release. |
6 Sep 2001 | SJH | 1.1 | Added ZLIB compression. |
27 May 2004 | SJH | 1.2 | Coloured the C++ code, and added instructions relating to VC 7 project settings. |
Sorry it took so long to publish.