The Journal consists of a rolling log of messages and commands (such as transactional boundaries and message deletions) stored in data files of a certain length. When the maximum length of the currently used data file has been reached a new data file is created. All the messages in a data file are reference counted, so that once every message in that data file is no longer required, the data file can be removed or archived. The journal only appends messages to the end of the current data file, so storage is very fast.
• The Cache holds messages for fast retrieval in memory after they have been written to the journal. The cache will periodically update the reference store with its current message ids and location of the messages in the journal. This process is known as performing a checkpoint. Once the reference store has been updated, messages can be safely removed from the cache. The period of time between the cache updates to the reference store is configurable and can be set by the checkpoinInterval property. A checkpoint will also occur if the ActiveMQ message broker is reaching its memory limit.
• The Reference Store holds references to the messages in the Journal that are indexed by their message Id. It is actually the reference store which maintains the FIFO data structure for queues and the durable subscriber pointers to their topic messages. The index type is configurable, the default being a disk based hash Index. It is also possible to use an in-memory hashmap as well. But this is only recommended if the total number of messages expected to be hold in the message store is less than 1 million at any given time. 5.2.1.2. The AMQ Message Store Directory Structure When you start the ActiveMQ with the AMQ message store, a directory will automatically created in which the persistent messages are stored. The AMQ message store directory contains sub-directories for all the brokers that are running on the machine. It is for this reason that it is strongly recommended that each broker use a unique name. In the default configuration for ActiveMQ, the broker name is localhost which needs to changed to something unique. This directory structure is represented in Figure 5.4 - the AMQ Store directory structure.
Inside of the directory for a given broker, the following items will be found:
• The data directory contains the indexes and collections used to reference
the messages held in the journal. This data directory is deleted and rebuilt as
part of recovery, if the broker has not shut down cleanly. You can force
recovery by manually deleting this directory before starting the broker.
• The state directory holds information about durable topic consumers. The journal itself does not hold information about consumers, so when it is
recovered it has to retrieve information about the durable subscribers first to
accurately rebuild its database.
• A lock file to ensure only one broker can access this data at any given time.
The lock is commonly used for hot stand-by purposes where more than one
broker with the same name will exist on the same system yet only one broker
will be able to acquire the lock and start up, while the other broker(s) remain
in stand-by mode.
• A temp-storage directory is used for storing non-persistent messages that
can no longer be stored in broker memory. These messages are typically
awaiting delivery to a slow consumer.
• The kr-store is the directory structure used by the reference (index) part of
the AMQ message store. It uses the Kaha database by default (Kaha is part
of the ActiveMQ core library) to index and store references to messages in
the journal. There are two distinct parts to the kr-store:
• The journal directory contains the data files for the journal, and a
data-control file which holds some meta information. The data files are
reference counted, so when all the contained messages are delivered, the data
file can be deleted or archived.
• The archive directory exists only if archiving is enabled. Its default
location can be found next to the journal. It makes sense, however, to use a
separate partition or disk. The archive is used to store data logs from the
journal directory which are moved here instead of being deleted. This makes
it possible to replay messages from the archive at a later point. To replay
messages, move the archived data logs (or a subset) to a new journal
directory and start a new broker pointed to the location of this directory. It
will automatically replay the data logs in the journal.
Now that the basics of the AMQ message store have been covered, the next step is
to review its configuration.
The KahaDB message store is a new message store that has been developed to
address some of the limitations in the AMQ message store. The AMQ message
store uses two separate files for every index (there is one index per destination) and
recovery can be slow if the ActiveMQ broker is not shutdown cleanly. The reason
for this is that all the indexes need to be rebuilt, which requires the broker to
traverse all its message logs.
To overcome these limitations, the KahaDB message store uses a transactional log
for its indexes and only uses one index file for all its destinations. It has been used
in production environments with active 10,000 connections, each connection
having a separate queue.
The main components of the KahaDB are very similar to that of the AMQ message
store, namely:
• A cache
• Reference Indexes
• A message journal
All index file updates are also recorded in a redo log. This ensures that the indexes
can be brought back in a consistent state. In addition, KahaDB uses a B-Tree
layout for storage as opposed to AMQ message store which keeps all of its indexes
in a persistent hash table. As the hash indexes have to be resized from time to time,
KahaDB has more consistent performance characteristics.
The difference in architecture is depicted in figure Figure 5.5 below:
The KahaDB store is very straight forward to use as demonstrated in its limited
configuration.