GemFire Tutorial

http://community.gemstone.com/display/gemfire/GemFire+Tutorial

GemFire Tutorial

 

 

In this tutorial you learn about the basic features of GemFire Enterprise by walking through the code of a rudimentary social networking application built on GemFire. The application demonstrates how GemFire enables you to kill VMs without loss of service; dynamically add more storage while the application is running; and provide low latency access to your data.

Note: The tutorial demonstrates how to use these features with GemFire's Java API. If you plan to use C++ or C#, you may prefer to get started with the Native Client Documentation.

 

Tutorial Overview

 

Main concepts

The tutorial covers the following concepts.

GemFire Distributed system

VMs running GemFire form a distributed system. Each VM is referred to as a GemFire peer. To start a GemFire peer, you create a GemFire cache in each VM. The cache manages the connectivity to other gemfire peers. Peers discover each other through multicast messaging or a TCP location service.

Regions

Regions are an abstraction on top of the distributed system. A region lets you store data in many VMs in the system without regard to which peer the data is stored on. Regions give you a map interface that transparently fetches your data from the appropriate VM. The Region class extends the java.util.Map interface, but it also supports querying and transactions.

Replicated regions

A replicated region keeps a full copy of the region on each GemFire peer.

Partitioned regions

Partitioned regions allow you to configure the number of copies of data in your distributed system. GemFire partitions your data so that each peer only stores a part of the region contents.

Client caching

A GemFire distributed system is a mesh network, where all peers are connected directly to all other peers. GemFire also supports clients of the distributed system, which are connected only to a few of the peers in the distributed system. Clients can have their own local cache of data that is in the server distributed system, and they can update their local cache by registering with the server to receive changes to the data. GemFire also provides C++ and C# client APIs.

Shared-nothing persistence

GemFire supports shared-nothing persistence, where each peer persists its partition of the data set to its local disk. GemFire persistence also allows you to maintain a configurable number of copies of your data on disk.

 

Requirements

In this tutorial it is assumed that you are familiar with programming and running Java applications. Running the examples in this tutorial requires Java 1.5 or greater.

 

Sample application

The tutorial shows a sample social networking application (download) that stores people's profiles and posts. Each person has a profile that lists their friends, and each person can write posts.

To store this data the application uses two regions; one holds people and the other holds posts.

 

People region

 

key

value

Type

String

com.gemstone.gemfire.tutorial.model.Profile

Description

The persons name

The profile of this person

 

Posts region

 

key

value

Type

com.gemstone.gemfire.tutorial.model.PostID

String

Description

A unique ID for this post

The text of the post

The Profile class holds a set of friends belonging to one person. A PostID is an author's name and a timestamp. The author's name should match a key (the person's name) in the people region, but that constraint is not enforced in this simple application.

A simple command line UI enables you to add people and posts to the system.

 

Setting Up the Environment

If you have not already done so, install and configure GemFire Enterprise and the GemFire product examples.

You will need several terminal sessions to run the tutorial.

 

Distributed System and Regions

 

1. Start a locator

VMs running GemFire discover each other through multicast messaging or a TCP location service, which is called a locator. The locator runs as a separate process to which each new VM connects to first discover the list of available peers. For this example, we'll use a locator.

Peer discovery using a locator

Start a locator:

$ gemfire start-locator -port=55221

The locator process runs in the background, listening for connections on port 55221. To stop the process, you can use "gemfire stop-locator." But don't stop it yet.

 

2. Walk through creating a cache

You will store the data on several GemFire peers. The first step to starting up a GemFire peer is to create a cache. The cache is the central component of GemFire. It manages connections to other GemFire peers.

In a text editor or your favorite IDE, open the GemfireDAO class in the sample application code. Look at the initPeer() method. The first thing this method does is create a cache:

Cache cache = new CacheFactory() .set("locators", "localhost[55221]") .set("mcast-port", "0") .set("log-level", "error") .create();

The GemFire "locators" property tells the cache which locators to use to discover other GemFire VMs. The mcast-port property tells GemFire not to use multicast discovery to find peers. The log-level property controls the log level of GemFire's internal logging. Here it is set to "error" to limit the amount of messaging that will show up in the console. After the call to create finishes, this peer has discovered other peers and connected to them.

 

3. Start peers

You already have one window open where you started the locator. Start another terminal window. In each window, run the Peer application:

$ java com.gemstone.gemfire.tutorial.Peer

The peers start up and connect to each other.

 

4. Create the People region (a replicated region)

GemFire Regions are key-value collections. They extend the java.util.concurrent.ConcurrentMap interface. The simplest type of of region is a replicated region. Every peer that hosts a replicated region stores a copy of the entire region locally. Changes to the replicated region are sent synchronously to all peers that host the region.

a. Look at the initPeer method in GemfireDAO. To create region you use a RegionFactory. Here's where the people region is created:

people = cache.<String, Profile>createRegionFactory(REPLICATE) .addCacheListener(listener) .create("people");

In this case, the people region is constructed with RegionShortcut.REPLICATE, which tells the factory to start with the configuration for a replicated region. This method adds a cache listener to the region. You can use a cache listener to receive notifications when the data in the region changes. This sample application includes a LoggingCacheListener class, which prints changes to the region to System.out and enables you to see how entries are distributed.

b. Take a look at the addPerson method. It adds the entry to the region by calling the put method of Region.

public void addPerson(String name, Profile profile) { people.put(name, profile); }

Calling put on the people region distributes the person to all other peers that host that region. After this call completes, each peer will have a copy of this person.

c. Add people. In one of your terminal windows, type:

person Isabella
person Ethan

You will see the users show up in the other window:

In region people created key Isabella value Profile [friends=[]]
In region people created key Ethan value Profile [friends=[]]

 

5. Create the Posts region (partitioned region)

You expect to create a lot of posts. Because of that, you don't want to host a copy of the posts on every server. Thus you store them in a partitioned region. The API to use the partitioned region is the same as a replicated region, but the data is stored differently. A partitioned region lets you control how many copies of your data will exist in the distributed system. The data is partitioned over all of the peers that host the partitioned region. GemFire automatically maintains the number of copies of each partition that you request.

Look at the initPeer method of GemFireDAO. You can create partitioned regions with the PARTITION_XXX shortcuts. To create the posts region, the method uses the PARTITIONED_REDUNDANT shortcut to tell GemFire to create a partitioned region that keeps one primary and one redundant copy of each post on different machines.

posts = cache.<PostID, String>createRegionFactory(PARTITION_REDUNDANT) .addCacheListener(listener) .create("posts");

 

6. Start up another terminal window and launch the peer application in that window

You should have three peers running now. Each post you create will be stored in only two of these peers.

 

7. Add some posts

> post Isabella I like toast
> post Ethan Waaa!
> post Ethan Hello

You will see that the listener in only one of the VMs is invoked for each post. That's because partitioned regions choose one of the copies of the post to be the primary copy. By default GemFire only invokes the listener in the peer that holds the primary copy of each post.

 

8. List the available posts with the posts command

Try listing the posts from any window. You should be able to list all posts, because GemFire fetches them from the peer that hosts each post.

> posts
Ethan: Waaa!
Isabella: I like toast
Ethan: Hello

If you kill one of the VMs, you should still be able to list all posts. You can bring that VM back up and kill another one, and you should still see all of the posts.

 

9. Kill your peers before moving on to the next section

Type quit in each window.

 

Client/Server Caching

You have a fully working system now, but the UI code is running in the same VM in which you are storing data. That works well for some use cases, but it may be undesirable for others. For example, if you have a web server for the UI layer, you may want to increase or decrease the number of web servers without modifying your data servers. Or you may need to host only 100 GB of data, but have thousands of clients that might access it. For these use cases, it makes more sense to have dedicated GemFire servers and access the data through a GemFire client.

GemFire servers are GemFire peers, but they also listen on a separate port for connections from GemFire clients. GemFire clients connect to a limited number of these cache servers.

Like regular peers, GemFire servers still need to define the regions they will host. You could take the peer code you already have and create a CacheServer programatically, but this example uses the cacheserver script that ships with GemFire. The cacheserver script reads the cache configuration from a cache xml file, which is a declarative way to define the regions in the cache.

 

1. Walk through configuring gemfire with XML

All of the gemfire configuration you can do in java you can also do in xml. Look at the server.xml file in the xml directory. This file creates the same regions as the java code in the GemfireDAO.initPeer method.

<?xml version="1.0"?><!DOCTYPE cache PUBLIC "-//GemStone Systems, Inc.//GemFire Declarative Caching 6.5//EN" "http://www.gemstone.com/dtd/cache6_5.dtd"><cache> <region name="people" refid="REPLICATE"> <region-attributes> <cache-listener> <class-name>com.gemstone.gemfire.tutorial.storage.LoggingCacheListener</class-name> </cache-listener> </region-attributes> </region><region name="posts" refid="PARTITION_REDUNDANT"> <region-attributes> <cache-listener> <class-name>com.gemstone.gemfire.tutorial.storage.LoggingCacheListener</class-name> </cache-listener> </region-attributes> </region></cache>

 

2. Start cache servers

The cacheserver script starts a gemfire peer that listens for client connections. Start two cache servers. You need to run these commands in the tutorial directory:

$ mkdir server1
$ cacheserver start locators=localhost[55221] mcast-port=0 cache-xml-file=../xml/server.xml -server-port=0 -dir=server1
$ mkdir server2
$ cacheserver start locators=localhost[55221] mcast-port=0 cache-xml-file=../xml/server.xml -server-port=0 -dir=server2

The cache servers should now be running. Here is what all those command line parameters mean.

Parameter

Description

locators

List of locator hosts and ports that are used to discover other cache servers.

mcast-port

Multicast port to use to discover other cache servers. 0 means don't use multicast.

cache-xml-file

Where to find the cache xml file to use.

server-port

Port to listen on for GemFire clients. 0 means the server will listen on an ephemeral port, which is a temporary port assigned to the process by the OS.

dir

The working directory of the server. Logs for the server are written to this directory.

 

3. Start a client

Starting a GemFire client is very similar to starting a GemFire peer. In the GemFire client, you create a ClientCache, which connects to the locator to discover servers. Look at the GemfireDOA.initClient method. The first thing the method does is create a ClientCache:

ClientCache cache = new ClientCacheFactory() .addPoolLocator("localhost", 55221) .setPoolSubscriptionEnabled(true) .setPoolSubscriptionRedundancy(1) .set("log-level", "error") .create();

Once you create a ClientCache, it maintains a pool of connections to the servers similar to a JDBC connection pool. However, with GemFire you do not need to retrieve connections from the pool and return them. That happens automatically as you perform operations on the regions. The pool locator property tells the client how to discover the servers. The client uses the same locator that the peers do to discover cache servers.

Setting the subscription enabled and subscription redundancy properties allow the client to subscribe to updates for entries that change in the server regions. You are going to subscribe to notifications about any people that are added. The updates are sent asynchronously to the client. Because the updates are sent asynchronously, they need to be queued on the server side. The subscription redundancy setting controls how many copies of the queue are kept on the server side. Setting the redundancy level to 1 means that you can lose 1 server without missing any updates.

 

4. Create a proxy region, Post, in the client

Creating regions in the client is similar to creating regions in a peer. There are two main types of client regions, PROXY regions and CACHING_PROXY regions. PROXY regions store no data on the client. CACHING_PROXY regions allow the client to store keys locally on the client. This example uses a lot of posts, so you won't cache any posts on the client. You can create a proxy region with the shortcut PROXY. Look at the GemFireDAO.initPeer method. This method creates the posts region like this:

posts = cache.<PostID, String>createClientRegionFactory(PROXY) .create("posts");

 

5. Create a caching proxy region, People, in the client

You don't have that many people, so for this sample the client caches all people on the client. First you create a region that has local storage enabled. You can create a region with local storage on the client with ClientRegionShortcut.CACHING_PROXY. In the initClient method, here's where the people region is created.

people = cache.<String, Profile>createClientRegionFactory(CACHING_PROXY) .addCacheListener(listener) .create("people");

 

6. Call the registerInterest method to subscribe to notifications from the server

By creating a CACHING_PROXY, you told GemFire to cache any people that you create from this client. However, you can also choose to receive any updates to the people region that happen from other peers or other clients by invoking the registerInterest methods on the client. In this case you want to register interest in all people, so you cache the entire people regon on the client. The regular expression ".*" matches all keys in the people region. Look at the initClient method. The next line calls registerInterestRegex:

people.registerInterestRegex(".*");

When the registerInterestRegex method is invoked, the client downloads all existing people from the server. When a new person is added on the server it is pushed to the client.

 

7. Iterate over keys from the client

Look at the getPeople and getPosts methods in GemFireDAO.

public Set<String> getPeople() { return people.keySet(); } public Set<PostID> getPosts() { if(isClient) { return posts.keySetOnServer(); } else { return posts.keySet(); } }

GemFireDAO.getPeople calls people.keySet(), whereas GemFireDAO.getPosts calls posts.keySetOnServer() for the client. That's because the keySet method returns the keys that are stored locally on the client. For the People region you can use your locally cached copy of the keys, but for the Post region you need to go to the server to get the list of keys.

 

8. Open a new terminal window and start the client application

$ java com.gemstone.gemfire.tutorial.Client

You should be able to add people and posts from the client. You can start another client and see that people are sent from one client to the other.

 

Adding and Stopping Cache Servers (Peers)

You can dynamically add peers to the system while it is running. The new peers are automatically discovered by the other peers and the client. The new peers automatically receive a copy of any replicated regions they create. However, partitioned regions do not automatically redistribute data to the new peer unless you explicitly instruct GemFire to rebalance the partitioned region. With the cacheserver script, you can pass a command line flag indicating that the new peer should trigger a rebalance of all partitioned regions.

 

Add a new peer

$ mkdir server3
$ cacheserver start locators=localhost[55221] mcast-port=0 cache-xml-file=../xml/server.xml -server-port=0 -dir=server3 -rebalance

You can also rebalance the system in java code with a RebalanceFactory. Use this method if you start your server without the cacheserver script.

 

Stop cache servers

1. Stop one of the cacheservers:

$ cacheserver stop -dir=server1

Your data is still available and the client automatically ignores the dead server.

2. Stop the other cache servers before moving on to the next step. You can leave the client running.

$ cacheserver stop -dir=server2
$ cacheserver stop -dir=server3

 

Using Persistence

GemFire supports shared-nothing persistence. Each VM writes its portion of the region data to its own disk files. For the People region, each peer will be writing the entire region to its own disk files. For the Posts region a copy of each post will reside on two different peers.

 

1. Walk through configuring a persistence region

Look at the xml/persistent_server.xml file. It's exactly the same as the server.xml file, except that each region uses the XXX_PERSISTENT shortcut.

<?xml version="1.0"?><!DOCTYPE cache PUBLIC "-//GemStone Systems, Inc.//GemFire Declarative Caching 6.5//EN" "http://www.gemstone.com/dtd/cache6_5.dtd"><cache> <region name="people" refid="REPLICATE_PERSISTENT"> <region-attributes> <cache-listener> <class-name>com.gemstone.gemfire.tutorial.storage.LoggingCacheListener</class-name> </cache-listener> </region-attributes> </region><region name="posts" refid="PARTITION_REDUNDANT_PERSISTENT"> <region-attributes> <cache-listener> <class-name>com.gemstone.gemfire.tutorial.storage.LoggingCacheListener</class-name> </cache-listener> </region-attributes> </region></cache>

 

2. Start two servers using this persistent_server.xml

$ cacheserver start locators=localhost[55221] mcast-port=0 cache-xml-file=../xml/persistent_server.xml -server-port=0 -dir=server1
$ cacheserver start locators=localhost[55221] mcast-port=0 cache-xml-file=../xml/persistent_server.xml -server-port=0 -dir=server2

 

3. Add some posts from the client

 

4. Kill both servers and restart them to make sure the posts recovered

$ cacheserver stop -dir=server1
$ cacheserver stop -dir=server2

When you restart persistent members, you need to call cacheserver start in parallel for each server.

The reason is that GemFire ensures that your complete data set is recovered before allowing the VMs to start. Remember that each VM is persisting only part of the Posts region. Each GemFire VM waits until all of the posts are available before starting. This protects you from seeing an incomplete view of the posts region.

 

Unix

$ cacheserver start locators=localhost[55221] mcast-port=0 cache-xml-file=../xml/persistent_server.xml -server-port=0 -dir=server1 &
$ cacheserver start locators=localhost[55221] mcast-port=0 cache-xml-file=../xml/persistent_server.xml -server-port=0 -dir=server2 &

 

Windows

$ start /B cacheserver start locators=localhost[55221] mcast-port=0 cache-xml-file=../xml/persistent_server.xml -server-port=0 -dir=server1
$ start /B cacheserver start locators=localhost[55221] mcast-port=0 cache-xml-file=../xml/persistent_server.xml -server-port=0 -dir=server2

 

 

Other Features

Here are some other features of GemFire that you can use to expand the example.

 

Faster serialization with GemFire serialization

In this example, The PostID and Profile classes implemented java.io.Serializable. Java serialization works, but a few things about it are inefficient. Serialization uses reflection to find the fields to serialize. Serializing an object writes the entire classname and field names to the output stream. GemFire provides a more efficient serialization mechanism that you can use to improve the performance of serialization in your application. See the javadocs for DataSerializable for more information. In addition to improving performance, using GemFire serialization makes it easier to share data between Java, C++, and C# clients.

 

Locator redundancy

In this example we only used a single locator, which is a single point of failure for the application. In a production scenario you should use at least two locators. The locators property accepts a comma separated list of locators.

 

Executing queries

Suppose you want to show all people who have someone listed as a friend. GemFire supports querying region contents using OQL, which stands for object query language. See the Querying section in the manual and the javadocs for the query package for more information. To find all people who have listed Ethan as a friend:

select p.key from /people.entrySet p, p.value.friends f where f='Ethan'

 

Continuous queries

In this example the client registered interest in updates to the people region using a regular expression. You can also register for updates using an OQL query. For example, you could register interest in all of the posts from a particular user. See Continuous Querying for more information.

 

Grid programming model

By default, posts are assigned to different peers based on the hashCode of the key. That means there is no organization of which posts go to which servers. If you need to do something to all of the posts from a particular user, such as running a spelling check on them, you would have to access posts on many different peers. To make this sort of operation efficient it makes sense to group those posts by user name, so that the spellcheck could be performed in a single VM. GemFire lets you do this with a PartitionResolver. The partition resolver lets you return a value that indicates the logical group that a key belongs to. In this sample app, the PartitionResolver could just return the author field of the PostID. That would tell GemFire to put all posts by the same author in the same VM.

Once the posts are grouped together logically, you need to execute your spelling check on the VM that actually stores the posts. GemFire lets you ship functions to a subset of the peers based on what keys those functions intend to work with. The function is then executed in parallel on the subset of peers that host those keys. To execute the spelling check function:

SpellingCheck spellcheck = new SpellingCheck(); //implements FunctionSet<String> authors = new HashSet<String();authors.add("Ethan");FunctionService.onRegion(people).withFilter(authors).execute(spellcheck);

See Function Execution for more details.

 

Multiple geographical locations - the WAN gateway

If you have multiple data centers in remote locations, the synchronous replication between GemFire peers may cause too much latency if those peers are in different locations. GemFire provides a WAN gateway, which allows two or more remote sites to send updates asynchronously. With the gateway, many updates are sent at once to improve throughput. See Configuring Multi-site Installations for more details.

 

Cache writers and loaders

As you have seen, a CacheListener callback is invoked after an entry is changed. A CacheWriter callback is invoked before the entry is updated. You can block the update, or send the update to some other system in a cache writer. You can add a CacheLoader callback to a region to fetch or generate a value if it was not present in the cache when a get was invoked. See Handling Events for more information on these callbacks.

 

Eviction and expiration

GemFire provides full support for the eviction and expiration features you might expect from a cache. You can limit the size of a region to a certain number of entries or a certain size in bytes. You can configure regions to expire entries after a certain amount of time, or simply evict entries when your heap is getting full.

Rather than lose the values completely, you can configure the region to overflow values onto disk.

See Controlling Memory Use With Eviction and Overflow and Keeping Your Data Current With Expiration for more information.

 

Next Steps

 

Quick Start examples

The Quick Start Examples provide several more sample applications that demonstrate different features of GemFire. You can read through and copy from these examples to try out these features.

 

Product documentation

The [GemFire Product Documentation] provides in-depth information about GemFire Enterprise.

 

Spring GemFire Integration

After you familiarize yourself with basic GemFire Enterprise functionality, you'll want to check out Spring GemFire Integration. Use Spring GemFire Integration with GemFire as a distributed data management platform to build Spring-powered, highly scalable applications.

 

Forums

Haven't found the information you need? Get answers from GemFire developers and other users on the GemFire Forums.

 

 

 

Child Pages

 

 

Setting Up GemFire

 

In this procedure it is assumed that you know how to configure environment variables for your operating system.

Install and configure GemFire Enterprise and Java as described here for every machine on which you will run GemFire.

1. Confirm that your system meets the requirements to run GemFire Enterprise.

See System Requirements. To check your current Java version, type java -version at a command-line prompt. You can download Sun/Oracle Java SE here.

2. Install GemFire.

a. Register at the Springsource GemFire download site to access the download.
The registration process is quick and helps us help you later if you have questions or problems. The product includes an evaluation license.

b. Install GemFire Enterprise 6.5 according to the instructions on the download site.
You can also get GemFire from your salesperson.

3. Configure your environment for GemFire.

a. Set the JAVA_HOME environment variable to point to your Java runtime installation. (There should be a bin directory under JAVA_HOME.)

b. Set the GEMFIRE environment variable to point to your GemFire installation top-level directory. (There should be bin, lib, dtd, and other directories under GEMFIRE.)

c. Configure GF_JAVA and your PATH and CLASSPATH as shown in these examples. (GF_JAVA must point to the java executable file under your JAVA_HOME.)

Unix Bourne and Korn shells (sh, ksh, bash)

GF_JAVA=$JAVA_HOME/bin/java;export GF_JAVA
PATH=$PATH:$JAVA_HOME/bin:$GEMFIRE/bin;export PATH
CLASSPATH=$GEMFIRE/lib/gemfire.jar:$GEMFIRE/lib/antlr.jar:$GEMFIRE/lib/gfSecurityImpl.jar:$CLASSPATH;export CLASSPATH

Windows

set GF_JAVA=%JAVA_HOME%\bin\java.exe
set PATH=%PATH%;%JAVA_HOME%\bin;%GEMFIRE%\bin
set CLASSPATH=%GEMFIRE%\lib\gemfire.jar;%GEMFIRE%\lib\antlr.jar;%GEMFIRE%\lib\gfSecurityImpl.jar;%CLASSPATH%

4. Explore GemFire features.

With GemFire installed and your environment configured, you can run all of the GemFire product features. See the documentation for guidance on using GemFire.

To run the product tutorial and examples, follow the steps in Setting Up the Product Examples.

 

 

 

Setting Up the Product Examples

 

Follow this procedure for each machine on which you will run the GemFire examples:

1. If you have not already done so, install and configure GemFire Enterprise as described in Setting Up GemFire.
2. Download and unzip the examples zip file.

Unzip the GemFire product examples zip to a directory on the machine where you want to run the examples.

Open the file from the dialogue box that appears when you click on the link, or save the file to disk and then double-click it there to open it. Extract the file to the directory in which you want to store the examples.

3. Add the example class directories to your CLASSPATH.

You need these examples subdirectories in your CLASSPATH to run all of the product examples:

  • tutorial/classes
  • helloworld/classes
  • quickstart/classes
  • examples/dist/classes
Example CLASSPATH settings

Unix Bourne and Korn shells (sh, ksh, bash)
In these settings, replace $SamplesDirectory with the directory where you extracted the zip file.

CLASSPATH=$SamplesDirectory/tutorial/classes:$SamplesDirectory/helloworld/classes:$SamplesDirectory/quickstart/classes:$SamplesDirectory/examples/dist/classes:$CLASSPATH;export CLASSPATH

Windows
In these settings, replace %SamplesDirectory% with the directory where you extracted the zip file.

set CLASSPATH=%SamplesDirectory%\tutorial\classes;%SamplesDirectory%\helloworld\classes;%SamplesDirectory%\quickstart\classes;%SamplesDirectory%\examples\dist\classes;%CLASSPATH%

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值