Ignite SQL网格快速学习（一）

最新推荐文章于 2022-06-22 16:46:34 发布

Anokata

最新推荐文章于 2022-06-22 16:46:34 发布

阅读量3.2k

点赞数 3

分类专栏： Ignite 文章标签： ignite sql 分布式网格事务

本文链接：https://blog.csdn.net/qq_31179577/article/details/75316495

版权

Ignite 专栏收录该内容

14 篇文章 6 订阅

订阅专栏

由于之前做了一段时间的规则引擎的翻译与学习，现在回来再续上Ignite的博文，希望大家一起进步。

1.基础的SQL Query体验之SqlQuery

在Ignite中，SQL查询常用的分为两种类型，分别是SqlQuery和SqlFieldQuery，还有其他的一些功能性的查询，我也会一一讲到。我们下面分两小点来讲，所以从我们的标题也看出来，第一个小点讲SqlQuery.

1.1 准备工作

	//java类 Company，getter个setter就不附了，只是一个简单的POJO
public class Company {
    private static final AtomicLong ID_GEN = new AtomicLong();

    @QuerySqlField(index = true)
    private Long id;

    @QuerySqlField(index = true)
    private String name;
    ...
}

	//java类 Company，getter个setter就不附了，只是一个简单的POJO
public class Person {
    private static final AtomicLong ID_GEN = new AtomicLong();
    @QuerySqlField(index = true)
    public Long id;
    @QuerySqlField(index = true)
    public Long companyId;
    @QuerySqlField
    public String firstName;
    @QuerySqlField
    public String lastName;
    @QuerySqlField
    public String resume;
    @QuerySqlField(index = true)
    public double salary;
    private transient AffinityKey<Long> key;
    
    public AffinityKey<Long> key() {
        if (key == null)
            key = new AffinityKey<>(id, companyId);
        return key;
    }
	...
}

	//测试数据的生成
	private static void initData(IgniteCache<Long, Company> cacheOnlyCompany, IgniteCache<AffinityKey<Long>,Person> cacheaffinity, IgniteCache<Long, Person> cacheOnlyPerson) {
		//初始化Company
		Company cloud = new Company("yihecloud");
		Company ultra = new Company("ultrapower");
		//存储Company
		cacheOnlyCompany.put(cloud.id(), cloud);
		cacheOnlyCompany.put(ultra.id(), ultra);
		//初始化person
        Person p1 = new Person(cloud, "John", "Doe", 2000D, "John Doe has Master Degree.");
        Person p2 = new Person(cloud, "Jane", "Doe", 1000D, "Jane Doe has Bachelor Degree.");
        Person p3 = new Person(ultra, "John", "Smith", 3000D, "John Smith has Bachelor Degree.");
        Person p4 = new Person(ultra, "Jane", "Smith", 4000D, "Jane Smith has Master Degree.");
        //并置方式（affinity，PS:我在数据网格快速学习中有深入的讲解）存储
        cacheaffinity.put(p1.key(), p1);
        cacheaffinity.put(p2.key(), p2);
        cacheaffinity.put(p3.key(), p3);
        cacheaffinity.put(p4.key(), p4);
        //非并置方式存储      
        cacheOnlyPerson.put(p1.id, p1);
        cacheOnlyPerson.put(p2.id, p2);
        cacheOnlyPerson.put(p3.id, p3);
        cacheOnlyPerson.put(p4.id, p4);
	}

1.2 SqlQuery执行

	public static void main(String[] args) {
	//1.0 启动Ignite
		try(Ignite ignite = Ignition.start("examples/config/example-ignite.xml")){
		//2.1 配置Company的缓存
			CacheConfiguration<Long, Company> cacheOnlyCompanyCfg = new CacheConfiguration<Long, Company>("CACHE_ONLY_COMPANY");
			cacheOnlyCompanyCfg.setIndexedTypes(Long.class, Company.class);
			//2.2 配置并置关联的person的缓存（看缓存的键的类型！！）
			CacheConfiguration<AffinityKey<Long>, Person> cacheaffinityCfg = new CacheConfiguration<AffinityKey<Long>, Person>("CACHE_AFFINITY");
			cacheaffinityCfg.setIndexedTypes(AffinityKey.class, Company.class);
			//2.3 配置非并置关联的person的缓存（看缓存的键的类型！！）
			CacheConfiguration<Long, Person> cacheOnlyPersonCfg = new CacheConfiguration<Long, Person>("CACHE_ONLY_PERSON");
			cacheOnlyPersonCfg.setIndexedTypes(Long.class,Person.class);
			//3.0 获取cache实例
			try(IgniteCache<Long, Company> cacheOnlyCompany = ignite.getOrCreateCache(cacheOnlyCompanyCfg);
				IgniteCache<AffinityKey<Long>, Person> cacheaffinity = ignite.getOrCreateCache(cacheaffinityCfg);
				IgniteCache<Long, Person> cacheOnlyPerson = ignite.getOrCreateCache(cacheOnlyPersonCfg)){
			//4.0 初始化	initData(cacheOnlyCompany,cacheaffinity,cacheOnlyPerson);
			//SqlQuery 执行
				sqlQueryEnjoy(cacheOnlyPerson);
			}
			 
		}
	}

	private static void sqlQueryEnjoy(IgniteCache<Long, Person> cacheOnlyPerson) {
		String sql = "salary > ? and salary < ?";
		List<Entry<Long, Person>> matched = cacheOnlyPerson.query(new SqlQuery<Long,Person>(Person.class, sql).setArgs(1000,4000)).getAll();
		if(null != matched)
			matched.stream().forEach((entry) -> {
				System.out.println(entry.getKey() + ":" + entry.getValue());
			});
	}

日志

	//PS：我们查询的就是salary大于1000小于4000的
1:Person [id=1, companyId=1, lastName=Doe, firstName=John, salary=2000.0, resume=John Doe has Master Degree.]
3:Person [id=3, companyId=2, lastName=Smith, firstName=John, salary=3000.0, resume=John Smith has Bachelor Degree.]

1.3 总结

虽然我们上述代码中做了一些与SqlQuery不相关的一些事，但这都是在下面的知识点中会用到的。一是给大家一个眼熟，那么下面讲起来就不会陌生了；二是我下面不想写那么多遍，所以就开头写，这时候大家注意力还是很集中的。

因为这是Sql网格的第一个知识点，我分条目来讲

1.3.1 Sql语句

Ignite的SQL网格完全支持所有SQL和DML命令，包括SELECT、UPDATE、INSERT、MERGE和DELETE查询。

具体的Sql语法已经超出了本命题的范畴，我们直接进入下一个知识点

1.3.2 SqlQuery

我们通过Sql语句作为参数，构造了一个SqlQuery对象，它是Query抽象类的一个实现，我们之后会将的SqlFieldQuery也是Query的实现之一。该类并没有特别的注释，我这里只讲一下它的数据结构，接下来再讲IgniteCache.query(Query q)该方法。

public final class SqlQuery<K, V> extends Query<Cache.Entry<K, V>> {
	...
    public SqlQuery(String type, String sql) {
        setType(type);
        setSql(sql);
    }
    public SqlQuery(Class<?> type, String sql) {
        setType(type);
        setSql(sql);
    }
    ....
}

截取的这段代码是SqlQuery的源码的声明，该类有两个参数化的类型。

方法	参数	介绍
SqlQuery	String type：表示我们所获取的结果的类型, String sql：这即是我们会执行的SQL语句)	这两个重载方法只是看用户偏好，当然还是传递字符串比较好，其实Ignite也是以String格式来存储它为SqlQuery的type字段，所以还是直接传递类名吧
SqlQuery	(Class<?> type：表示我们所获取的结果的类型，但不是String字符串，而是class实例, String sql)	看上面

1.3.3 IgniteCache#query(Query< R > qry)

    @SuppressWarnings("unchecked")
    @Override public <R> QueryCursor<R> query(Query<R> qry) {
        A.notNull(qry, "qry");

        GridCacheGateway<K, V> gate = this.gate;

        CacheOperationContext prev = onEnter(gate, opCtx);

        try {
            ctx.checkSecurity(SecurityPermission.CACHE_READ);

            validate(qry);

            convertToBinary(qry);

            CacheOperationContext opCtxCall = ctx.operationContextPerCall();

            boolean keepBinary = opCtxCall != null && opCtxCall.isKeepBinary();

            if (qry instanceof ContinuousQuery)
                return (QueryCursor<R>)queryContinuous((ContinuousQuery<K, V>)qry, qry.isLocal(), keepBinary);

            if (qry instanceof SqlQuery)
                return (QueryCursor<R>)ctx.kernalContext().query().querySql(ctx, (SqlQuery)qry, keepBinary);

            if (qry instanceof SqlFieldsQuery)
                return (FieldsQueryCursor<R>)ctx.kernalContext().query().querySqlFields(ctx, (SqlFieldsQuery)qry,
                    keepBinary);

            if (qry instanceof ScanQuery)
                return query((ScanQuery)qry, null, projection(qry.isLocal()));

            return (QueryCursor<R>)query(qry, projection(qry.isLocal()));
        }
        catch (Exception e) {
            if (e instanceof CacheException)
                throw (CacheException)e;

            throw new CacheException(e);
        }
        finally {
            onLeave(gate, prev);
        }
    }

上面贴的是我们调用的query方法的源码，我也看不懂，但是我们来只看对我们的SqlQuery有用的这段，即if条件判断那里
这里写图片描述

截图有点模糊，这里做一下解释。因为我们做的SqlQuery，所以在我们光标所在的地方进入表达式内。我额外打出的是ctx，即上下文对象的一些信息，我们可以看到，它是CACHE_ONLY_PERSON，对的，与我们的缓存实例的名字一致的。

我们来看上述截图中光标所在的哪一行的表达式：

	return (QueryCursor<R>)ctx.kernalContext().query().querySql(ctx, (SqlQuery)qry, keepBinary);

首先该函数在此就返回了，也可以说是结束句了。

该表达式首先调用的是ctx:GridCacheContext<K, V>的kernalContext()方法，ctx是GridCacheContext对象，它是缓存上下文对象，因此我们才会看到，ctx的名字是我们配置的缓存名字。该对象维护着一个GridKernalContext对象属性，它是网格内核上下文。我下面做一个类集成图：

	IgniteCache的实现类IgniteCacheProxy的query方法
	-query->  GridCacheContext<K, V>（即我们截图中的ctx）
	-维护着->  GridKernalContext
	-集成自->  Iterable<GridComponent>  
	------->  GridComponent(接口，它是所有主要的内部Ignite组件)

我们需要知道的是ctx是与我们的缓存实例相关的即可，它所维护的GridKernalContext，则是一个可迭代的GridComponent集合对象，即网格组件对象集合。而网格组件GridComponent则是我们很多Ignite的负借口，比如我们目前所需要的是Sql查询，那么在调用GridKernalContext#query()时候，其实是返回的GridQueryProcessor这个组件（不建议看源码，2600行，人生苦短，对自己好点），而它是网格组件GridComponent的子类之一。

方法调用结果返回的是QueryCursor<Cache.Entry<K,V>>类型的数据。该类是一个接口，我们可以看它的实现类QueryCursorImpl

public class QueryCursorImpl<T> implements QueryCursorEx<T>, FieldsQueryCursor<T>

该类继承自Iterable，它的Iterator iter属性用来记录查询所得数据。具体的我会在后面用到时候再讲解。

当你需要在查询执行结束时返回整个对象，存储在缓存中(键和值都缓存)，返回最终结果集，那么sqlQuery是很有用的。但是如果你知识需要部分属性或者是一个计算之后的值，那么你可以使用SqlFieldQuery。

2. 基础的SQL Query体验之SqlFieldQuery

2.1 准备工作（与1.1基本类似）

2.2 SqlFieldQuery执行

	public static void main(String[] args) {
		try(Ignite ignite = Ignition.start("examples/config/example-ignite.xml")){
			CacheConfiguration<Long,Company> companyConfig = new CacheConfiguration<Long,Company>("CACHE_ONLY_COMPANY");
			companyConfig.setIndexedTypes(Long.class,Company.class);
			
			
			CacheConfiguration<Long, Person> personConfig = new CacheConfiguration<>("CACHE_ONLY_PERSON");
			personConfig.setIndexedTypes(Long.class, Person.class);
			
			try(IgniteCache<Long, Company> cacheOnlyCompany = ignite.getOrCreateCache(companyConfig);
				IgniteCache<Long, Person> cacheOnlyPerson = ignite.getOrCreateCache(personConfig)){
				//初始化数据
				initData(cacheOnlyCompany,cacheOnlyPerson);
				executeSqlFieldQuery(cacheOnlyPerson);
				
			}
		}
	}

	private static void executeSqlFieldQuery(IgniteCache<Long, Person> cacheOnlyPerson) {
		
		SqlFieldsQuery sqlFieldsQuery = new SqlFieldsQuery("select lastName,salary from Person");
		FieldsQueryCursor<List<?>> result = cacheOnlyPerson.query(sqlFieldsQuery);
		result.getAll().stream().forEach((data) -> {
			System.out.println("start...." );
			data.stream().forEach(System.out::println);
			System.out.println("end...." );
		});
	}

输出日志

[10:52:46] Ignite node started OK (id=2707ffc6)
[10:52:46] Topology snapshot [ver=1, servers=1, clients=0, CPUs=4, heap=0.87GB]
start....
Doe
2000.0
end....
start....
Doe
1000.0
end....
start....
Smith
3000.0
end....
start....
Smith
4000.0
end....
[10:53:05] Ignite node stopped OK [uptime=00:00:18:593]

2.3 总结

通过上述代码以及日志输出，我们可以看到最终的查询输出，并不是一个完整对象，而只是对象的某几个属性而已。这就是SqlFieldQuery的价值所在。下面我会从源码层面解析下SqlFieldQuery。

2.3.1 SqlFieldQuery

在executeSqlFieldQuery方法中，我们首先是根据sql语句创建了一个SqlFieldQuery对象，这个对象是待会提交给IgniteCache#query方法的参数。

SqlFieldQuery是Query抽象类的子类，它与SqlQuery类一样。下面是该类的签名

public class SqlFieldsQuery extends Query<List<?>>

通过签名我们大体可以知道，该类所查询出的最终结果集，应该是以List集合的形式来存储（当然实际也是这样）。

下面我们罗列下该类的构造器

方法	参数	介绍
SqlFieldsQuery	String sql:应该不难理解，即sql语句	比较常用的
SqlFieldsQuery	String sql:应该不难理解，即sql语句, boolean collocated:源码中的注释标明这是一个以优化为目标的属性，通过简单的样例暂时无法得知其实际作用，且不会影响并置与非并置查询。	如果以affinity方式并置了缓存，那么建议开启它，既然说可以优化，那么应该是在Sql做的优化

该类的其他属性都是一些比较好理解的，下面我将常用的方法介绍一下，其他的自己看一下即可，并不难理解

方法	参数	介绍
setArgs()	Object… args:可变参数，一般我们在写sql时候，如果有的值需要指定的话，我们会以?来作为占位符，而这个方法的作用就是为我们填充这些占位符，这个参数的类型是Object类型，因此可以传递所需的任何数值进去	比较常用，毕竟业务不可能都是不变的，通过这个方法可以让我们sql更具有通用性
setTimeout()	int timeout:超时时间, TimeUnit timeUnit:超时时间的单位	在我们希望系统更加的具有可响应性，我们可以用这个方法，做到定时处理
setLocal()	boolean loc:布尔值，标识是否执行本地查询	设置为true的话，将会值查询本地节点的数据，也就是说在分片模式下，查询的数据可能是不完整的。设置为false的话，标识此次查询可能要做分布式查询
setReplicatedOnly	boolean replicatedOnly:指定该查询是否只包含复制的表。也是一个优化的属性	当你已经确定了你所查询的缓存执行的是replicated策略，即复制，那么Ignite可以执行优化，来使得查询只在本地查询即可，提高速度
setDistributedJoins	boolean distributedJoins:布尔值，标识该查询是否启动分布式	该属性会影响到我们的查询数据的完整度。假设当前两个服务器节点，那么我们在其中一个节点之上进行跨缓存的查询（其实就类似于Mysql的连表查询），并设置此值为false，那么查询结果只会包含本节点的数据，而不会包含另外一个节点的数据，不过如果只查询一个缓存的数据，也就是说没有表连接操作，那么及时设置了该值为false，Ignite还是会查询所有结果出来。

2.3.2 IgniteCache#query

该方法的截图请参考1.3.3中的代码，是一样的，我们之前讲过是通过条件语句来控制的执行流程，因为我们现在是SqlFieldQuery，所以自然进它的流程中。

PS:需要注意的是，Ignite2.0和Ignite2.2的源码做了很大调整，所以尽量将自己换将升级到最新的版本，否则我们看到的代码不是一样的！！

最终进行功能执行的还是GridQueryProcessor，它是GridKernalContext所持有的组件（GridComponent）之一.

下面我们来看一下GridQueryProcessor的结构：

//GridQueryProcessor 的声明
public class GridQueryProcessor extends GridProcessorAdapter 
//GridProcessorAdapter 的声明
public abstract class GridProcessorAdapter implements GridProcessor
//GridProcessor 的声明
public interface GridProcessor extends GridComponent

通过上述的类声明，我们可以很清楚的知道GridQueryProcessor 的类组织结构。

我们执行该类的query(…)方法,返回值为FieldsQueryCursor<List<?>>类型，它是一个接口，同时还是QueryCursor 的子类。而实现类与SqlQuery所用的类是同一个：QueryCursorImpl，该类为SqlQuery和SqlFieldQuery这两种查询的数据集服务。有什么不懂的就参考1.3.3的内容。

3. 跨缓存查询

3.1 准备工作

数据按照第1小节中的来即可

3.2 跨缓存查询

	public static void main(String[] args) {
		try (Ignite ignite = Ignition.start("examples/config/example-ignite.xml")) {
			CacheConfiguration<Long, Company> companyConfig = new CacheConfiguration<Long, Company>("CACHE_ONLY_COMPANY");
			companyConfig.setIndexedTypes(Long.class, Company.class);

			CacheConfiguration<Long, Person> personConfig = new CacheConfiguration<>("CACHE_ONLY_PERSON");
			personConfig.setIndexedTypes(Long.class, Person.class);
			try (IgniteCache<Long, Company> cacheOnlyCompany = ignite.getOrCreateCache(companyConfig);
					IgniteCache<Long, Person> cacheOnlyPerson = ignite.getOrCreateCache(personConfig)) {
				// 初始化数据
				initData(cacheOnlyCompany, cacheOnlyPerson);
				executeAcrossCacheQuery(cacheOnlyPerson);

			}
		}
	}

	private static void executeAcrossCacheQuery(IgniteCache<Long, Person> cacheOnlyPerson) {
		final String ORG_CACHE = "CACHE_ONLY_COMPANY";
		String sql = "select concat(firstName, ' ', lastName), org.name " +
	            "from Person, \"" + ORG_CACHE + "\".Company as org " +
	            "where Person.companyId = org.id";
		SqlFieldsQuery sqlFieldsQuery = new SqlFieldsQuery(sql);
		sqlFieldsQuery.setDistributedJoins(true);
		FieldsQueryCursor<List<?>> result = cacheOnlyPerson.query(sqlFieldsQuery);
		result.getAll().stream().forEach((data) -> {
			System.out.println("start...." );
			data.stream().forEach(System.out::println);
			System.out.println("end...." );
		});
		
	}

日志输出

[19:08:36] Topology snapshot [ver=2, servers=2, clients=0, CPUs=4, heap=1.7GB]
start....
John Smith
ultrapower
end....
start....
John Doe
yihecloud
end....
start....
Jane Doe
yihecloud
end....
start....
Jane Smith
ultrapower
end....
[19:08:38] Ignite node stopped OK [uptime=00:00:02:088]

3.3 总结

Ignite将缓存cache的名字，做为每一个Table的Scheme，我们在写SQL语句时候，连接其他库的表也是需要带上库名的，相同道理，我们连接其他缓存的表，那么就加上该缓存的Scheme。这也是为什么我们的SQL中无端的加上了一个缓存Cache名字的原因。当然scheme并不是不可以自动以的，我们可以通过CacheConfiguration.setSqlSchema(…)来设置为你喜欢的Scheme.

上述的SQL中我们发现Person并没有加上scheme，因为Person是当前我们所使用的IgniteCache实例所持有的表，因此我们可以不加scheme。

总体来说，与我们SQL语法基本类似，只需要多练习几次即可。

同时，也表明了Ignite是支持跨缓存操作的。我们在测试样例中加了一行sqlFieldsQuery.setDistributedJoins(true);因为我测试时候是启动了两个节点，而person对象数据是以非并置方式放进Ignite缓存中的，如果我不加这一行句子，Ignite则只会查询本地节点的数据，最终的数据是不完整的，而不会执行分布式查询，来获取完整的数据，希望你们可以自己测试一下，感受一下这个属性的魅力。

4. 分布式并置连接查询(Distributed Collocated join query )

第四和第五点分别是分布式查询相关的知识，且比较难懂，我们首先开始分布式并置连接查询

what:什么是分布式并置连接查询呢？？

我们在讲解数据网格时候讲了affinity的功能，它即是为我们提供并置功能:将带存储数据与我们的affinityKey相关的数据存储在同一节点上。分布式的意思呢，就是多节点协作查询。假如：我们当前的Ignite集群中的节点使用的是分区策略（这是默认策略），而不是复制或者local策略，那么我们的数据会被存储在集群中的任何一个节点之上，当我们查询数据的时候，单个节点是无法查询出所有的数据的，所以需要多个节点通力合作，这即是我们这里所说的分布式关键字。

分布式并置关联查询：即由多个节点通力合作，查询由affinityKey关联起来的跨缓存的数据出来

4.1 数据准备

	private static void initData(IgniteCache<Long, Company> cacheOnlyCompany, IgniteCache<AffinityKey<Long>,Person> cacheaffinity) {
		//初始化cacheOnlyCompany，这里两个参数分别是名字和id属性
		Company cloud = new Company("yihecloud",1L);
		Company ultra = new Company("ultrapower",11L);
		cacheOnlyCompany.put(cloud.id(), cloud);
		cacheOnlyCompany.put(ultra.id(), ultra);
		//初始化cacheaffinity
        Person p1 = new Person(cloud, "John", "Doe", 2000D, "John Doe has Master Degree.");
        Person p2 = new Person(cloud, "Jane", "Doe", 1000D, "Jane Doe has Bachelor Degree.");
        Person p3 = new Person(ultra, "John", "Smith", 3000D, "John Smith has Bachelor Degree.");
        Person p4 = new Person(ultra, "Jane", "Smith", 4000D, "Jane Smith has Master Degree.");
        //初始化cacheOnlyPerson
        cacheaffinity.put(p1.key(), p1);
        cacheaffinity.put(p2.key(), p2);
        cacheaffinity.put(p3.key(), p3);
        cacheaffinity.put(p4.key(), p4);
	}

4.2 分布式并置关联查询

	public static void main(String[] args) {
		try (Ignite ignite = Ignition.start("examples/config/example-ignite.xml")) {
			CacheConfiguration<Long, Company> companyConfig = new CacheConfiguration<Long, Company>(
					"CACHE_ONLY_COMPANY");
			companyConfig.setIndexedTypes(Long.class, Company.class);

			CacheConfiguration<AffinityKey<Long>, Person> personConfig = new CacheConfiguration<AffinityKey<Long>, Person>(
					"CACHE_AFFINITY_PERSON");
			personConfig.setIndexedTypes(AffinityKey.class, Person.class);

			try (IgniteCache<Long, Company> cacheOnlyCompany = ignite.getOrCreateCache(companyConfig);
					IgniteCache<AffinityKey<Long>, Person> cacheAffinityPerson = ignite
							.getOrCreateCache(personConfig)) {
				// 初始化数据
				initData(cacheOnlyCompany, cacheAffinityPerson);
				distributedCollocatedJoin(cacheAffinityPerson);

			}
		}
	}

	private static void distributedCollocatedJoin(IgniteCache<AffinityKey<Long>, Person> cacheAffinityPerson) {
		final String ORG_CACHE = "CACHE_ONLY_COMPANY";
		String sql = "select concat(firstName, ' ', lastName), org.name " + "from Person, \"" + ORG_CACHE
				+ "\".Company as org " + "where Person.companyId = org.id";
		SqlFieldsQuery sqlFieldsQuery = new SqlFieldsQuery(sql);
		FieldsQueryCursor<List<?>> result = cacheAffinityPerson.query(sqlFieldsQuery);
		result.getAll().stream().forEach((data) -> {
			System.out.println("start....");
			data.stream().forEach(System.out::println);
			System.out.println("end....");
		});
		
		
		QueryCursor<Entry<AffinityKey<Long>, Person>> query = cacheAffinityPerson.query(new SqlQuery<AffinityKey<Long>, Person>(Person.class, "from Person").setLocal(true));
		print("Local all persons:", query);
		IgniteCache<Long, Company> cacheOnlyCompany = Ignition.ignite().cache("CACHE_ONLY_COMPANY");
		QueryCursor<Entry<Long, Company>> query2 = cacheOnlyCompany
				.query(new SqlQuery<Long, Company>(Company.class, "from Company").setLocal(true));
		print("Local all company:", query2);
	}

输出日志

[10:16:44] Topology snapshot [ver=2, servers=2, clients=0, CPUs=4, heap=1.7GB]
start....
John Smith
ultrapower
end....
start....
Jane Smith
ultrapower
end....
start....
John Doe
yihecloud
end....
start....
Jane Doe
yihecloud
end....
------------------黄金分割线----------------------
>>> Local all persons:
>>>     Entry [key=AffinityKey [key=3, affKey=11], val=Person [id=3, companyId=11, lastName=Smith, firstName=John, salary=3000.0, resume=John Smith has Bachelor Degree.]]
>>>     Entry [key=AffinityKey [key=4, affKey=11], val=Person [id=4, companyId=11, lastName=Smith, firstName=Jane, salary=4000.0, resume=Jane Smith has Master Degree.]]

>>> Local all company:
>>>     Entry [key=11, val=Organization [id=11, name=ultrapower93]
[10:16:45] Ignite node stopped OK [uptime=00:00:01:727]

我这里做一个简单的解释。首先我是启动了两个服务器节点，包括测试main方法这里也有服务器节点的启动。初始化数据时候，我们初始化了两个Company和4个关联的Person,Person是要与其相关的Company处于同一节点上的，因为我们使用了affinityKey并置。

在最终日志输出中，我们黄金分割线之下的，我是以本地查询模式，查询main所在的服务器节点上的数据，数据显示本节点只有一个Company以及两个与其相关的Person，完全符合我们的定义。另一个Company和其相关的两个Person处于另外一台服务器节点之上。

然后我们来看我们分割线之上的日志，我们竟然全部查询出来了所有的Person，至于原因，我下面讲解

4.3 总结

4.3.1 原理

这里写图片描述

我们来看上图，本图中有三个服务器节点，分别为Node1~Node3.作为客户端的我们发起调用Q，那么Ignite会将我们的请求Q发送给所有的集群中的节点，我们在图中也可以看到我们的Client有三条Q的交互路线，分别对应每一个服务器节点（Ps:我们这里使用的是默认的分片策略，如果是复制策略就不同啦）。我们的用例中是有关联查询的，即Person表与Company表，而且我们的数据使用的是并置affinity，在每一个节点上的E（Q）操作就是对本地数据集的查询，我们知道Person肯定与其关联的Company是处于同一节点上的，因此必然有数据。当然Ignite默认情况下也是执行本地数据集关联查询，因此Affinity只可以说是一种为了适配于Ignite而做的，以完成我们的关联查询，毕竟在业务中，关联查询是必须的，但是Ignite却默认执行的本地查询，只能以Affinity实现并置，来保证数据的完整度。

在E（Q）执行完之后，将数据集R返回给Client上，让它做map-reduce操作。整个流程结束。

通过affinity实现并置，具有很多优点，因为它执行的是本地数据集的关联查询，所以在速度上快，这也是生产中并置查询占绝大多数的原因。

4.3.2 其他场景（一）

上述的测试场景是以默认的分区模式来模拟的。那么假设我们的Company是分区的，而Person是复制的，那么执行流程是怎么样的？？

下面先看一下日志输出

[11:05:07] Topology snapshot [ver=2, servers=2, clients=0, CPUs=4, heap=1.7GB]
start....
John Smith
ultrapower
end....
start....
Jane Smith
ultrapower
end....
start....
John Doe
yihecloud
end....
start....
Jane Doe
yihecloud
end....

>>> Local all persons:
>>>     Entry [key=AffinityKey [key=3, affKey=11], val=Person [id=3, companyId=11, lastName=Smith, firstName=John, salary=3000.0, resume=John Smith has Bachelor Degree.]]
>>>     Entry [key=AffinityKey [key=1, affKey=1], val=Person [id=1, companyId=1, lastName=Doe, firstName=John, salary=2000.0, resume=John Doe has Master Degree.]]
>>>     Entry [key=AffinityKey [key=4, affKey=11], val=Person [id=4, companyId=11, lastName=Smith, firstName=Jane, salary=4000.0, resume=Jane Smith has Master Degree.]]
>>>     Entry [key=AffinityKey [key=2, affKey=1], val=Person [id=2, companyId=1, lastName=Doe, firstName=Jane, salary=1000.0, resume=Jane Doe has Bachelor Degree.]]

>>> Local all company:
>>>     Entry [key=11, val=Organization [id=11, name=ultrapower93]
[11:05:09] Ignite node stopped OK [uptime=00:00:02:194]

通过日志我们可以看出，所有的用户还是被查询出来了，与上述没有差别，本地的节点还是只有一个Company，但是用户却有4个，因为我们使用了复制模式。各位可能会不理解原因。其实与我们上述的图示一致的。

当我们将请求Q发出，其还是将请求路由给所有的集群节点，并回收所有的结果集。我们假设两个节点A和B，B也是我们的main方法启动的。我们通过日志看到B节点上有ultrapoer这个Company，那么也就是说节点A上是有yihecloud这个Company，当请求发出后，我们的A节点可以查出所有与yihecloud相关的用户，毕竟有关联嘛，而节点B则查出所有的ultrapower公司的员工。结果集全部聚集到Client上，我们得到了完整的结果集。

4.3.3 其他场景（二）

我们接着上述的情景，但是我们知道了每个节点上都有Person，只是Company有点分散罢了，我们能不能只在一个节点上查询，但是允许节点间通信，使得我们在一个节点就查询到4个Person??(这里需要指明，当我们设置本地查询时候，也就是setLocal(true)，这种情况下，虽然每个节点都有全部的Person数据，但是Company不完整，我们的join操作会过滤掉不存在的Company的Person，也就是说在我们的用例情境下，只会查询出本地节点所拥有的Company下的所有Person)。

在我的一番测试中这是不可以的,当然也许是有方式的，只是本人不知道。但是实现的方法并不是没有，即:将Company的缓存与Person的缓存均采取LOCAL策略，那么我们可以在我们的本地模式下完成功能，但是LOCAL模型下，你的缓存时无法被其他节点发现的，也就是说你在LOCAL模式下产生的数据，其他节点是无法看到的，切记~~

5. 分布式非并置连接查询(Distributed Non-Collocated join query )

我们在上面讲的是关联关系以AffinityKey来保持并置，但是在复杂的领域逻辑下，不得已时候需要组装部分信息，即使他们是不相关的聚合，这时候我们是不可能为这种数据使用AffinityKey的。那么我们需要进行关联的话，就无法像第四节讲的这样随性啦。

PS:Ignite是自1.7版本才开始支持非并置关联查询的，因为我们知道，Ignite默认是在在本地节点执行关联查询的，而不会与其他节点交互的，知道完成关联查询，将数据送还给Client端。

自Ignite1.7之后，Ignite开始支持分布式非并置关联查询，下面让我带你领略一下：
##5.1 数据准备

	private static void initData(IgniteCache<Long, Company> cacheOnlyCompany,
			IgniteCache<Long, Person> cacheOnlyPerson) {
		// 初始化cacheOnlyCompany
		Company cloud = new Company("yihecloud",1L);
		Company ultra = new Company("ultrapower",11L);
		cacheOnlyCompany.put(cloud.id(), cloud);
		cacheOnlyCompany.put(ultra.id(), ultra);
		// 初始化cacheaffinity
		Person p1 = new Person(cloud, "John", "Doe", 2000D, "John Doe has Master Degree.");
		Person p2 = new Person(cloud, "Jane", "Doe", 1000D, "Jane Doe has Bachelor Degree.");
		Person p3 = new Person(ultra, "John", "Smith", 3000D, "John Smith has Bachelor Degree.");
		Person p4 = new Person(ultra, "Jane", "Smith", 4000D, "Jane Smith has Master Degree.");
		// 初始化cacheOnlyPerson
        cacheOnlyPerson.put(p1.id, p1);
        cacheOnlyPerson.put(p2.id, p2);
        cacheOnlyPerson.put(p3.id, p3);
        cacheOnlyPerson.put(p4.id, p4);
	}

测试数据并没有使用Affinity，而是将Company和Person分别缓存

5.2 分布式非并置关联查询

	public static void main(String[] args) {
		try (Ignite ignite = Ignition.start("examples/config/example-ignite.xml")) {
			CacheConfiguration<Long, Company> companyConfig = new CacheConfiguration<Long, Company>(
					"CACHE_ONLY_COMPANY");
			companyConfig.setIndexedTypes(Long.class, Company.class);

			CacheConfiguration<Long, Person> personConfig = new CacheConfiguration<Long, Person>("CACHE_ONLY_PERSON");
			personConfig.setIndexedTypes(Long.class,Person.class);
			try (IgniteCache<Long, Company> cacheOnlyCompany = ignite.getOrCreateCache(companyConfig);
					IgniteCache<Long, Person> cacheOnlyPerson = ignite
							.getOrCreateCache(personConfig)) {
				// 初始化数据
				initData(cacheOnlyCompany, cacheOnlyPerson);
				distributedNonCollocatedJoin(cacheOnlyPerson);
			}
		}
	}

	private static void distributedNonCollocatedJoin(IgniteCache<Long, Person> cacheAffinityPerson) {
		final String ORG_CACHE = "CACHE_ONLY_COMPANY";
		String sql = "select concat(firstName, ' ', lastName), org.name " + "from Person, \"" + ORG_CACHE
				+ "\".Company as org " + "where Person.companyId = org.id";
		SqlFieldsQuery sqlFieldsQuery = new SqlFieldsQuery(sql);
		sqlFieldsQuery.setDistributedJoins(true);
		FieldsQueryCursor<List<?>> result = cacheAffinityPerson.query(sqlFieldsQuery);
		result.getAll().stream().forEach((data) -> {
			System.out.println("start....");
			data.stream().forEach(System.out::println);
			System.out.println("end....");
		});
		
		
		QueryCursor<Entry<AffinityKey<Long>, Person>> query = cacheAffinityPerson.query(new SqlQuery<AffinityKey<Long>, Person>(Person.class, "from Person").setLocal(true));
		print("Local all persons:", query);
		IgniteCache<Long, Company> cacheOnlyCompany = Ignition.ignite().cache("CACHE_ONLY_COMPANY");
		QueryCursor<Entry<Long, Company>> query2 = cacheOnlyCompany
				.query(new SqlQuery<Long, Company>(Company.class, "from Company").setLocal(true));
		print("Local all company:", query2);
	}

日志输出：

[15:28:01] Topology snapshot [ver=2, servers=2, clients=0, CPUs=4, heap=1.7GB]
start....
John Smith
ultrapower
end....
start....
John Doe
yihecloud
end....
start....
Jane Doe
yihecloud
end....
start....
Jane Smith
ultrapower
end....
-------------------黄金分割线------------------------
>>> Local all persons:
>>>     Entry [key=3, val=Person [id=3, companyId=11, lastName=Smith, firstName=John, salary=3000.0, resume=John Smith has Bachelor Degree.]]

>>> Local all company:
>>>     Entry [key=11, val=Organization [id=11, name=ultrapower93]
[15:28:04] Ignite node stopped OK [uptime=00:00:02:192]

通过日志我们可以看出，本地节点上只有一个Company:ultrapower93和一个Person,虽然它们二者是有关的，这个Person属于这个Comnpany，但是这只是巧合罢了。我们还做了禁止非关联分布式的测试，我会在下面讲的

5.3 总结

5.3.1 分布式非关联关联查询

在我们的测试用例代码中的distributedNonCollocatedJoin方法中，我们调用了setDistributedJoins(true),来开启非并置关联查询，如果你想获取完整的数据集，那么这个属性是必须设置的，否则数据将会不完整，下面的5.3.2我会讲到。

先上一张图：
这里写图片描述

我们以这张图来分解分布式非并置关联查询:

客户方Client发起请求Q，因为我们并没有设置本次SQL以Local形式执行，那么Ignite将Q分布式发往所有的服务器集群节点（注意：我们这里是分片模式），每一个节点在其本地执行关联查询，但是我们知道，肯定有数据是无法关联到的，毕竟我们采取的是非并置方式。

我们就以我的测试用例来分解，我测试一共启动两个节点，第一个节点A一直在做while循环空转，而第二个节点B就是我们所看到的main方法启动的节点。我们从日志打印的黄金分割线之下的数据日志可以看到，在我的本地节点上只有一个Company{name:ultrapower,id:11},而且在这个节点上还有一个Person{id:3,companyId:11},那么其他的数据必然是在我们的A节点上。如果按照Ignite的默认方式，查询只在本地节点上关联的话，那么应该只有三条数据（动脑子想想~~），但是我们的测试数据却是四条，那么也就是说，A节点上的id=4的那个Person执行了上图中的D（Q）操作，与其他节点做了信息的共享，所以它才会关联到B节点上的id=11的Company。

最后，合并数据收工

5.3.1.1 调优

在上述的场景中，为了让ID=4的员工可以查询出来（因为它在A节点，而它所属的Company在B节点上），我们的节点会进行广播。也就是通知所有的节点，配合它完成关联查询，但是这样的效率是很慢的。

String sql = "select concat(firstName, ' ', lastName), org.name " + "from Person, \"" + ORG_CACHE
				+ "\".Company as org " + "where Person.companyId = org._key";

还好的是，Ignite提供了两个关键字，可以帮我们

_key
_key

他们的功能是可以让我完整的拿到我们做缓存时候的键与值，除了这个有点之外，我们来看上述的sql语句，本来是写org.id的地方，我们却写了org._key，因为我们key就是id嘛，不仅结果是完整的，而且这可以使得Ignite定位到数据所在的节点，然后以单播的形式发往该节点，这样效率提升了一大截，原理其实与采用Affinity存储数据时候是一致的，为什么我们的数据可以存储在其关联数据所在的节点上？？回头看数据网格快学吧！！

5.3.2 关闭分布式非并置关联查询

下面的日志，是我们调用setDistributedJoins(false)的情况下发生。

输出日志

[15:36:56] Topology snapshot [ver=2, servers=2, clients=0, CPUs=4, heap=1.7GB]
start....
John Smith
ultrapower
end....
start....
John Doe
yihecloud
end....
start....
Jane Doe
yihecloud
end....
--------------------------黄金分割线------------------------------
>>> Local all persons:
>>>     Entry [key=3, val=Person [id=3, companyId=11, lastName=Smith, firstName=John, salary=3000.0, resume=John Smith has Bachelor Degree.]]

>>> Local all company:
>>>     Entry [key=11, val=Organization [id=11, name=ultrapower93]
[15:36:59] Ignite node stopped OK [uptime=00:00:02:994]

从日志我们可以看到，我们的输出结果是3个。黄金分割线之下，表明我们本地节点上只有Company{name:ultrapower,id:11}和Person{id:3,companyId:11，firstname:John,lastname:Smith}，细心的朋友已经知道了，上述结果缺少的就是我们的4号Person -->Jane Smith,它在A节点，但是它是B节点上的Company的员工，但是我们关闭了分布式非并置关联，所以它无法进行关联，导致无法查出来，所以数据也是不完整的。