SnappyData Tutorial

1. Spark和SnappyData独立运行:

启动Spark

cd /opt/spark-2.1.1-bin-hadoop2.7/sbin
./start-master.sh
./start-slave.sh  --master=spark://locahost:7077

启动SnappyData:

mkdir -p /opt/snappydata-1.0.1-bin/work/localhost-locator-1
mkdir -p /opt/snappydata-1.0.1-bin/work/localhost-server-1
mkdir -p /opt/snappydata-1.0.1-bin/work/localhost-lead-1
cd /opt/snappydata-1.0.1-bin/bin/
../snappy locator start  -dir=/opt/snappydata-1.0.1-bin/work/localhost-locator-1
../snappy server start  -dir=/opt/snappydata-1.0.1-bin/work/localhost-server-1  -locators=localhost[10334] -heap-size=8g
../snappy leader start  -dir=/opt/snappydata-1.0.1-bin/work/localhost-lead-1  -locators=localhost[10334] -spark.executor.cores=4

运行example:

package org.apache.spark.examples.snappydata

import org.apache.spark.sql.{SnappySession, SparkSession}

/**
 * This example shows how an application can interact with SnappyStore in Split cluster mode.
 * By this mode an application can access metastore of an existing running SnappyStore. Hence it can
 * query tables, write to tables which reside in a SnappyStore.
 *
 * To run this example you need to set up a Snappy Cluster first . To do the same, follow the steps
 * mentioned below.
 *
 * 1.  Go to SNAPPY_HOME. Your Snappy installation directory.
 *
 * 2.  Start a Snappy cluster
 * ./sbin/snappy-start-all.sh
 * This will start a simple cluster with one data node, one lead node and a locator
 *
 * 3.  Open Snappy Shell
 * ./bin/snappy-sql
 * This will open Snappy shell which can be used to create and query tables.
 *
 * 4. Connect to the Snappy Cluster. On the shell prompt type
 * connect client 'localhost:1527';
 *
 * 5. Create a column table and insert some rows in SnappyStore. Type the followings in Snappy Shell.
 *
 * CREATE TABLE SNAPPY_COL_TABLE(r1 Integer, r2 Integer) USING COLUMN;
 *
 * insert into SNAPPY_COL_TABLE VALUES(1,1);
 * insert into SNAPPY_COL_TABLE VALUES(2,2);
 *
 * 6. Run this example to see how this program interacts with the Snappy Cluster
 * table (SNAPPY_COL_TABLE) that we created. This program also creates a table in SnappyStore.
 * After running this example you can also query the table from Snappy shell
 * e.g. select count(*) from TestColumnTable.
 *
 * bin/run-example snappydata.SmartConnectorExample
 *
 */

object SmartConnectorExample {

  def main(args: Array[String]): Unit = {

    val builder = SparkSession
      .builder
      .appName("SmartConnectorExample")
      .master("spark://localhost:7077")
      // snappydata.connection property enables the application to interact with SnappyData store
      .config("snappydata.connection", "localhost:1527")


    args.foreach( prop => {
      val params = prop.split("=")
      builder.config(params(0), params(1))
    })

    val spark: SparkSession = builder
        .getOrCreate
    val snSession = new SnappySession(spark.sparkContext)

    println("\n\n ####  Reading from the SnappyStore table SNAPPY_COL_TABLE  ####  \n")
    val colTable = snSession.table("SNAPPY_COL_TABLE")
    colTable.show(10)


    println(" ####  Creating a table TestColumnTable  #### \n")

    snSession.dropTable("TestColumnTable", ifExists = true)

    // Creating a table from a DataFrame
    val dataFrame = snSession.range(1000).selectExpr("id", "floor(rand() * 10000) as k")

    snSession.sql("create table TestColumnTable (id bigint not null, k bigint not null) using column")

    dataFrame.write.insertInto("TestColumnTable")

    println(" ####  Write to table completed. ### \n\n" +
        "Now you can query table TestColumnTable using $SNAPPY_HOME/bin/snappy-shell")

  }

}

 

2. Spark和SnappyData集成运行:

 

Getting Started with your Spark Distribution

If you are a Spark developer and already using Spark 2.1.1 the fastest way to work with SnappyData is to add SnappyData as a dependency. For instance, using the package option in the Spark shell.

Open a command terminal, go to the location of the Spark installation directory, and enter the following:

$ cd <Spark_Install_dir>
# Create a directory for SnappyData artifacts
$ mkdir quickstartdatadir
$ ./bin/spark-shell --conf spark.snappydata.store.sys-disk-dir=quickstartdatadir --conf spark.snappydata.store.log-file=quickstartdatadir/quickstart.log --packages "SnappyDataInc:snappydata:1.0.1-s_2.11"

This opens the Spark shell and downloads the relevant SnappyData files to your local machine. Depending on your network connection speed, it may take some time to download the files.
All SnappyData metadata, as well as persistent data, is stored in the directory quickstartdatadir. The spark-shell can now be used to work with SnappyData using Scala APIs and SQL.

How to Use Spark-shell to run snappy-sql

After open spark shell, we must import snappy:

scala> import org.apache.spark.sql.{SnappySession, SparkSession}
scala> val snappy = new org.apache.spark.sql.SnappySession(spark.sparkContext)

then we can run snappy sql:

scala> snappy.sql("CREATE TABLE SNAPPY_COL_TABLE(r1 Integer, r2 Integer) USING COLUMN")
scala> snappy.sql("insert into SNAPPY_COL_TABLE VALUES(1,1)")
scala> snappy.sql("insert into SNAPPY_COL_TABLE VALUES(2,2)")
scala> snappy.sql("select count(*) from SNAPPY_COL_TABLE")

 

转载于:https://my.oschina.net/u/2935389/blog/1819826

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值