Frequently Asked Questions in Spark

转载 2015年11月17日 19:32:28

Using PredictionIO

Q: How do I check to see if various dependencies, such as ElasticSearch and HBase, are running?

You can run $ pio status from the terminal and it will return the status of various components that PredictionIO depends on.

  • You should see the following message if everything is OK:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
$ pio status
PredictionIO
  Installed at: /home/vagrant/PredictionIO
  Version: 0.8.6

Apache Spark
  Installed at: /home/vagrant/PredictionIO/vendors/spark-1.2.0
  Version: 1.2.0 (meets minimum requirement of 1.2.0)

Storage Backend Connections
  Verifying Meta Data Backend
  Verifying Model Data Backend
  Verifying Event Data Backend
  Test write Event Store (App Id 0)
2015-02-03 18:52:38,904 INFO  hbase.HBLEvents - The table predictionio_eventdata:events_0 doesn't exist yet. Creating now...
2015-02-03 18:52:39,868 INFO  hbase.HBLEvents - Removing table predictionio_eventdata:events_0...

(sleeping 5 seconds for all messages to show up...)
Your system is all ready to go.
  • If you see the following error message, it usually means ElasticSearch is not running properly:
1
2
3
4
5
6
7
8
9
  ...
Storage Backend Connections
  Verifying Meta Data Backend
  ...
Caused by: org.elasticsearch.client.transport.NoNodeAvailableException: None of the configured nodes are available: []
    at org.elasticsearch.client.transport.TransportClientNodesService.ensureNodesAreAvailable(TransportClientNodesService.java:298)
  ...

Unable to connect to all storage backend(s) successfully. Please refer to error message(s) above. Aborting.

You can check if there is any ElasticSearch process by running 'jps'.

Please see How to start elasticsearch below.

  • If you see the following error message, it usually means HBase is not running properly:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Storage Backend Connections
  Verifying Meta Data Backend
  Verifying Model Data Backend
  Verifying Event Data Backend
2015-02-03 18:40:04,810 ERROR zookeeper.RecoverableZooKeeper - ZooKeeper exists failed after 1 attempts
2015-02-03 18:40:04,812 ERROR zookeeper.ZooKeeperWatcher - hconnection-0x1e4075ce, quorum=localhost:2181, baseZNode=/hbase Received unexpected KeeperException, re-throwing exception
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/hbaseid
...
2015-02-03 18:40:07,021 ERROR hbase.StorageClient - Failed to connect to HBase. Plase check if HBase is running properly.
2015-02-03 18:40:07,026 ERROR storage.Storage$ - Error initializing storage client for source HBASE
2015-02-03 18:40:07,027 ERROR storage.Storage$ - Can't connect to ZooKeeper
java.util.NoSuchElementException: None.get
...

Unable to connect to all storage backend(s) successfully. Please refer to error message(s) above. Aborting.

You can check if there is any HBase-related process by running 'jps'.

Please see How to start HBase below.

Q: How to start ElasticSearch?

If you used the install script to install PredictionIO, the ElasticSearch is installed at ~/PredictionIO/vendors/elasticsearch-x.y.z/ where x.y.z is the version number (currently it's 1.4.4). To start it, run:

1
$ ~/PredictionIO/vendors/elasticsearch-x.y.z/bin/elasticsearch

If you didn't use install script, please go to where ElasticSearch is installed to start it.

It may take some time (15 seconds or so) for ElasticSearch to become ready after you start it (wait a bit before you run pio status again).

Q: How to start HBase ?

If you used the install script to install PredictionIO, the HBase is installed at ~/PredictionIO/vendors/hbase-x.y.z/where x.y.z is the version number (currently it's 0.98.6). To start it, run:

1
$ ~/PredictionIO/vendors/hbase-x.y.z/bin/start-hbase.sh

If you didn't use install script, please go to where HBase is installed to start it.

It may take some time (15 seconds or so) for HBase to become ready after you start it (wait a bit before you run pio status again).

Problem with Event Server

Q: How do I increase the JVM heap size of the Event Server?

Add the JAVA_OPTS environmental variable to supply JVM options, e.g.

1
$ JAVA_OPTS=-Xmx16g bin/pio eventserver ...

Engine Training

Q: How to increase Spark driver program and worker executor memory size?

In general, the PredictionIO bin/pio scripts wraps around Spark's spark-submit script. You can specify a lot of Spark configurations (i.e. executor memory, cores, master url, etc.) with it. You can supply these as pass-through arguments at the end of bin/pio command.

If the engine training seems stuck, it's possible that the the executor doesn't have enough memory.

First, follow instruction here to start standalone Spark cluster and get the master URL. If you use the provided quick install script to install PredictionIO, the Spark is installed at PredictionIO/vendors/spark-1.2.0/ where you could run the Spark commands in sbin/ as described in the Spark documentation. Then use following train commmand to specify executor memory (default is only 512 MB) and driver memory.

For example, the follow command set the Spark master to spark://localhost:7077 (the default url of standalone cluster), set the driver memory to 16G and set the executor memory to 24G for pio train.

1
$ pio train -- --master spark://localhost:7077 --driver-memory 16G --executor-memory 24G

Q: How to resolve "Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Serialized task 165:35 was 110539813 bytes, which exceeds max allowed: spark.akka.frameSize (10485760 bytes) - reserved (204800 bytes). Consider increasing spark.akka.frameSize or using broadcast variables for large values."?

A likely reason is the local algorithm model is larger than the default frame size. You can specify a larger value as a pass-thru argument to spark-submit when you pio train. The following command increase the frameSize to 1024MB.

1
$ pio train -- --conf spark.akka.frameSize=1024

Deploy Engine

Q: How to increase heap space memory for "pio deploy"?

If you see the following error during pio deploy, it means there is not enough heap space memory.

1
2
3
4
...
[ERROR] [LocalFSModels] Java heap space
[ERROR] [OneForOneStrategy] None.get
...

To increase the heap space, specify the "-- --driver-memory " parameter in the command. For example, set the driver memory to 8G when deploy the engine:

1
$ pio deploy -- --driver-memory 8G

Building PredictionIO

Q: How to resolve "Error: Could not find or load main class io.prediction.tools.Console" after ./make_distribution.sh?

1
2
$ bin/pio app
Error: Could not find or load main class io.prediction.tools.Console

When PredictionIO bumps a version, it creates another JAR file with the new version number.

Delete everything but the latest pio-assembly-<VERSION>.jar in $PIO_HOME/assembly directory. For example:

1
2
3
4
5
6
7
8
9
PredictionIO$ cd assembly/
PredictionIO/assembly$ ls -al
total 197776
drwxr-xr-x  2 yipjustin yipjustin      4096 Nov 12 00:08 .
drwxr-xr-x 17 yipjustin yipjustin      4096 Nov 12 00:09 ..
-rw-r--r--  1 yipjustin yipjustin 101184982 Nov  5 06:05 pio-assembly-0.8.1-SNAPSHOT.jar
-rw-r--r--  1 yipjustin yipjustin 101324859 Nov 12 00:09 pio-assembly-0.8.2.jar

PredictionIO/assembly$ rm pio-assembly-0.8.1-SNAPSHOT.jar

Q: How to resolve ".......error java.lang.AssertionError: assertion failed: java.lang.AutoCloseable" when ./make_distribution.sh?

PredictionIO only support Java 7 or later. Please make sure you have the correct Java version with the command:

1
$ javac -version

Engine Development

Q: What's the difference between P- and L- prefixed classes and functions?

PredictionIO v0.8 is built on the top of Spark, a massively scalable programming framework. A spark algorithm is different from conventional single machine algorithm in a way that spark algorithms use the RDD abstraction as its primary data type.

PredictionIO framework natively support both RDD-based algorithms and traditional single-machine algorithms. For controllers prefixed by "P" (i.e. PJavaDataSource, PJavaAlgorithm), their data include RDD abstraction; For "L" controllers, they are traditional single machine algorithms.

Running HBase

Q: How to resolve 'Exception in thread "main" java.lang.NullPointerException at org.apache.hadoop.net.DNS.reverseDns(DNS.java:92)'?

HBase relies on reverse DNS be set up properly to function. If your network configuration changes (such as working on a laptop with public WiFi hotspots), there could be a chance that reverse DNS does not function properly. You can install a DNS server on your own computer. Some users have reported that using Google Public DNS would also solve the problem.

If you have other questions, you can search or post on the user group or email the core team directly.

OSPF: Frequently Asked Questions

http://www.cisco.com/en/US/tech/tk365/technologies_q_and_a_item09186a0080094704.shtml#q18
  • blakegao
  • blakegao
  • 2014年02月08日 12:53
  • 378

Frequently Asked Questions (bouncycastle)

http://www.bouncycastle.org/wiki/display/JA1/Frequently+Asked+Questions   1. Why do I get "java.la...
  • guolong1983811
  • guolong1983811
  • 2012年12月13日 08:06
  • 465

Frequently Asked Questions for System.Web.Mail

Frequently Asked Questions for System.Web.Mailhttp://www.systemwebmail.com/
  • stickking
  • stickking
  • 2005年08月25日 19:29
  • 748

ovs:Frequently Asked Questions

Open vSwitch ========================== General ------- Q: What is Open vSwitch? A: Open vSwit...
  • jincm13
  • jincm13
  • 2014年07月15日 14:09
  • 664

USB Frequently Asked Questions

http://www.microchip.com/stellent/idcplg?IdcService=SS_GET_PAGE&nodeId=2651¶m=en534460
  • embededvc
  • embededvc
  • 2011年09月28日 14:35
  • 448

Frequently Asked Questions

 什么是Chord  Chord是一个peer-to-peer算法。它允许分布的参与者在给出了Key的前提协商出一个单点作为参与节点,而不用任何中心的调度。尤其是提供了分布式评估函数successor...
  • tunnel115
  • tunnel115
  • 2009年03月11日 09:08
  • 558

C# Frequently Asked Questions

Why did I receive the error: "The type ornamespace does not exist in the class ornamespace (are yo...
  • sjm19880409
  • sjm19880409
  • 2009年12月23日 14:34
  • 426

Frequently Asked Questions(MPICH2)

General Information Q: What is MPICH2? A: MPICH2 is a freely available, portable implementation of M...
  • wanglei5695312
  • wanglei5695312
  • 2009年12月08日 11:02
  • 2494

Frequently Asked Questions in MooseFS

Frequently Asked Questions (last update: May 24, 2016) Table of Contents: What average ...
  • weiyuefei
  • weiyuefei
  • 2016年08月24日 17:13
  • 359

WinCE Frequently Asked Questions

Find answers to commonly asked questions about Platform Builder 3.0, along with links to additional ...
  • xianfengdesign
  • xianfengdesign
  • 2007年07月12日 21:39
  • 535
内容举报
返回顶部
收藏助手
不良信息举报
您举报文章:Frequently Asked Questions in Spark
举报原因:
原因补充:

(最多只允许输入30个字)