Developing Your First Application


In this tutorial, you will learn how to develop your first InsightEdge application to read and write from/to the Data Grid. This tutorial assumes that you have basic knowledge of Apache Spark.

See also:

For installation and launching the InsightEdge cluster, refer to Quick Start for the minimum cluster setup.

Project dependencies

InsightEdge 1.0.0 runs on Spark 1.6.0 and Scala 2.10.4. These dependencies will be included when you depend on the InsightEdge artifacts.

InsightEdge jars are not published to Maven Central Repository yet. To install artifacts to your local Maven repository, make sure you have Maven installed and then run:

./sbin/insightedge-maven.sh
sbin\insightedge-maven.cmd

For SBT projects include the following:

resolvers += Resolver.mavenLocal
resolvers += "Openspaces Maven Repository" at "http://maven-repository.openspaces.org"

libraryDependencies += "org.gigaspaces.insightedge" % "insightedge-core" % "1.0.0" % "provided" exclude("javax.jms", "jms")

libraryDependencies += "org.gigaspaces.insightedge" % "insightedge-scala" % "1.0.0" % "provided" exclude("javax.jms", "jms")

And if you are building with Maven:

<dependency>
    <groupId>org.gigaspaces.insightedge</groupId>
    <artifactId>insightedge-core</artifactId>
    <version>1.0.0</version>
    <scope>provided</scope>
</dependency>
<dependency>
    <groupId>org.gigaspaces.insightedge</groupId>
    <artifactId>insightedge-scala</artifactId>
    <version>1.0.0</version>
    <scope>provided</scope>
</dependency>

InsightEdge jars are already packed into InsightEdge distribution and are automatically loaded with your application if you submit them with insightedge-submit script or run the Web Notebook. Therefore you don’t need to pack them into your uber jar. But if you want to run Spark in a local[*] mode, the dependencies should be declared with the compile scope.

Developing the Spark application

InsightEdge provides an extension to the regular Spark API.

See also:

Please refer to Self-Contained Applications if you are new to Spark.

InsightEdgeConfig is the starting point in connecting Spark with the Data Grid.

Create the InsightEdgeConfig and the SparkContext:

import org.insightedge.spark.context.InsightEdgeConfig
import org.insightedge.spark.implicits.all._

val ieConfig = InsightEdgeConfig("insightedge-space", Some("insightedge"), Some("127.0.0.1:4174"))
val sparkConf = new SparkConf().setAppName("sample-app").setMaster("spark://127.0.0.1:7077").setInsightEdgeConfig(ieConfig)
val sc = new SparkContext(sparkConf)

It is important to import org.insightedge.spark.implicits.all._, this will enable Data Grid specific API.

“insightedge-space”, “insightedge” and “127.0.0.1:4174” are the default Data Grid settings

When you are running Spark applications from the Web Notebook, the InsightEdgeConfig is created implicitly with the properties defined in the Spark interpreter.

Modeling Data Grid objects

Create a case class Product.scala to represent a Product entity in the Data Grid:

import org.insightedge.scala.annotation._
import scala.beans.{BeanProperty, BooleanBeanProperty}

case class Product(

   @BeanProperty
   @SpaceId
   var id: Long,

   @BeanProperty
   var description: String,

   @BeanProperty
   var quantity: Int,

   @BooleanBeanProperty
   var featuredProduct: Boolean

) {
    def this() = this(-1, null, -1, false)
}

Saving to Data Grid

To save Spark RDD just use saveToGrid method.

val products = (1 to 1000).map(i => Product(i, "Description of product " + i, Random.nextInt(10), Random.nextBoolean()))
val rdd = sc.parallelize(products)
rdd.saveToGrid()

Loading and analyzing data from Data Grid

Use the gridRdd method of the SparkContext to view Data Grid objects as Spark RDD

val gridRdd = sc.gridRdd[Product]()
println("total products quantity: " + gridRdd.map(_.quantity).sum())

Closing context

When you are done, close the Spark context and all connections to Data Grid with

sc.stopInsightEdgeContext()

Under the hood it will call regular Spark’s sc.stop(), so no need to call it manually.

Running your Spark application

After you packaged a jar, submit the Spark job via insightedge-submit instead of spark-submit.

./bin/insightedge-submit --class com.insightedge.spark.example.YourMainClass --master spark://127.0.0.1:7077 path/to/jar/insightedge-examples.jar
bin\insightedge-submit --class com.insightedge.spark.example.YourMainClass --master spark://127.0.0.1:7077 path\to\jar\insightedge-examples.jar