Graph Views with Apache Spark GraphX

azharuddin · February 22, 2018, 1:22pm

Spark GraphX and Its Importance

Use of graph has become very important in every sector. Whether it is targeted advertising or social network, graph has become mandatory. In spite of the massive demand of graphs, it is hard to find the right tool that work efficiently. This is why the graph computation task has become tiresome and difficult to maintain. For developers, it is just another name of burden. This necessity has influenced Spark to launch GraphX. It is one of the best tools that are capable to deal with extremely tough tasks.

Are you intereted in taking up for Apache Spark Certification Training? Enroll for Free Demo on Apache Spark Training!

Spark has proved itself efficient from the beginning of its journey. Spark’s GraphX is just another proof of its efficiency. GraphX is the new API of Spark for graphs like social network and web-graphs. It is also tremendous for graph-parallel computation like collaborate filtering and Page Rank. GraphX pull out the Spark RDD abstraction, at extreme level, by simply commencing the Resilient Distributed Property Graph. Resilient Distributed Property Graph indicates a directed multi-graph that has properties attached to every edge and vertex. GraphX exposes some elementary operators like mapReduce Triplets, joinVertics and subgraph along with optimized variant of Pergel API to shore up graph computation. Graphx also includes a superior collection of graph builders and algorithms. GraphX is capable of every possible task that can be expected from its kind. It will also facilitate you superior speed, efficiency and performance like always. In short, it is a complete package to serve its purpose truly.

Graph Views

Sometimes you might require extracting the edge and vertex views (RDD) of the graph. For example, if you are saving or arranging result of a calculation, you might need these. Graph class includes element like graph.edge and graph.vertexId while accessing the edges and vertices of graph. This also influences the internal representation of GraphX for the graph data.

To display user names which are above thirty years, you should use graph.vertics.

The output might look like this:

Jason is 52
Morgan is 50
Nicolas is 55
john is 45

Remember, there will be other log files too.