Skip to content

Commit 192d1f9

Browse files
WeichenXu123srowen
authored andcommitted
[GRAPHX][EXAMPLES] move graphx test data directory and update graphx document
## What changes were proposed in this pull request? There are two test data files used for graphx examples existing in directory "graphx/data" I move it into "data/" directory because the "graphx" directory is used for code files and other test data files (such as mllib, streaming test data) are all in there. I also update the graphx document where reference the data files which I move place. ## How was this patch tested? N/A Author: WeichenXu <[email protected]> Closes apache#14010 from WeichenXu123/move_graphx_data_dir.
1 parent bad0f7d commit 192d1f9

File tree

3 files changed

+9
-9
lines changed

3 files changed

+9
-9
lines changed
File renamed without changes.
File renamed without changes.

docs/graphx-programming-guide.md

+9-9
Original file line numberDiff line numberDiff line change
@@ -1007,15 +1007,15 @@ PageRank measures the importance of each vertex in a graph, assuming an edge fro
10071007

10081008
GraphX comes with static and dynamic implementations of PageRank as methods on the [`PageRank` object][PageRank]. Static PageRank runs for a fixed number of iterations, while dynamic PageRank runs until the ranks converge (i.e., stop changing by more than a specified tolerance). [`GraphOps`][GraphOps] allows calling these algorithms directly as methods on `Graph`.
10091009

1010-
GraphX also includes an example social network dataset that we can run PageRank on. A set of users is given in `graphx/data/users.txt`, and a set of relationships between users is given in `graphx/data/followers.txt`. We compute the PageRank of each user as follows:
1010+
GraphX also includes an example social network dataset that we can run PageRank on. A set of users is given in `data/graphx/users.txt`, and a set of relationships between users is given in `data/graphx/followers.txt`. We compute the PageRank of each user as follows:
10111011

10121012
{% highlight scala %}
10131013
// Load the edges as a graph
1014-
val graph = GraphLoader.edgeListFile(sc, "graphx/data/followers.txt")
1014+
val graph = GraphLoader.edgeListFile(sc, "data/graphx/followers.txt")
10151015
// Run PageRank
10161016
val ranks = graph.pageRank(0.0001).vertices
10171017
// Join the ranks with the usernames
1018-
val users = sc.textFile("graphx/data/users.txt").map { line =>
1018+
val users = sc.textFile("data/graphx/users.txt").map { line =>
10191019
val fields = line.split(",")
10201020
(fields(0).toLong, fields(1))
10211021
}
@@ -1032,11 +1032,11 @@ The connected components algorithm labels each connected component of the graph
10321032

10331033
{% highlight scala %}
10341034
// Load the graph as in the PageRank example
1035-
val graph = GraphLoader.edgeListFile(sc, "graphx/data/followers.txt")
1035+
val graph = GraphLoader.edgeListFile(sc, "data/graphx/followers.txt")
10361036
// Find the connected components
10371037
val cc = graph.connectedComponents().vertices
10381038
// Join the connected components with the usernames
1039-
val users = sc.textFile("graphx/data/users.txt").map { line =>
1039+
val users = sc.textFile("data/graphx/users.txt").map { line =>
10401040
val fields = line.split(",")
10411041
(fields(0).toLong, fields(1))
10421042
}
@@ -1053,11 +1053,11 @@ A vertex is part of a triangle when it has two adjacent vertices with an edge be
10531053

10541054
{% highlight scala %}
10551055
// Load the edges in canonical order and partition the graph for triangle count
1056-
val graph = GraphLoader.edgeListFile(sc, "graphx/data/followers.txt", true).partitionBy(PartitionStrategy.RandomVertexCut)
1056+
val graph = GraphLoader.edgeListFile(sc, "data/graphx/followers.txt", true).partitionBy(PartitionStrategy.RandomVertexCut)
10571057
// Find the triangle count for each vertex
10581058
val triCounts = graph.triangleCount().vertices
10591059
// Join the triangle counts with the usernames
1060-
val users = sc.textFile("graphx/data/users.txt").map { line =>
1060+
val users = sc.textFile("data/graphx/users.txt").map { line =>
10611061
val fields = line.split(",")
10621062
(fields(0).toLong, fields(1))
10631063
}
@@ -1081,11 +1081,11 @@ all of this in just a few lines with GraphX:
10811081
val sc = new SparkContext("spark://master.amplab.org", "research")
10821082

10831083
// Load my user data and parse into tuples of user id and attribute list
1084-
val users = (sc.textFile("graphx/data/users.txt")
1084+
val users = (sc.textFile("data/graphx/users.txt")
10851085
.map(line => line.split(",")).map( parts => (parts.head.toLong, parts.tail) ))
10861086

10871087
// Parse the edge data which is already in userId -> userId format
1088-
val followerGraph = GraphLoader.edgeListFile(sc, "graphx/data/followers.txt")
1088+
val followerGraph = GraphLoader.edgeListFile(sc, "data/graphx/followers.txt")
10891089

10901090
// Attach the user attributes
10911091
val graph = followerGraph.outerJoinVertices(users) {

0 commit comments

Comments
 (0)