先来个普通的数组:
<code class="hljs javascript has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">scala> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">var</span> arr=<span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">Array</span>(<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.0</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">3</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">4</span>) arr: <span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">Array</span>[Double] = <span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">Array</span>(<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.0</span>, <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2.0</span>, <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">3.0</span>, <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">4.0</span>)</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li></ul>
可以将它转换成一个Vector:
<code class="hljs avrasm has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">scala> import org<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.apache</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.spark</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.mllib</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.linalg</span>._ scala> var vec=Vectors<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.dense</span>(arr) <span class="hljs-label" style="box-sizing: border-box;">vec:</span> org<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.apache</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.spark</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.mllib</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.linalg</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.Vector</span> = [<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.0</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2.0</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">3.0</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">4.0</span>]</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li></ul>
再做一个RDD[Vector]:
<code class="hljs avrasm has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">scala> val rdd=sc<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.makeRDD</span>(Seq(Vectors<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.dense</span>(arr),Vectors<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.dense</span>(arr<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.map</span>(_*<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">10</span>)),Vectors<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.dense</span>(arr<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.map</span>(_*<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">100</span>)))) <span class="hljs-label" style="box-sizing: border-box;">rdd:</span> org<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.apache</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.spark</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.rdd</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.RDD</span>[org<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.apache</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.spark</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.mllib</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.linalg</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.Vector</span>] = ParallelCollectionRDD[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">6</span>] at makeRDD at <console>:<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">26</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li></ul>
可以根据这个RDD做一个分布式的矩阵:
<code class="hljs avrasm has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">scala> import org<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.apache</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.spark</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.mllib</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.linalg</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.distributed</span>._ scala> val mat: RowMatrix = new RowMatrix(rdd) <span class="hljs-label" style="box-sizing: border-box;">mat:</span> org<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.apache</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.spark</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.mllib</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.linalg</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.distributed</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.RowMatrix</span> = org<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.apache</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.spark</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.mllib</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.linalg</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.distributed</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.RowMatrix</span><span class="hljs-localvars" style="box-sizing: border-box;">@3133</span>b850 scala> val m = mat<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.numRows</span>() <span class="hljs-label" style="box-sizing: border-box;">m:</span> Long = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">3</span> scala> val n = mat<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.numCols</span>() <span class="hljs-label" style="box-sizing: border-box;">n:</span> Long = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">4</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li></ul>
试试统计工具,算算平均值:
<code class="hljs avrasm has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">scala> var sum=Statistics<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.colStats</span>(rdd) scala> sum<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.mean</span> <span class="hljs-label" style="box-sizing: border-box;">res7:</span> org<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.apache</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.spark</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.mllib</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.linalg</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.Vector</span> = [<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">37.0</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">74.0</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">111.0</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">148.0</span>]</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li></ul>
版权声明:本文为博主原创文章,未经博主允许不得转载。