1.声明值和变量
scala> val myStr = "Hello World!"
scala> val myStr2 : String = "Hello World!"
scala> var myPrice : Double = 9.9
scala> myPrice = 10.6
2.基本数据类型和操作
scala> i += 1 //将i递增
scala> val sum1 = 5 + 3
5.toString() //产生字符串"5"
"abc".intersect("bcd") //输出"bc"。intersect()方法用来输出两个字符串中都存在的字符。
scala> println(mySet.contains("Scala")) //已知mySet是一个不可变集
true
3.range
scala> 1 to 5
scala> 1 until 5
scala> 1 to 10 by 2
4.print输出
print("My name is:")
println("My name is:") //有换行效果
printf("My name is %s. I hava %d apples and %d eggs.\n","Ziyu",i,j) //带有c语言输出的风格
5.读写文件
把数据写入文本文件
import java.io.PrintWriter //这行是Scala解释器执行上面语句后返回的结果
val out = new PrintWriter("/usr/local/scala/mycode/output.txt")
for (i <- 1 to 5) out.println(i)
out.close() //注意:必须要执行out.close()语句,才会看到output.txt文件被生成
读取文本文件中的行
scala> import scala.io.Source
scala> val inputFile = Source.fromFile("output.txt", "UTF-8")
scala> val lines = inputFile.getLines //返回的结果是一个迭代器
scala> for (line <- lines) println(line)
scala>val lines = source.getLines().toArray
scala> println(lines.size)
6.条件、循环
条件
val a = if (x>0) 1 else -1
val x = 3
if (x>0) {
println("This is a positive number")
} else if (x==0) {
println("This is a zero")
} else {
println("This is a negative number")
}
循环
var i = 9
while (i > 0) {
i -= 1
printf("i is %d\n",i)
}
var i = 0
do {
i += 1
println(i)
}while (i<5)
for循环语句格式如下:for (变量<-表达式) 语句块
for (i <- 1 to 5) println(i)
for (i <- 1 to 5 by 2) println(i)
for (i <- 1 to 5; j <- 1 to 3) println(i*j)
7.数据结构
数组
val myStrArr = new Array[String](3) //声明一个长度为3的字符串数组,每个数组元素初始化为null
myStrArr(0) = "BigData"
myStrArr(1) = "Hadoop"
myStrArr(2) = "Spark"
for (i <- 0 to 2) println(myStrArr(i))
实际上,Scala提供了更加简洁的数组声明和初始化方法,如下:
val intValueArr = Array(12,45,33)
val myStrArr = Array("BigData","Hadoop","Spark")
列表
val intList = List(1,2,3)
val intListOther = 0::intList
val intList = 1::2::3::Nil
val intList1 = List(1,2)
val intList2 = List(3,4)
val intList3 = intList1:::intList2
元组(tuple)
元组和列表不同,列表中各个元素必须是相同类型,元组可以包含不同类型的元素。
val tuple = ("BigData",2015,45.0)
println(tuple._2)
2015
集(set)
集就是一种更为方便的列表
1.不可变集
scala> var mySet = Set("Hadoop","Spark")
scala> println(mySet.contains("Spark"))
2.可变集
import scala.collection.mutable.Set
val myMutableSet = Set("Database","BigData")
myMutableSet += "Cloud Computing"
println(myMutableSet)
//结果为 Set(BigData, Cloud Computing, Database)
映射(Map)
在Scala中,映射(Map)是一系列键值对的集合(类似于字典)
1.不可变映射
val university = Map("XMU" -> "Xiamen University", "THU" -> "Tsinghua University","PKU"->"Peking University")
println(university("XMU")) //获取映射中的值
val xmu = if (university.contains("XMU")) university("XMU") else 0
println(xmu)
//检查映射中是否包含某个值
2.可变的映射
import scala.collection.mutable.Map
val university2 = Map("XMU" -> "Xiamen University", "THU" -> "Tsinghua University","PKU"->"Peking University")
university2("XMU") = "Ximan University" //更新已有元素的值
university2("FZU") = "Fuzhou University" //添加新元素
university2 + = ("TJU"->"Tianjin University") //添加一个新元素
university2 + = ("SDU"->"Shandong University","WHU"->"Wuhan University") //同时添加两个新元素
循环遍历映射
for ((k,v) <- 映射) 语句块
例子:
for ((k,v) <- university) printf("Code is : %s and name is: %s\n",k,v)
for (k<-university.keys) println(k)
以下才是重点
一、类
getter和setter方法
class Counter {
private var privateValue = 0 //这是私有字段,如何修改这个私有字段呢?
def value = privateValue //通过value可以获取这个私有字段privateValue的值
def value_=(newValue: Int){
if (newValue > 0) privateValue = newValue
}
//通过调用value_=这个方法,可以修改privateValue的值
def increment(step: Int): Unit = { value += step}
def current(): Int = {value}
}
object MyCounter{
def main(args:Array[String]){
val myCounter = new Counter
println(myCounter.value) //打印value的值(也就是privateValue的初始值)
myCounter.value_=(3) //调用value_=()方法,修改privateValue的值
println(myCounter.value) //打印value的新值(也就是privateValue的值)
myCounter.increment(1) //这里设置步长为1,每次增加1
println(myCounter.current)
}
}
构造器
辅助构造器:
class Counter {
private var value = 0 //value用来存储计数器的起始值
private var name = "" //表示计数器的名称
private var mode = 1 //mode用来表示计数器类型(比如,1表示步数计数器,2表示时间计数器)
def this(name: String){ //第一个辅助构造器
this() //调用主构造器
this.name = name
}
def this (name: String, mode: Int){ //第二个辅助构造器
this(name) //调用前一个辅助构造器
this.mode = mode
}
def increment(step: Int): Unit = { value += step}
def current(): Int = {value}
def info(): Unit = {printf("Name:%s and mode is %d\n",name,mode)}
}
object MyCounter{
def main(args:Array[String]){
val myCounter1 = new Counter //主构造器
val myCounter2 = new Counter("Runner") //第一个辅助构造器,计数器的名称设置为Runner,用来计算跑步步数
val myCounter3 = new Counter("Timer",2) //第二个辅助构造器,计数器的名称设置为Timer,用来计算秒数
myCounter1.info //显示计数器信息
myCounter1.increment(1) //设置步长
printf("Current Value is: %d\n",myCounter1.current) //显示计数器当前值
myCounter2.info //显示计数器信息
myCounter2.increment(2) //设置步长
printf("Current Value is: %d\n",myCounter2.current) //显示计数器当前值
myCounter3.info //显示计数器信息
myCounter3.increment(3) //设置步长
printf("Current Value is: %d\n",myCounter3.current) //显示计数器当前值
}
}
主构造器:
class Counter(val name: String, val mode: Int) {
private var value = 0 //value用来存储计数器的起始值
def increment(step: Int): Unit = { value += step}
def current(): Int = {value}
def info(): Unit = {printf("Name:%s and mode is %d\n",name,mode)}
}
object MyCounter{
def main(args:Array[String]){
val myCounter = new Counter("Timer",2)
myCounter.info //显示计数器信息
myCounter.increment(1) //设置步长
printf("Current Value is: %d\n",myCounter.current) //显示计数器当前值
}
}
二、对象
class Person(val name:String) {
private val id = Person.newPersonId() //访问伴生对象的方法,半生方法可以不需要进行实例化的条件下相互访问对方内部的成员变量和成员方法
def info() { printf("The id of %s is %d.\n",name,id)}
}
object Person {
private var lastId = 0 //一个人的身份编号
private def newPersonId() = {
lastId +=1
lastId
}
//在person单例对象里面定义了一个main函数(scalac需要一个main函数为入口)
def main(args: Array[String]){
val person1 = new Person("Ziyu") //这个Person是class Person
val person2 = new Person("Minxing")
person1.info()
person2.info()
}
}
class TestApplyClassAndObject {
}
class ApplyTest{
def apply() = println("apply method in class is called!")
def greetingOfClass: Unit ={
println("Greeting method in class is called.")
}
}
object ApplyTest{
def apply() = {
println("apply method in object is called")
new ApplyTest()
}
}
object TestApplyClassAndObject{
def main (args: Array[String]) {
val a = ApplyTest() //这里会调用伴生对象中的apply方法
a.greetingOfClass
a() // 这里会调用伴生类中的apply方法
}
}
class Car(name: String){
def info() {println("Car name is "+ name)}
}
object Car {
def apply(name: String) = new Car(name) //apply方法会调用伴生类Car的构造方法创建一个Car类的实例化对象
}
object MyTest{
def main (args: Array[String]) {
val mycar = Car("BMW") //这里会调用伴生对象中的apply方法,apply方法会创建一个Car类的实例化对象
mycar.info()
}
}
三、继承
abstract class Car{
val carBrand: String //抽象字段
def info() //抽象方法
def greeting() {println("Welcome to my car!")} //已经实现了的具体方法
}
//重写一个非抽象方法(具体方法时)必须使用override修饰符。
class BMWCar extends Car {
override val carBrand = "BMW" //实现了父类的抽象字段
def info() {printf("This is a %s car. It is expensive.\n", carBrand)} //实现了父类的抽象方法
override def greeting() {println("Welcome to my BMW car!")} //重新实现了父类的具体方法,必须要有override
}
class BYDCar extends Car {
override val carBrand = "BYD"
def info() {printf("This is a %s car. It is cheap.\n", carBrand)}
override def greeting() {println("Welcome to my BYD car!")}
}
object MyCar {
def main(args: Array[String]){
val myCar1 = new BMWCar()
val myCar2 = new BYDCar()
myCar1.greeting()
myCar1.info()
myCar2.greeting()
myCar2.info()
}
}
四、特质
trait CarId{
var id: Int
def currentId(): Int //定义了一个抽象方法
}
class BYDCarId extends CarId{ //使用extends关键字
override var id = 10000 //BYD汽车编号从10000开始
def currentId(): Int = {id += 1; id} //返回汽车编号
}
class BMWCarId extends CarId{ //使用extends关键字
override var id = 20000 //BMW汽车编号从10000开始
def currentId(): Int = {id += 1; id} //返回汽车编号
}
object MyCar {
def main(args: Array[String]){
val myCarId1 = new BYDCarId()
val myCarId2 = new BMWCarId()
printf("My first CarId is %d.\n",myCarId1.currentId)
printf("My second CarId is %d.\n",myCarId2.currentId)
}
}
trait CarId{
var id: Int
def currentId(): Int //定义了一个抽象方法
}
trait CarGreeting{
def greeting(msg: String) {println(msg)}
}
class BYDCarId extends CarId with CarGreeting{ //使用extends关键字混入第1个特质,后面可以反复使用with关键字混入更多特质
override var id = 10000 //BYD汽车编号从10000开始
def currentId(): Int = {id += 1; id} //返回汽车编号
}
class BMWCarId extends CarId with CarGreeting{ //使用extends关键字混入第1个特质,后面可以反复使用with关键字混入更多特质
override var id = 20000 //BMW汽车编号从10000开始
def currentId(): Int = {id += 1; id} //返回汽车编号
}
object MyCar {
def main(args: Array[String]){
val myCarId1 = new BYDCarId()
val myCarId2 = new BMWCarId()
myCarId1.greeting("Welcome my first car.")
printf("My first CarId is %d.\n",myCarId1.currentId)
myCarId2.greeting("Welcome my second car.")
printf("My second CarId is %d.\n",myCarId2.currentId)
}
}
五、模式匹配
scala中也有switch-case语句,比java要强大得多。
for (elem <- List(9,12.3,"Spark","Hadoop",'Hello)){
val str = elem match{
case i: Int => i + " is an int value."
case d: Double => d + " is a double value."
case "Spark"=> "Spark is found."
case s: String => s + " is a string value."
case _ => "This is an unexpected value."
}
println(str)
}
六、函数式编程
以下是你最经常见的函数:
- 最常见的无返回值的函数:
def fun1(name:String):Unit={println(name)}
和def fun1(name:String)=println(name)
和def fun1(name:String){println(name)}
,最提倡def fun1(name:String)=println(name)
- 最常见的有返回值的函数:
def counter(value: Int): Int = { value += 1}
- 匿名函数:
(num: Int) => num * 2
- 省略式函数(用定义变量的方式去定义函数):
- (无返回值的)
val fun2=(content:string) => println(content)
- (有返回值的)
val myNumFunc: Int=>Int = (num: Int) => num * 2
和val counter: Int => Int = { (value) => value += 1 }
- (有返回值的简写)
val myNumFunc = (num: Int) => num * 2
关于下划线"_"的使用
高阶函数
函数本身括号里的参数仍然是一个函数,这就是高阶函数。spark里很多都是高阶函数。
我们先了解一下
实例:
1.普通方法:
//普通方法求给定两个数区间中的所有整数求和
def sumInts(a: Int, b: Int): Int = {
if(a > b) 0 else a + sumInts(a + 1, b)
}
//普通方法求连续整数的平方和
def square(x: Int): Int = x * x
def sumSquares(a: Int, b: Int): Int = {
if(a > b) 0 else square(a) + sumSquares(a + 1, b)
}
//普通方法求连续整数的关于2的幂次和
def powerOfTwo(x: Int): Int = {
if(x == 0) 1 else 2 * powerOfTwo(x-1)
} //例如求powerOfTwo(4)就是2^3
def sumPowersOfTwo(a: Int, b: Int): Int = {
if(a > b) 0 else powerOfTwo(a) + sumPowersOfTwo(a+1, b)
}
//例如求sumPowersOfTwo(2,4)就是2^1 + sumPowersOfTwo(3,4)=2^1 + 2^2 + sumPowersOfTwo(4,4)=2^1 + 2^2 + 2^3 + 0
6.2遍历操作
列表(List)、映射(Map)等数据结构经常需要进行遍历操作。
列表的遍历
可以使用for循环进行遍历:
val list = List(1, 2, 3, 4, 5)
for (elem <- list) println(elem)
或者,也可以使用foreach进行遍历:
val list = List(1, 2, 3, 4, 5)
list.foreach(elem => println(elem)) //elem => println(elem)匿名函数
//foreash事实上就是遍历的作用,list.foreash(f)就是对list里面的每一个元素进行遍历,然后将元素放入函数f里面
映射的遍历
下面我们创建一个不可变映射:
val university = Map("XMU" -> "Xiamen University", "THU" -> "Tsinghua University","PKU"->"Peking University")
//循环遍历映射:
for ((k,v) <- university) printf("Code is : %s and name is: %s\n",k,v)
//只遍历键:
for (k<-university.keys) println(k)
//只遍历值:
for (v<-university.values) println(v)
当然,我们也可以使用foreach来实现对映射的遍历,如下(我们在Scala解释器中演示):
scala>university.foreach({case (k,v) => println(k+":"+v)})
scala> university.foreach {kv => println(kv._1+":"+kv._2)}
6.3map操作和flatMap操作
scala> val books = List("Hadoop", "Hive", "HDFS")
books: List[String] = List(Hadoop, Hive, HDFS)
scala> books.map(s => s.toUpperCase) //s => s.toUpperCase是匿名函数。它的含义是,对于输入s,都都执行s.toUpperCase(大写)操作。
res0: List[String] = List(HADOOP, HIVE, HDFS)
scala>book.mapp(s=>s.length) //s=>s.length是匿名函数,返回字符串的长度
res57:List[Int]=List(6,4,4)
scala> val books = List("Hadoop","Hive","HDFS")
books: List[String] = List(Hadoop, Hive, HDFS)
scala> books.flatMap (s => s.toList) //s.toList可以把s变成一个字母列表,比如Hadoop变成一个列表{H,a,d,o,o,p}
res0: List[Char] = List(H, a, o, o, p, H, i, v, e, H, D, F, S)
6.4 filter操作
val university = Map("XMU" -> "Xiamen University", "THU" -> "Tsinghua University","PKU"->"Peking University","XMUT"->"Xiamen University of Technology")
//然后我们采用filter操作过滤得到那些学校名称中包含“Xiamen”的元素:
val universityOfXiamen = university.filter({kv => kv._2 contains "Xiamen"})
//下面,把universityOfXiamen中的元素进行遍历全部打印出来:
universityOfXiamen.foreach({kv => println(kv._1+":"+kv._2)})
val university = Map("XMU" -> "Xiamen University", "THU" -> "Tsinghua University","PKU"->"Peking University","XMUT"->"Xiamen University of Technology")
val universityOfP = university filter {kv => kv._2 startsWith "P"}
universityOfP foreach {kv => println(kv._1+":"+kv._2)}
6.5 reduce规约操作
scala> val list = List(1,2,3,4,5)
list: List[Int] = List(1, 2, 3, 4, 5)
scala> list.reduceLeft(_ - _)
res25: Int = -13
scala> list.reduceRight(_ - _)
res26: Int = 3
6.6 fold操作
scala> val list = List(1,2,3,4,5)
list: List[Int] = List(1, 2, 3, 4, 5)
scala> list.fold(10)(_*_) //以10为初始值,依次对List里面的元素进行累乘
res0: Int = 1200
6.7 函数式编程实例WordCount
import java.io.File //读文件用的
import scala.io.Source //Souce.fromFile用的
import collection.mutable.Map //映射用的
object WordCount{
def main(args:Array[String]){
val dirfile=new File('/usr/local/scala/mycode/wordcount')
val files=dirfile.listFiles //列出这个目录下所有的相关文件
val results=Map.empty[String,Int] //创建一个空的映射
for(file <- files){
val data=Source.fromFile(file) //把当前遍历到的file传过来赋值给data
val strs=data.getLines.flatMap{s=>s.split("")} //data.getLines一行一行的读取data,用flatMap{s=>s.split("")}可以把每一行的单词分隔开并且合并为一个集合
//(如第一行是“I love Hadoop”,第二行是“I Love Spark”,经过data.getLines.flatMap{s=>s.split("")}处理后是{"I","Love","Hadoop","I","Love","Spark"})
strs foreach {word =>
if(results.contains(word))
results(word)+=1 else results(word)=1
}
}
results foreach{case(k,v) => println(s"$k:$v")}
}
}