Programming Clojure笔记之三——使用序列

最新推荐文章于 2019-06-14 11:55:22 发布

cwt8805

最新推荐文章于 2019-06-14 11:55:22 发布

阅读量1.2k

点赞数

分类专栏：编程语言文章标签： clojure 函数式编程

本文链接：https://blog.csdn.net/code_for_fun/article/details/51314263

版权

编程语言专栏收录该内容

32 篇文章 2 订阅

订阅专栏

一切皆Sequence

A seq is a logical list.
Collections that can be viewed as seqs are called seq-able.

;获取序列的第一个元素，序列为空或nil则返回nil
(first aseq)
;获取序列的第一个以外的所有元素，序列为空或nil则返回一个空序列
(rest aseq)
;向序列起始处添加元素，从而创建一个新序列
(cons elem aseq)

;将seq-able容器转化为一个seq，coll为空或nil则返回nil。
(seq coll)
;将seq-able容器第一个以外的所有元素转化为一个seq。等同于(seq (rest aseq))。
(next aseq)

;注意list前面的单引号
(first '(1 2 3))
-> 1
(rest '(1 2 3))
-> (2 3)
(cons 0 '(1 2 3))
-> (0 1 2 3)

;将vector传入rest或者cons，结果是一个seq，而不是vector。在REPL中，seq看起来list一致。
(first [1 2 3])
-> 1
(rest [1 2 3])
-> (2 3)
(cons 0 [1 2 3])
-> (0 1 2 3)

;使用class函数查看真实类型
(class (rest [1 2 3]))
-> clojure.lang.PersistentVector$ChunkedSeq

;map也可以作为一个seq
(first {:fname "Aaron" :lname "Bedra"})
-> [:lname "Bedra"]
(rest {:fname "Aaron" :lname "Bedra"})
-> ([:fname "Aaron"])
(cons [:mname "James"] {:fname "Aaron" :lname "Bedra"})
-> ([:mname "James"] [:lname "Bedra"] [:fname "Aaron"])

;set同样
(first #{:the :quick :brown :fox})
-> :brown
(rest #{:the :quick :brown :fox})
-> (:quick :fox :the)
(cons :jumped #{:the :quick :brown :fox})
-> (:jumped :brown :quick :fox :the)

;不能依赖map和set的元素顺序，如果需要可靠的顺序，你可以使用sorted-set和sorted-map
(sorted-set :the :quick :brown :fox)
-> #{:brown :fox :quick :the}
(sorted-map :c 3 :b 2 :a 1)
-> {:a 1, :b 2, :c 3}

;使用conj和into，对于list插到前面，对于vector插到了后面
(conj '(1 2 3) :a)
-> (:a 1 2 3)
(into '(1 2 3) '(:a :b :c))
-> (:c :b :a 1 2 3)
(conj [1 2 3] :a)
-> [1 2 3 :a]
(into [1 2 3] [:a :b :c])
-> [1 2 3 :a :b :c]

使用Sequence Library

序列创建

;(range start? end step?)
(range 10)
-> (0 1 2 3 4 5 6 7 8 9)
(range 10 20)
-> (10 11 12 13 14 15 16 17 18 19)
(range 1 25 2)

;(repeat n x)
(repeat 5 1)
-> (1 1 1 1 1)
(repeat 10 "x")
-> ("x" "x" "x" "x" "x" "x" "x" "x" "x" "x")
-> (1 3 5 7 9 11 13 15 17 19 21 23)

;(iterate f x)，生成无限序列(lazy sequence)从x开始，下一个元素由函数作用当前元素生成。
(take 10 (iterate inc 1))
-> (1 2 3 4 5 6 7 8 9 10)
(take n sequence)

;(cycle coll)，循环一个有限序列生成无限序列
(take 10 (cycle (range 3)))
-> (0 1 2 0 1 2 0 1 2 0)

;(interleave & colls)将多个容器交插，直到其中一个容器结束为止。
(defn whole-numbers [] (iterate inc 1))
-> #'user/whole-numbers
(interleave (whole-numbers) ["A" "B" "C" "D" "E"])
-> (1 "A" 2 "B" 3 "C" 4 "D" 5 "E")

;(interpose separator coll)使用separator分隔容器中的元素
(interpose "," ["apples" "bananas" "grapes"])
-> ("apples" "," "bananas" "," "grapes")
;interpose配合(apply str ...)输出string
(apply str (interpose \, ["apples" "bananas" "grapes"]))
-> "apples,bananas,grapes"

;(apply str ...)是个惯用法，因而Clojure封装了一个函数join来替代之
(use '[clojure.string :only (join)])
(join \, ["apples" "bananas" "grapes"])
-> "apples,bananas,grapes"

;以上创建的都是seq，对于list、vector、set、map各有创建方法
(list & elements)
(vector & elements)
(hash-set & elements)
(hash-map key-1 val-1 ...)

;set和vec分别是hash-set和vector的相似函数，不同在于接受一个Sequence作为参数
(set [1 2 3])
-> #{1 2 3}
(hash-set 1 2 3)
-> #{1 2 3}
(vec (range 3))
-> [0 1 2]

序列过滤

;取前10个偶数，even?为谓词函数
(take 10 (filter even? (whole-numbers)))
-> (2 4 6 8 10 12 14 16 18 20)
;取前10个奇数，odd?为谓词函数
(take 10 (filter odd? (whole-numbers)))
-> (1 3 5 7 9 11 13 15 17 19)

;take函数的带谓词版本take-while
;set可以充当函数，用来测试参数是否位于其中
;complement函数接受一个函数作为参数生成行为与原函数相反的函数
(take-while (complement #{\a\e\i\o\u}) "the-quick-brown-fox")
-> (\t \h)

;drop-while是与take-while行为相反的函数
;丢掉序列开始的元素直到谓词函数为假,最后返回剩下的元素
(drop-while (complement #{\a\e\i\o\u}) "the-quick-brown-fox")
-> (\e \- \q \u \i \c \k \- \b \r \o \w \n \- \f \o \x)

;split-at函数接受一个index，而split-with函数接受一个谓词函数
(split-at 5 (range 10))
->[(0 1 2 3 4) (5 6 7 8 9)]
(split-with #(<= % 10) (range 0 20 2))
->[(0 2 4 6 8 10) (12 14 16 18)]

序列判定（Predicate）

;every
(every? odd? [1 3 5])
-> true
(every? odd? [1 3 5 8])
-> false

;some函数
(some even? [1 2 3])
-> true
(some even? [1 3 5])
-> nil
(some identity [nil false 1 nil 2])
-> 1

(not-every? even? (whole-numbers))
-> true
(not-any? even? (whole-numbers))
-> false

序列变换

;map函数，接受一个序列参数
(map #(format "<p>%s</p>" %) ["the" "quick" "brown" "fox"])
-> ("<p>the</p>" "<p>quick</p>" "<p>brown</p>" "<p>fox</p>")

;map函数，接受多个序列参数
(map #(format "<%s>%s</%s>" %1 %2 %1)
["h1" "h2" "h3" "h1"] ["the" "quick" "brown" "fox"])
-> ("<h1>the</h1>" "<h2>quick</h2>" "<h3>brown</h3>"
"<h1>fox</h1>")

;reduce函数
(reduce + (range 1 11))
-> 55
(reduce * (range 1 11))
-> 3628800

;排序，自然序
(sort [42 1 7 11])
-> (1 7 11 42)
;根据函数结果的自然序排列
(sort-by #(.toString %) [42 1 7 11])
-> (1 11 42 7)

;如果不想要自然序，那么两个函数都可以带一个比较函数
(sort > [42 1 7 11])
-> (42 11 7 1)
(sort-by :grade > [{:grade 83} {:grade 90} {:grade 77}])
-> ({:grade 90} {:grade 83} {:grade 77})

;大部分的序列过滤和变换都可以使用list comprehension替代。
;Clojure中使用for宏来进行comprehension，形式如下：
(for [binding-form coll-expr filter-expr? ...] expr)
;使用list comprehension替代map
(for [word ["the" "quick" "brown" "fox"]]
(format "<p>%s</p>" word))
-> ("<p>the</p>" "<p>quick</p>" "<p>brown</p>" "<p>fox</p>")

;使用list comprehension加:when从句，替代filter
(take 10 (for [n (whole-numbers) :when (even? n)] n))
-> (2 4 6 8 10 12 14 16 18 20)

;使用多个绑定表达式的comprehension
(for [file "ABCDEFGH" rank (range 1 9)] (format "%c%d" file rank))
-> ("A1" "A2" ... elided ... "H7 ""H8")

Lazy and infinite sequences

大部分的sequence都是lazy的，元素只有在需要的时候才返回。

;使用list comprehension定义一个容器，但是没有输出
(def x (for [i (range 1 3)] (do (println i) i)))
-> #'user/x

;强制输出使用doall
(doall x)
| 1
| 2
-> (1 2)

;或者使用dorun，不同在于dorun不在内存中保留整个序列
(dorun x)
| 1
| 2
-> nil

使用Java中的Seq-able

Java容器

; String.getBytes返回字节数组
(first (.getBytes "hello"))
-> 104
(rest (.getBytes "hello"))
-> (101 108 108 111)
(cons (int \h) (.getBytes "ello"))
-> (104 101 108 108 111)

;Hashmap和Map
; System.getProperties returns a Hashtable
(first (System/getProperties))
-> #<Entry java.runtime.name=Java(TM) SE Runtime Environment>
(rest (System/getProperties))
-> (#<Entry sun.boot.library.path=/System/Library/... etc. ...

正则表达式

;不推荐的用法（暴露了Matcher可变的本质）
(let [m (re-matcher #"\w+" "the quick brown fox")]
  (loop [match (re-find m)]
    (when match
      (println match)
      (recur (re-find m)))))

;推荐的用法
(re-seq #"\w+" "the quick brown fox")
-> ("the" "quick" "brown" "fox")

文件系统

;繁琐用法
(map #(.getName %) (seq (.listFiles (File. "."))))
-> ("concurrency" "sequences" ...)
;事实上，当使用map之类的函数时，seq自动会调用，简单写法
(map #(.getName %) (.listFiles (File. ".")))
-> ("concurrency" "sequences" ...)

;深度优先遍历文件系统使用file-seq
(defn minutes-to-millis [mins] (* mins 1000 60))

(defn recently-modified? [file]
(> (.lastModified file)
  (- (System/currentTimeMillis) (minutes-to-millis 30))))

(filter recently-modified? (file-seq (File. ".")))
-> (./sequences ./sequences/sequences.clj)

流

(use '[clojure.java.io :only (reader)])
; reader没有关闭
(take 2 (line-seq (reader "src/examples/utils.clj")))
-> ("(ns examples.utils" " (:import [java.io BufferedReader InputStreamReader]))")

;使用with-open关闭绑定的reader
(with-open [rdr (reader "src/examples/utils.clj")]
  (count (line-seq rdr)))
-> 64

;过滤掉空白行
(with-open [rdr (reader "src/examples/utils.clj")]
  (count (filter #(re-find #"\S" %) (line-seq rdr))))
-> 55

;计算文件树中所有Clojure文件的总行数
(use '[clojure.java.io :only (reader)])
(defn non-blank? [line] (if (re-find #"\S" line) true false))

(defn non-svn? [file] (not (.contains (.toString file) ".svn")))

(defn clojure-source? [file] (.endsWith (.toString file) ".clj"))

(defn clojure-loc [base-file]
  (reduce
    +
    (for [file (file-seq base-file)
          :when (and (clojure-source? file) (non-svn? file))]
      (with-open [rdr (reader file)]
        (count (filter non-blank? (line-seq rdr)))))))

XML

;xml数据
<compositions>
<composition composer="J. S. Bach">
<name>The Art of the Fugue</name>
</composition>
<composition composer="F. Chopin">
<name>Fantaisie-Impromptu Op. 66</name>
</composition>
<composition composer="W. A. Mozart">
<name>Requiem</name>
</composition>
</compositions>

;解析成树形结构
(use '[clojure.xml :only (parse)])
(parse (java.io.File. "data/sequences/compositions.xml"))
-> {:tag :compositions,
:attrs nil,
:content [{:tag :composition, ... etc. ...

;解析出创作者
(for [x (xml-seq
(parse (java.io.File. "data/sequences/compositions.xml")))
:when (= :composition (:tag x))]
(:composer (:attrs x)))
-> ("J. S. Bach" "F. Chopin" "W. A. Mozart")

调用数据结构特定函数

前面的序列函数是通用的函数，有时需要使用特定于list、vector、map、struct和set的函数。

list函数

;peek等同于first
(peek '(1 2 3))
-> 1
;pop不等同于rest，pop在空序列情况下抛出异常
(pop '(1 2 3))
-> (2 3)

vector函数

;同样支持peek和pop，不同在作用于最后的元素
(peek [1 2 3])
-> 3
(pop [1 2 3])
-> [1 2]
;get函数
(get [:a :b :c] 1)
-> :b
(get [:a :b :c] 5)
-> nil

;vector本身也是函数
([:a :b :c] 1)
-> :b
([:a :b :c] 5)
-> java.lang.ArrayIndexOutOfBoundsException: 5

;替换元素
(assoc [0 1 2 3 4] 2 :two)
-> [0 1 :two 3 4]

;创建子vector
(subvec [1 2 3 4 5] 3)
-> [4 5]
(subvec [1 2 3 4 5] 1 3)
-> [2 3]
;当然也可以用take和drop模拟
(take 2 (drop 1 [1 2 3 4 5]))
-> (2 3)
;一般情况下某个数据结构有特定的方法存在，都是从效率上考虑的

map函数

;keys和vals函数
(keys {:sundance "spaniel", :darwin "beagle"})
-> (:sundance :darwin)
(vals {:sundance "spaniel", :darwin "beagle"})
-> ("spaniel" "beagle")

;get函数
(get {:sundance "spaniel", :darwin "beagle"} :darwin)
-> "beagle"
(get {:sundance "spaniel", :darwin "beagle"} :snoopy)
-> nil

;map本身也是函数
({:sundance "spaniel", :darwin "beagle"} :darwin)
-> "beagle"
({:sundance "spaniel", :darwin "beagle"} :snoopy)
-> nil

;keyword也是函数
(:darwin {:sundance "spaniel", :darwin "beagle"} )
-> "beagle"
(:snoopy {:sundance "spaniel", :darwin "beagle"} )
-> nil

;contains函数，测试某键是否存在
(contains? score :stu)
-> true 

;数据
(def song {:name "Agnus Dei"
  :artist "Krzysztof Penderecki"
  :album "Polish Requiem"
  :genre "Classical"})
;添加键值对
(assoc song :kind "MPEG Audio File")
-> {:name "Agnus Dei", :album "Polish Requiem",
:kind "MPEG Audio File", :genre "Classical",
:artist "Krzysztof Penderecki"}

;删除键值对
(dissoc song :genre)
-> {:name "Agnus Dei", :album "Polish Requiem",
:artist "Krzysztof Penderecki"}

;获取特定键值对
(select-keys song [:name :artist])
-> {:name "Agnus Dei", :artist "Krzysztof Penderecki"}

;合并map
(merge song {:size 8118166, :time 507245})
-> {:name "Agnus Dei", :album "Polish Requiem",
:genre "Classical", :size 8118166,
:artist "Krzysztof Penderecki", :time 507245}

;按特定规则合并
(merge-with
concat
{:rubble ["Barney"], :flintstone ["Fred"]}
{:rubble ["Betty"], :flintstone ["Wilma"]}
{:rubble ["Bam-Bam"], :flintstone ["Pebbles"]})
-> {:rubble ("Barney" "Betty" "Bam-Bam"),
:flintstone ("Fred" "Wilma" "Pebbles")}

set函数

;数据
(def languages #{"java" "c" "d" "clojure"})
(def beverages #{"java" "chai" "pop"})

(union languages beverages)
-> #{"java" "c" "d" "clojure" "chai" "pop"}

(difference languages beverages)
-> #{"c" "d" "clojure"}

(intersection languages beverages)
-> #{"java"}

(select #(= 1 (.length %)) languages)
-> #{"c" "d"}

;并集和差集是集合论的一部分，同时也是关系代数的一部分，关系代数是
;SQL语言的基层。关系代数包括六个基本操作。并集(set union)，差集
;(set difference)，再加上重命名(rename)，选择(selection)，
;投射(projection)，叉积(cross product)。

关系代数	数据库	Clojure类型系统
Relation	Table	Anything set-like
Tuple	Row	Anything map-like

;模拟一个关系数据库
(def compositions
#{{:name "The Art of the Fugue" :composer "J. S. Bach"}
{:name "Musical Offering" :composer "J. S. Bach"}
{:name "Requiem" :composer "Giuseppe Verdi"}
{:name "Requiem" :composer "W. A. Mozart"}})
(def composers
#{{:composer "J. S. Bach" :country "Germany"}
{:composer "W. A. Mozart" :country "Austria"}
{:composer "Giuseppe Verdi" :country "Italy"}})
(def nations
#{{:nation "Germany" :language "German"}
{:nation "Austria" :language "German"}
{:nation "Italy" :language "Italian"}})

;rename函数
(rename compositions {:name :title})
-> #{{:title "Requiem", :composer "Giuseppe Verdi"}
{:title "Musical Offering", :composer "J.S. Bach"}
{:title "Requiem", :composer "W. A. Mozart"}
{:title "The Art of the Fugue", :composer "J.S. Bach"}}

;select函数
(select #(= (:name %) "Requiem") compositions)
-> #{{:name "Requiem", :composer "W. A. Mozart"}
{:name "Requiem", :composer "Giuseppe Verdi"}}

;project函数
(project compositions [:name])
-> #{{:name "Musical Offering"}
{:name "Requiem"}
{:name "The Art of the Fugue"}}

;join函数
(join compositions composers)
-> #{{:name "Requiem", :country "Austria",
:composer "W. A. Mozart"}
{:name "Musical Offering", :country "Germany",
:composer "J. S. Bach"}
{:name "Requiem", :country "Italy",
:composer "Giuseppe Verdi"}
{:name "The Art of the Fugue", :country "Germany",
:composer "J. S. Bach"}}

;如果两个表没有共同的键，则需要指定
(join composers nations {:country :nation})
-> #{{:language "German", :nation "Austria",
:composer "W. A. Mozart", :country "Austria"}
{:language "German", :nation "Germany",
:composer "J. S. Bach", :country "Germany"}
{:language "Italian", :nation "Italy",
:composer "Giuseppe Verdi", :country "Italy"}}

;综合示例
(project
(join
(select #(= (:name %) "Requiem") compositions)
composers)
[:country])
-> #{{:country "Italy"} {:country "Austria"}}