上一章有介绍到Lisp的基本原子态的数据格式,本章由题可知,讲的是由原子构建成的分子形态。
集合这个概念在现代计算机语言中有着广泛的应用(有不用的么?),比较常见的有数组、元组、列表、哈希表、字典等等。 而在Lisp中最为常用也最为人所知的估计就是“列表”了,list,lisp,还的确蛮像的~不知是这个原因还是其他,相当一部分的Lisp的书籍中干脆只见list不见其他了。而事实上,作为"智能化"语言,自然是不可能只有这么一种集合形式,故而本章顶着"集合"之名向大家伙介绍了Lisp中其他的一些集合形态:向量和哈希表。
向量算是Lisp的一个基本的集合了,以整数作索引,又分为固定向量(大小设定既不变)和可变向量(增删改查,爱咋咋的)两种。
创建向量很简单,使用VECTOR函数:
(vector) ==> #() (vector 1) ==> #(1) (vector 1 2) ==> #(1 2)
不过通常更常用的构建向量函数是MAKE-ARRAY
(make-array 5 :initial-element nil) ==> #(NIL NIL NIL NIL NIL)
make-array常用的一个点在于,它构建的向量是可变的,通常的用法是:
(defparameter *x* (make-array 5 :fill-pointer 0)) (vector-push 'a *x*) ==> 0 *x* ==> #(A) (vector-push 'b *x*) ==> 1 *x* ==> #(A B) (vector-push 'c *x*) ==> 2 *x* ==> #(A B C) (vector-pop *x*) ==> C *x* ==> #(A B) (vector-pop *x*) ==> B *x* ==> #(A) (vector-pop *x*) ==> A *x* ==> #()
通常来说如果搭配上:adjustable参数就使得向量可变:
(make-array 5 :fill-pointer 0 :adjustable t) ==> #()
当需要增大向量规模时,用VECTOR-PUSH-EXTEND,用法和vector-push一样(话说我用的sbcl,即使不用上面的
神奇参数也是可以扩展的..)
在一些深入业务的操作中,有时也会对数据格式有要求,那么:element-type参数就有了用武之地
(make-array 5 :fill-pointer 0 :adjustable t :element-type 'character)
如此构建的向量应该算是特殊向量吧。
向量作为序列的一个子集,也包含了序列的很多特性(这些特性同样适用于其他类型的序列包括列表)
序列最基础的两个函数:LENGTH
, ELT:
(defparameter *x* (vector 1 2 3)) (length *x*) ==> 3 (elt *x* 0) ==> 1 (elt *x* 1) ==> 2 (elt *x* 2) ==> 3 (elt *x* 3) ==> error
elt通常还会和setf合用:
(setf (elt *x* 0) 10) *x* ==> #(10 2 3)
上述的基础函数可以组合出n多花样,不过为了方便,lisp已经提供了一些常用、靠谱的函数用以迭代、遍历序列:
Name | Required Arguments | Returns |
COUNT | Item and sequence | Number of times item appears in sequence |
FIND | Item and sequence | Item or NIL |
POSITION | Item and sequence | Index into sequence or NIL |
REMOVE | Item and sequence | Sequence with instances of item removed |
SUBSTITUTE | New item, item, and sequence | Sequence with instances of item replaced with new item |
使用方式:
(count 1 #(1 2 1 2 3 1 2 3 4)) ==> 3 (remove 1 #(1 2 1 2 3 1 2 3 4)) ==> #(2 2 3 2 3 4) (remove 1 '(1 2 1 2 3 1 2 3 4)) ==> (2 2 3 2 3 4) (remove #\a "foobarbaz") ==> "foobrbz" (substitute 10 1 #(1 2 1 2 3 1 2 3 4)) ==> #(10 2 10 2 3 10 2 3 4) (substitute 10 1 '(1 2 1 2 3 1 2 3 4)) ==> (10 2 10 2 3 10 2 3 4) (substitute #\x #\b "foobarbaz") ==> "fooxarxaz" (find 1 #(1 2 1 2 3 1 2 3 4)) ==> 1 (find 10 #(1 2 1 2 3 1 2 3 4)) ==> NIL (position 1 #(1 2 1 2 3 1 2 3 4)) ==> 0
当然,上面只是标准的使用方式,鉴于大伙的不走寻常路,这些函数都支持一些通用的参数:
(count "foo" #("foo" "bar" "baz") :test #'string=) ==> 1 (find 'c #((a 10) (b 20) (c 30) (d 40)) :key #'first) ==> (C 30) (find 'a #((a 10) (b 20) (a 30) (b 40)) :key #'first) ==> (A 10) (find 'a #((a 10) (b 20) (a 30) (b 40)) :key #'first :from-end t) ==> (A 30) (remove #\a "foobarbaz" :count 1) ==> "foobrbaz" (remove #\a "foobarbaz" :count 1 :from-end t) ==> "foobarbz"
像:from-end这种参数对count函数似乎意义不大,不过,对于倒序什么还是有用处的:
CL-USER> (defparameter *v* #((a 10) (b 20) (a 30) (b 40))) *V* CL-USER> (defun verbose-first (x) (format t "Looking at ~s~%" x) (first x)) VERBOSE-FIRST CL-USER> (count 'a *v* :key #'verbose-first) Looking at (A 10) Looking at (B 20) Looking at (A 30) Looking at (B 40) 2 CL-USER> (count 'a *v* :key #'verbose-first :from-end t) Looking at (B 40) Looking at (A 30) Looking at (B 20) Looking at (A 10) 2
估计这么一堆代码下来,有些筒子要受不鸟,那么整理下,看张表格:
Argument | Default | Meaning |
:test | EQL | Two-argument function used to compare item (or value extracted by :key function) to element. |
:key | NIL | One-argument function to extract key value from actual sequence element. NIL means use element as is. |
:start | 0 | Starting index (inclusive) of subsequence. |
:end | NIL | Ending index (exclusive) of subsequence. NIL indicates end of sequence. |
:from-end | NIL | If true, the sequence will be traversed in reverse order, from end to start. |
:count | NIL | Number indicating the number of elements to remove or substitute or NIL to indicate all (REMOVE and SUBSTITUTE only). |
舒活下筋骨,咱对前面的方法做些许改造:
(count-if #'evenp #(1 2 3 4 5)) ==> 2 (count-if-not #'evenp #(1 2 3 4 5)) ==> 3 (position-if #'digit-char-p "abcd0001") ==> 4 (remove-if-not #'(lambda (x) (char= (elt x 0) #\f)) #("foo" "bar" "baz" "foom")) ==> #("foo" "foom") (count-if #'evenp #((1 a) (2 b) (3 c) (4 d) (5 e)) :key #'first) ==> 2 (count-if-not #'evenp #((1 a) (2 b) (3 c) (4 d) (5 e)) :key #'first) ==> 3 (remove-if-not #'alpha-char-p #("foo" "bar" "1baz") :key #'(lambda (x) (elt x 0))) ==> #("foo" "bar") (remove-duplicates #(1 2 1 2 3 1 2 3 4)) ==> #(1 2 3 4)
是不是似曾相识?(不然你就是舒活太久筋骨了~)加了些类似自然英语的后缀,就变成复合函数了。
这里原作者啰嗦了一堆,大家学编程的,都是聪明人,看这代码,哪还不能理解这个~
以上根本其实是对序列中的单数据操作,Lisp还有一些可以对整个序列操作的函数:
COPY-SEQ 拷贝完整序列
REVERSE
反向序列
CONCATENATE
连接多序列 (这个需要指明合成序列是什么集合)
(concatenate 'vector #(1 2 3) '(4 5 6)) ==> #(1 2 3 4 5 6) (concatenate 'list #(1 2 3) '(4 5 6)) ==> (1 2 3 4 5 6) (concatenate 'string "abc" '(#\d #\e #\f)) ==> "abcdef"
作为序列,如果连排序都做不到,实在是太说不过去了,lisp提供sort,stable-sort(靠谱加强版?):
(sort (vector "foo" "bar" "baz") #'string<) ==> #("bar" "baz" "foo")
这种排序是会影响到原数据的,故而又称为破坏性函数,虽然如此,仍然建议大家这么写:
(setf my-sequence (sort my-sequence #'string<))
而不是这样:
(sort my-sequence #'string<)
merge函数:类似concatenate,不过它支持排序哦:
(merge 'vector #(1 3 5) #(2 4 6) #'<) ==> #(1 2 3 4 5 6) (merge 'list #(1 3 5) #(2 4 6) #'<) ==> (1 2 3 4 5 6)
还有一种需求(需求无限多啊)是要操纵序列的子序列,subseq函数算是代表:
(subseq "foobarbaz" 3) ==> "barbaz" (subseq "foobarbaz" 3 6) ==> "bar"
在配合着setf:
(defparameter *x* (copy-seq "foobarbaz")) (setf (subseq *x* 3 6) "xxx") ; subsequence and new value are same length *x* ==> "fooxxxbaz" (setf (subseq *x* 3 6) "abcd") ; new value too long, extra character ignored. *x* ==> "fooabcbaz" (setf (subseq *x* 3 6) "xx") ; new value too short, only two characters changed *x* ==> "fooxxcbaz"
相对应的,查找匹配的子序列位置也很方便:
(position #\b "foobarbaz") ==> 3 (search "bar" "foobarbaz") ==> 3 (mismatch "foobarbaz" "foom") ==> 3 (mismatch "foobar" "bar" :from-end t) ==> 3
下面再来介绍下用于序列的判断函数:
(every #'evenp #(1 2 3 4 5)) ==> NIL (some #'evenp #(1 2 3 4 5)) ==> T (notany #'evenp #(1 2 3 4 5)) ==> NIL (notevery #'evenp #(1 2 3 4 5)) ==> T (every #'> #(1 2 3 4) #(5 4 3 2)) ==> NIL (some #'> #(1 2 3 4) #(5 4 3 2)) ==> T (notany #'> #(1 2 3 4) #(5 4 3 2)) ==> NIL (notevery #'> #(1 2 3 4) #(5 4 3 2)) ==> T
应该不难理解吧,想看作者啰嗦?
map函数:
(map 'vector #'* #(1 2 3 4 5) #(10 9 8 7 6)) ==> #(10 18 24 28 30) (map-into a #'+ a b c) 将值赋给a
map这东西怎么说捏,我一直当遍历用~
再给大家介绍一个强劲的递归函数reduce:
(reduce #'+ #(1 2 3 4 5 6 7 8 9 10)) ==> 55
reduce带上前面的那些个参数,无限可能性啊!!
终于,跳出不是那么友好的数字索引式向量,咱来看看哈希表:允许用你喜欢的字符串来索引值(什么叫人性化?)话不多说,哈希表的基本玩法:
(defparameter *h* (make-hash-table)) (gethash 'foo *h*) ==> NIL (setf (gethash 'foo *h*) 'quux) (gethash 'foo *h*) ==> QUUX
这里的gethash还有高级功能,在后面的章节会有给出,下面来个抢先看:
(defun show-value (key hash-table) (multiple-value-bind (value present) (gethash key hash-table) (if present (format nil "Value ~a actually present." value) (format nil "Value ~a because key not found." value)))) (setf (gethash 'bar *h*) nil) ; provide an explicit value of NIL (show-value 'foo *h*) ==> "Value QUUX actually present." (show-value 'bar *h*) ==> "Value NIL actually present." (show-value 'baz *h*) ==> "Value NIL because key not found."
看看就好,作为初学者,深究是伤脑筋的。如果会用到,自然而然~
与向量不大相同的是,由于并非数字索引的顺序结构,迭代、遍历哈希表也得换个方式:
(maphash #'(lambda (k v) (format t "~a => ~a~%" k v)) *h*) (maphash #'(lambda (k v) (when (< v 10) (remhash k *h*))) *h*)
还有种高级点的方法:
(loop for k being the hash-keys in *h* using (hash-value v) do (format t "~a => ~a~%" k v))
这个后面章节还有详解。
本章的目的在开始出就表明了。不过对于Lisp,List或许真的才是王道吧---敬请期待下一章节<因为List,它被称为Lisp>!
好吧,我承认,偷懒省略了很多文字(我仍然坚称作者略显啰嗦~),难道只有List才能正确以待?
(未完待续)