Ruby的内核模块已经实现了I/O相关的方法:gets,open,print,printf,putc,puts,readline,readlines,test。
11.1 IO对象
Ruby提供了基础类:IO,它的继承类有File与BasicSocket。IO对象就是建立一个双向通道,一端接Ruby,一端接外部资源。
11.2文件的打开与关闭
file = File.new("testfile", "r")
#...process the file
file.close
#第二个参数是操作模式:r w r+(read-write)
上面的new方法返回的是一个File对象,File#open与之相像,只是如果给File#open赋一个代码块,则open方法会调用这个代码块,把打开的文件对象作为参数。并且在操作完成后,自动关闭文件。
File.open("testfile","r") do |file|
#...processthefile
end #<-file automatically closed here
读取文件最好使用这种方式,因为这种在处理过程中如果发生异常,File#open方法会在抛出异常前关闭文件。
#open方法内部大概类似于下面的处理逻辑
class File
def File.open(*args)
result=f=File.new(*args)
if block_given?
begin
result=yield f
ensure
f.close
end
end
result
end
end
11.3 读写文件
gets可从标准输入中读取一行,在通过脚本调用时,如果通过命令指定了文件,也可以从文件中读取。
while line=gets
puts line
end
$ruby copy.rb
These are lines
These are lines
that I am typing
that I am typing
^D
#脚本执行时指定文体
$rubycopy.rb testfile
This is line one
This is line two
#显示的指定文件,并逐行打印
File.open("testfile") do |file|
while line = file.gets
puts line
end
end
循环读取
#IO#each_byte读取下一个8-bit字节
#chr方法将数值转换为ASCII字符
File.open("testfile") do |file|
file.each_byte.with_index do |ch,index|
print"#{ch.chr}:#{ch}"
break if index > 10
end
end
produces:
T:84h:104i:105s:115 :32i:105s:115 :32l:108i:105n:110e:101
IO#each_line逐行读取文件
#String#dump用于显示换行符号
File.open("testfile") do |file|
file.each_line{|line| puts "Got #{line.dump}"}
end
produces:
Got "This is line one\n"
Got "This is line two\n"
Got "This is line three\n"
Got "And soon...\n"
#IO.each_line("*")支持自定义换行符号,下面示例使用e作为换行符号
File.open("testfile") do |file|
file.each_line("e") {|line| puts "Got #{line.dump}"}
end
produces:
Got "This is line"
Got "one"
Got "\nThis is line"
Got "two\nThis is line"
Got "thre"
Got "e"
Got "\nAnd soon...\n"
#使用IO#foreach
IO.foreach("testfile") {|line| puts line}
也可以把文件内容读取为String串,或者String数组(每行读取为一列)
#read into string
str = IO.read("testfile")
str.length #=>66
str[0,30] #=>"This is line one\nThis is line"
#read into an array
arr = IO.readlines("testfile")
arr.length #=>4
arr[0] #=>"This is line one\n"
注意:IO处理经常会出现异常情况,在调用这些API时,记着使用begin..rescue..end来捕获它们。
写文件
#Note the "w",which opens the file for writing
File.open("output.txt","w") do |file|
file.puts "Hello"
file.puts "1+2=#{1+2}"
end
nil写入文件后是empty串。
Doing I/O with Strings
StringIO类类似于java的StringReader,StringWriter。提供了IO类相同的方法实现。
require 'stringio'
ip = StringIO.new("now is\nthe time\nto learn\nRuby!")
op=StringIO.new("","w")
ip.each_line do |line|
op.puts line.reverse
end
op.string#=>"\nsi won\n\nemit eht\n\nnrael ot\n!ybuR\n"
11.4 网络通信
require 'socket'
client = TCPSocket.open('127.0.0.1', 'www')
client.send("OPTIONS /~dave/ HTTP/1.0\n\n", 0) #0 means standard packet
puts client.readlines
client.close
produces:
HTTP/1.1200OK
Date:Mon,27May201317:31:00GMT
Server:Apache/2.2.22(Unix)DAV/2PHP/5.3.15withSuhosin-Patchmod_ssl/2.2.22
OpenSSL/0.9.8r
Allow:GET,HEAD,POST,OPTIONS
Content-Length:0
Connection:close
Content-Type:text/html
lib/net包下面提供了更高一层次的应用协议封装(FTP,HTTP,POP,SMTP,telnet)
require 'net/http'
http = Net::HTTP.new('pragprog.com',80)
response = http.get('/book/ruby3/programming-ruby-1-9')
if response.message == "OK"
puts response.body.scan(/<imgalt=".*?"src="(.*?)"/m).uniq[0,3]
end
produces:
http://pragprog.com/assets/logo-c5c7f9c2f950df63a71871ba2f6bb115.gif
http://pragprog.com/assets/drm-free80-9120ffac998173dc0ba7e5875d082f18.png
http://imagery.pragprog.com/products/99/ruby3_xlargecover.jpg?1349967653
更高一层次
require 'open-uri'
open('http://pragprog.com') do |f|
puts f.read.scan(/<imgalt=".*?"src="(.*?)"/m).uniq[0,3]
end
produces:
http://pragprog.com/assets/logo-c5c7f9c2f950df63a71871ba2f6bb115.gif
http://pragprog.com/assets/drm-free80-9120ffac998173dc0ba7e5875d082f18.png
http://imagery.pragprog.com/products/353/jvrails2_xlargebeta.jpg?1368826914
11.5解析HTML
#通过正则式匹配,%r{..}m,添加m表示要多次匹配
require 'open-uri'
page = open('http://pragprog.com/titles/ruby3/programming-ruby-1-9').read
if page =~ %r{<title>(.*?)</title>}m
puts "Title is #{$1.inspect}"
end
produces:
Title is "The Pragmatic Bookshelf | Programming Ruby1.9"
使用nokogiri模块,可以更强大的支持解析html
require 'open-uri'
require 'nokogiri'
doc = Nokogiri::HTML(open("http://pragprog.com/"))
puts"Pagetitleis"+doc.xpath("//title").inner_html
#Output the first paragraph in the div with an id="copyright"
#(nokogiri supports both xpath and css-like selectors)
puts doc.css('div#copyright p')
#Output the second hyperlink in the site-links div using xpath and css
puts "\nSecond hyperlink is"
puts doc.xpath('id("site-links")//a[2]')
puts doc.css('#site-links a:nth-of-type(2)')
Nokogiri can also update and create HTML and XML