我胡汗三又回来了
既上次立下FLAG后,好久没碰博客了,脸上都消肿了, 最近科创项目需要分析数据,来确定羊的运动状态,立项的时候看过可以用k-means算法写,然而最近学长说这个算法太老了,推荐我看一下AP算法和python,我估摸着学长的小算盘,我写一下,他可以参考我的。我这么皮,怎么能屈服,正好看Ruby好久了,拿来练练手,k-means我是一定要写的,参考[http://www.csdn.net/article/2012-07-03/2807073-k-means]这位大哥的帖子,动手写了一下,运行起来还可以。算法原理直接看前边大神的贴子,我这里只给出代码。
下方贴代码
K-means.rb
#! /bin/ruby
#_*_ coding:utf-8 _*_
require 'pg'
require './sport'
conn = PG.connect(:dbname => 'dbname', :port => 5432, :user => 'username', :password =>'password',:host =>'localhost')
#从postgresql中读取运动数据
sportArr = Array.new
res = conn.query("select * from goat where goatid='G1' and id < 4000 ")
res.each do |row|
sportArr << Sport.new(row['id'],row['datatime'],row['sportx'],row['sporty'],row['sportz'])
end
seedArr = Array.new
classArr = Array.new
classNumArr = Array.new
indexArr = Array[3252,3311,3381]
sportArr.each_with_index do |item,i|
indexArr.each do |j|
if item.id.to_i == j.to_i then
seedArr << item
classArr << Array.new
classNumArr << 0
end
end
end
seedArr.each do |i|
i.printOut
end
if seedArr.empty? then
puts "seedArr is null"
else
flag = true
while flag do
sportArr.each do |i|
index = 0
1.step(seedArr.size-1,1) do |j|
if i.similarity(seedArr[j]) > i.similarity(seedArr[index]) then
# if i.distanceWithZ(seedArr[j]) < i.distanceWithZ(seedArr[index]) then
index = j
end
end
classArr[index] << i
# puts index
end
classArr.size.times do |i| puts "num class#{i}:#{classArr[i].size}"
end
seedArr.each_with_index do |item,i|
sum_x = 0;sum_y = 0;sum_z = 0;
classArr[i].each do |j|
sum_x += j.x
sum_y += j.y
sum_z += j.z
end
if classArr[i].size > 0 then
item.x = sum_x / classArr[i].size
item.y = sum_y / classArr[i].size
item.z = sum_z / classArr[i].size
else
item.x = 0
item.y = 0
item.z = 0
end
end
puts "种子已重新定位"
temp = 0
classNumArr.each_with_index do |item,i|
if item == classArr[i].size then
temp += 1
end
end
puts temp
if temp == seedArr.size then
puts "聚类结束"
flag = false
else
classNumArr.size.times do |i|
classNumArr[i] = classArr[i].size
end
end
classArr.each do |i|
i.clear
end
end
end
seedArr.each do |i|
i.printOut
end
sport.rb
#_*_ coding:utf-8 _*_
require 'mathn'
class Sport
def initialize(inid,indatatime,inx,iny,inz)
@id = inid
@datatime = indatatime
@x = inx.to_f
@y = iny.to_f
@z = inz.to_f
@state = 'not analysis'
end
def x
@x
end
def y
@y
end
def z
@z
end
def id
@id
end
def datatime
@datatime
end
def state
@state
end
def x=(inx)
@x = inx
end
def y=(iny)
@y = iny
end
def state=(instate)
@state = instate
end
def z=(inz)
@z = inz
end
def len
Math.sqrt(@x*@x +@y*@y + @z*@z)
end
def similarity another #相似度计算,用的余弦量度量,这里可以重写成自己所需要的
(@x*another.x + @y*another.y + @z*another.z)/(len()*another.len())
end
def similarityWithZ another
temp = Math.sqrt(@z.abs*another.z.abs)
if temp >0 then
# @z*another.z/Math.sqrt(@z.abs*another.z.abs)
@z*another.z/temp
else
# puts "#{@id} and #{another.id}"
0
end
end
def distanceWithZ another
(@z - another.z).abs
end
def printOut
puts "id:#{@id} datatime:#{@datatime} sportx:#{@x} sporty:#{@y} sportz:#{@z}"
end
end
试运行结果
毕竟只用了几个小时写成,如过各位老铁发现有什么不对的地方,请给我指正,邮箱: horpoppy@gmail.com