最近在开发公司得基于用户行为日志,分析某个城市那个地方打开得app多,那个地方入住酒店多,该功能可以知道城市得热门地方,和那个地方开酒店比较火
由于公司得日志,是放在mysql数据库中,而且没有单个用户是访问的城市字段,只有当前用户的经纬度和搜索的经纬度,如果通过百度的API去获得城市酒店的话,百度API不会让调用那么多次,于是想到通过经纬度去得到基于hash的值,通过该值的不同位数,来确定是否属于该城市
代码如下:
private static int numbits = 6 * 5;
final static char[] digits = { '0', '1', '2', '3', '4', '5', '6', '7', '8',
'9', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'j', 'k', 'm', 'n', 'p',
'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z' };
final static HashMap<Character, Integer> lookup = new HashMap<Character, Integer>();
static {
int i = 0;
for (char c : digits)
lookup.put(c, i++);
}
public static void main(String[] args) throws Exception{
System.out.println(new Geohash().encode(30.4066211031716,118.36380581742704));
System.out.println(new Geohash().encode(30.5814,114.279));
System.out.println(new Geohash().encode(30.47,114.42));
System.out.println(new Geohash().encode(30.61,114.15));
System.out.println(new Geohash().encode(30.4675290000,114.5259580000));
}
public double[] decode(String geohash) {
StringBuilder buffer = new StringBuilder();
for (char c : geohash.toCharArray()) {
int i = lookup.get(c) + 32;
buffer.append( Integer.toString(i, 2).substring(1) );
}
BitSet lonset = new BitSet();
BitSet latset = new BitSet();
//even bits
int j =0;
for (int i=0; i< numbits*2;i+=2) {
boolean isSet = false;
if ( i < buffer.length() )
isSet = buffer.charAt(i) == '1';
lonset.set(j++, isSet);
}
//odd bits
j=0;
for (int i=1; i< numbits*2;i+=2) {
boolean isSet = false;
if ( i < buffer.length() )
isSet = buffer.charAt(i) == '1';
latset.set(j++, isSet);
}
double lon = decode(lonset, -180, 180);
double lat = decode(latset, -90, 90);
return new double[] {lat, lon};
}
private double decode(BitSet bs, double floor, double ceiling) {
double mid = 0;
for (int i=0; i<bs.length(); i++) {
mid = (floor + ceiling) / 2;
if (bs.get(i))
floor = mid;
else
ceiling = mid;
}
return mid;
}
public String encode(double lat, double lon) {
BitSet latbits = getBits(lat, -90, 90);
BitSet lonbits = getBits(lon, -180, 180);
StringBuilder buffer = new StringBuilder();
for (int i = 0; i < numbits; i++) {
buffer.append( (lonbits.get(i))?'1':'0');
buffer.append( (latbits.get(i))?'1':'0');
}
return base32(Long.parseLong(buffer.toString(), 2));
}
private BitSet getBits(double lat, double floor, double ceiling) {
BitSet buffer = new BitSet(numbits);
for (int i = 0; i < numbits; i++) {
double mid = (floor + ceiling) / 2;
if (lat >= mid) {
buffer.set(i);
floor = mid;
} else {
ceiling = mid;
}
}
return buffer;
}
public static String base32(long i) {
char[] buf = new char[65];
int charPos = 64;
boolean negative = (i < 0);
if (!negative)
i = -i;
while (i <= -32) {
buf[charPos--] = digits[(int) (-(i % 32))];
i /= 32;
}
buf[charPos] = digits[(int) (-i)];
if (negative)
buf[--charPos] = '-';
return new String(buf, charPos, (65 - charPos));
}
通过基于hash得到某个城市列表下面的用户行为日志,通过该行为日志应用KM算法来确定用我们app打开查看城市酒店的热点