匹配mongodb数据库中的字段

    前边说了如何把较大的xml格式的数据(大约50G)进行解析,并导入到mongodb数据库中,此刻怎样更有效的对这么大的数据进行快速的匹配?

方法1:

       首先从mongodb数据库A中拿出要进行匹配的所有字段,mongodb数据库B是要进行匹配的原始数据,把数据从mongodb数据库中拿出来后在相应的平台上运行,也就是处理平台是在第三者的平台上,对处理小数据库还行,如果对较大数据那就困难的多了,速度很慢,代码如下:

                Logger monglogger = Logger.getLogger("org.mongodb.driver");
		monglogger.setLevel(Level.SEVERE);
		
		//获取集合NAME_中的匹配字段
		Mongo mongo = new Mongo("localhost", port);
		DB db = mongo.getDB("databaseName");
		DBCollection dbCollection = db.getCollection("collectionName");
		DBCursor dbCursor = dbCollection.find();
		dbCursor.addOption(com.mongodb.Bytes.QUERYOPTION_NOTIMEOUT);
		DBObject baDBObject = new BasicDBObject();
		Iterator<DBObject> iterator = dbCursor.iterator();
		List<String> list = new ArrayList<String>();
		while(iterator.hasNext()){
			baDBObject = iterator.next();
			String cityName = baDBObject.get("NAME_1").toString();
			String lowercityName = cityName.toLowerCase();
			String trimcityName = lowercityName.replaceAll(" ", "");
			//把NAME_1中所有的字段都放入list中
			list.add(trimcityName);
			System.out.println(trimcityName);
			System.out.println("******************************");
		}
		
		
		DBCollection dbCollection1 = db.getCollection("testdata1");
		DBCollection dbCollection2 = db.getCollection("resultdata");
		DBCursor dbCursor1 = dbCollection1.find();
		dbCursor1.addOption(com.mongodb.Bytes.QUERYOPTION_NOTIMEOUT);
		DBObject baDbObject1 = new BasicDBObject();
		Iterator<DBObject> iterator1 = dbCursor1.iterator();
		List<DBObject> list1 = new ArrayList<DBObject>();
		while(iterator1.hasNext()){
			baDbObject1 = iterator1.next();
			String titlecityName = baDbObject1.get("title").toString();
			//System.out.println(titlecityName);
			String trimtitlecityName = titlecityName.replaceAll(" ", "");
			String lowerTitlcityName = trimtitlecityName.toLowerCase();
			System.out.println(lowerTitlcityName);
			System.out.println("字段匹配开始");
			for(int i=0;i<list.size();i++){
				if(lowerTitlcityName.equals(list.get(i))){
					list1.add(baDbObject1);
					break;
				}
				dbCollection2.insert(list1);
				break;
			}
			System.out.println("字段匹配完成");
		}
		mongo.close();
	}
上述情况运行比较慢,改进一下:对mongodb数据库B中的要匹配的数据建立索引,此处我对title字段建立的索引,因为我就是为了要匹配title字段
即把mongodb数据库A中的对应字段放到mongodb数据库B中,让mongodb数据匹配,就已经很快很快了,快的不是等级啊
<pre name="code" class="java">Logger monglogger = Logger.getLogger("org.mongodb.driver");
		monglogger.setLevel(Level.SEVERE);
		
		@SuppressWarnings("deprecation")
		Mongo mongo = new Mongo("locaohost", port);
		@SuppressWarnings("deprecation")
		DB db = mongo.getDB("databaseName");
		DBCollection dbCollection = db.getCollection("collectionName");
		DBCursor dbCursor = dbCollection.find();
		dbCursor.addOption(com.mongodb.Bytes.QUERYOPTION_NOTIMEOUT);
		Iterator<DBObject> iterator = dbCursor.iterator();
		DBObject dbObject = new BasicDBList();
		List<String> list = new ArrayList<String>();
		while(iterator.hasNext()){
			dbObject = iterator.next();
			String cityName = dbObject.get("NAME_1").toString();
			list.add(cityName);
			System.out.println(cityName);
		}
		
		
		DBCollection dbCollection3 = db.getCollection("resultdata");
		for(int i=0;i<list.size();i++){
			DBCollection dbCollection2 = db.getCollection("testdata1");
			List<DBObject> list2 = new ArrayList<DBObject>();
			System.out.println("字段匹配开始");
			BasicDBObject query = new BasicDBObject();
			//把list.get(i)中的cityName字段与testdata1中的title向匹配
			DBObject objput = (DBObject) query.put("title", list.get(i).toString());
			DBCursor dbCursor2 = dbCollection2.find(query);
			//解决MongCursorNotFoundException
			dbCursor2.addOption(com.mongodb.Bytes.QUERYOPTION_NOTIMEOUT);
			while(dbCursor2.hasNext()){
				objput = dbCursor2.next();
				//System.out.println(objput);
				list2.add(objput);
				if(list2 !=null){
					dbCollection3.insert(list2);
				}
				System.out.println("字段匹配结束");
			}
		}
		mongo.close();
		System.out.println("mongodb数据库已关闭,字段匹配已经结束,请添加其他cityName字段");	
	}


 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值