BOSS交给了我一个任务,读取文件,将文件中的数据以“|”为分割标志,将分割出来的字段一一对应数据库里的字段插入里面。
1.前面的简单,首先IO流读取数据,然后将读取出来的数据一条条遍历,再分割一下完事。
上代码:
public static void readTxtFile(String filePath) throws IOException, SQLException{
String encoding="UTF-8";
File file=new File(filePath);
InputStreamReader read = new InputStreamReader(new
FileInputStream(file),encoding);//考虑到编码格式
BufferedReader bufferedReader = new BufferedReader(read);
String lineTxt = null;
List list =new ArrayList();
while((lineTxt = bufferedReader.readLine()) != null){
String[] temp=lineTxt.split("\\|");
String sqlValue="VALUES (";
for(int i=0;i<temp.length;i++) {
temp[i]=temp[i].trim();
if(i!=temp.length-1) {
sqlValue+="'"+temp[i]+"'"+",";
}else {
sqlValue+="'"+temp[i]+"'";
}
}
sqlValue+=")";
sqlValue= sqlValue.replace("null", "");
String sql=sqlValue;
list.add(sql);
}
read.close();
}
public static void main(String argv[]) throws IOException, SQLException{
String filePath = "E:\\缓存\\QQ缓存\\1023047818\\FileRecv\\part-00000";
readTxtFile(filePath);
}
2.难度就来了,传统的更新语句肯定不行了,因为有17万数据,一条一条插入会累死Ecilpse的,所以另外一种方法就出来了:executeBatch!这是个批量执行神器,比如我这种情况要一次执行数万数十万的数据的时候,每千条执行一次会极大提升效率。别的不说先上代码:
private static void doexcuteBatch(List list) {
try {
try {
conn = DriverManager.getConnection(dburl, DBUSER, password);
} catch (SQLException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
Statement pstmt = null;
Class.forName(drive);
conn = DriverManager.getConnection(dburl, DBUSER, password);
conn.setAutoCommit(false);
pstmt = conn.createStatement();
for(int i=0;i<list.size();i++) {
pstmt.addBatch(list.get(i).toString());
if (i > 1 && (i+1) % 1000 == 0) {
long startTime = System.currentTimeMillis();
pstmt.executeBatch();
conn.commit();
pstmt.clearBatch();
System.out.println("提交:" + i);
System.out.println("executeBatch 执行使用了 :"+(System.currentTimeMillis() - startTime )/1000 + " 秒");
}
}
pstmt.executeBatch();
pstmt.close();
conn.commit();
}catch(Exception e) {
e.printStackTrace();
}
}
要注意的问题是1 conn.setAutoCommit(false); 要关闭自动提交
2 pstmt = conn.createStatement(); 要把ps放在循环外面,不然只会执行一条
3 pstmt.clearBatch(); 每次执行后要将ps clear一下,防止程序卡死及重复提交