自我感觉对于全表扫描也有了小小的感觉.现将其总结于下.
功能介绍:假设内存中加载了1000条数据,仅有字段UrlPath (varchar),实现的功能等于sql语句:select urlpath,count(urlpath) from urldata group by urlpath ,勿喷,向下看.
个人使用的两种方法:
1,使用字段标记(select top 1000 urlpath,0 as Dealed from urldata)
2.使用Remove将处理过的DataRow移出DataTable
我要的是测试他们的运算时间是多少!!!
下面附上代码:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Diagnostics;
using System.Data;
using System.Data.SqlClient;
namespace DBTest
{
class Program
{
private static DataTable table;
static void Main(string[] args)
{
GetTable();
Start();
Console.Read();
}
static void GetTable()
{
Stopwatch sw = new Stopwatch();
sw.Start();
string connstr = "Data Source=localhost;Initial Catalog=BigTable;Persist Security Info=True;User ID=***;Password=***";
SqlConnection conn = new SqlConnection(connstr);
SqlCommand command = new SqlCommand("select top 1000 * ,0 as dealed from urldata ",conn);
SqlDataAdapter sda = new SqlDataAdapter(command);
DataSet ds = new DataSet();
sda.Fill(ds);
table = ds.Tables[0];
sw.Stop();
Console.WriteLine("加载数据:{0}",sw.ElapsedMilliseconds);
}
static void Start()
{
Stopwatch sw = new Stopwatch();
sw.Start();
SignTest();
sw.Stop();
Console.WriteLine("字段标记:{0}",sw.ElapsedMilliseconds);
sw.Reset();
sw.Start();
RemoveTest();
sw.Stop();
Console.WriteLine("记录移出:{0}",sw.ElapsedMilliseconds);
}
static void RemoveTest()
{
DataRow[] rows;
string urlpath=string.Empty;
while (table.Rows.Count > 0)
{
//Console.Write(table.Rows.Count);
rows=null;
urlpath=table.Rows[0]["UrlPath"].ToString();
//将使用hashtable/Dictionary来处理
rows=table.Select("urlpath='"+urlpath+"'");
foreach(DataRow row in rows)
{
table.Rows.Remove(row);
}
}
}
static void SignTest()
{
DataRow[] rows;
string urlpath = string.Empty;
foreach (DataRow row in table.Rows)
{
if (row["dealed"].ToString() == "1")
continue;
urlpath = row["Urlpath"].ToString();
rows = table.Select("urlpath='" + urlpath + "'");
foreach (DataRow row2 in rows)
{
row2["dealed"] = 1;
}
}
}
}
}
现在看一下执行结果
到了这里,请容我说句脏话,好JB慢.
内容到此结束.
在此厚颜求高手指教更好的全表扫描的方法,只用SQL完成不行(1000W的记录分个组就搞死人家数据库了),只要有理,请喷!!!
转载请注明出处.