Text2Sql研究(二)- 大模型测试对比结果

一. 前言

基于之前的文章Text2SQL研究-Chat2DB体验与剖析-CSDN博客,我们了解了Text2Sql基本原理和实现要点,下一步我们就对市面各个商业大模型做了SQL生成测试对比,我们挑选了9条语句,

序号问题
12023年度,累计加油金额是多少元?
22023年,每个季度的加油金额各是多少元?
32023年,每个月的加油金额各是多少元?
42023年,每个月的加油金额总体的变化趋势是怎样的?
5请列出2023年每个月加油金额相比上个月的环比变化百分比
6请列出2023年加油金额最高的十台车,并显示每台车的年度加油费用
72023年加油金额最高的十台车,他们的使用性质各是什么
82023年,加油金额最高的月份是哪个月?这个月哪十辆车加油费用最高
9请计算2023年车辆的平均油耗并列出油耗较低的前十辆车

并针对千帆Qianfan-Chinese-Llama-2-13B、通义千问-Max、讯飞星火-3.5、智普GLM-4做了测试结果分别如下:

二. 测试结果

  1. 千帆Qianfan-Chinese-Llama-2-13B

    千帆Qianfan-Chinese-Llama-2-13B
    语句生成的SQL语句准确性
    1SELECT  SUM(ROUND(fuel_consumption * cost, 2)) AS total_refueling_cost,  SUM(ROUND(mileage * 100)) AS total_driven_distance FROM (SELECT     SUM(fuel_consumption * cost) AS fuel_consumption_cost,   SUM(mileage) AS mileage, organization_id,  vehicle_id,   license_plate,  year  FROM (SELECT  id AS refueling_order_id,  order_number,   license_plate,  organization_id,。。。错误
    2SELECT YEAR(refueling_order.refueling_time) as year,QUARTER(refueling_order.refueling_time) as quarter,SUM(refueling_order.cost) as total_amount FROM  refueling_order GROUP BY  YEAR(refueling_order.refueling_time),  QUARTER(refueling_order.refueling_time)ORDER BY  YEAR(refueling_order.refueling_time),  quarter正确
    3SELECT  DATE_FORMAT(refueling_order.refueling_time, '%Y-%m') AS year,  SUM(refueling_order.fuel_consumption * refueling_order.cost) AS total_amountFROM  refueling_orderGROUP BY  DATE_FORMAT(refueling_order.refueling_time, '%Y-%m')正确
    4SELECT YEAR(refueling_order.refueling_time) as ‘Year’, MONTH(refueling_order.refueling_time) as ‘Month’,SUM(refueling_order.cost) as ‘Total_Refueling_Amount’FROM refueling_order WHERE YEAR(refueling_order.refueling_time) = 2023 GROUP BY YEAR(refueling_order.refueling_time), MONTH(refueling_order.refueling_time) ORDER BY YEAR(refueling_order.refueling_time), MONTH(refueling_order.refueling_time)错误
    5SELECT 
      YEAR(refueling_order.refueling_time) AS year, 
      MONTH(refueling_order.refueling_time) AS month, 
      SUM(refueling_order.cost) AS total_amount, 
      SUM(refueling_order.cost - LAG(refueling_order.cost) OVER (PARTITION BY YEAR(refueling_order.refueling_time), MONTH(refueling_order.refueling_time)) AS prev_month_amount) AS prev_month_total_amount,
      (SUM(refueling_order.cost) - SUM(refueling_order.cost - LAG(refueling_order.cost) OVER (PARTITION BY YEAR(refueling_order.refueling_time), MONTH(refueling_order.refueling_time))) / SUM(refueling_order.cost - SUM(refueling_order.cost - LAG(refueling_order.cost) OVER (PARTITION BY YEAR(refueling_order.refueling_time), MONTH(refueling_order.refueling_time)))) * 100 AS percentage_change
    FROM refueling_order GROUP BY  YEAR(refueling_order.refueling_time), 
     MONTH(refueling_order.refueling_time)
    ORDER BY 
      YEAR(refueling_order.refueling_time), 
    错误
    6SELECT 
      YEAR(refueling_order.refueling_time) AS year, 
      MONTH(refueling_order.refueling_time) AS month, 
      SUM(refueling_order.cost) AS total_amount, 
      SUM(refueling_order.cost - LAG(refueling_order.cost) OVER (PARTITION BY YEAR(refueling_order.refueling_time), MONTH(refueling_order.refueling_time)) AS prev_month_amount) AS prev_month_total_amount,
      (SUM(refueling_order.cost) - SUM(refueling_order.cost - LAG(refueling_order.cost) OVER (PARTITION BY YEAR(refueling_order.refueling_time), MONTH(refueling_order.refueling_time))) / SUM(refueling_order.cost - SUM(refueling_order.cost - LAG(refueling_order.cost) OVER (PARTITION BY YEAR(refueling_order.refueling_time), MONTH(refueling_order.refueling_time)))) * 100 AS percentage_change
    FROM 
      refueling_order
    GROUP BY 
      YEAR(refueling_order.refueling_time), 
      MONTH(refueling_order.refueling_time)
    ORDER BY 
      YEAR(refueling_order.refueling_time), 
      MONTH(refueling_order.refueling_time)
    正确
    7SELECT 
      YEAR(refueling_order.refueling_time) AS year, 
      MONTH(refueling_order.refueling_time) AS month, 
      SUM(refueling_order.cost) AS total_amount, 
      SUM(refueling_order.cost - LAG(refueling_order.cost) OVER (PARTITION BY YEAR(refueling_order.refueling_time), MONTH(refueling_order.refueling_time)) AS prev_month_amount) AS prev_month_total_amount,
      (SUM(refueling_order.cost) - SUM(refueling_order.cost - LAG(refueling_order.cost) OVER (PARTITION BY YEAR(refueling_order.refueling_time), MONTH(refueling_order.refueling_time))) / SUM(refueling_order.cost - SUM(refueling_order.cost - LAG(refueling_order.cost) OVER (PARTITION BY YEAR(refueling_order.refueling_time), MONTH(refueling_order.refueling_time)))) * 100 AS percentage_change
    FROM 
      refueling_order
    GROUP BY 
      YEAR(refueling_order.refueling_time), 
      MONTH(refueling_order.refueling_time)
    ORDER BY 
      YEAR(refueling_order.refueling_time), 
      MONTH(refueling_order.refueling_time)
    错误
    8SELECT YEAR(refueling_order.refueling_time) AS year,
    MONTH(refueling_order.refueling_time) AS month,
    SUM(refueling_order.cost) AS total_cost FROM
    refueling_order GROUP BY YEAR(refueling_order.refueling_time),
    MONTH(refueling_order.refueling_time)
    ORDER BY year DESC,month DESC LIMIT 10;
    错误
    9SELECT 
        r.license_plate,
        r.organization_name,
        r.vehicle_id,
        r.refueling_time,
        r.fuel_consumption,
        r.cost,
        r.year,
        AVG(r.mileage) AS avg_fuel_consumption
    FROM 
        refueling_order r
    JOIN 
        driven_distance d ON r.vehicle_id = d.vehicle_id
    WHERE 
        YEAR(r.refueling_time) = 2023
    GROUP BY 
        r.vehicle_id
    ORDER BY 
        avg_fuel_consumption ASC
    LIMIT 10;
    错误
    结果评价】6错3对,回答非常啰嗦,准确性不高
  2. SQLCoder-34B-Beta

    SQLCoder-34B-Beta
    语句生成的SQL语句准确性
    1SELECT SUM(r.cost) AS total_cost FROM refueling_order r WHERE EXTRACT(YEAR,FROM r.refueling_time) = 2023;正确
    2SELECT EXTRACT(QUARTER FROM r.refueling_time) AS QUARTER,SUM(r.cost) AS total_cost
    FROM refueling_order r WHERE EXTRACT(YEAR FROM r.refueling_time) = 2023 GROUP BY QUARTER ORDER BY QUARTER NULLS LAST;
    错误
    3SELECT date_trunc('month', r.refueling_time) AS MONTH,
           SUM(r.cost) AS total_cost
    FROM refueling_order r
    WHERE date_part('year', r.refueling_time) = 2023
    GROUP BY MONTH
    ORDER BY MONTH;
    错误
    4SELECT date_trunc('month', r.refueling_time) AS MONTH,
        SUM(r.cost) AS total_cost FROM refueling_order r
    WHERE date_part('year', r.refueling_time) = 2023
    GROUP BY MONTH ORDER BY MONTH;
    错误
    5SELECT date_trunc('month', r.refueling_time) AS MONTH,
           SUM(r.cost) AS total_cost,
           lag(SUM(r.cost)) OVER (
                                  ORDER BY date_trunc('month', r.refueling_time)) AS previous_month_total_cost
    FROM refueling_order r
    WHERE date_part('year', r.refueling_time) = 2023
    GROUP BY MONTH
    ORDER BY MONTH;
    错误
    6SELECT r.license_plate,SUM(r.cost) AS total_cost
    FROM refueling_order r
    WHERE EXTRACT(YEAR FROM r.refueling_time) = 2023
    GROUP BY r.license_plate ORDER BY total_cost DESC NULLS LAST LIMIT 10;
    错误
    7SELECT v.usage_type,SUM(r.cost) AS total_cost FROM refueling_order r JOIN vehicle v ON r.vehicle_id = v.id
    WHERE EXTRACT(YEAR FROM r.refueling_time) = 2023 GROUP BY v.usage_type ORDER BY total_cost DESC LIMIT 10;
    错误
    8SELECT date_trunc('month', o.refueling_time) AS MONTH,
           COUNT(*) AS number_of_orders,SUM(o.cost) AS total_cost
    FROM refueling_order o
    WHERE date_part('year', o.refueling_time) = 2023
    GROUP BY MONTH
    ORDER BY total_cost DESC
    LIMIT 1;
    错误
    9SELECT AVG(r.fuel_consumption) AS average_fuel_consumption
    FROM refueling_order r
    WHERE r.year = 2023
    ORDER BY r.fuel_consumption ASC NULLS LAST
    LIMIT 10;
    错误
    【结果评价】8错1对,返回SQL存在方言问题,且无法通过prompt解决
  3. 通义千问-Ma

    通义千问-Max
    语句生成的SQL语句准确性
    1SELECT v.license_plate, SUM(ro.fuel_consumption) AS total_fuel_consumption, SUM(dd.mileage) AS total_mileage
    FROM vehicle v JOIN refueling_order ro ON v.id = ro.vehicle_id
    JOIN driven_distance dd ON v.id = dd.vehicle_id AND dd.year = [指定年份] WHERE v.organization_id = [某个组织ID]
    GROUP BY v.license_plate;
    错误
    2SELECT    v.license_plate,    SUM(ro.fuel_consumption) AS total_fuel_consumption,    SUM(dd.mileage) AS total_mileageFROM    vehicle vJOIN    refueling_order ro ON v.id = ro.vehicle_idJOIN    driven_distance dd ON v.id = dd.vehicle_id AND dd.year = [指定年份]WHERE    v.organization_id = [某个组织ID]GROUP BY    v.license_plate;错误
    3SELECT    v.license_plate,    SUM(ro.fuel_consumption) AS total_fuel_consumption,    SUM(dd.mileage) AS total_mileageFROM    vehicle vJOIN    refueling_order ro ON v.id = ro.vehicle_idJOIN    driven_distance dd ON v.id = dd.vehicle_id AND dd.year = [指定年份]WHERE    v.organization_id = [某个组织ID]GROUP BY    v.license_plate;错误
    4SELECT    v.license_plate,    SUM(ro.fuel_consumption) AS total_fuel_consumption,    MAX(dd.mileage) AS year_mileageFROM    vehicle vJOIN    refueling_order ro ON v.id = ro.vehicle_idJOIN    driven_distance dd ON v.id = dd.vehicle_id AND dd.year = 2021WHERE    ro.refueling_time BETWEEN '2021-01-01' AND '2021-12-31'GROUP BY    v.license_plate;错误
    5SELECT    v.license_plate,    SUM(ro.fuel_consumption) AS total_fuel_consumption,    SUM(dd.mileage) AS total_mileageFROM    vehicle vJOIN    refueling_order ro ON v.id = ro.vehicle_idJOIN    driven_distance dd ON v.id = dd.vehicle_id AND dd.year = [指定年份]WHERE    v.organization_id = [某个组织ID]GROUP BY    v.license_plate;错误
    6SELECT    v.license_plate,    SUM(ro.fuel_consumption) AS total_fuel_consumption,    SUM(dd.mileage) AS total_mileageFROM    vehicle vJOIN    refueling_order ro ON v.id = ro.vehicle_idJOIN    driven_distance dd ON v.id = dd.vehicle_id AND dd.year = [指定年份]WHERE    v.organization_id = [某个组织ID]GROUP BY    v.license_plate;错误
    7SELECT    v.license_plate,    SUM(ro.fuel_consumption) AS total_fuel_consumption,    SUM(dd.mileage) AS total_mileageFROM    vehicle vJOIN    refueling_order ro ON v.id = ro.vehicle_idJOIN    driven_distance dd ON v.id = dd.vehicle_id AND dd.year = [指定年份]WHERE    v.organization_id = [某个组织ID]GROUP BY    v.license_plate;错误
    8SELECT    v.license_plate,    SUM(ro.fuel_consumption) AS total_fuel_consumption,    SUM(dd.mileage) AS total_mileageFROM    vehicle vJOIN    refueling_order ro ON v.id = ro.vehicle_idJOIN    driven_distance dd ON v.id = dd.vehicle_id AND dd.year = [指定年份]WHERE    v.organization_id = [某个组织ID]GROUP BY    v.license_plate;错误
    9SELECT    v.license_plate,    SUM(ro.fuel_consumption) AS total_fuel_consumption,    SUM(dd.mileage) AS total_mileageFROM    vehicle vJOIN    refueling_order ro ON v.id = ro.vehicle_idJOIN    driven_distance dd ON v.id = dd.vehicle_id AND dd.year = [指定年份]WHERE    v.organization_id = [某个组织ID]GROUP BY    v.license_plate;错误
    结果评价】9题全错,无论如何更换question,都无法准确识别prompt问题
  4. 讯飞星火-3.5

    讯飞星火-3.5
    语句生成的SQL语句准确性
    1SELECT SUM(cost) FROM refueling_order WHERE YEAR(refueling_time) = 2023;正确
    2SELECT QUARTER(refueling_time) AS quarter, SUM(cost) AS total_cost FROM refueling_order WHERE YEAR(refueling_time) = 2023 GROUP BY quarter;正确
    3SELECT MONTH(refueling_time) as month, SUM(cost) as total_cost FROM refueling_order WHERE YEAR(refueling_time) = 2023 GROUP BY month正确
    4SELECT MONTH(refueling_time) AS month, SUM(cost) AS total_cost FROM refueling_order WHERE YEAR(refueling_time) = 2023 GROUP BY month错误
    5SELECT t1.month, ((t1.total_cost - t2.total_cost) / t2.total_cost) * 100 AS month_on_month_percentage FROM (SELECT MONTH(refueling_time) AS month, SUM(cost) AS total_cost FROM refueling_order WHERE YEAR(refueling_time) = 2023 GROUP BY month) t1 JOIN (SELECT MONTH(refueling_time) AS month, SUM(cost) AS total_cost FROM refueling_order WHERE YEAR(refueling_time) = 2023 AND MONTH(refueling_time) < 12 GROUP BY month) t2 ON t1.month = t2.month + 1;正确
    6SELECT vehicle_id, SUM(cost) AS total_cost FROM refueling_order WHERE YEAR(refueling_time) = 2023 GROUP BY vehicle_id ORDER BY total_cost DESC LIMIT 10正确
    7SELECT v.license_plate, v.usage_type FROM vehicle v JOIN refueling_order r ON v.id = r.vehicle_id WHERE YEAR(r.refueling_time) = 2023 GROUP BY v.license_plate ORDER BY SUM(r.cost) DESC LIMIT 10正确
    8SELECT MONTH(refueling_time) AS month, license_plate, SUM(cost) AS total_cost FROM refueling_order WHERE YEAR(refueling_time) = 2023 GROUP BY month, license_plate ORDER BY total_cost DESC LIMIT 10;错误
    9SELECT vehicle_id, AVG(fuel_consumption) as avg_fuel_consumption FROM refueling_order WHERE YEAR(refueling_time) = 2023 GROUP BY vehicle_id ORDER BY avg_fuel_consumption ASC LIMIT 10;错误
    【结果评价】6对7错,准确率较高,复合问题无法处理
  5. 智普GLM-4

智普GLM-4
语句生成的SQL语句准确性
1SELECT SUM(cost) AS total_cost
FROM refueling_order
WHERE YEAR(refueling_time) = 2023
正确
2SELECT  QUARTER(refueling_time) AS quarter,  SUM(cost) AS total_costFROM  refueling_orderWHERE  YEAR(refueling_time) = 2023GROUP BY  quarterORDER BY  quarter正确
3SELECT    DATE_FORMAT(refueling_time, '%Y-%m') AS month, -- 获取年-月    SUM(cost) AS total_cost -- 计算每个月的加油金额总和FROM    refueling_orderWHERE    YEAR(refueling_time) = 2023 -- 确保只选取2023年的数据GROUP BY    monthORDER BY    month正确
4SELECT  DATE_FORMAT(refueling_time, '%Y-%m') AS month,  SUM(cost) AS total_costFROM  refueling_orderWHERE  DATE_FORMAT(refueling_time, '%Y') = '2023' -- 仅选择2023年的数据GROUP BY  monthORDER BY  month正确
5SELECT  DATE_FORMAT(refueling_time, '%Y-%m') AS month, -- 格式化日期到年-月  SUM(cost) AS monthly_cost, -- 每个月的加油总金额  LAG(SUM(cost)) OVER (ORDER BY DATE_FORMAT(refueling_time, '%Y-%m')) AS previous_month_cost, -- 上个月的加油总金额  (    (SUM(cost) - LAG(SUM(cost)) OVER (ORDER BY DATE_FORMAT(refueling_time, '%Y-%m'))) / NULLIF(LAG(SUM(cost)) OVER (ORDER BY DATE_FORMAT(refueling_time, '%Y-%m')), 0)  ) * 100 AS percentage_change -- 计算环比变化百分比FROM  refueling_orderWHERE  YEAR(refueling_time) = 2023 -- 筛选2023年的数据GROUP BY  monthORDER BY  month正确
6SELECT    ro.license_plate,    SUM(ro.cost) AS annual_fueling_costFROM    refueling_order roJOIN    vehicle v ON ro.vehicle_id = v.idWHERE    YEAR(ro.refueling_time) = 2023GROUP BY    ro.license_plate ORDER BY    annual_fueling_cost DESCLIMIT 10正确
7SELECT    vo.license_plate,    v.usage_type,    SUM(vo.cost) AS total_costFROM    refueling_order voINNER JOIN    vehicle v ON vo.license_plate = v.license_plate WHERE  YEAR(vo.refueling_time) = 2023 GROUP BY    vo.license_plateORDER BY    total_cost DESCLIMIT 10正确
8SELECT  MAX(total_cost_by_month.month) AS max_cost_month,  vehicles_with_max_cost.license_plate,  vehicles_with_max_cost.total_costFROM (  SELECT    EXTRACT(MONTH FROM refueling_time) AS month,    SUM(cost) AS total_cost  FROM    refueling_order  WHERE    EXTRACT(YEAR FROM refueling_time) = 2023  GROUP BY    month) AS total_cost_by_monthJOIN (  SELECT    license_plate,    SUM(cost) AS total_cost,    EXTRACT(MONTH FROM refueling_time) AS month  FROM    refueling_order  WHERE    EXTRACT(YEAR FROM refueling_time) = 2023  GROUP BY    license_plate,    month) AS vehicles_with_max_cost ON total_cost_by_month.month = vehicles_with_max_cost.monthWHERE  total_cost_by_month.total_cost = (    SELECT      MAX(total_cost)    FROM (      SELECT        EXTRACT(MONTH FROM refueling_time) AS month,        SUM(cost) AS total_cost      FROM        refueling_order      WHERE        EXTRACT(YEAR FROM refueling_time) = 2023      GROUP BY        month    ) AS max_cost_per_month  )ORDER BY  vehicles_with_max_cost.total_cost DESCLIMIT 10错误
9SELECT     v.id AS vehicle_id,        v.license_plate,        AVG(ro.fuel_consumption) AS average_fuel_consumption    FROM        vehicle v    JOIN        refueling_order ro ON v.license_plate = ro.license_plate    WHERE        YEAR(ro.refueling_time) = 2023 -- 确保只考虑2023年的数据    GROUP BY        v.id, v.license_plate-- 然后选择油耗较低的前十辆车SELECT    vehicle_id,    license_plate,    average_fuel_consumptionFROM    VehicleAverageFuelConsumptionORDER BY    average_fuel_consumption ASC -- 升序排列,以获取油耗最低的车辆LIMIT 10正确

【结果评价】8对1错,准确率很高,有概率识别复合问题

三. 综合评价

根据您以上测试结果,对市面上几款商业大模型在SQL生成方面的性能进行了比较,可以看出各个模型的表现存在明显差异。

  • 千帆Qianfan-Chinese-Llama-2-13B的表现显示其在准确性方面存在挑战,只有3个问题回答正确,而且回答方式啰嗦,这可能表明其在理解复杂查询和准确生成SQL方面还有改进的空间。
  • SQLCoder-34B-Beta的表现则表明了一个极端的情况,其中8个问题回答错误,仅1个问题回答正确。这可能指出模型在理解特定数据库方言(如Oracle)方面遇到了困难,或者它在根据提示信息生成准确SQL方面存在局限性。
  • 通义千问-Max的测试结果显示了最大的挑战,所有问题都回答错误,这可能意味着模型在理解问题和生成对应SQL方面有根本性的缺陷,或者需要更多的定制化训练以提高其性能。
  • 讯飞星火-3.5展示了较好的性能,尽管在处理复杂问题时遇到了困难,但其准确率相对较高,这表明它在理解简单至中等复杂度问题方面具有一定的优势。
  • 智普GLM-4的性能最为突出,准确率非常高,仅1个问题回答错误。这表明该模型在理解问题和生成准确SQL方面具有较强能力,甚至能够一定程度上处理复合问题。

综合评价来看,各个模型在SQL生成任务上的表现各不相同,体现了在自然语言理解、问题转换、以及特定数据库知识方面的差异。智普GLM-4讯飞星火-3.5在测试中表现较好,显示出较高的准确性和一定程度上处理复杂问题的能力。而其他模型如SQLCoder-34B-Beta通义千问-Max则在这些测试中遇到了较大挑战。这些差异可能源于模型的设计、训练数据的丰富性、以及对特定数据库语言的理解等方面的不同。未来的发展可能会看到这些模型在性能上的进一步优化和改进。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值