-
异常场景。
-
训练阶段。
-
场景一:当超参数的设置超出取值范围,模型训练失败,返回ERROR,并提示错误,例如:
openGauss=# CREATE MODEL patient_linear_regression USING linear_regression FEATURES second_attack,treatment TARGET trait_anxiety FROM patients WITH optimizer='aa'; ERROR: Invalid hyperparameter value for optimizer. Valid values are: gd, ngd.
-
场景二:当模型名称已存在,模型保存失败,返回ERROR,并提示错误原因,例如:
openGauss=# CREATE MODEL patient_linear_regression USING linear_regression FEATURES second_attack,treatment TARGET trait_anxiety FROM patients; ERROR: The model name "patient_linear_regression" already exists in gs_model_warehouse.
-
场景三:FEATURE或者TARGETS列是*,返回ERROR,并提示错误原因,例如:
openGauss=# CREATE MODEL patient_linear_regression USING linear_regression FEATURES * TARGET trait_anxiety FROM patients; ERROR: FEATURES clause cannot be * ----------------------------------------------------------------------------------------------------------------------- openGauss=# CREATE MODEL patient_linear_regression USING linear_regression FEATURES second_attack,treatment TARGET * FROM patients; ERROR: TARGET clause cannot be *
-
场景四:对于无监督学习方法使用TARGET关键字,或者在监督学习方法中不适用TARGET关键字,均会返回ERROR,并提示错误原因,例如:
openGauss=# CREATE MODEL patient_linear_regression USING linear_regression FEATURES second_attack,treatment FROM patients; ERROR: Supervised ML algorithms require TARGET clause ----------------------------------------------------------------------------------------------------------------------------- CREATE MODEL patient_linear_regression USING linear_regression TARGET trait_anxiety FROM patients; ERROR: Supervised ML algorithms require FEATURES clause
-
场景五:当进行分类任务时TARGET列的分类只有1种情况,会返回ERROR,并提示错误原因,例如:
openGauss=# CREATE MODEL ecoli_svmc USING multiclass FEATURES f1, f2, f3, f4, f5, f6, f7 TARGET cat FROM (SELECT * FROM db4ai_ecoli WHERE cat='cp'); ERROR: At least two categories are needed
-
场景六:DB4AI在训练过程中会过滤掉含有空值的数据,当参与训练的模型数据为空的时候,会返回ERROR,并提示错误原因,例如:
openGauss=# create model iris_classification_model using xgboost_regression_logistic features message_regular target error_level from error_code; ERROR: Training data is empty, please check the input data.
-
场景七:DB4AI的算法对于支持的数据类型是有限制的。当数据类型不在支持白名单中,会返回ERROR,并提示非法的oid,可通过pg_type查看OID确定非法的数据类型,例如:
openGauss=# CREATE MODEL ecoli_svmc USING multiclass FEATURES f1, f2, f3, f4, f5, f6, f7, cat TARGET cat FROM db4ai_ecoli ; ERROR: Oid type 1043 not yet supported
-
场景八:当GUC参数statement_timeout设置了时长,训练超时执行的语句将被终止:执行CREATE MODEL语句。训练集的大小、训练轮数(iteration)、提前终止条件(tolerance、max_seconds)、并行线程数(nthread)等参数都会影响训练时长。当时长超过数据库限制,语句被终止模型训练失败。
-
-
模型解析。
-
场景九:当模型名在系统表中查找不到,数据库会报ERROR,例如:
openGauss=# select gs_explain_model("ecoli_svmc"); ERROR: column "ecoli_svmc" does not exist
-
-
推断阶段。
-
场景十:当模型名在系统表中查找不到,数据库会报ERROR,例如:
openGauss=# select id, PREDICT BY patient_logistic_regression (FEATURES second_attack,treatment) FROM patients; ERROR: There is no model called "patient_logistic_regression".
-
场景十一:当做推断任务FEATURES的数据维度和数据类型与训练集存在不一致,将报ERROR,并提示错误原因,例如:
openGauss=# select id, PREDICT BY patient_linear_regression (FEATURES second_attack) FROM patients; ERROR: Invalid number of features for prediction, provided 1, expected 2 CONTEXT: referenced column: patient_linear_regression_pred ------------------------------------------------------------------------------------------------------------------------------------- openGauss=# select id, PREDICT BY patient_linear_regression (FEATURES 1,second_attack,treatment) FROM patients; ERROR: Invalid number of features for prediction, provided 3, expected 2 CONTEXT: referenced column: patient_linear_regression_pre
-
-
说明: DB4AI特性需要读取数据参与计算,不适用于密态数据库等情况。