QA 检查
QA 检查的对象是List <TranslationUnit>
ITranslationUnit
public interface ITranslationUnit
{
public string ID { get; set; }
public string Source { get; set; }
public string Target { get; set; }
}
分类:
按检查内容:单语检查,双语检查
按类别:漏译,长度,一致性,标点符号,空格,叠字,大写,括号,数字,日期,时间,度量单位,商标等
排除:
排除的目的是降低误报,满足一定条件的不进行QA检查,例如,translation=“no”, status=“locked”, length<3 等等
漏译:
//Source equals Target
if (TranslationUnit.Source.Trim() == TranslationUnit.Target.Trim())
if (TranslationUnit.Source.toLower() == TranslationUnit.Target.toLower())
//Target Empty
if (TranslationUnit.Source.Trim() != string.Empty)
{
if (TranslationUnit.Target.Trim() == string.Empty)
{
}
}
长度:
int sourceLength = TranslationUnit.Source.Length;
int targetLength = TranslationUnit.Target.Length;
//0 < n < 100
//shorter than
if(targetLength <= sourceLength * (n / 100))
// m > 0
//longer than
if(targetLength >= sourceLength * (n / 100))
一致性:
一致性可以是单个文件的一致性也可以是多个文件的一致性
List<TranslationUnit> TUsToBeChecked = new List<TranslationUnit>();
TUsToBeChecked.AddRange(file1.TranslationUnits);
TUsToBeChecked.AddRange(file2.TranslationUnits);
...
foreach(TranslationUnit TUToBeChecked in TUsToBeChecked)
{
//optimise ??
IEnumerable<TranslationUnit> optTUs = from v in TUsToBeChecked where v.Source.Length == TUToBeChecked.Source.Length select v;
//Trados采用的优化方法是通过扩展ITranslationUnit接口,增加属性 sdl:rep
//<sdl:rep id="adTq+25OH3O1XA8tE4pq4yjlRyw=" />
//以MD5或Base64算法计算TranslationUnit.Source 以此结果作为比较的依据
//same source different target
//原理
IEnumerable<TranslationUnit> tus = from v in optTUs where TUToBeChecked.Source.Trim() == v.Source.Trim().toLower() && TUToBeChecked.Target.Trim() != v.Target.Trim().toLower() select v;
if(tus.Count<TranslationUnit>() > 0)
{
//to do
}
}
标点符号:
//假定Source是中文 Target是英文
string CNPunc = "。:?!";
string ENPunc = ".:?!";
char[] CNPuncArr = CNPunc.ToCharArray();
char[] ENPuncArr = ENPunc.ToCharArray();
for(int i=0; i < CNPuncArr.Length; i++ )
{
if(TranslationUnit.Source.Trim().EndsWith(CNPuncArr[i].toString()))
{
if(!TranslationUnit.Target.Trim().EndsWith(ENPuncArr[i].toString()))
{
//todo
}
}
}
双空格:
Regex regexSpace = new Regex(@"\s{2,}");
if (regexSpace.IsMatch(TranslationUnit.Target.Trim()))
首尾空格:
if (TranslationUnit.Target!= TranslationUnit.Target.Trim())
叠字:
Regex regexRepeatedWords = new Regex(@"\b(\w+)\b\s\1\b");
if (regexRepeatedWords.IsMatch(TranslationUnit.Target))
{
MatchCollection matchCollection = regexRepeatedWords.Matches(TranslationUnit.Target);
List<string> rwords = new List<string>();
foreach (Match m in matchCollection)
{
rwords.Add(m.Value);
}
}
数字:
Regex regexNum = new Regex(@"\d+");
if (regexNum .IsMatch(TranslationUnit.Source.Trim()))
{
MatchCollection matchCollectionSourceNumbers = regexRepeatedWords.Matches(TranslationUnit.Source);
List<string> sourceNumbers= new List<string>();
foreach (Match m in matchCollection)
{
sourceNumbers.Add(m.Value);
}
List<string> targetMissedNumbers= new List<string>();
foreach(string sourceNumber n in sourceNumbers)
{
if(!TranslationUnit.Target.Contains(n))
{
targetMissedNumbers.Add(sourceNumber);
}
}
if(targetMissedNumbers.Count<string>() > 0)
{
//todo
}
}
拼写检查:
拼写检查包括,单词的拼写,语法错误,单复数的使用,过去时态,进行时态的使用等
https://www.grammarly.com/
Grammar Check Online - It’s a Free tool by NOUNPLUS
http://www.gingersoftware.com/zh/grammarcheck#
https://www.autocrit.com/
http://virtualwritingtutor.com/
http://www.hemingwayapp.com/
http://www.1checker.com/
https://prowritingaid.com/
http://www.whitesmoke.com
QA 检查 - 高级正则表达式检查 见 QA 检查 - 高级正则表达式检查
Reference:独立的QA检查工具
Xbench https://www.xbench.net
ErrorSpy https://www.dog-gmbh.de/en/products/errorspy-quality-assurance
Verifika https://e-verifika.com