html nbsp显示,HTML编码问题-“出现”字符而不是“ ”

在使用ActivePDF生成PDF报告的过程中,遇到HTML模板中的非破坏性空格编码错误,显示为ISO-8859-1字符。问题在于非UTF-8字符导致ActivePDF解析错误。目前通过替换非ASCII字符的方法暂时解决,但寻求更佳解决方案。
摘要由CSDN通过智能技术生成

I've got a legacy app just starting to misbehave, for whatever reason I'm not sure. It generates a bunch of HTML that gets turned into PDF reports by ActivePDF.

The process works like this:

Pull an HTML template from a DB with tokens in it to be replaced (e.g. "~CompanyName~", "~CustomerName~", etc.)

Replace the tokens with real data

Tidy the HTML with a simple regex function that property formats HTML tag attribute values (ensures quotation marks, etc, since ActivePDF's rendering engine hates anything but single quotes around attribute values)

Send off the HTML to a web service that creates the PDF.

Somewhere in that mess, the non-breaking spaces from the HTML template (the  s) are encoding as ISO-8859-1 so that they show up incorrectly as an "Â" character when viewing the document in a browser (FireFox). ActivePDF pukes on these non-UTF8 characters.

My question: since I don't know where the problem stems from and don't have time to investigate it, is there an easy way to re-encode or find-and-replace the bad characters? I've tried sending it through this little function I threw together, but it turns it all into gobbledegook doesn't change anything.

Private Shared Function ConvertToUTF8(ByVal html As String) As String

Dim isoEncoding As Encoding = Encoding.GetEncoding("iso-8859-1")

Dim source As Byte() = isoEncoding.GetBytes(html)

Return Encoding.UTF8.GetString(Encoding.Convert(isoEncoding, Encoding.UTF8, source))

End Function

Any ideas?

EDIT:

I'm getting by with this for now, though it hardly seems like a good solution:

Private Shared Function ReplaceNonASCIIChars(ByVal html As String) As String

Return Regex.Replace(html, "[^\u0000-\u007F]", " ")

End Function

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值