如何在hdfs上将文件下载
How to convert HTML file to a text on Linux?
如何在Linux上将HTML文件转换为文本?
You can use html2text
(can be installed on Fedora by yum install html2text
):
您可以使用html2text
(可以通过yum install html2text
安装在Fedora上):
$ html2text ${html_file}
${html_file} is the html file to be converted. The converted text will be printed to the STDOUT. You can redirect it to a file if it is needed.
$ {html_file}是要转换的html文件。 转换后的文本将被打印到STDOUT上 。 您可以根据需要将其重定向到文件。
Adding -style pretty
can make html2text
prints additional spaces/lines to make the text look more prettier.
添加- style pretty
可以使html2text
打印更多的空格/行,以使文本看起来更漂亮。
The -width 100
option may help for pages that are rendered strangely.
-width 100
选项可能有助于呈现奇怪的页面。
翻译自: https://www.systutorials.com/how-to-convert-html-file-to-text-on-linux/
如何在hdfs上将文件下载