html table csv,How can I convert an HTML table to CSV?

最新推荐文章于 2021-06-30 15:14:50 发布

葛店小学张洪雨

最新推荐文章于 2021-06-30 15:14:50 发布

阅读量36

点赞数

文章标签： html table csv

Sorry for resurrecting an ancient thread, but I recently wanted to do this, but I wanted a 100% portable bash script to do it. So here's my solution using only grep and sed.

The below was bashed out very quickly, and so could be made much more elegant, but I'm just getting started really with sed/awk etc...

As you can see I've got the page source using curl, but you could just as easily feed in the table source from elsewhere.

Here's the explanation:

Get the Contents of the URL using cURL, dump stderr to null (no progress meter)

curl "http://www.webpagewithtableinit.com/" 2>/dev/null

I only want Table elements (return only lines with TABLE,TR,TH,TD tags)

| grep -i -e '\?TABLE\|\?TD\|\?TR\|\?TH'

Remove any Whitespace at the beginning of the line.

| sed 's/^[\ \t]*//g'

Remove newlines

| tr -d '\n\r'

Replace with newline

| sed 's/]*>/\n/Ig'

Remove TABLE and TR tags

| sed 's/]*>//Ig'

Remove ^

, ^, $, $

| sed 's/^]*>\|]*>$//Ig'

Replace

with comma

| sed 's/]*>]*>/,/Ig'

Note that if any of the table cells contain commas, you may need to escape them first, or use a different delimiter.

Hope this helps someone!

葛店小学张洪雨

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
html table csv,How can I convert an HTML table to CSV?

Sorry for resurrecting an ancient thread, but I recently wanted to do this, but I wanted a 100% portable bash script to do it. So here's my solution using only grep and sed.The below was bashed out v...
复制链接

扫一扫