这几天一直在使用PHP开发一个不同文件类型转换的项目,清源这里将各种文件格式转换的方法分享给大家,有需要的朋友可以参考一下,谢谢!
1、将PDF转换成JPG - PDF2JPG
这是一个转换图片的很简单的方法,使用前你必须保证已经安装了Image Magic。
2、将HTML转换为PDF - html2ps
这是一个很实用的方法,在很多项目中都能用上。
3、将HTML换砖为TXT
如果你正在开发 搜索引擎程序,那么这段代码也许你能用的上。
1、将PDF转换成JPG - PDF2JPG
这是一个转换图片的很简单的方法,使用前你必须保证已经安装了Image Magic。
<?php
$pdf_file = './pdf/_folder/example.pdf';
$save_to = './image_folder/example.jpg'; //make sure that apache has permissions to write in this folder! (common problem)
//execute ImageMagick command 'convert' and convert PDF to JPG with applied settings
exec('convert "'.$pdf_file.'" -colorspace RGB -resize 800 "'.$save_to.'"', $output, $return_var);
if($return_var == 0) { //if exec successfuly converted pdf to jpg
print "Conversion OK";
} (PS:T不错的PHP Q扣峮:276167802,验证:csl)
else print "Conversion failed.<br />".$output;
?>
2、将HTML转换为PDF - html2ps
这是一个很实用的方法,在很多项目中都能用上。
<?php
function convert_to_pdf($url, $path_to_pdf) {
require_once(dirname(__FILE__).'/html2ps/config.inc.php');
require_once(HTML2PS_DIR.'pipeline.factory.class.php');
echo WRITER_TEMPDIR;
//error_reporting(E_ALL);
//ini_set("display_errors","1");
@set_time_limit(10000);
parse_config_file(HTML2PS_DIR.'html2ps.config');
/**
* Handles the saving of generated PDF to user-defined output file on server
*/
class MyDestinationFile extends Destination {
/**
* @var String result file name / path
* @access private
*/
var $_dest_filename;
function MyDestinationFile($dest_filename) {
$this->_dest_filename = $dest_filename;
}
function process($tmp_filename, $content_type) {
copy($tmp_filename, $this->_dest_filename);
}
}
$media = Media::predefined("A4");
$media->set_landscape(false);
$media->set_margins(array('left' => 5,
'right' => 5,
'top' => 10,
'bottom' => 10));
$media->set_pixels(800);
$pipeline = PipelineFactory::create_default_pipeline("", // Auto-detect encoding
"");
// Override HTML source
$pipeline->fetchers[] = new FetcherURL;
$pipeline->data_filters[] = new DataFilterHTML2XHTML;
$pipeline->parser = new ParserXHTML;
$pipeline->layout_engine = new LayoutEngineDefault;
$pipeline->output_driver = new OutputDriverFPDF($media);
//$filter = new PreTreeFilterHeaderFooter("HEADER", "FOOTER");
//$pipeline->pre_tree_filters[] = $filter;
// Override destination to local file
$pipeline->destination = new MyDestinationFile($path_to_pdf);
global $g_config;
$g_config = array(
'cssmedia' => 'screen',
'scalepoints' => '1',
'renderimages' => true,
'renderlinks' => true,
'renderfields' => true,
'renderforms' => false,
'mode' => 'html',
'encoding' => '',
'debugbox' => false,
'pdfversion' => '1.4',
'draw_page_border' => false
);
$pipeline->configure($g_config);
//$pipeline->add_feature('toc', array('location' => 'before'));
$pipeline->process($url, $media);
}
?>
3、将HTML换砖为TXT
如果你正在开发 搜索引擎程序,那么这段代码也许你能用的上。
<?php
// strip javascript, styles, html tags, normalize entities and spaces
// based on http://www.php.net/manual/en/function.strip-tags.php#68757
function html2text($html){
$text = $html;
static $search = array(
'@<script.+?</script>@usi', // Strip out javascript content
'@<style.+?</style>@usi', // Strip style content
'@<!--.+?-->@us', // Strip multi-line comments including CDATA
'@</?[a-z].*?\>@usi', // Strip out HTML tags
);
$text = preg_replace($search, ' ', $text);
// normalize common entities
$text = normalizeEntities($text);
// decode other entities
$text = html_entity_decode($text, ENT_QUOTES, 'utf-8');
// normalize possibly repeated newlines, tabs, spaces to spaces
$text = preg_replace('/\s+/u', ' ', $text);
$text = trim($text);
// we must still run htmlentities on anything that comes out!
// for instance:
// <<a>script>alert('XSS')//<<a>/script>
// will become
// <script>alert('XSS')//</script>
return $text;
}
// replace encoded and double encoded entities to equivalent unicode character
// also see /app/bookmarkletPopup.js
function normalizeEntities($text) {
static $find = array();
static $repl = array();
if (!count($find)) {
// build $find and $replace from map one time
$map = array(
array('\'', 'apos', 39, 'x27'), // Apostrophe
array('\'', ''', 'lsquo', 8216, 'x2018'), // Open single quote
array('\'', ''', 'rsquo', 8217, 'x2019'), // Close single quote
array('"', '"', 'ldquo', 8220, 'x201C'), // Open double quotes
array('"', '"', 'rdquo', 8221, 'x201D'), // Close double quotes
array('\'', ',', 'sbquo', 8218, 'x201A'), // Single low-9 quote
array('"', ',,', 'bdquo', 8222, 'x201E'), // Double low-9 quote
array('\'', '′', 'prime', 8242, 'x2032'), // Prime/minutes/feet
array('"', '′′', 'Prime', 8243, 'x2033'), // Double prime/seconds/inches
array(' ', 'nbsp', 160, 'xA0'), // Non-breaking space
array('-', '-', 8208, 'x2010'), // Hyphen
array('-', '-', 'ndash', 8211, 150, 'x2013'), // En dash
array('--', '--', 'mdash', 8212, 151, 'x2014'), // Em dash
array(' ', ' ', 'ensp', 8194, 'x2002'), // En space
array(' ', ' ', 'emsp', 8195, 'x2003'), // Em space
array(' ', ' ', 'thinsp', 8201, 'x2009'), // Thin space
array('*', 'o', 'bull', 8226, 'x2022'), // Bullet
array('*', '?', 8227, 'x2023'), // Triangular bullet
array('...', '...', 'hellip', 8230, 'x2026'), // Horizontal ellipsis
array('°', 'deg', 176, 'xB0'), // Degree
array('EUR', 'euro', 8364, 'x20AC'), // Euro
array('¥', 'yen', 165, 'xA5'), // Yen
array('£', 'pound', 163, 'xA3'), // British Pound
array('?', 'copy', 169, 'xA9'), // Copyright Sign
array('?', 'reg', 174, 'xAE'), // Registered Sign
array('(TM)', 'trade', 8482, 'x2122') // TM Sign
);
foreach ($map as $e) {
for ($i = 1; $i < count($e); ++$i) {
$code = $e[$i];
if (is_int($code)) {
// numeric entity
$regex = "/&(amp;)?#0*$code;/";
}
elseif (preg_match('/^.$/u', $code)/* one unicode char*/) {
// single character
$regex = "/$code/u";
}
elseif (preg_match('/^x([0-9A-F]{2}){1,2}$/i', $code)) {
// hex entity
$regex = "/&(amp;)?#x0*" . substr($code, 1) . ";/i";
}
else {
// named entity
$regex = "/&(amp;)?$code;/";
}
$find[] = $regex;
$repl[] = $e[0];
}
}
} // end first time build
return preg_replace($find, $repl, $text);
}
?>
以上是本文关于PHP 转换PDF、TXT、HTML以及图像等格式的方法,希望本文对广大php开发者有所帮助,感谢阅读本文。