参考资料:《详细程序注解学OpenCL一 环境配置和入门程序》、《VS2010 NVIDIA OpenCL 开发环境配置》
一、 搭建开发环境
1. 下载和安装CUDA SDK
下载路径:https://developer.nvidia.com/cuda-downloads ;
如果默认安装路径的话,是在:C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v5.5。打开这个目录会发现里面有include和lib文件夹,这就是我们需要在Visual C++ 2008中配置的目录。
2. 配置Visual C++ 2008
a. 打开Visual C++ 2008,新建一个空项目;
b. 右键点击界面左侧“源文件”文件夹,选择“添加”-->"新建项",建立一个空的“main.cpp”文件;(做这一步是为了让工程的“属性页”里的“配置属性”里出现“C/C++”选项,以配置路径。)
c.右键点击项目文件,选择“属性”;
d. 配置属性页。
(a). “配置属性”--> "C/C++" --> "常规" ,在右边“附加包含目录”里添加: “ C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v5.5\include “,如下图。
(b).“配置属性”--> "链接器" --> "常规",在右边"附加库目录"里添加:" C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v5.5\lib\Win32",如果是64位系统可以是: "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v5.5\lib\x64" ,如下图。
(c). “配置属性”--> "链接器" --> "输入" ,在右边"附加依赖项"里添加:OpenCL.lib,如下图。
二、 入门程序示例
1. OpenCL基础概念与框架
OpenCL支持大量不同的应用,无论哪一种,面向异构平台的应用都必须完成的步骤有:
a.发现构成异构系统的组建;
b.探查这些组件的特征,使软件能够适应不同硬件单元的特定特性;
c.创建将在平台上运行的指令块(内核);
d.建立和管理计算中涉及的内存对象;
e.在系统正确的组件上按正确的顺序执行内核;
f.收集最终的内核。
这些步骤通过OpenCL中的一系列API再加上一个面向内核的编程环境来完成。我们把问题分解为以下模型:
> 平台模型(platform model):异构系统的描述;
> 执行模型(execution model):指令流在异构平台上执行的抽象表示;
> 内存模型(memory model):OpenCL中的内存区域集合以及一个OpenCL计算期间这些内存如何交互;
> 编程模型(programming model):程序员设计算法来实现一个应用时的高层描述。
OpenCL的框架可以划分为以下组成部分:
>主机编程
平台API
- 查询计算设备
- 创建上下文(Contexts)
运行时API
- 创建上下文相关的内存对象
- 编译和创建内核编程对象
- 发出命令到命令队列
- 命令同步
- 清除OpenCL资源
> 内核
编程语言
- 带一些限制和扩展的C代码
应用在OpenCL框架中的基本工作流示意图如下。
2. 第一个OpenCL程序
我们的示例程序将完成以下操作:
a.在第一个可用平台上创建OpenCL上下文;
b. 在第一个可用设备上创建命令队列;
c.加载一个内核文件(HelloWorld.cl),并将它构建到程序对象中;
d. 为内核函数hello_kernel()创建一个内核对象;
e. 为内核参数创建内存对象;
f. 将待执行的内核排队;
g. 将内核结果读回结果缓冲区。
2.1 选择平台并创建上下文
一个系统上可以有多个OpenCL实现,创建OpenCL程序的第一步是选择一个平台。创建平台之后,在平台上创建一个上下文,具体代码如下:
/******** 第一部分 选择OpenCL平台,创建一个上下文 ********/
cl_uint numPlatforms;
cl_platform_id *platformIds;
cl_context context = 0;
// 1. Select an OpenCL platform to run on.
errNum = clGetPlatformIDs(0, NULL, &numPlatforms); // 1. 获取OpenCL平台数目
if (errNum != CL_SUCCESS || numPlatforms <= 0) {
perror("Failed to find any OpenCL platforms.");
exit(1);
}
printf("Platform Numbers: %d\n", numPlatforms);
platformIds = (cl_platform_id *)malloc(
sizeof(cl_platform_id) * numPlatforms);
errNum = clGetPlatformIDs(numPlatforms, platformIds, NULL); // 2. 创建所有OpenCL平台
if (errNum != CL_SUCCESS) {
perror("Failed to find any OpenCL platforms.");
exit(1);
}
// 2. Create an OpenCL context on the platform.
cl_context_properties contextProperties[] = {
CL_CONTEXT_PLATFORM,
(cl_context_properties)platformIds[0], // 3. 选择第一个OpenCL平台
0
};
context = clCreateContextFromType(contextProperties, // 4. 尝试为一个GPU设备创建一个上下文
CL_DEVICE_TYPE_GPU,
NULL, NULL, &errNum);
if (errNum != CL_SUCCESS) {
perror("Could not create GPU context, trying CPU...");
context = clCreateContextFromType(contextProperties, // 5. 尝试为一个CPU设备创建一个上下文
CL_DEVICE_TYPE_GPU, NULL, NULL, &errNum);
if (errNum != CL_SUCCESS) {
perror("Failed to create an OpenCL GPU or CPU context.");
exit(1);
}
}
2.2 选择设备并创建命令队列
选择平台并创建一个上下文之后,下一步是选择一个设备,并创建一个命令队列,具体代码如下:
/******** 第二部分 选择设备,创建命令队列 ********/
cl_device_id *devices;
cl_device_id device = 0;
cl_command_queue commandQueue = NULL;
size_t deviceBufferSize = -1;
// 3. Get the size of the device buffer.
errNum = clGetContextInfo(context, CL_CONTEXT_DEVICES, 0, NULL, // 1. 查询存储上下文所有可用设备ID所需要的缓冲区大小
&deviceBufferSize);
if (errNum != CL_SUCCESS) {
perror("Failed to get context infomation.");
exit(1);
}
if (deviceBufferSize <= 0) {
perror("No devices available.");
exit(1);
}
devices = new cl_device_id[deviceBufferSize/sizeof(cl_device_id)];
errNum = clGetContextInfo(context, CL_CONTEXT_DEVICES, // 2. 获取上下文中所有可用设备
deviceBufferSize, devices, NULL);
if (errNum != CL_SUCCESS) {
perror("Failed to get device ID.");
exit(1);
}
// 4. Choose the first device
commandQueue = clCreateCommandQueue(context, // 3. 选择第一个设备,创建一个命令队列
devices[0], 0, NULL);
if (commandQueue == NULL) {
perror("Failed to create commandQueue for device 0.");
exit(1);
}
device = devices[0];
delete [] devices;
2.3 创建和构建程序对象
OpenCL的下一步是从HelloWorld.cl文件中加载OpenCL内核源代码,由它创建一个程序对象。该程序对象用内核源代码加载,然后进行编译,从而在上下文相关联的设备上执行。HelloWorld.cl中的OpenCL内核代码如下:
// OpenCL Kernel Function
__kernel void HelloOpenCL(__global const float* a,
__global const float* b,
__global float* result)
{
// get index into global data array
int iGID = get_global_id(0);
// elements operation
result[iGID] = a[iGID] * b[iGID];
}
创建和构建程序对象的源码如下:
/******** 第三部分 读取OpenCL C语言,创建和构建程序对象 ********/
cl_program program;
size_t szKernelLength; // Byte size of kernel code
char* cSourceCL = NULL; // Buffer to hold source for compilation
// 5. Read the OpenCL kernel in from source file
cSourceCL = oclLoadProgSource( // 1. 从绝对路径读取HelloWorld.cl的源代码
"C:/Users/xxx/Desktop/OpenCL/HelloOpenCL.cl", "",
&szKernelLength);
if (cSourceCL == NULL){
perror("Error in oclLoadProgSource\n");
exit(1);
}
// 6. Create the program
program = clCreateProgramWithSource(context, 1, // 2. 使用源代码创建程序对象
(const char **)&cSourceCL,
&szKernelLength, &errNum);
if (errNum != CL_SUCCESS) {
perror("Error in clCreateProgramWithSource\n");
exit(1);
}
// 7. Build the program with 'mad' Optimization option
char* flags = "-cl-fast-relaxed-math";
errNum = clBuildProgram(program, 0, NULL, NULL, NULL, NULL); // 3. 编译内核源代码
if (errNum != CL_SUCCESS) {
perror("Error in clBuildProgram.\n");
exit(1);
}
读取内核源代码的函数来自Nvidia官方网站的实例程序,具体代码如下:
//
//! Loads a Program file and prepends the cPreamble to the code.
//!
//! @return the source string if succeeded, 0 otherwise
//! @param cFilename program filename
//! @param cPreamble code that is prepended to the loaded file, typically a set of #defines or a header
//! @param szFinalLength returned length of the code string
//
char* oclLoadProgSource(const char* cFilename, const char* cPreamble, size_t* szFinalLength)
{
// locals
FILE* pFileStream = NULL;
size_t szSourceLength;
// open the OpenCL source code file
#ifdef _WIN32 // Windows version
if(fopen_s(&pFileStream, cFilename, "rb") != 0)
{
return NULL;
}
#else // Linux version
pFileStream = fopen(cFilename, "rb");
if(pFileStream == 0)
{
return NULL;
}
#endif
size_t szPreambleLength = strlen(cPreamble);
// get the length of the source code
fseek(pFileStream, 0, SEEK_END);
szSourceLength = ftell(pFileStream);
fseek(pFileStream, 0, SEEK_SET);
// allocate a buffer for the source code string and read it in
char* cSourceString = (char *)malloc(szSourceLength + szPreambleLength + 1);
memcpy(cSourceString, cPreamble, szPreambleLength);
if (fread((cSourceString) + szPreambleLength, szSourceLength, 1, pFileStream) != 1)
{
fclose(pFileStream);
free(cSourceString);
return 0;
}
// close the file and return the total length of the combined (preamble + source) string
fclose(pFileStream);
if(szFinalLength != 0)
{
*szFinalLength = szSourceLength + szPreambleLength;
}
cSourceString[szSourceLength + szPreambleLength] = '\0';
return cSourceString;
}
2.4 创建内核和内存对象
下面是创建内核对象,并将其编译到程序对象中。另外,分配数组,将数组数据复制到为内存对象分配的存储空间中。具体代码如下:
/******** 第四部分 创建内核和内存对象 ********/
#define ARRAY_SIZE 10
cl_kernel kernel = 0;
cl_mem memObjects[3] = {0, 0, 0};
float a[ARRAY_SIZE];
float b[ARRAY_SIZE];
float result[ARRAY_SIZE];
// 8. Create the kernel
kernel = clCreateKernel(program, "HelloOpenCL", NULL); // 1. 创建内核对象
if (kernel == NULL) {
perror("Error in clCreateKernel.\n");
exit(1);
}
// 9. Create memory objects
for (int i = 0; i < ARRAY_SIZE; i++) {
a[i] = (float)i;
b[i] = (float)i;
}
memObjects[0] = clCreateBuffer(context, CL_MEM_READ_ONLY | // 2. 创建内存对象
CL_MEM_COPY_HOST_PTR,
sizeof(float) * ARRAY_SIZE,
a, NULL);
memObjects[1] = clCreateBuffer(context, CL_MEM_READ_ONLY |
CL_MEM_COPY_HOST_PTR,
sizeof(float) * ARRAY_SIZE,
b, NULL);
memObjects[2] = clCreateBuffer(context, CL_MEM_READ_WRITE |
CL_MEM_COPY_HOST_PTR,
sizeof(float) * ARRAY_SIZE,
result, NULL);
if (memObjects[0] == NULL || memObjects[1] == NULL ||
memObjects[2] == NULL) {
perror("Error in clCreateBuffer.\n");
exit(1);
}
2.5 执行内核
创建了内核和内存对象之后,可以将要执行的内核排队。首先是建立内核参数,之后利用命令队列使将在设备上执行的内核排队。执行内核排队并不表示这个内核会被立即执行。内核执行会被放在命令队列中,以后再由设备消费。具体代码如下:
/******** 第五部分 执行内核 ********/
size_t globalWorkSize[1] = { ARRAY_SIZE };
size_t localWorkSize[1] = { 1 };
// 10. Set the kernel arguments
errNum = clSetKernelArg(kernel, 0, sizeof(cl_mem), &memObjects[0]); // 1. 设置内核参数
errNum |= clSetKernelArg(kernel, 1, sizeof(cl_mem), &memObjects[1]);
errNum |= clSetKernelArg(kernel, 2, sizeof(cl_mem), &memObjects[2]);
if (errNum != CL_SUCCESS) {
perror("Error in clSetKernelArg.\n");
exit(1);
}
// 11. Queue the kernel up for execution across the array
errNum = clEnqueueNDRangeKernel(commandQueue, kernel, 1, NULL, // 2. 执行内核排队
globalWorkSize, localWorkSize,
0, NULL, NULL);
if (errNum != CL_SUCCESS) {
perror("Error in clEnqueueNDRangeKernel.\n");
exit(1);
}
// 12. Read the output buffer back to the Host
errNum = clEnqueueReadBuffer(commandQueue, memObjects[2], // 3. 读取运算结果到主机
CL_TRUE, 0,
ARRAY_SIZE * sizeof(float), result,
0, NULL, NULL);
if (errNum != CL_SUCCESS) {
perror("Error in clEnqueueReadBuffer.\n");
exit(1);
}
2.6 结果测试
本示例程序的目的是,计算两个数组中对应元素相乘的结果,其测试用例与结果如下: /******** 第六部分 测试结果 ********/
printf("\nTest: a * b = c\n\n");
printf("Input numbers:\n");
for (int i = 0; i < ARRAY_SIZE; i++)
printf("a[%d] = %f, b[%d] = %f\n", i, a[i], i, b[i]);
printf("\nOutput numbers:\n");
for (int i = 0; i < ARRAY_SIZE; i++)
printf("a[%d] * b[%d] = %f\n", i, i, result[i]);
最终测试输出界面如下图所示,输出的前面四行是打印平台信息,在上诉代码中并未涉及,具体查看附录里的完整代码。
三、 附录 完整的示例程序
1. 内核代码(HelloWorld.cl)
// OpenCL Kernel Function
__kernel void HelloOpenCL(__global const float* a,
__global const float* b,
__global float* result)
{
// get index into global data array
int iGID = get_global_id(0);
// elements operation
result[iGID] = a[iGID] * b[iGID];
}
2. 完整的主机程序(main.cpp)
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <iostream>
#ifdef MAC
#include <OpenCL/cl.h>
#else
#include <CL/cl.h>
#endif
char* oclLoadProgSource(const char* cFilename, const char* cPreamble, size_t* szFinalLength);
int main()
{
cl_int errNum;
/******** 第一部分 选择OpenCL平台,创建一个上下文 ********/
cl_uint numPlatforms;
cl_platform_id *platformIds;
cl_context context = 0;
// 1. Select an OpenCL platform to run on.
errNum = clGetPlatformIDs(0, NULL, &numPlatforms); // 1. 获取OpenCL平台数目
if (errNum != CL_SUCCESS || numPlatforms <= 0) {
perror("Failed to find any OpenCL platforms.");
exit(1);
}
printf("Platform Numbers: %d\n", numPlatforms);
platformIds = (cl_platform_id *)malloc(
sizeof(cl_platform_id) * numPlatforms);
errNum = clGetPlatformIDs(numPlatforms, platformIds, NULL); // 2. 创建所有OpenCL平台
if (errNum != CL_SUCCESS) {
perror("Failed to find any OpenCL platforms.");
exit(1);
}
//------------------ 打印平台信息(Start) ------------------/
// Extension data
size_t ext_size = 0;
//输出生产商的名字
errNum = clGetPlatformInfo(platformIds[0],
CL_PLATFORM_NAME,
0, NULL, &ext_size);
if(errNum < 0) {
perror("Couldn't read CL_PLATFORM_NAME.");
exit(1);
}
char *name = (char*)malloc(ext_size);
clGetPlatformInfo(platformIds[0], CL_PLATFORM_NAME,
ext_size, name, NULL);
printf("Platform Name: %s\n", name);
//供应商信息
errNum = clGetPlatformInfo(platformIds[0],
CL_PLATFORM_VENDOR,
0, NULL, &ext_size);
if(errNum < 0) {
perror("Couldn't read CL_PLATFORM_VENDOR.");
exit(1);
}
char *vendor = (char*)malloc(ext_size);
clGetPlatformInfo(platformIds[0], CL_PLATFORM_VENDOR,
ext_size, vendor, NULL);
printf("Platform Vendor: %s\n", vendor);
//最高支持的OpenCL版本
errNum = clGetPlatformInfo(platformIds[0],
CL_PLATFORM_VERSION,
0, NULL, &ext_size);
if(errNum < 0) {
perror("Couldn't read CL_PLATFORM_VERSION.");
exit(1);
}
char *version = (char*)malloc(ext_size);
clGetPlatformInfo(platformIds[0], CL_PLATFORM_VERSION,
ext_size, version, NULL);
printf("Platform Version: %s\n", version);
//只有两个值:full profile 和 embeded profile
errNum = clGetPlatformInfo(platformIds[0],
CL_PLATFORM_PROFILE,
0, NULL, &ext_size);
if(errNum < 0) {
perror("Couldn't read CL_PLATFORM_PROFILE.");
exit(1);
}
char *profile = (char*)malloc(ext_size);
clGetPlatformInfo(platformIds[0], CL_PLATFORM_PROFILE,
ext_size, profile, NULL);
printf("Platform Full Profile or Embeded Profile?: %s\n", profile);
//------------------ 打印平台信息(End) ------------------/
// 2. Create an OpenCL context on the platform.
cl_context_properties contextProperties[] = {
CL_CONTEXT_PLATFORM,
(cl_context_properties)platformIds[0], // 3. 选择第一个OpenCL平台
0
};
context = clCreateContextFromType(contextProperties, // 4. 尝试为一个GPU设备创建一个上下文
CL_DEVICE_TYPE_GPU,
NULL, NULL, &errNum);
if (errNum != CL_SUCCESS) {
perror("Could not create GPU context, trying CPU...");
context = clCreateContextFromType(contextProperties, // 5. 尝试为一个CPU设备创建一个上下文
CL_DEVICE_TYPE_GPU, NULL, NULL, &errNum);
if (errNum != CL_SUCCESS) {
perror("Failed to create an OpenCL GPU or CPU context.");
exit(1);
}
}
/******** 第二部分 选择设备,创建命令队列 ********/
cl_device_id *devices;
cl_device_id device = 0;
cl_command_queue commandQueue = NULL;
size_t deviceBufferSize = -1;
// 3. Get the size of the device buffer.
errNum = clGetContextInfo(context, CL_CONTEXT_DEVICES, 0, NULL, // 1. 查询存储上下文所有可用设备ID所需要的缓冲区大小
&deviceBufferSize);
if (errNum != CL_SUCCESS) {
perror("Failed to get context infomation.");
exit(1);
}
if (deviceBufferSize <= 0) {
perror("No devices available.");
exit(1);
}
devices = new cl_device_id[deviceBufferSize/sizeof(cl_device_id)];
errNum = clGetContextInfo(context, CL_CONTEXT_DEVICES, // 2. 获取上下文中所有可用设备
deviceBufferSize, devices, NULL);
if (errNum != CL_SUCCESS) {
perror("Failed to get device ID.");
exit(1);
}
// 4. Choose the first device
commandQueue = clCreateCommandQueue(context, // 3. 选择第一个设备,创建一个命令队列
devices[0], 0, NULL);
if (commandQueue == NULL) {
perror("Failed to create commandQueue for device 0.");
exit(1);
}
device = devices[0];
delete [] devices;
/******** 第三部分 读取OpenCL C语言,创建和构建程序对象 ********/
cl_program program;
size_t szKernelLength; // Byte size of kernel code
char* cSourceCL = NULL; // Buffer to hold source for compilation
// 5. Read the OpenCL kernel in from source file
cSourceCL = oclLoadProgSource( // 1. 从绝对路径读取HelloWorld.cl的源代码
"C:/Users/xxx/Desktop/OpenCL/HelloOpenCL.cl", "",
&szKernelLength);
if (cSourceCL == NULL){
perror("Error in oclLoadProgSource\n");
exit(1);
}
// 6. Create the program
program = clCreateProgramWithSource(context, 1, // 2. 使用源代码创建程序对象
(const char **)&cSourceCL,
&szKernelLength, &errNum);
if (errNum != CL_SUCCESS) {
perror("Error in clCreateProgramWithSource\n");
exit(1);
}
// 7. Build the program with 'mad' Optimization option
char* flags = "-cl-fast-relaxed-math";
errNum = clBuildProgram(program, 0, NULL, NULL, NULL, NULL); // 3. 编译内核源代码
if (errNum != CL_SUCCESS) {
perror("Error in clBuildProgram.\n");
exit(1);
}
/******** 第四部分 创建内核和内存对象 ********/
#define ARRAY_SIZE 10
cl_kernel kernel = 0;
cl_mem memObjects[3] = {0, 0, 0};
float a[ARRAY_SIZE];
float b[ARRAY_SIZE];
float result[ARRAY_SIZE];
// 8. Create the kernel
kernel = clCreateKernel(program, "HelloOpenCL", NULL); // 1. 创建内核对象
if (kernel == NULL) {
perror("Error in clCreateKernel.\n");
exit(1);
}
// 9. Create memory objects
for (int i = 0; i < ARRAY_SIZE; i++) {
a[i] = (float)i;
b[i] = (float)i;
}
memObjects[0] = clCreateBuffer(context, CL_MEM_READ_ONLY | // 2. 创建内存对象
CL_MEM_COPY_HOST_PTR,
sizeof(float) * ARRAY_SIZE,
a, NULL);
memObjects[1] = clCreateBuffer(context, CL_MEM_READ_ONLY |
CL_MEM_COPY_HOST_PTR,
sizeof(float) * ARRAY_SIZE,
b, NULL);
memObjects[2] = clCreateBuffer(context, CL_MEM_READ_WRITE |
CL_MEM_COPY_HOST_PTR,
sizeof(float) * ARRAY_SIZE,
result, NULL);
if (memObjects[0] == NULL || memObjects[1] == NULL ||
memObjects[2] == NULL) {
perror("Error in clCreateBuffer.\n");
exit(1);
}
/******** 第五部分 执行内核 ********/
size_t globalWorkSize[1] = { ARRAY_SIZE };
size_t localWorkSize[1] = { 1 };
// 10. Set the kernel arguments
errNum = clSetKernelArg(kernel, 0, sizeof(cl_mem), &memObjects[0]); // 1. 设置内核参数
errNum |= clSetKernelArg(kernel, 1, sizeof(cl_mem), &memObjects[1]);
errNum |= clSetKernelArg(kernel, 2, sizeof(cl_mem), &memObjects[2]);
if (errNum != CL_SUCCESS) {
perror("Error in clSetKernelArg.\n");
exit(1);
}
// 11. Queue the kernel up for execution across the array
errNum = clEnqueueNDRangeKernel(commandQueue, kernel, 1, NULL, // 2. 执行内核排队
globalWorkSize, localWorkSize,
0, NULL, NULL);
if (errNum != CL_SUCCESS) {
perror("Error in clEnqueueNDRangeKernel.\n");
exit(1);
}
// 12. Read the output buffer back to the Host
errNum = clEnqueueReadBuffer(commandQueue, memObjects[2], // 3. 读取运算结果到主机
CL_TRUE, 0,
ARRAY_SIZE * sizeof(float), result,
0, NULL, NULL);
if (errNum != CL_SUCCESS) {
perror("Error in clEnqueueReadBuffer.\n");
exit(1);
}
/******** 第六部分 测试结果 ********/
printf("\nTest: a * b = c\n\n");
printf("Input numbers:\n");
for (int i = 0; i < ARRAY_SIZE; i++)
printf("a[%d] = %f, b[%d] = %f\n", i, a[i], i, b[i]);
printf("\nOutput numbers:\n");
for (int i = 0; i < ARRAY_SIZE; i++)
printf("a[%d] * b[%d] = %f\n", i, i, result[i]);
while(1);
return 0;
}
//
//! Loads a Program file and prepends the cPreamble to the code.
//!
//! @return the source string if succeeded, 0 otherwise
//! @param cFilename program filename
//! @param cPreamble code that is prepended to the loaded file, typically a set of #defines or a header
//! @param szFinalLength returned length of the code string
//
char* oclLoadProgSource(const char* cFilename, const char* cPreamble, size_t* szFinalLength)
{
// locals
FILE* pFileStream = NULL;
size_t szSourceLength;
// open the OpenCL source code file
#ifdef _WIN32 // Windows version
if(fopen_s(&pFileStream, cFilename, "rb") != 0)
{
return NULL;
}
#else // Linux version
pFileStream = fopen(cFilename, "rb");
if(pFileStream == 0)
{
return NULL;
}
#endif
size_t szPreambleLength = strlen(cPreamble);
// get the length of the source code
fseek(pFileStream, 0, SEEK_END);
szSourceLength = ftell(pFileStream);
fseek(pFileStream, 0, SEEK_SET);
// allocate a buffer for the source code string and read it in
char* cSourceString = (char *)malloc(szSourceLength + szPreambleLength + 1);
memcpy(cSourceString, cPreamble, szPreambleLength);
if (fread((cSourceString) + szPreambleLength, szSourceLength, 1, pFileStream) != 1)
{
fclose(pFileStream);
free(cSourceString);
return 0;
}
// close the file and return the total length of the combined (preamble + source) string
fclose(pFileStream);
if(szFinalLength != 0)
{
*szFinalLength = szSourceLength + szPreambleLength;
}
cSourceString[szSourceLength + szPreambleLength] = '\0';
return cSourceString;
}