本篇:paint.net插件开发之二
上回说到在paint.net插件中集成opencl的方法里面我写了这样一个内核:
__kernel void mark_red(
const int width,
const int height,
const int channel, // RGBA
__global int* image
) {
int x = get_global_id(0);
int y = get_global_id(1);
for (int c=0;c<channel;c++) {
int idx = x + y * width + c * width * height;
if (c == 0 || c == 3) {
image[idx] = 0xFF;
} else {
image[idx] = 0x00;
}
}
}
有经验的程序员已经看出来了,这个程序有一定的问题,就是里面每个通道都占用了一个32位int,然而paint.net中的单通道数据其实只有一个字节。如果有四个通道就等于说这4heightwidth*32bit的数据全都要拷贝到显存当中,这是非常浪费现存而且降低运行效率的一种行为,那为什么不直接用八位的char呢?
诶,这是因为在OpenCl主机代码里面创建buffer只能用IntPtr指针,连UIntPtr都不行(uint数组转换位IntPtr是可以的,后面会讲),它不是抛出异常,是直接闪退的那种,连日志都输出不了。
之前了解到opencl中是要考虑数据对齐的,32位可能反而是符合gpu优化的,用8位的byte数组或者uchar数组反而不好,于是我稍微改了一下程序,在主机程序当中:
protected override void OnRender(IBitmapEffectOutput output)
{
using IEffectInputBitmap<ColorBgra32> sourceBitmap = Environment.GetSourceBitmapBgra32();
using IBitmapLock<ColorBgra32> sourceLock = sourceBitmap.Lock(new RectInt32(0, 0, sourceBitmap.Size));
RegionPtr<ColorBgra32> sourceRegion = sourceLock.AsRegionPtr();
RectInt32 outputBounds = output.Bounds;
using IBitmapLock<ColorBgra32> outputLock = output.LockBgra32();
RegionPtr<ColorBgra32> outputSubRegion = outputLock.AsRegionPtr();
var outputRegion = outputSubRegion.OffsetView(-outputBounds.Location);
//uint seed = RandomNumber.InitializeSeed(RandomNumberRenderSeed, outputBounds.Location);
// Delete any of these lines you don't need
ColorBgra32 primaryColor = Environment.PrimaryColor;
ColorBgra32 secondaryColor = Environment.SecondaryColor;
int canvasCenterX = Environment.Document.Size.Width / 2;
int canvasCenterY = Environment.Document.Size.Height / 2;
var selection = Environment.Selection.RenderBounds;
int selectionCenterX = (selection.Right - selection.Left) / 2 + selection.Left;
int selectionCenterY = (selection.Bottom - selection.Top) / 2 + selection.Top;
// Loop through the output canvas tile
uint[] buffer = new uint[outputBounds.Width * outputBounds.Height];
//int[] buffer = new int[outputBounds.Width * outputBounds.Height * 4];
for (int y = 0; y < outputBounds.Height; ++y)
{
if (IsCancelRequested) return;
for (int x = 0; x < outputBounds.Width; ++x)
{
int _x = outputBounds.Left + x;
int _y = outputBounds.Top + y;
buffer[x + y * outputBounds.Width] =
( (uint)(sourceRegion[_x, _y].R) << 24 ) |
( (uint)(sourceRegion[_x, _y].G) << 16 ) |
( (uint)(sourceRegion[_x, _y].B) << 8 ) |
( (uint)(sourceRegion[_x, _y].A) );
}
}
Opencl.Execute(outputBounds.Width, outputBounds.Height, 4, buffer);
// save pixel
for (int y = 0; y < outputBounds.Height; ++y)
{
if (IsCancelRequested) return;
for (int x = 0; x < outputBounds.Width; ++x)
{
int _x = outputBounds.Left + x;
int _y = outputBounds.Top + y;
outputRegion[_x, _y].R = (byte)(buffer[x + y * outputBounds.Width] >> 24); // R
outputRegion[_x, _y].G = (byte)(buffer[x + y * outputBounds.Width] >> 16); // G
outputRegion[_x, _y].B = (byte)(buffer[x + y * outputBounds.Width] >> 8); // B
outputRegion[_x, _y].A = (byte)(buffer[x + y * outputBounds.Width] ); // A
}
}
// save pixel end
} // OnRender
OpenCL封装那里则修改为
public static unsafe IntPtr ToIntPtr(this uint[] obj)
{
IntPtr PtrA = IntPtr.Zero;
fixed (uint* Ap = obj) return new IntPtr(Ap);
}
如此可以把uint[]转换为IntPtr传给内核,我不知道为什么UIntPtr就行不行,就必须得用IntPtr,可能是.net内部的某种行为。但是没有关系啊,反正到了内核里面它都认为是uint*,也都是32位的,没有问题。
然后在内核代码当中
__kernel void mark_red(
const int width,
const int height,
const int channel, // 从高到低8位分别为RGBA
__global uint* image
) {
int x = get_global_id(0);
int y = get_global_id(1);
if (x >= width || y >= height) return;
int idx = x + y * width;
uint output = 0;
for (int c = 0; c < channel; c++) {
uchar buffer = (image[idx] >> (8 * c)) & 0xFF;
if (c !=0 && c != 3) {
buffer = (uchar)(buffer * 0.1);
}
output |= (uint)(buffer) << (8 * c); // 设置新值
}
image[idx] = output;
}
运行结果就不放了,反正是跑成功了,跟上次效果一样。
最后再说一下怎么debug,首先咱也没有debug编译版本的paint.net,没法把调试器直接attach上去,那就只能用古老的办法了——打日志。
最简单的办法是直接在程序当中插入Console.Writeline
,然后在运行的时候:
$process = Start-Process -FilePath "D:\Program Files\paint.net\paintdotnet.exe" -RedirectStandardOutput "log.txt" -NoNewWindow -PassThru
标准输出流就会直接写入文件当中,这样就能定位问题了。