php mysql相似度,在(纯)PHP / MySQL中查找类似的图像

My users are uploading images to my website and i would like first to offer them already uploaded images first. My idea is to

1. create some kind of image "hash" of every existing image

2. create a hash of newly uploaded image and compare it with the other in the database

i have found some interesting solutions like http://www.pureftpd.org/project/libpuzzle or or http://phash.org/ etc. but they got one or more problems

they need some nonstandard extension to PHP (or are not in PHP at all) - it would be OK for me, but I would like to create it as a plugin to my popular CMS, which is used on many hosting environments without my control.

they are comparing two images but i need to compare one to many (e.g. thousands) and doing it one by one would be very uneffective / slow ...

...

I would be OK to find only VERY similar images (so e.g. different size, resaved jpg or different jpg compression factor).

The only idea I got is to resize the image to e.g. 5px*5px* 256 colors, create a string representation of it and then find the same. But I guess that it may have create tiny differences in colors even with just two same images with different size, so finding just the 100 % same would be useless.

So I would need some good format of that string representation of image which than could be used with some SQL function to find similar, or some other nice way. E.g. phash create perceptional hashes, so when two numbers are close, the images should be close as well, so i just need to find closest distances. But it is again external library.

Is there any easy way?

解决方案

I've had this exact same issue before.

Feel free to copy what I did, and hopefully it will help you / solve your problem.

How I solved it

My first idea that failed, similar to what you may be thinking, is I ended up making strings for every single image (no matter what size). But I quickly worked out this fills your database super fast, and wasn't effective.

Next option (that works) was a smaller image (like your 5px idea), and I did exactly that, but with 10px*10px images. The way I created the 'hash' for each image was the imagecolorat() function.

When receiving the rgb colours for the image, I rounded them to the nearest 50, so that the colours were less specific. That number (50) is what you want to change depending on how specific you want your searches to be.

for example:

// Pixel RGB

rgb(105, 126, 225) // Original

rgb(100, 150, 250) // After rounding numbers to nearest 50

After doing this to every pixel (10px*10px will give you 100 rgb()'s back), I then turned them into an array, and stored them in the database as base64_encode() and serialize().

When doing the search for images that are similar, I did the exact same process to the image they wanted to upload, and then extracted image 'hashes' from the database to compare them all, and see what had matching rounded rgb's.

Tips

The Bigger that 50 is in the rgb rounding, the less specific your search will be (and vice versa).

If you want your SQL to be more specific, it may be better to store extra/specific info about the image in the database, so that you can limit the searches you get in the database. eg. if the aspect ratio is 4:3, only pull images around 4:3 from the database. (etc)

It can be difficult to get this perfectly 5px*5px, so a suggestion is phpthumb. I used it with the syntax:

phpthumb.php?src=IMAGE_NAME_HERE.png&w=10&h=10&zc=1

// &w= width of your image

// &h= height of your image

// &zc= zoom control. 0:Keep aspect ratio, 1:Change to suit your width+height

Good luck mate, hope I could help.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值