Using partition by to write alternative SQL to get effdt and effseq items

Using partition by to write alternative SQL to get effdt and effseq items

Paritioning data is really useful for grouping data prior to data manipulation (as opposed to group by which does grouping after manipulation the data). This is because group by is grouping resulting data while partition by is actually partitioning  the data and returning the appropriate partitioned data. Simply said; a normal query is run and the results are retrieves by the Database, then Analytics are applied to the results and the Analytic function columns are computed.

This can be very usefull in peoplesoft when retrieving MAX(effdt) and MAX(effseq). Normally we would write a SQL like this to get the MAX(effdt) and MAX(effseq) for each EMPLID and EMPL_RCD.


SELECT A.EMPLID, A.EMPL_RCD, A.EFFDT, A.EFFSEQ
FROM PS_JOB A
WHERE ( A.EFFDT =
(SELECT MAX(A_ED.EFFDT) FROM PS_JOB A_ED
WHERE A.EMPLID = A_ED.EMPLID
AND A.EMPL_RCD = A_ED.EMPL_RCD
AND A_ED.EFFDT <= sysdate)
AND A.EFFSEQ =
(SELECT MAX(A_ES.EFFSEQ) FROM PS_JOB A_ES
WHERE A.EMPLID = A_ES.EMPLID
AND A.EMPL_RCD = A_ES.EMPL_RCD
AND A.EFFDT = A_ES.EFFDT) )

As we can see in the explain plan, a lot is going on in the tables just to get the MAX(effdt) and MAX(effseq).

normal sql explain plan

In the next example I have use the Partition by function to get the MAX(effdt) and MAX(effseq) It partitions job by EMPLID and EMPL_RCD, and returns the maximum dated row for each partition.


SELECT J.EMPLID,
J.EMPL_RCD,
J.EFFDT,
J.EFFSEQ
FROM
(SELECT EMPLID,
EMPL_RCD,
EFFDT,
EFFSEQ,
MAX (EFFDT) over (partition BY EMPLID, EMPL_RCD) AS MAX_JOB_EFFDT,
MAX (EFFSEQ) over (partition BY emplid, EMPL_RCD, EFFDT) AS MAX_JOB_EFFSEQ
FROM PS_JOB
WHERE effdt <= SYSDATE) J
WHERE J.EFFDT = J.MAX_JOB_EFFDT
AND j.EFFSEQ = J.MAX_JOB_EFFSEQ;

In this explain plan, a lot less is going on but it returns the exact same data.

Partitioned SQL

Also you might notice that the CPU Cost is much lower. This doesn’t have to be the case with all partition by data. Because we are using analytic functions, when processing this on 300,000 rows in job the cost might be much higher then the cost of a normal query. The big difference will be in the execution time. For example; when executing the normal query this might cost 2.500 seconds, while in analythis this will be below 1.000 seconds. This is something to keep in mind.

Analytic functions  sometimes cost more CPU when processig large amounts of rows, but execution time will also be much, much faster.

 Viewed 1324 times by 542 visitors

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Here is a sample code that performs image thresholding using partition method first and then Otsu's method: ```c #include <stdio.h> #include <stdlib.h> #include <math.h> #define MAX_PIXEL_VALUE 255 int partition_thresholding(int *histogram, int num_pixels); int otsu_thresholding(int *histogram, int num_pixels); int main() { int width = 256, height = 256; unsigned char *image = malloc(width * height); // Generate a sample image for (int i = 0; i < height; ++i) { for (int j = 0; j < width; ++j) { image[i * width + j] = i % 256; } } // Compute histogram of the image int *histogram = calloc(MAX_PIXEL_VALUE + 1, sizeof(int)); for (int i = 0; i < height; ++i) { for (int j = 0; j < width; ++j) { histogram[image[i * width + j]]++; } } // Threshold using partition method int partition_threshold = partition_thresholding(histogram, width * height); printf("Partition threshold: %d\n", partition_threshold); // Threshold using Otsu's method int otsu_threshold = otsu_thresholding(histogram, width * height); printf("Otsu threshold: %d\n", otsu_threshold); free(image); free(histogram); return 0; } int partition_thresholding(int *histogram, int num_pixels) { int sum = 0, num_background = 0, num_foreground = 0; for (int i = 0; i <= MAX_PIXEL_VALUE; ++i) { sum += i * histogram[i]; num_foreground += histogram[i]; } int threshold = 0, max_variance = 0; for (int i = 0; i <= MAX_PIXEL_VALUE; ++i) { num_background += histogram[i]; num_foreground -= histogram[i]; if (num_background == 0 || num_foreground == 0) { continue; } int mean_background = sum - i * num_foreground; mean_background /= num_background; int mean_foreground = sum - mean_background * num_background; mean_foreground /= num_foreground; int variance = num_background * num_foreground * pow(mean_background - mean_foreground, 2); if (variance > max_variance) { max_variance = variance; threshold = i; } } return threshold; } int otsu_thresholding(int *histogram, int num_pixels) { float sum = 0, num_background = 0, num_foreground = 0; for (int i = 0; i <= MAX_PIXEL_VALUE; ++i) { sum += i * histogram[i]; } float sum_background = 0, sum_foreground = 0; int threshold = 0; float max_variance = 0; for (int i = 0; i <= MAX_PIXEL_VALUE; ++i) { num_background += histogram[i]; if (num_background == 0) { continue; } num_foreground = num_pixels - num_background; if (num_foreground == 0) { break; } sum_background += i * histogram[i]; sum_foreground = sum - sum_background; float mean_background = sum_background / num_background; float mean_foreground = sum_foreground / num_foreground; float variance = num_background * num_foreground * pow(mean_background - mean_foreground, 2); if (variance > max_variance) { max_variance = variance; threshold = i; } } return threshold; } ``` In this code, we first generate a sample image and compute its histogram. Then, we apply partition method and Otsu's method to compute the threshold value. Finally, we print the threshold values obtained from both methods. Note that this code assumes that the input image is a grayscale image with pixel values ranging from 0 to 255. You may need to modify the code if your input image has a different format.

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值