http://it.toolbox.com/blogs/sap-on-db2/for-all-entries-vs-db2-join-8912
1. duplicate rows are automatically removed
2. if the itab used in the clause is empty , all the rows in the source table will be selected .
3. performance degradation when using the clause on big tables.
In this post I'd like to shed some light on the third issue. Specifically i'll discuss the use of the "for all entries" clause as a means to join tables in the abap code instead of in db2.
Say for example you have the following abap code:
Select * from mara
For all entries in itab
Where matnr = itab-matnr.
If the actual source of the material list (represented here by itab) is actually another database table, like:
select matnr from mseg
into corresponding fields of table itab
where ….
Then you could have used one sql statement that joins both tables.
Select t1.*
From mara t1, mseg t2
Where t1.matnr = t2.matnr
And T2…..
So what are the drawbacks of using the "for all entires" instead of a join ?
At run time , in order to fulfill the "for all entries " request, the abap engine will generate several sql statements (for detailed information on this refer to note 48230). Regardless of which method the engine uses (union all, "or" or "in" predicates) If the itab is bigger then a few records, the abap engine will break the itab into parts, and rerun an sql statement several times in a loop. This rerun of the same sql statement , each time with different host values, is a source of resource waste because it may lead to re-reading of data pages.
returing to the above example , lets say that our itab contains 500 records and that the abap engine will be forced to run the following sql statement 50 times with a list of 10 values each time.
Select * from mara
Where matnr in ( ...)
Db2 will be able to perform this sql statement cheaply all 50 times, using one of sap standard indexes that contain the matnr column. But in actuality, if you consider the wider picture (all 50 executions of the statement), you will see that some of the data pages, especially the root and middle-tire index pages have been re-read each execution.
Even though db2 has mechanisms like buffer pools and sequential detection to try to minimize the i/o cost of such cases, those mechanisms can only minimize the actual i/o operations , not the cpu cost of re-reading them once they are in memory. Had you coded the join, db2 would have known that you actually need 500 rows from mara, it would have been able to use other access methods, and potentially consume less getpages i/o and cpu.
In other words , when you use the "for all entries " clause instead of coding a join , you are depriving the database of important information needed to select the best access path for your application. Moreover, you are depriving your DBA of the same vital information. When the DBA monitors & tunes the system, he (or she) is less likely to recognize this kind of resource waste. The DBA will see a simple statement that uses an index , he is less likely to realize that this statement is executed in a loop unnecessarily.
In conclusion I suggest to "think twice" before using the "for all entries" clause and to evaluate the use of database views as a means to:
a. simplify sql
b. simplify abap code
c. get around open sql limitations.
Omer Brandis
DB2 DBA & SAP Basis professional (and all around nice guy)
omerb@srl.co.il
Related White Papers
Mar 1, 2007
Really nice statement. All you've written is true. But you compare ABAP coding with native SQL. What's you intention? Should an ABAP developer use more native SQL? You should show an example using ABAP and the ABAP join clause. In my opinion the major problem of ABAP seems to be that it is not ANSI SQL compliant. It's not a good idead to compare ANSI SQL with ABAP. But you are right: We could make thinks pretty much better if SAP would support ANSI SQL. |
omer brandis
| Mar 4, 2007
Dear JD, In this post, i didn't mean to compare ABAP open sql with DB2 NATIVE SQL, I meant to compare the abap "for all entries" clause with a "simple" database join, i.e. a join that is executed completly at the database level. This simple join can be writen in abap using the regular open sql syntax (no need for native sql syntax), all you have to do is write a simple sql join in your abap program. In conclusion, in this post i tried to shed some light on one of the [performance related ] draw backs of using the "for all entries" clause. I don't mean to say that open sql should be more ansi sql complient, nor that, open sql or that the "for all entries" clasue is "bad" and shouldn't be used ,nor that developers should code more native sql and less open sql. I hope this has helped clearify this blog entry (if not , please feel free to post more questions) Omer. |
Jun 1, 2007
Hello Omer, very useful article. I have a question concerning the performance of such statement:SELECT f1 f2 FROM db_table FOR ALL ENTRIES IN itab WHERE f3 = itab-f3 INTO... The itab-f3 has about 5 000 different values and it is not possible to use join. Is it better to use SELECT SINGLE in a loop or there is a better way to achieve good performance? BR Peter |
omer brandis
| Jun 3, 2007
Hello peter,if you are 100% sure , you can't code a database join, then : Theoretically , the "for all entries" clause has an advantage over a loop of select singel by primary key. This is becuase the abap interpreter can group together several rows in the itab into one select (see note 48230) so it could get the requested result using less conversations with the database. The down side is that by doing so, the where predicates will not be a simple equation predicates, they may become "in" predicates and so on (see note 48230). This change in the sql may lead to a more costly access path. So, unfortunatly, you will probably be forced to write and test both options and see which costs less. * by costs i mean total getpages or "in db2 time" for the whole workload, and not transaction elapsed time.) ** as a rule of thumb I would lean to select single by primary key, over "for all entries" ... but as i said, its depends on several factors.... good luck. omer |
Jun 5, 2007
Thanks Omer,I should really do some more tests. Unfortunately test results are totally different on different system and I do not know what will be the system of the customer. Anyway, once again thank you for the advices! Peter |
Jun 22, 2007
Omer - very intersting reading. I wish I had seen it before I wrote:https://weblogs.sdn.sap.com/pub/wlg/6050 I took a more practical rather than theoretical approach and came to basically the same conclusion. (However, I found in a few cases, when selecting a large number of records (say around 100,000), FOR ALL ENTRIES actually outperformed the JOIN.) Rob |
Oct 1, 2007
Dear OmerUse sub query instead of "for all entries", if possible, bcos Network load is considerably less when used large db. Please check DWDM transaction for more info. Regards Vijayendran S |
Nov 2, 2007
Hi,I'd like to point out that there IS a way to ensure that duplicate entries are not eliminated, in "FOR ALL ENTRIES" - in the FROM clause of your select-statements, ensure that the entire primary key of the database table is selected; this will automatically ensure that all entries, even duplicates, get picked up. |
Sep 5, 2008
Omar ? thanks for an interesting article. However have I seen when you have large number of records you need to do performance tests on which of the alternatives to use. Have seen huge differences in performance for tables with over 10 million records, especially where several tables have been JOINed together. However is that another but related question on how many table you should JOIN together.
NOTE 48230 |