You Asked (Jump to Tom's latest followup)
and we said...
Tom Can you give an example carried out in pro* c and the same done in pl/sql, and show us which is faster and why. I guess pro*c also does array inserts which in pl/sql we call bulk binds. Which of them is faster. thank y
I can show you code that will be faster in Pro*C then in PLSQL. I can show the SAME EXACT code where PLSQL will be faster then Pro*C. You have to consider where in the world the processing takes place. Lets say you have a very fast network and a super fast client. Pro*C can easily pull the data out of the database, over the network, work on it quickly and put it back into the database. It might be plsql here simply because it has a fast network and a very fast machine to work on. Now lets say the pro*c client is on a busy network and the client machine its running on is very utilized (busy). Now, PLSQL in the database -- since it doesn't have the network overhead and perhaps the database is not as utilized -- beats the pants off of pro*c (or more specifically C). The answer (as always) is -- it depends. There is no cut and dry answer for this. I personally prefer to put all data processing stuff in the database. In that fashion, I can do the processing from Pro*C, Java, Vb, SQLPlus -- whatever I want. If I find a bug in the processing, I can fix the stored procedure right there and I don't have to track done all of the code that might have duplicated this processing (and fix all of the copies). If I want to tune it, I just get the stored procedure out of the database -- tune it and put it back in. So, neither is blanket "faster" then the other. There are many reasons to put the logic in the database and as long as it goes fast enough -- thats my approach.
Reviews | ||||||||||||||||||||||||||
Reviewer: Enrique Aviles from Orlando, FL Just wanted to use this forum to share my opinion. I've read many answers to many questions posted on this web site and it seems to me that Tom prefers PL/SQL and doing "data processing" on the database than in Pro*C. My experience has been just the opposite. I think that for most business applications, PL/SQL is the language of choice. For heavy batch processing (million of records) that require complex data manipulation/processing with advanced data structures (trees, hash tables, linked lists) Pro*C is the language of choice. I supported an in-house developed system of a Microelectronics manufacturing company and I can't even begin to imagine writing the batch programs in PL/SQL. At some point we had to convert a 2000 PL/SQL script into a Pro*C program because of performance issues. The script ran in 3 hours and the equivalent Pro*C program ran in 15 minutes. All the underlying tables used in the script had the proper indexes and all other stuff required for performance. Problem is that when you need to query some tables millions or billions of times nothing beats loading the data in memory and using the bsearch() function. If you can manage to have most of your data loaded in memory and do all the calculations there, it will be faster than database processing most of the time and you can easily do that with Pro*C. My intention is not to bash Tom's opinions or experiences. He is certainly an extremely competent and knowledgable Oracle professional. I just wanted to share my experience which seems to be quite different from the usual answers presented in this web site. <script language="JavaScript1.1" type="text/javascript"> document.getElementById("latestLink").style.display = "inline" </script>
Reviewer: A reader > For heavy batch processing (million of records) that require complex data manipulation/processing with advanced data structures (trees, hash tables, linked lists) Pro*C is the language of choice Could you please explain it a bit further?
Reviewer: Enrique Aviles from Orlando, FL >> "as long as it goes fast enough". That is the crux of the situation. We needed the speed of C, that's why we had to use Pro*C. We couldn't run our batch process in a timely manner at night just using PL/SQL, that's where Pro*C made things possible. >> "I will most heartily disagree with you on the loading >> into memory and using bsearch(), i would have given the >> algorithms a look see and undoubtably (you can >> so "on no you wouldn't" -- but I have yet to see >> opposite) have seen lots of procedural code to >> simply "erase". Not sure what you're saying here. Bottom line is that I can bet the farm on bsearch(). If I need to run 10 million selects on a 500,000 row table I can do it faster by loading the table in memory and doing the lookups using bsearch(). I've seen the benefits of this many, many times. Converting a loop with one or more embedded selects into bsearch dramatically improves performance (of course, if that's the only bottleneck in the program). There is no disk I/O, no database interaction, no network traffic to worry about, just plain data in memory being accessed by perhaps the fastest search algorithm available. It's hard to beat that! <script language="JavaScript1.1" type="text/javascript"> document.getElementById("latestLink").style.display = "inline" </script>
Reviewer: Pavan from India Enrique, I don't know Pro*C much nor PL/SQL well but I can see that your argument is "flawed". The poster asked which is fast and when. Tom explained it by saying ... put everything in database .. if it doesn't work go to pro*c. In my opinion you are trying to generalize using one particular example
Reviewer: Zoran Martic from Dublin, Ireland - currently Alabama :) Just to answer on the question about array inserts: I was very surprised that until 9.2.0.6 (that you have only on Windows and Solaris 32 for now) if you need to use array (bulk) inserts with save exceptions it is just not going to work because of 3 nasty bugs. Even with 10g (10.0.1.3) is not working. I did not realize that until a few weeks ago while I wanted to test again the speed of array inserts with total failing bulk insert for some customer. I wanted to see why bulk inserts are that much slower when you have bad inserts in the set and I wanted to confirm that fact through the PL/SQL first, but found these nasty bugs and could not progress with PL/SQL even for testing. That reveals me the fact about that a small majority of people doing bulk inserts in PL/SQL. Just to not talk without proves, bug numbers are: 3377931, 3566124, 2708874. From the bug description for some it stated it is confirmed to be found in 9.2.0.5 (for now all my databases are 9.2.0.5 or 10.1.0.3). Am I supposed to revert it back now :) Also 9.2.0.5 fixed some other bugs we needed to fix. Tom, please prove me wrong with this, because I was so dissapointing in this particular case. Also somebody said PL/SQL is using OCI interface (maybe UPI as Pro*C if anybody out of that Oracle group knows exactly about UPI) and because of that in C-OCI or Pro*C it is fixed first because it needs to be of the customer volume behind at least array inserts in C world. With Pro*C or C-OCI it works from 8i (cannot remember earlier, but could be in earlier releases of OCI or Pro*C interface). That means stright answer on your question about what is faster: PL/SQL or Pro*C is? it works properly only in Pro*C or C-OCI that we are using if you want to handle bulk exceptions :) Oracle recommendation as the workaround in not using SAVE EXCEPTIONS in PL/SQL is to not use SAVE EXCEPTIONS, that means if you have any bad insert in the array insert you will need to stop processing and do whatever recovery in the exception session. Of course if you do not have any errors then PL/SQL will work just fine with array inserts. Tom, I like PL/SQL because it is so easy to develop and maintain, but in the case as Enrique said where you need array processing capabilities faster is going to be in Pro*C or even C-OCI. Enrique, not sure that you ever used ASSOCIATIVE ARRRAYS in the PL/SQL. I did the tests with it on 5M rows tables (around 300M of PGA space, that you need to load) and associative arrays are this fast: 10g - 2M lookups per second 9i - 0.5 lookups per second When you are looking into this you are seeing "crazy" Oracle algoritham to do this. I tested many other fast memory lookups in C and they were not that fast. Not sure about that bsearch (going to investigate) but it looks Oracle has the "crazy" bsearch here. But again you have the overhead to load the table in ASSOCIATIVE ARRAY in PL/SQL at the beggining if the table is big (probably the same with doing that in Pro*C). Of course I am doing as much possible in single SQL, but sometimes the business logic is much more dynamic driven and it is crazy harder to do it with huge SQL statements. I already discussed this with Tom in some earlier thread. That concludes that again dependant on all relevant requirements you will choose one or other as Tom said. But for the keeping this discussion to the point of the first question Pro*C is for the moment more reliable or even the only working solution bearing in mind that in PL/SQL you need to use Oracle workarounds for PL/SQL array inserts if they work at all (for me even workarounds gave me internal errors when using SAVE EXCEPTIONS). At the end, thanks to Enrique who expressed the different view, in this case probably correct one. Without his input I will not be that intrigated to write my comments. Regards, Zoran
<script language="JavaScript1.1" type="text/javascript"> document.getElementById("latestLink").style.display = "inline" </script>
Reviewer: Enrique Aviles from Orlando, FL Tom said: "what I'm saying is -- you bet the farm on bsearch and I'll beat it with SQL." With all due respect... I highly doubt it. Pavan said: "In my opinion you are trying to generalize using one particular example". Agreed. I should have not generalized with a trivial example as a table lookup. My original post was motivated by the system I used to support. Maybe I should have used that as a more general/comprehensive example. Following that idea, Zoran hit the nail on the head when he said: "Of course I am doing as much possible in single SQL, but sometimes the business logic is much more dynamic driven and it is crazy harder to do it with huge SQL statements." That's closer to what I was trying to say. The planning system I supported was very complex due to the nature of the business (microelectronics manufacturing). The fabrication of integrated circuits is much more complex than building refrigerators or chairs. The planning system had to calculate an optimal delivery date for every order. The system had to consider the inventory available to build a particular product (many parts with its own characteristics), the factory capacity available (worldwide factories), all the different ways a product could be built (think permutations), and come up with the best date that the company could deliver the product. The memory map of such system looked like an operating system. We had arrays, linked lists, trees, hash tables and pointers galore. The system ran every night to see if it could improve on the previous day's results (due to cancelled orders, which released inventory/capacity and other factors). This is a system that I cannot even imagine being written in SQL. There was too much data manipulation and too many calculations to effectively do it in the database or with PL/SQL. <script language="JavaScript1.1" type="text/javascript"> document.getElementById("latestLink").style.display = "inline" </script>
Reviewer: Zoran Martic from Dublin, Ireland Hi Tom, You are correct about Pro*C. There is not save exception method there at all. I extended my OCI behaviour to Pro*C by default assuming it is the similar. I was wrong, that means SAVE EXCEPTIONS feature is only useful in OCI and not in Pro*C and PL/SQL (is fixed in PL/SQL in 9.2.0.6, will be soon for all platforms). Tom, I am not saying PL/SQL is useless because I like to do everything in PL/SQL if possible. That is what I know better then C at the moment. If you can overcome all PL/SQL bugs with array inserts then great, use PL/SQL. Also, I agree that probably you can do almost everything in PL/SQL and SQL. But the question was what is faster??? Neither of us concentrated on the exact question. Because I did not test Pro*C against PL/SQL I cannot tell exactly. I suppose that Oracle should have some test results???? Because Tom is in Oracle he can maybe share some or find them somewhere (of course that it cannot apply to any scenario, but at least to get an idea). Then I can say that probably for array (bulk insert) performance the difference will be based on: a) the networking with Pro*C the networking overhead is something hard to find published in Oracle b) where the source data is: in the table or the client side (sometimes you will mix both Pro*C + PL/SQL for the best results) If looking only the bulk inserts then hard to say. Proabably to decide what is better for you it needs to be considered the whole use-case scenario + who is going to do it. If it is me that will be PL/SQL and not C :) But again Enrique and me just wanted to talk about the things from the different angle. Regards, Zoran <script language="JavaScript1.1" type="text/javascript"> document.getElementById("latestLink").style.display = "inline" </script>
Reviewer: Zoran Martic from Dublin, Ireland Tom, I agree fully with you about using the most appropriate thing when needed. I am not sure that anybody here did not agree with you in previous responses. We just mentioned our experience with this thing. At the end I have just expressed the problems with PL/SQL because of bugs related to some aspects of array inserts while that thing is working with C-OCI for a long time. In PL/SQL is not yet fixed, even in 10g. Would you confirm that PL/SQL is based on OCI, please? Just curious, because you have SQL*Net waits while using PL/SQL :) Regards, Zoran
<script language="JavaScript1.1" type="text/javascript"> document.getElementById("latestLink").style.display = "inline" </script>
Reviewer: Enrique Aviles from Orlando, FL Zoran wrote: "I am not sure that anybody here did not agree with you in previous responses. We just mentioned our experience with this thing." I could not have said it better myself. I totally agree that PL/SQL, Pro*C, C or any other language are tools. Some of them are better suited for a particular application or problem than others.
Reviewer: Tim... from UK Just a few points: - Pulling everything back into arrays and processing it can use alot of memory and result in you processing out-of-date data. I've seen it cause problems many times. - The speed of procedural code can be improved using native compilation. Ease of PL/SQL, speed of machine code. Best of both worlds ;-) - I worked on real-time-control systems for a while. Some of the guys did everything in Pro*C, I coded most of my stuff in PL/SQL. My code kicked butt and everyone started to follow my lead. I'm not saying this is right for everyone, but in my experience, if data is involved let the database do it... Cheers Tim...
Reviewer: A reader from Sweden I have some fairly heavy batch jobs in my programs and I could, of course, do it the Tom way and put a lot of it in the db. But then, on the other hand, Oracle is doing the processing and calculations and will quicker reach the state of high load. I could buy the license for another cpu, but this will double the cost of the Oracle licenses and for an EE license and that hurts. Badly. This is my pragmatic view of it. If Oracle was to drop the per CPU license to strictly per USER I'd go with the Tom approach of putting as much in the db as possible. Generally that really is the best way, I think. Yes, I am aware of the new cheap ($150/user) license for small customers. Thank you for that, Oracle, and I really mean it. As running a startup company I feel a thousand times more safe basing my product on Oracle then an Open Source project and still being competitive at pricing. Open Source is good, but you cannot make volunteers get up in the middle of the night to fix an error if my customers scream. <script language="JavaScript1.1" type="text/javascript"> document.getElementById("latestLink").style.display = "inline" </script>
Reviewer: A reader Yes, but you can have different massages. I don't do a quick rub on the shoulders and therefore I gain speed by doing it in C++. On the other points I agree or stand corrected. We have allready had one customer who turned us down of fear of costly oracle consultants and products. They go with a mySql solution. If I can cut some price by doing the above I will. But this situation, I guess, will change by you (oracle) now started targeting small & midsize companies as well. I've even tried Postgres as they have a pro*c-ish program, but it wont array read (I even went through the source) and Postgres lacks a busload of other stuff. I feel like I'm getting off topic now... Sorry. Have a nice weekend, Tom!
<script language="JavaScript1.1" type="text/javascript"> document.getElementById("latestLink").style.display = "inline" </script>
Reviewer: A reader I'm a developer of a program using Oracle as a database. The FUD numbers came from the customer and we failed to convince them of otherwise. Oracle has a reputation of being expensive and that reputation will not change over night, even if you have lowered your prices. Sql server started out quite cheap and has got a reputation as quite cheap, even if they are about same as you are now. Just like Volvo still has a reputation of being a slow tank, even if they nowdays are just like any other (european) car. Who do you think people associate the monster in the sybase(?) ads with? Come on! Shoot back, you've got some good ammo now. In know of several start-ups who stay away from Oracle because they have illusions about the cost. It was f.ex by pure luck I found the Oracle personal edition offer a year and a half ago, which I could do prototypes with. Now I'll stop torturing you and the readers of asktom with my nonsense. <script language="JavaScript1.1" type="text/javascript"> document.getElementById("latestLink").style.display = "inline" </script>
Reviewer: A reader Sybase is running ads for their linux campaign:http://www.sybase.com/linuxpromo You can see the ads if you f.ex reload freshmeat.net a couple of times. If you read the highly intellectual comments on slashdot.org it also appears there from time to time. I found a small note about Oracle's $150/user offering in a computer magazine I subscribe to. But is this enough to spread the word about it? Most start-ups nowdays try to use good free libs & utils in their products to reduce time to market and initial r&d cost and search sourceforge.net (& co) open source portals frequently. Therefore I think it was a brilliant move of Sybase to put their ads there. As for the SS vs Oracle pricing, what I wrote is what I've heard from others, including guys basing their software around SS. I'm a programmer and not very excited about reading price charts. Few are, I guess. Please feel free to remove my comments as they are likely not of any general interrest. <script language="JavaScript1.1" type="text/javascript"> document.getElementById("latestLink").style.display = "inline" </script>
Reviewer: A reader Hi I would like to know if we can use host structure for updates, in the doc it suggests this You can use a C structure to contain host variables. You reference a structure containing host variables in the INTO clause of a SELECT or a FETCH statement, and in the VALUES list of an INSERT statement. typedef struct { char emp_name[11]; /* one greater than column length */ int emp_number; int dept_number; float salary; } emp_record; ... /* define a new structure of type "emp_record" */ emp_record new_employee; strcpy(new_employee.emp_name, "CHEN"); new_employee.emp_number = 9876; new_employee.dept_number = 20; new_employee.salary = 4250.00; EXEC SQL INSERT INTO emp (ename, empno, deptno, sal) VALUES (:new_employee); can we do this with UPDATES? <script language="JavaScript1.1" type="text/javascript"> document.getElementById("latestLink").style.display = "inline" </script>
Reviewer: albert from china Hi tom you say you prefer to plsql rather than pro*C, But I have a case in our Data Warehouse system as following: 1/ a source table "sales_source" with 30+ million records : column -------- prod_id cust_id date_id sales_amount 2/ and two Dimension tables 2.1/ the "Customer" table with 2+ million records: column --------------------- cust_Surrogate_key cust_id cust_name cust_age 2.2/ the "product" table with 2000 records: column -------------------- prod_Surrogate_key prod_id prod_name prod_catalog 3/ the target fact table "cube_sales" is column -------------- cust_Surrogate_key prod_Surrogate_key date_id sales_amount We want to transfer sales data from the source table to the target one, meanwhile replacing the cust_id with a Surrogate key such that we could resolve the "Slowly Changing Dimensions" problem. Would you please show me a simple and effective PL/SQL solution to achive the that ? I wonder if PL/SQL can be enough powerful to do faster lookup then pro*C ? best regard
Reviewer: Pradip Kr Das from India It is so helpul that no word to say. Answer is very clear and understanding. |