Suck It Dry - Tuning Parallel Execution

Home Page

Back to Papers

Please note this page has moved to http://oracledoug.com/px.html. You will be redirected in 5 seconds but it would be better to update your bookmarks ;-)

Introduction

The first time I heard of Oracle’s Parallel Query Option was in 1993, when my boss returned from the IOUG conference. He was an experienced database guy and I still remember his comment that ‘this could be as big a step forward as B-tree indexing’.

I heard nothing about it for another 2 or 3 years, working at various sites as a contractor and the facility wasn't released until Oracle 7.1.6 Then I worked at a site where one of the DBAs decided we should ‘give it a try’ to improve the performance of our overnight batch schedule. The results were disastrous so it was switched off again because no one had the time to investigate why. Much of the time, that’s the the day-to-day reality of life as a working DBA. Periodically I would hear about someone trying this wonderful feature with negative results so it was never used as much as Oracle must have hoped.

As Oracle extended the server's capabilities over time to include parallel DML and the like, the name of this functionality has been changed to Parallel Execution, but you'll hear people using both names.

I think that one of the reasons why Parallel Execution is not used at many sites is that Oracle’s basic Instance Architecture works so well. One of the foundations of the architecture is multiple processes running concurrently that use multi-CPU servers effectively. In a typical Production OLTP system there will be a large number of concurrent sessions. In most configurations, each will have their own dedicated server process to communicate with the instance. When all of these server connection processes are considered, the reality is likely to be that you are using your available CPU resource quite effectively, particularly when you consider the various background processes as well!

However, what about those situations when you have a resource intensive job to run, whether it be a long running report query, large data load or perhaps an index creation? That's where parallel execution can prove extremely useful. It allows the server to take a single large task, break into separate streams of work and pass those streams to parallel execution (PX) slave processes for completion. Because the PX slaves are separate processes (or threads in a Windows environment), the operating system is able to schedule them and provide timely CPU resource in the same way that it would schedule individual user sessions. In fact, each PX slave is just like a normal dedicated server connection process in many ways so it’s like setting four or eight normal user sessions to work on one problem. Of course, those ‘users’ need to behave in a co-ordinated manner (probably unlike most real users!).

As Oracle's parallel capabilities have been developed, most tasks can be executed in parallel now. According to the 9.2.0.5 documentation, Oracle supports parallel execution of the following operations: -

Access methods - For example, table scans, index fast full scans, and partitioned index range scans.
Join methods - For example, nested loop, sort merge, hash, and star transformation.
DDL statements - CREATE TABLE AS SELECT, CREATE INDEX, REBUILD INDEX, REBUILD INDEX PARTITION, and MOVE SPLIT COALESCE PARTITION
DML statements - For example, INSERT AS SELECT, updates, deletes, and MERGE operations.
Miscellaneous SQL operations - For example, GROUP BY, NOT IN, SELECT DISTINCT, UNION, UNION ALL, CUBE, and ROLLUP, as well as aggregate and table functions.

However, Unless a task is likely to run for a fair amount of time (minutes or hours, rather than seconds) there's not much point in splitting it into parallel streams. The process initialisation, synchronisation and messaging effort might take longer than the original single-threaded operation!

In practice, you can simplify the possibilities for the SELECT statements that I’m going to focus on to those that include one of the following: -

Full Table Scans
Partition Scans (Table or Indexes)

Or, to put it another way, those operations that are likely to process significant volumes of data. There wouldn’t be any significant benefits to having multiple processes retrieving a handful of rows via a selective index.

Technical Papers Utilities and Scripts Book Reviews Links
My Resume Fun & Games Email Home