Suck It Dry - Tuning Parallel Execution

Home Page

Back to Papers

Please note this page has moved to http://oracledoug.com/px8.html. You will be redirected in 5 seconds but it would be better to update your bookmarks ;-)

Session Tracing and Wait Events

Tracing an application that uses parallel execution is a little more complicated than tracing non-parallel statements in a few ways.

A trace file will be generated for each slave process as well as for the query co-ordinator.
As I’ve just mentioned, it’s time consuming prior to 10g to identify precisely what application operation a PX slave is involved in processing.
The trace files for the slave processes may be created in background_dump_dest, rather than the standard user_dump_dest. This is version dependant and the trace file for the query coordinator will be in user_dump_dest in any case
As a result of all the synchronisation and message passing that occurs between the different processes, there a number of additional wait events.

You won’t find too much specific information about tracing Parallel Execution because it’s based on exactly the same principles as standard tracing, with the few differences mentioned above. The biggest problem tends to be the large number of trace files that you have to analyse! However, here's an excellent guide to PX tracing and 10g from Mark Rittman's Blog.

Parallel-specific Wait Events

The first thing to get used to when monitoring PX, whether it be using Statspack at the high level or event tracing at the low level, is that you are going to see a lot more wait events, including types that you won’t have seen before. Here are some of the Parallel-specific wait events.

Events indicating Consumers are waiting for data from Producers

PX Deq: Execute Reply
PX Deq: Table Q Normal

Oracle’s documentation states that these are idle events because they indicate the normal behaviour of a process waiting for another process to do its work, so it’s easy to ignore them. However, if you have excessive wait times on these events it could indicate a problem in the slaves. To give you a real-world example, here is the top timed events section of a Statspack report from a production system I worked on.


Event                        Waits   Timeouts   Time (s)   (ms)     /txn
direct Path read         2,249,666          0    115,813     51     25.5
PX Deq: Execute Reply      553,797     22,006     75,910    137      6.3
PX qref latch               77,461     39,676     42,257    546      0.9
library cache pin           27,877     10,404     31,422   1127      0.3
db file scattered read   1,048,135          0     25,144     24     11.9

The absolute times aren’t important here, just the events. First, it’s worth knowing that PX slaves perform direct path reads rather than db file scattered reads. You may already be used to direct path reads because they’re used with temporary segments for example. On this system, which was a European-wide Data Warehouse, we were performing long-running SELECT statements as part of the overnight batch run, so a high level of disk I/O was inevitable. (Whether an average wait time of 51 ms is acceptable when you’ve spent a small fortune on a Hitachi SAN is another matter!)

The next event is PX Deq: Execute Reply, which Oracle considers to be an idle event as I’ve mentioned. So we ignore that and move down to the next event. The PX qref latch event can often mean that the Producers are producing data quicker than the Consumers can consume it. On this particular system, very high degrees of parallelism were being used during an overnight batch run so a great deal of messaging was going on. Maybe we could increase parallel_execution_message_size to try to eliminate some of these waits or we might decrease the DOP.

But the real problem that we were able to solve was the next event – library cache pin. This event represents Oracle trying to load code into the Library Cache so you wouldn’t normally expect to see a significant percentage of wait time for this event unless the Shared Pool is really struggling (which it was on this system).

So next we drill down and start to try session tracing to establish the source of these events. Initially I was unsuccessful in tracking them down until I realised that the PX Deq: Execute Reply was a useful hint. The fact is that many of these wait events were happening in the PX slaves and many of the PX Deq: Execute Reply events were caused by the QC waiting for the PX slaves, which were waiting for the library cache pin latch! So sometimes idle events are important.

Eventually it turned out to be a pretty bad bug in earlier versions of 9.2 (fixed in 9.2.0.5) that caused some of our 2-minute SQL statements to occasionally take 2 hours to run. (Yes, that really does say 2 hours.) Anyway, back to more wait events.

Events indicating producers are quicker than consumers (or QC)

PX qref latch

I’ve found that PX qref latch is one of the events that a system can spend a lot of time waiting on when using Parallel Execution extensively (as you can see from the earlier Statspack example). Oracle suggest that you could try to increase parallel_execution_message_size as this might reduce the communications overhead, but this could make things worse if the consumer is just taking time to process the incoming data.

Synchronisation Message Events

PX Deq Credit: need buffer
PX Deq: Signal Ack
PX Deq: Join Ack

Although you will see a lot of waits on these synchronisation events – the slaves an QC need to communicate with each other - the time spent should not be a problem. If it is, perhaps you have an extremely busy server that is struggling to cope and reducing the Degree of Parallelism and parallel_max_servers would be the best approach.

Query Coordinator waiting for the slaves to parse their SQL statements

PX Deq: Parse Reply

Long waits on this event would tend to indicate problems with the Shared Pool as the slaves are being delayed while trying to parse their individual SQL statements. (Indeed, this was the event I would have expected to see as a result of the bug I was talking about earlier but the library cache pin waits were appearing in the Execute phase of the PX slave’s work.) Again, the best approach is to examine the trace files of the PX slaves and track down the problem there.

Partial Message Event

PX Deq: Msg Fragment

This event indicates that parallel_execution_message_size may be too small. Maybe the rows that are being passed between the processes are particularly long and the messages are being broken up into multiple fragments. It’s worth experimenting with message size increases to reduce or eliminate the impact of this.

Technical Papers Utilities and Scripts Book Reviews Links
My Resume Fun & Games Email Home