GotoDBA Troubleshooting Analyzing Raw Trace Files

Analyzing Raw Trace Files

Many times we need to understand what our server process does. When we run an application flow or in cases we cannot get or analyze a specific query, we turn tracing on and analyze the trace file.
When tracing a session, we usually use the tkprof command line utility to format the file so it is much more readable and easy to follow. The tkprof utility also allows us to sort the content of the trace file and filter recursive calls.
However, there are some things that exist in the raw trace file and are missing from the formatted file after tkprof. In this post I’ll explain the main things we see in the raw trace file and how they can help us.

There are several things that are missing from the formatted file, I’ll address two of them here: wait events and bind variables.
To trace a session we use the DBMS_MONITOR package. It allows us to enable and disable SQL trace for our session, other sessions or the entire database. It also has parameters to allow the capture of wait events (on by default) and the values of bind variables (off by default). The wait events information is shown in the formatted file after running tkprof, but the bind variables don’t.
The reason for that (and the second thing missing from the formatted file) is that tkprof aggregates the queries from the trace. If a query was executed 3 times, in the trace we’ll see it 3 times, but in the tkprof formatted file we will see the query once with 3 executions. So we don’t really see the queries one by one in the order of execution, and I don’t see the binds (because the query is only shown once). The same is true about the wait events during the query. We will see the total waits per wait event and the number of waits, but we won’t see the details (such as file# and block# for “db file sequential read”).
If we need to analyze the bind variables, the details of the wait events or simply see the real chronological flow of the session, we’ll have to dig in the trace file itself.
For demonstration, I created a table called tracetbl containing 2 columns: id (number) and name (varchar2) and inserted 1000 rows with id 1 to 1000. I enabled trace, and executed these commands (the parameters for dbms_monitor are SID and SERIAL#, nulls mean the current session, and the following trues are to enable bind and waits capture):

SQL> exec dbms_monitor.session_trace_enable(null,null,true,true);
PL/SQL procedure successfully completed.
SQL> variable id number;
SQL> exec :id:=5;
PL/SQL procedure successfully completed.
SQL> select * from tracetbl where id=:id;
        ID NAME
---------- --------------------
        5 name=5
SQL> exec :id:=2000;
PL/SQL procedure successfully completed.
SQL> select * from tracetbl where id=:id;
no rows selected
SQL> exec dbms_monitor.session_trace_disable;
PL/SQL procedure successfully completed.

The trace file generated by this session was about 2KB and 161 lines long, so I won’t post all of it, but when I look for the tracetbl table, this is the part in the trace which includes the query (in line 51, before that I could see the commands to enable the trace and the PL/SQL to set the bind variable value):

PARSING IN CURSOR #3084330624 len=35 dep=0 uid=68 oct=3 lid=68 tim=1434479573653924 hv=1169955350 ad='38d0ba78' sqlid='9d2mr0d2vs6hq'
select * from tracetbl where id=:id
END OF STMT
PARSE #3084330624:c=1000,e=402,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,plh=0,tim=1434479573653922

The first thing to look for is the cursor#. This number will represent this specific SQL cursor until it is closed (and then the number can be reused). In our case it is 3084330624. You can also see the SQL_ID of this query at the end of the line. The “tim” represent the time. I don’t know how to convert it to real time, but it doesn’t really matter, as we continue reading the file we will see the “tim” value for every command. Subtract the “tim” of the later command from the earlier one to find the time difference in microseconds.
After we have the cursor#, we can skip recursive calls that might appear (for dynamic sampling for example) and look for the next time this cursor appears:

BINDS #3084330624:
Bind#0
oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
oacflg=03 fl2=1000000 frm=00 csi=00 siz=24 off=0
kxsbbbfp=b7cacbb8  bln=22  avl=02  flg=05
value=5
EXEC #3084330624:c=3000,e=43002,p=0,cr=12,cu=0,mis=0,r=0,dep=0,og=1,plh=564910428,tim=1434479573696967
WAIT #3084330624: nam='SQL*Net message to client' ela= 2 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1434479573697014
FETCH #3084330624:c=0,e=42,p=0,cr=7,cu=0,mis=0,r=1,dep=0,og=1,plh=564910428,tim=1434479573697079
WAIT #3084330624: nam='SQL*Net message from client' ela= 215 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1434479573697321
FETCH #3084330624:c=0,e=14,p=0,cr=1,cu=0,mis=0,r=0,dep=0,og=1,plh=564910428,tim=1434479573697356
STAT #3084330624 id=1 cnt=1 pid=0 pos=1 obj=81718 op='TABLE ACCESS FULL TRACETBL (cr=8 pr=0 pw=0 time=34 us cost=3 size=30 card=1)'
WAIT #3084330624: nam='SQL*Net message to client' ela= 1 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1434479573697391
*** 2015-06-16 21:33:11.020
WAIT #3084330624: nam='SQL*Net message from client' ela= 17323489 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1434479591020890
CLOSE #3084330624:c=0,e=9,dep=0,type=0,tim=1434479591020979

As you can see, the first part is the binding of the bind variable. If there are more than one, we will see them one after the other, starting with Bind#0 for the first bind variable. In this case we see that Bind#0 (which is the only one) gets the value 5.
After that we see EXEC for this cursor, and this is the actual execution. The other lines are wait events with information and FETCH of the data back to the client. The last line here is closing the cursor. Now the cursor is closed and the next execution will parse the SQL once again with a new cursor# (in my demo it got the same number).
So If I look for the cursor number again, I see this:

PARSING IN CURSOR #3084330624 len=35 dep=0 uid=68 oct=3 lid=68 tim=1434479596296705 hv=1169955350 ad='38d0ba78' sqlid='9d2mr0d2vs6hq'
select * from tracetbl where id=:id
END OF STMT
PARSE #3084330624:c=0,e=62,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=564910428,tim=1434479596296704
BINDS #3084330624:
Bind#0
oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
oacflg=03 fl2=1000000 frm=00 csi=00 siz=24 off=0
kxsbbbfp=b7cacbb8  bln=22  avl=02  flg=05
value=2000
EXEC #3084330624:c=0,e=56,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=564910428,tim=1434479596296801

As you can see, the trace shows the SQL text, the PARSE operation and the BIND (this time the value is 2000). Then Oracle executes the SQL (and after that the wait events and fetch of the rows which I didn’t copy here).
Regarding wait events, during the execution we might see wait events. As I said, in the tkprof output file we will see only the event, how many times we were waiting on it and how much time. In the trace file we will see something like this:

WAIT #3083894392: nam='db file sequential read' ela= 12 file#=1 block#=385 blocks=1 obj#=42 tim=1434480464660869
WAIT #3083894392: nam='db file sequential read' ela= 5 file#=1 block#=62770 blocks=1 obj#=42 tim=1434480464660899
WAIT #3083190816: nam='db file sequential read' ela= 7 file#=1 block#=577 blocks=1 obj#=62 tim=1434480464661452

Here you can see that every event appears in a different line and we have the event parameters (such as file number, block number and number of blocks in this case). This information can be very useful, and an example of that and an investigation I did lately will be in a different post.
To sum things up, the raw trace file contains all the SQL and PL/SQL statements in the actual order they were executed and without any aggregation. It also contains all bind variable data for each execution (providing we enabled it when we enabled the trace) and all wait events with the relevant parameters. And this helps us in certain cases to investigate the application and Oracle behavior.
Hope you enjoyed it,
Liron

2 thoughts on “Analyzing Raw Trace Files”

  1. Are there any tools, awk scripts to parse the .trc file for a specific SQL, cursorid or for all in a trc file that differences the tim entry so longer request can be investigated further.

    1. Hi Kevin,
      For parsing the trace files you can use tkprof. It aggregates the information per query and gives quite a lot of information about the time, buffer gets, physical read, etc.
      Liron

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Post