Saturday, November 3, 2012

ODI Monitoring scripts

I have included some usefule ODI monitoring scripts (if you want the abridged version of this blog post).

I haven't been blogging in a while (it's been crazy), but I wanted to share some information on ODI.

I have been working on trying to monitor ODI (Oracle Data Integrator).  ODI is a somewhat recent Oracle purchase, and it has a client GUI that is used by the developers.

Me (like many of you), are DBA's, and we want to go into the database to see what is happening.  We eithor don't have access to the GUI, or we don't want access. 

ODI is a great tool used for transforming data.  It can be used to build sql statments that are executed directly in the database.  This makes it a bit different from a tool like Datastage that runs sql remotely.

Here is the first sql I was able to come up with.  It will tell you information about the load plans that have been run. You need to qualify the tables with the owner of the ODI repository..

SELECT SLI.I_LP_INST AS "Load Plan Instance #"
     , SLR.NB_RUN AS "Load Plan Run #"
     , SLI.LOAD_PLAN_NAME AS "Load Plan Name"
     , SLR.CONTEXT_CODE AS "Source System"
     , SLR.STATUS AS "Load Plan Status"
     , SLR.RETURN_CODE AS "error code"
     , CASE WHEN SLR.END_DATE IS NULL 
            THEN TRUNC(ROUND((NVL(SLR.END_DATE , SYSDATE) - SLR.START_DATE)*86400) / 3600) || ':' || 
                 LPAD(TRUNC(MOD(ROUND((NVL(SLR.END_DATE , SYSDATE) - SLR.START_DATE)*86400), 3600) / 60), 2, 0) || ':' || 
                 LPAD(MOD(ROUND((NVL(SLR.END_DATE , SYSDATE) - SLR.START_DATE)*86400), 60), 2, 0)
            ELSE TRUNC(SLR.DURATION / 3600) || ':' || LPAD(TRUNC(MOD(SLR.DURATION, 3600) / 60), 2, 0) || ':' || LPAD(MOD(SLR.DURATION, 60), 2, 0) 
       END AS "Load Time"
     , SLR.START_DATE
     , SLR.END_DATE
     , substr(to_char(SLR.START_DATE,'mm/dd/yy:hh24'),1,11) start_date_hour
  FROM SNP_LP_INST SLI
  JOIN SNP_LPI_RUN SLR ON SLI.I_LP_INST = SLR.I_LP_INST
where 'JRNL_LOAD'=sli.load_plan_name

I was able to include this in an apex report, and it display some of the history.

The next SQL igives you detail of the scenarios, for the load plan. The input is the Load plan



SELECT SLI.Load_plan_name as "Load Plan Name",
      SUBSTR(SLR.CONTEXT_CODE, 9, 5) AS "Source System",
      SLS.LP_STEP_NAME AS "Target Table",
      SLS.scen_name as "scenario name",
     TRUNC(SUM(SSTL.TASK_DUR) / 3600) || ':' ||
             LPAD(TRUNC(MOD(SUM(SSTL.TASK_DUR), 3600) / 60), 2, 0) || 
            ':' || LPAD(MOD(SUM(SSTL.TASK_DUR), 60), 2, 0) AS "Load Time"
     , SST.SESS_NO AS "Session Number"
     , SLSL.start_date as "Start Time"
     , SLSL.End_date as "End Time"
     , sum(sstl.nb_ins) as "Rows Inserted"
     , sum(sstl.nb_upd) as "Rows Updated"
     , sum(sstl.nb_del) as "Rows Deleted"
     , sum(sstl.nb_err) as "Rows Errors"
     , case 
        when (sum(sstl.nb_ins) + sum(sstl.nb_upd)) > 0 then trunc(sum(sstl.task_dur)/(sum(sstl.nb_ins) + sum(sstl.nb_upd)) ,4)
        else 0
       end as "Throughput"
  FROM SNP_LP_INST SLI
  JOIN SNP_LPI_STEP SLS 
                  ON SLI.I_LP_INST = SLS.I_LP_INST
  JOIN SNP_LPI_STEP_LOG SLSL
        ON SLS.I_LP_STEP = SLSL.I_LP_STEP
      AND SLS.I_LP_INST = SLSL.I_LP_INST
  JOIN SNP_SESS_TASK SST
            ON SST.SESS_NO = SLSL.SESS_NO
  JOIN SNP_SESS_TASK_LOG SSTL
            ON SSTL.SCEN_TASK_NO = SST.SCEN_TASK_NO 
           AND SST.SESS_NO = SSTL.SESS_NO
  JOIN SNP_LPI_RUN  SLR 
            on SLI.I_LP_INST = SLR.I_LP_INST
 WHERE (1=1)
  AND SLSL.I_LP_INST = :P6_SCENARIO
   AND SLS.LP_STEP_TYPE = 'RS'
--   AND SLSL.STATUS IN ('M','D')
 GROUP BY SUBSTR(SLR.CONTEXT_CODE, 9, 5),
           SLSL.start_date,SLSL.end_date,SLI.load_plan_name,
          SLS.scen_name,SLS.LP_STEP_NAME, SST.SESS_NO

Finally, this is the last query. This query takes the task number as an input, and will display the detail for all the tasks contained in a scenario.
SELECT SST.TASK_NAME2 AS "Session Name"
     , SST.TASK_NAME3 AS "Task Name"
     , CASE WHEN SSTL.TASK_END IS NULL 
            THEN TRUNC(ROUND((NVL(SSTL.TASK_END , SYSDATE) - SSTL.TASK_BEG)*86400) / 3600) || ':' || 
                 LPAD(TRUNC(MOD(ROUND((NVL(SSTL.TASK_END , SYSDATE) - SSTL.TASK_BEG)*86400), 3600) / 60), 2, 0) || ':' || 
                 LPAD(MOD(ROUND((NVL(SSTL.TASK_END , SYSDATE) - SSTL.TASK_BEG)*86400), 60), 2, 0)
            ELSE TRUNC(TASK_DUR / 3600) || ':' || LPAD(TRUNC(MOD(TASK_DUR, 3600) / 60), 2, 0) || ':' || LPAD(MOD(TASK_DUR, 60), 2, 0) 
       END AS "Load Time"
     , substr(sst.def_context_code,9,5) "Context"
     , SSTL.TASK_BEG AS "Start Time"
     , SSTL.TASK_END AS "End Time"
     , SSTL.NB_DEL AS "Rows Deleted"
     , SSTL.NB_UPD AS "Rows Updated"
     , SSTL.NB_INS AS "Rows Inserted"
     , SSTL.NB_ERR AS "# Of Errors"
     , SST.SESS_NO 
     , sst.scen_task_no
/* UNCOMMENT TO GET SQL EXECUTED FOR THIS STEP */
  FROM SNP_SESS_TASK SST,
      SNP_SESS_TASK_LOGv SSTL 
 WHERE (1=1)
   AND SST.SESS_NO =:P7_TASK   
   AND SSTL.TASK_STATUS IN ('D','M','R')
   AND SSTL.SCEN_TASK_NO = SST.SCEN_TASK_NO 
   AND SST.SESS_NO = SSTL.SESS_NO
 

Friday, October 5, 2012

Anayzing query Cost example

Well,
  Here is the issue I've been dealing with.  The query cost doesn't stay consistent, and I was wondering if a profile would help keep it stable, and if it keeps things better or worse.

  I'm sure you have all run into this.  For some reason your cardinality can't be well estimated, the optimizer changes plans on you, and you want to know if a profile will help or hurt.

  The good news I had the perfect test case.  The query was part of a data load, and there was a driving table with a varying number of rows in it.  This is how I went about analyzing.

First I created a table to store the results.

create table bgrenn.mytable
 (ROW_COUNT                                          NUMBER,
 PLAN_HASH_VALUE                                    VARCHAR2(15)
 COST                                               number);




Next I took a copy of my driving table ( a full size table). I used this to create a smaller copy of the table.



declare


row_count number:= 0;
v_statement_id varchar2(10);
v_plan_hash_value varchar2(15);
v_cost number;
v_statement varchar2(4000);

begin

for row_count in 0..100000  loop


execute immediate 'drop table BGRENN.TMP_DRIVER purge';
execute immediate 'create table   BGRENN.TMP_DRIVERT as select * from  BGRENN.TMP_DRIVERB where rownum<' ||  to_char(row_count,'9999999');

dbms_stats.gather_table_stats('BGRENN','TMP_DRIVERT');

v_statement_id := to_char(row_count,'99999');

v_statement := 'explain plan SET STATEMENT_ID = ''' || v_statement_id || '''  for ' ||
'select   * ' ||
' from BGRENN.TMP_DRIVERT TMP_DRIVERA  ' ||
' INNER JOIN BGRENN.TABA TABA ON TMP_DRIVERA.ID=TABA.ID  ' ||
' INNER JOIN BGRENN.TABB ON TABA.GCC_ID=TABA.ID AND TABA.LOCN <> 8031431  ' ||
' INNER JOIN BGRENN.TABC TABC ON TABA.ID=TABC.ID AND TABC.id = 1168583  ' ||
' INNER JOIN BGRENN.TABD TABD ON TABA.ID=TABD.ID  ' ||
' INNER JOIN BGRENN.TABE TABE ON TABD.ID=TABE.ID  ' ||
' INNER JOIN BGRENN.TABF TABF ON TABD.ID=TABF.ID  ' ||
' INNER JOIN BGRENN.TABG TABG ON TABF.ID=TABG.ID  ' ||
' INNER JOIN BGRENN.TABH TABH ON TABG.ID=TABH.ID AND TABH.SEQ_NBR < 500  ' ||
' INNER JOIN BGRENN.TABI ON TABC.ID=ID  ' ||
' INNER JOIN BGRENN.TABJ TABJ ON TABC.ID=TABJ.ID  ' ||
' INNER JOIN BGRENN.TABK TABK ON TABJ.ID=TABK.ID and TABK.id in ( 1221589, 1219009, 1191882, 1221590, 1171956)  ' ||
' LEFT OUTER JOIN ERD.TABL TABL ON TABH.ID=TABL.ID  ' ||
' LEFT OUTER JOIN ERD.TABM TABM ON TABE.ID=TABM.ID  ' ||
' where (1=1)';

dbms_output.put_line(v_statement);

execute immediate v_statement;



SELECT substr(plan_table_output,18,12) into v_plan_hash_value  FROM TABLE(dbms_xplan.display(statement_id => v_statement_id)) where rownum <2;
select cost into v_cost from plan_table where id=0 and rownum<2 and statement_id=v_statement_id;

insert into bgrenn.mytable values(row_count,v_plan_hash_value,v_cost);

delete from plan_table where statement_id=v_statement_id;
commit;
end loop;

end;
/                                            number);




This produced a set of rows in the table with the cost.

I then copied the table, installed a profile and reran.

After joining the 2 tables on row count I created an "r" program and analyzed the results.
Here is the program.


# open Libarary psych for functions
library(psych)

#  open file
query_data <- read.table("c:/r/data/query_output.txt", header=T)

#what are the variables
describe (query_data)


pdf("c:/r/data/querydata.pdf")
plot(query_data$ROW_COUNT,query_data$orig_cost,type='l',col="red")
lines(query_data$ROW_COUNT,query_data$new_cost,type='l',col="green")
dev.off()


And here is the output ..  The red is the orginal plan, and the green is the plan with the profile. I can see that the cost of the profile plan remains more consistent. and is probably a better choice.

Wednesday, September 5, 2012

"enq: CR - block range reuse ckpt" and the recycle bin

I decided to write this blog entry because we ran into an issue over the weekend.  The process that normally runs fine, was running slow.. "like molasses" was the comment from one of the folks.  After digging into an AWR report I found one of the top waits for the sql was


enq: CR - block range reuse ckpt

OK.. what is that, and why ?   I didn't find a whole lot except that it indiciated contention on blocks from multiple processes trying to update blocks.. hmmm..

I looked further in the AWR and saw the top Segment for Logical reads was "RECYCLEBIN$"

and one of the top queries was.

"delete from RecycleBin$ where bo=:1"


Well since the process was finished I did my own test.. I created a tablespace and created an object in it.. dropped the object, added the to table, then dropped the table.. over and over again, until the number of objects in  the tablespace remained constant.  I then created the table again, and started to let it grow (so it would have to free up the recycle bin to get space)... And what I saw in the top 5 wait events was again.....

enq: CR - block range reuse ckpt

I wanted to document this for others that may hit this.. If a search for this wait event brought you to my blog, Please check your recycle bin and make sure that it isn't cleaning itself out to make room causing this wait event ...