Monday, December 8, 2014

Insert slows down with cell single block physical read and read by other session events.

Recently I got chance to work on performance tuning issue with one of the application where performance suddenly degraded on database running on Exadata machine. On investigation I found that there are lot of concurrent sessions from application trying to insert into same table but spending lot of time on 'read by other session' and 'cell single block physical read' events. The strange thing was that the insert was simple statement with no underlying select query or triggers on the table being inserted. So what would have caused insert to perform poor suddenly which was earlier running in fractions of seconds.

I did some comparison for good and bad performance AWR snapshots. One thing was clear that during the day when we started having performance issue, application was trying to do lot many concurrent inserts on same table. But should oracle really limit the concurrency of application? Yes, under certain scenarios.

1.) If you have primary key on the table and application trying to insert same record, oracle would acquire row lock contention for period of time until blocking session commits and the blocked session would immediately get 'Primary or Unique Key Violation' message. This was not the case in the issue we had as there were no symptoms of such row lock contentions.

2.) Index leaf block becomes hot due to right handed growing index. This normally happens where you have index defined on sequences and such contention can be reduced by use of hash partitioning the index as the whole index tree would be divided into separate multiple trees due to hash algorithm and would minimize such contention. Even that was not in our case as there were no such symptoms.

3.) Other thing in Exadata environment, you would like to check whether storage cell disk is performing fine. We examined same and there was no issue with the Exadata Storage.

4.) GC Buffer busy waits in RAC environment could cause severe performance degradation if multiple sessions from different nodes request for a block. In RAC if an instance doesn't find the buffer in its local SGA, then block needs to be transferred from holding instance SGA through cache fusion. Now if the instance holding the block is busy, or if the block is already in use, then requesting instance has to wait and GC buffer busy will show up.  I did not find any such high waits during the time when we had performance issue.

Now we need to drill down and go into some details regarding how oracle works when there is request for inserting record in a data block. Basically each time when a session tries to insert a record, it has to find a block having enough free space where that record can fit. Oracle maintains list of free blocks in bitmap format in ASSM (freelists in MSSM) in the datafile header blocks. Further to speed up this searching process of free blocks, oracle also maintains this information in space search cache.

But what if block marked as free doesn't contain enough free space to hold the record being inserted? User process in this case would spend time in finding the right block having enough space and all other sessions which are also trying to insert record for the same table would also result in scanning these many blocks. And what if the block is not already available in buffer cache, they will have to perform unnecessary physical reads. And what if block is already being read into buffer cache by some session, other sessions will have to wait until the block is read into buffer cache and this shows up with read by other session wait. In our case we finally found that issue happening sporadically matches with 'INSERT slow on ASSM (Doc ID 1263496.1)'.

On performing below steps I was able to relate that the issue occurred because of bulk load and subsequent inserts from multiple sessions slowed down as they had to spend large time in finding the right free block having enough space.

  1.)     Find all the database sessions and instances for which the insert is slow. We need to enable the trace for 2-3 such sessions for collecting more diagnostic information and rule out any possibility of oracle bug. All below commands need to be executed using sys user. For finding such sessions below command can be used or same can be retrieved using OEM:
SELECT sess.inst_id, SID, SERIAL#, sess.sql_id,last_call_ET,to_char(logon_time,'YYYY-MM-DD HH24:MI:SS') logon_since,
         event,
         status,
         blocking_session,
         BLOCKING_INSTANCE,
         sess.program,
         sess.module,
         sess.machine 
    FROM gv$session sess, gv$sql sq
   WHERE status = 'ACTIVE' AND USERNAME = <owner> and 
         sess.sql_id = sq.sql_id and
         sess.inst_id = sq.inst_id and
         upper(sql_text) like '%object_name%' and
         sess.last_call_et > 1;

2.)    Enable trace for one of the session undergoing poor performance (having lot of read by other sessions and cell single physical read block events)

select spid from gv$process where addr in (select paddr from gv$session where sid = <session_id> and inst_id = <instance_number>);
Login to the node as sysdba where above session is currently running and enable ASSM trace using below command:
ORADEBUG SETOSPID <spid value from above query>;
ORADEBUG UNLIMIT;
ORADEBUG EVENT 10320 TRACE NAME CONTEXT FOREVER, LEVEL 3;
             ORADEBUG EVENT 10612 TRACE NAME CONTEXT FOREVER, LEVEL 1;
             ORADEBUG tracefile_name;
             Keep it active for around 2-3 minutes and then disable the trace using below command:
             ORADEBUG EVENT 10320 TRACE NAME CONTEXT OFF;
             ORADEBUG EVENT 10612 TRACE NAME CONTEXT OFF;
             Exit;
             Download the tracefile name as appeared using oradebug tracefile_name.

3.)    Also collect error stack for the session for which insert is running slow. Login to the instance as sysdba where insert is slow and run below commands:

sqlplus '/ as sysdba'
oradebug setospid <spid value>
oradebug unlimit
oradebug dump errorstack 3
!sleep 30
oradebug dump errorstack 1
!sleep 30
oradebug dump errorstack 1
!sleep 30
oradebug dump errorstack 1
!sleep 30
oradebug dump errorstack 1
!sleep 30
oradebug dump errorstack 1
!sleep 30
oradebug dump errorstack 1
!sleep 30
oradebug dump errorstack 3
!sleep 30
oradebug dump errorstack 1
!sleep 30
oradebug dump errorstack 1
!sleep 30
oradebug dump errorstack 1
!sleep 30
oradebug dump errorstack 1
!sleep 30
oradebug dump errorstack 1
!sleep 30
oradebug dump errorstack 1
!sleep 30
oradebug dump errorstack 3 
             oradebug tracefile_name;
             quit;

4.)    Run below as sysdba from any instance and capture the output:
set serveroutput on size unlimited
declare
total_bl number;
total_by number;
unused_bl number;
unused_by number;
last_used_extent_file_id number;
last_used_extent_block_id number;
last_used_block number;
unformatted_blocks number;
unformatted_bytes number;
fs1_bl number;
fs1_by number;
fs2_bl number;
fs2_by number;
fs3_bl number;
fs3_by number;
fs4_bl number;
fs4_by number;
full_bl number;
full_by number;
i number;
BEGIN
dbms_space.unused_space('PDA01PRD','DM_SYSOBJECT_S', 'TABLE',total_bl , total_by , unused_bl , unused_by , last_used_extent_file_id ,last_used_extent_block_id , last_used_block );
dbms_space.space_usage('PDA01PRD','DM_SYSOBJECT_S', 'TABLE',unformatted_blocks , unformatted_bytes , fs1_bl , fs1_by , fs2_bl , fs2_by , fs3_bl , fs3_by, fs4_bl , fs4_by , full_bl , full_by );
dbms_output.put_line('| total_bl| total_by| unused_bl| unused_by| last_used_extent_file_id|last_used_extent_block_id| last_used_block');
dbms_output.put_line('| '||lpad(total_bl,8)||'|'||
lpad(total_by,9)||'|'||
lpad(unused_bl,10)||'|'||
lpad(unused_by,10)||'|'||
lpad(last_used_extent_file_id,25)||'|'||
lpad(last_used_extent_block_id,26)||'|'||
lpad(last_used_block,16));
dbms_output.put_line(' .');
dbms_output.put_line('| unformatted_blocks| unformatted_bytes| fs1_bl| fs1_by| fs2_bl|fs2_by| fs3_bl| fs3_by| fs4_bl| fs4_by| full_bl| full_by');
dbms_output.put_line('| '||lpad(unformatted_blocks,18)||'|'||
lpad(unformatted_bytes,18)||'|'||
lpad(fs1_bl,7)||'|'||
lpad(fs1_by,7)||'|'||
lpad(fs2_bl,7)||'|'||
lpad(fs2_by,7)||'|'||
lpad(fs3_bl,7)||'|'||
lpad(fs3_by,7)||'|'||
lpad(fs4_bl,7)||'|'||
lpad(fs4_by,7)||'|'||
lpad(full_bl,8)||'|'||
lpad(full_by,8));
end;
/
prompt FS1 means 025%
free space within a block
prompt FS2 means 2550%
free space within a block
prompt FS3 means 5075%
free space within a block
prompt FS4 means 75100%
free space within a block

55.)    If value of FS1 is higher than (FS2, FS3 and FS4), then the reason of slowness could be due to large number of blocks having not enough space to accommodate inserts. In this case, explicitly marking the block to be full and increasing PCTFREE should resolve the issue. Increasing PCTFREE results in blocks to be marked as FULL at higher rate but this could result in segment to grow faster and wastage of space. So it can be increased to optimum value after considering performance and space usage.

 EXECUTE DBMS_REPAIR.SEGMENT_FIX_STATUS('PDA01PRD','DM_SYSOBJECT_S');

Same can also be confirmed after examining ASSM trace file where blocks from FS1 category moving to FULL can be clearly seen :
[ktspsrch] adddba dba:0x0100636b curl1:0x01006280 curl2:0x01002601
ktspfsrch: Returns: BlockDBA:0x0100637b
kdt_bseg_srch_cbk: examine dba=4.0x0100637b
kdt_bseg_srch_cbk: failed dba=4.0x0100637b avs=837 afs=0 tosp=837 full=1
ktspfupdst: state change tsn 4, objd 96400, objn 71401, blksz 8192
ktspfupdst: oldstate 2, newstate 1, delta bytes 1065
ktspfupdst: fdba: 0x01006280 offset: 251 nblks: 1 newstate: 1
kdt_bseg_srch_cbk:      mark full >the blocks is marked as full
ktspfsrch:Cbk didnot like 0x010063 

6.)  If above is not the case then it could be an issue with space search cache and same can be determined by examining the stack trace taken in step 4.

Short stack dump:
ksedsts()+461<-ksdxfstk()+32<-ksdxcb()+1876<-sspuser()+112<-__sighandler()<-__poll()+47<-ssskgxp_poll()+40<-sskgxp_select()+263<-skgxpiwait()+3901<-skgxpwaiti()+1832<-skgxpwait()+162<-ossnet_wait_all()+241<-ossnet_wait()+146<-oss_wait()+991<-kfk_wait_on_oss_io()+406<-kfk_submit_oss_io_wtevt()+285<-kfk_submit_oss_io()+415<-kfk_submit_io()+105<-kfk_io1()+1068<-kfkRequest()+14<-kfk_transitIO()+1696<-kfioSubmitIO()+4777<-kfioRequestPriv()+199<-kfioRequest()+706<-ksfd_kfioRequest()+649<-ksfd_osmio()+1114<-ksfd_io()+2761<-ksfdread()+581<-kcfrbd1()+609<-kcbzib()+1888<-kcbgcur()+8274<-ktbgcur()+123<-ktspfsrch()+2569<-ktspscan_bmb()+898<-ktspgsp_main()+1817<-kdtgsp()+1576<-kdtgsph()+107<-kdtgrs()+1013<-kdtSimpleInsRow()+320<-qerltcSimpleSingleInsRowCBK()+46<-qerltcSingleRowLoad()+279<-qerltcFetch()+379<-insexe()+682<-opiexe()+5622<-kpoal8()+2076<-opiodr()+916<-ttcpip()+2210<-opitsk()+1717<-opiino()+966<-opiodr()+916<-opidrv()+570<-sou2o()+103<-opimai_real()+133<-ssthrdmain()+252<-main()+201<-__libc_start_main()+244<-_start()+36

If you see above waits for space search cache, then try disabling space search cache and see if performance improves. This action should be done only after consultation with oracle support.
alter system set event="10262 trace name context forever, level 1";

Hope you find this article useful. Pls let me know for any questions or suggestions.

References: https://support.oracle.com

No comments:

Post a Comment