Ever run into the dreaded ORA-01555 error?  It usually happens eight hours into your ten hour ETL run.  With automatic undo management in Oracle 9i, this happens less often, but can still be a major headache.  So you look at your ETL to try to find ways to improve and discover code like this:

declare
  commit_cnt    NUMBER := 0;
begin

for i IN (SELECT 10,000,000 rows FROM random_table)
loop

    << DO SOME COMPLEX PL/SQL WORK >>

    commit_cnt := commit_cnt + 1;

    if ( MOD (commit_cnt, 10000) = 0 ) then

      COMMIT;
    end if;
  end loop;
end;

It only takes a little while to process 10,000 rows, so why does the ORA-01555 error keep occurring.  It’s because Oracle maintains read consistency for the entire duration of the execution.  So putting the COMMIT within the loop is actually making the ORA-01555 more likely than just running through the 10,000,000 rows with a single COMMIT.

Say the ETL kicks off at SCN 100.  After 10,000 rows are processed, the transaction is committed at SCN 200.  At this point, other transactions and the ETL session itself are now free to overwrite the rollback information.  Later in the ETL, a row needs to be updated that is part of a block that was already updated during the SCN 200 commit.  If Oracle cannot derive the version of the block that existed at SCN 100 through existing rollback information, the dreaded ORA-01555 error occurs.

How can this be avoided?  One, the size of the undo tablespace can always be increased.  With today’s demand on IT departments to do more with less, this might not be the best option.  Two, ensure that all of the rows in the block are processed at the same time.  If you don’t revist the same block, this ORA-01555 cannot occur.  Three, remove the commits within the loop.  If the commit does not occur, that rollback information is still active, so the data at SCN 100 can always be retrieved.

Generally, I opt for the third option.  It makes coding easier.  It prevents having to restart or undo ETL in the middle of the process, which can be tricky depending on the complexity of the PL/SQL.  And it keeps commits tied to business requirements, which is where they should be.

Tom Kyte has an more thorough explanation of the ORA-01555 error: http://asktom.oracle.com/pls/asktom/f?p=100:11:7980017829814105::::P11_QUESTION_ID:275215756923

 

 

 

 

Comment