Hi All,
Today, I got a case related to transaction behavior on SQL Server.
One of our Informatica ETL developer came back saying there is some problem with the sql database.
I went on call to try to understand what she was saying.
Problem description:
They have one table and try to load some data using Informatica ETL tool.
Using Informatica ETL Tool, they have written a simple transformation logic to load data. (truncate & load).
Table structure
==============
create table test_tbl
(
id varchar(100) null,
last_update_date datetime2(7) null
)
create unique clustered index pk_test_dbl on test_tbl(id asc);
go
For sake of testing, they are loading 1000 recs and trying to load 2 additional records (intentionally to reproduce a duplicate record).
Something like below
Truncate table test_tbl;
Insert into test_tbl
select top 1000
c1,c2 from srcdb.dbo.stg_tbl
union
select '9999','2019-10-30 00:00:00'
union
select '9999','2019-10-30 00:00:00' --- simulate a duplicate record
Questions :
scenario 1 : when they run the ETL Tool pointing to dev server ,dev db, they see expected behaviour i.e. 1001 rows loaded and 1 rec is rejected as it is a duplicate 1. (i.e. 9999 )
scenario 2: when they change the connection, this time pointing prod server, prod db, they seeing different behaviour. They see only 700+ rows are getting inserted in prod table and not 1001.
they were asking why? it's the same truncate and load , works in dev perfectly and why it is not working the same for prod db?
To isolate the issue, I want to do some tests in SSMS, to remove the informatica tool out of picture.
CASE 1:
Ran below statements against dev and prod, I see the transaction behaviour as same (. i.e. SQL is considering the whole thing as single transaction, even 1 record fails, the entire txn is rollbacked. )
truncate table test_tbl;
go
Insert into test_tbl
select top 1000
c1,c2 from srcdb.dbo.stg_tbl
union
select '9999','2019-10-30 00:00:00'
union
select '9999','2019-10-30 00:00:00' --- simulate a duplicate record
Msg 2601, Level 14, State 1, Line 11
Cannot insert duplicate key row in object 'dbo.test_tbl' with unique index 'pk_test_dbl'. The duplicate key value is (9999).
select * from test_tbl
--no rows
CASE 2 : I have trunacted the table and try to insert only 2 rows . here, this time 1 is getting inserted and 1 row rejected. Infact, i was expecting above error as CASE 1.
truncate table test_tbl
go
Insert into test_tbl
select '9999','2019-10-30 00:00:00'
union
select '9999','2019-10-30 00:00:00' --- simulate a duplicate record
(1 row(s) affected)
select * from test_tbl
id last_update_date
9999 2019-10-30 00:00:00.0000000
Can anyone explain this transaction behavior of sql server and ssms ? What is the batch size in sql server to commit or rollback? The reason why I am asking this is, If I try to insert 2 rows , I don't see any error. why ????
Once I get know the exact sql behaviour then I go back to the team and tell this is how sql server treats batch's and if there is something needs to be tweaked in terms of batch size or any db setting in inform them accordingly.
Environment details
==================
select @@version
Microsoft SQL Server 2012 (SP4) (KB4018073) - 11.0.7001.0 (X64)
Aug 15 2017 10:23:29
Copyright (c) Microsoft Corporation
Enterprise Edition: Core-based Licensing (64-bit) on Windows NT 6.3 <X64> (Build 9600: ) (Hypervisor)
Thanks,
Sam