MS SQL Lesson 12 – Identity Columns | Dataplexa

Identity Columns

Every table in DataplexaStore has a primary key that SQL Server generates automatically — UserID, ProductID, OrderID, OrderItemID, ReviewID. You never supply these values in an INSERT statement. SQL Server owns them entirely, incrementing a counter with each new row and guaranteeing that no two rows in the table ever receive the same value. This mechanism is the IDENTITY property, and while it looks simple on the surface, understanding exactly how it works — how values are generated, what happens to gaps, how to retrieve the last generated value safely, and what to do when the counter needs resetting — is essential knowledge for every developer who writes INSERT statements against a SQL Server database.

How IDENTITY Works

The IDENTITY property is attached to a column definition with two parameters: seed and increment. The seed is the value assigned to the very first row inserted into the table. The increment is the amount added to the previous value for each subsequent row. IDENTITY(1,1) — the most common form — starts at 1 and counts up by 1. IDENTITY(1000,10) would start at 1000 and count 1000, 1010, 1020, and so on. SQL Server maintains the current IDENTITY value in memory and in the system catalog. When a row is inserted, SQL Server reserves the next value before the INSERT executes — which means even if the INSERT fails and is rolled back, that value is consumed and will never be used again. This is the source of gaps in IDENTITY sequences, and it is by design — IDENTITY guarantees uniqueness, not contiguity.

-- Inspect IDENTITY settings for all DataplexaStore tables
-- last_value shows the most recently generated value — NULL if no rows inserted yet

SELECT
    OBJECT_NAME(object_id)          AS table_name,
    name                            AS column_name,
    seed_value,                     -- starting value (first row gets this)
    increment_value,                -- added to previous value for each new row
    last_value,                     -- most recently generated IDENTITY value
    is_not_for_replication          -- 1 = IDENTITY skipped during replication inserts
FROM sys.identity_columns
ORDER BY table_name;

-- Confirm the column definition — IDENTITY columns show as NOT NULL in schema
SELECT
    TABLE_NAME,
    COLUMN_NAME,
    COLUMNPROPERTY(OBJECT_ID(TABLE_NAME), COLUMN_NAME, 'IsIdentity') AS is_identity
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME IN ('Users','Products','Orders','OrderItems','Reviews')
  AND COLUMNPROPERTY(OBJECT_ID(TABLE_NAME), COLUMN_NAME, 'IsIdentity') = 1;

table_name column_name seed_value increment_value last_value is_not_for_replication
─────────── ──────────── ────────── ─────────────── ────────── ──────────────────────
OrderItems OrderItemID 1 1 10 0
Orders OrderID 1 1 5 0
Products ProductID 1 1 10 0
Reviews ReviewID 1 1 5 0
Users UserID 1 1 10 0

TABLE_NAME COLUMN_NAME is_identity
─────────── ──────────── ───────────
OrderItems OrderItemID 1
Orders OrderID 1
Products ProductID 1
Reviews ReviewID 1
Users UserID 1

last_value confirms exactly how many rows have been successfully inserted — UserID last_value of 10 means the last user inserted received UserID 10, and the next INSERT will receive 11
is_not_for_replication = 0 means the IDENTITY counter fires normally — setting it to 1 allows replication to insert rows with their original source IDENTITY values rather than generating new ones on the subscriber
COLUMNPROPERTY with IsIdentity is the programmatic way to find IDENTITY columns without relying on INFORMATION_SCHEMA, which does not expose this property directly

Inserting Rows and Retrieving the Generated Value

When you insert a row into a table with an IDENTITY column, you simply omit that column from the INSERT statement entirely. SQL Server generates the value and stores it. The challenge comes immediately after — your application almost always needs to know what value was generated, because the next step is typically inserting child rows that reference the new parent row as a foreign key. SQL Server provides three ways to retrieve the last generated IDENTITY value: @@IDENTITY, SCOPE_IDENTITY(), and IDENT_CURRENT('tablename'). Only SCOPE_IDENTITY() is safe for production use. @@IDENTITY returns the last identity value generated in the current session regardless of scope — if a trigger on your table inserts a row into another table with its own IDENTITY column, @@IDENTITY returns that trigger's value, not your original insert's value. IDENT_CURRENT is not session-scoped at all — it returns the last value generated for a named table by any session, making it completely unreliable in a multi-user environment.

-- Insert a new user and immediately retrieve the generated UserID
-- This is the pattern used every time a parent row is needed for child inserts

INSERT INTO Users (FullName, Email, City, Country, MembershipTier)
VALUES ('Carlos Mendoza', 'carlos.mendoza@example.com', 'Mexico City', 'Mexico', 'Premium');

-- SCOPE_IDENTITY() returns the last IDENTITY value generated in THIS scope
-- Safe even if triggers fire and insert into other identity tables
DECLARE @NewUserID INT = SCOPE_IDENTITY();
SELECT @NewUserID AS new_user_id;

-- Now use the new UserID immediately to insert a child row
INSERT INTO Orders (UserID, TotalAmount, Status)
VALUES (@NewUserID, 0.00, 'processing');

-- Confirm the link
SELECT
    u.UserID, u.FullName,
    o.OrderID, o.Status, o.OrderDate
FROM Users u
JOIN Orders o ON u.UserID = o.UserID
WHERE u.UserID = @NewUserID;

new_user_id
───────────
11

UserID FullName OrderID Status OrderDate
────── ─────────────── ─────── ────────── ───────────────────────────
11 Carlos Mendoza 6 processing 2024-03-28 14:22:31.0000000

SCOPE_IDENTITY() is called immediately after the INSERT — it returns the value generated in the current batch and scope, unaffected by any concurrent inserts from other sessions
Storing the result in a variable (@NewUserID) lets you reuse it for multiple child inserts — for example creating the order, then creating order items, all in one transaction referencing the same new UserID
Never use SELECT MAX(UserID) FROM Users to find the last inserted ID — in a concurrent environment another session could have inserted a higher UserID between your INSERT and your SELECT, returning the wrong value

OUTPUT Clause — The Modern Approach

The OUTPUT clause is a cleaner alternative to SCOPE_IDENTITY() that works at scale, handles multi-row inserts, and returns the generated values as a result set rather than a scalar. When multiple rows are inserted in a single INSERT statement, SCOPE_IDENTITY() only returns the last generated value — if you inserted 100 rows you only know the identity of the last one. OUTPUT INSERTED returns every generated value for every inserted row, making it the correct choice for bulk inserts where you need to track all generated IDs.

-- Use OUTPUT to capture generated IDENTITY values from a multi-row INSERT
-- INSERTED is a virtual table containing the rows just inserted

-- Create a temporary table to hold the captured IDs
DECLARE @InsertedUsers TABLE (
    NewUserID   INT,
    FullName    NVARCHAR(100)
);

-- Insert multiple users and capture all generated UserIDs simultaneously
INSERT INTO Users (FullName, Email, City, Country)
OUTPUT INSERTED.UserID, INSERTED.FullName     -- capture from the virtual INSERTED table
    INTO @InsertedUsers (NewUserID, FullName)  -- redirect output into our temp table
VALUES
    ('Fatima Al-Hassan',  'fatima@example.com',  'Dubai',    'UAE'),
    ('Kenji Tanaka',      'kenji@example.com',   'Tokyo',    'Japan'),
    ('Amara Okafor',      'amara@example.com',   'Lagos',    'Nigeria');

-- Review all three generated IDs
SELECT NewUserID, FullName FROM @InsertedUsers;

NewUserID FullName
───────── ──────────────────
12 Fatima Al-Hassan
13 Kenji Tanaka
14 Amara Okafor

OUTPUT INSERTED gives you every generated value in one operation — no need to query the table again, no risk of a concurrent insert interfering with your ID retrieval
The INSERTED virtual table mirrors the rows that were just written — every column is available, not just the IDENTITY column; you could capture FullName, Email, and JoinDate simultaneously
OUTPUT also works on UPDATE and DELETE — INSERTED contains the new values, DELETED contains the old values, enabling audit logging of changes directly from DML statements

IDENTITY_INSERT — Supplying Explicit Values

There are legitimate scenarios where you need to insert a specific value into an IDENTITY column rather than letting SQL Server generate one — the most common being data migration, where you are moving rows from a source database and need to preserve the original IDs so that all foreign key relationships remain intact. SQL Server blocks explicit IDENTITY inserts by default. To allow them temporarily you run SET IDENTITY_INSERT tablename ON, perform the inserts, then turn it off again. Only one table per session can have IDENTITY_INSERT enabled at a time. After finishing an explicit insert session you should always run DBCC CHECKIDENT to reseed the counter to the current maximum, otherwise the next automatic insert will start from where the counter was before your explicit inserts — which may now conflict with the explicitly inserted values.

-- Demonstrate IDENTITY_INSERT for a migration scenario
-- Insert a user with a specific UserID preserved from a source system

SET IDENTITY_INSERT Users ON;    -- allow explicit inserts into the IDENTITY column

INSERT INTO Users (UserID, FullName, Email, City, Country)  -- UserID now in the column list
VALUES (500, 'Migrated User', 'migrated@example.com', 'Sydney', 'Australia');

SET IDENTITY_INSERT Users OFF;   -- always turn off immediately after use

-- The IDENTITY counter did not advance — last_value is still 14 from the previous inserts
-- The next automatic insert would try to use 15, skipping the gap up to 500
SELECT last_value FROM sys.identity_columns WHERE OBJECT_NAME(object_id) = 'Users';

-- Reseed the counter to the current maximum to avoid future conflicts
DBCC CHECKIDENT ('Users', RESEED);  -- RESEED without a value sets counter to current MAX(UserID)

-- Confirm the counter is now at 500
SELECT last_value FROM sys.identity_columns WHERE OBJECT_NAME(object_id) = 'Users';

-- last_value before RESEED:
last_value
──────────
14

-- DBCC CHECKIDENT output:
Checking identity information: current identity value '14', current column value '500'.
IDENTITY_INSERT is set to OFF for table 'Users'.

-- last_value after RESEED:
last_value
──────────
500

Without the DBCC CHECKIDENT RESEED the next automatic INSERT would try to use UserID 15, which would succeed — but if more explicit inserts were done with values between 15 and 500 in a future migration, those values would collide
DBCC CHECKIDENT with RESEED and no explicit value is the safe form — it scans the table and sets the counter to the current maximum, ensuring the next automatic insert gets max + increment
DBCC CHECKIDENT ('tablename', RESEED, 0) resets the counter to 0 so the next insert gets 1 — useful when truncating a development table and wanting IDs to start from 1 again

Gaps in IDENTITY Sequences

IDENTITY sequences are not guaranteed to be contiguous. Gaps appear for three reasons: a rolled-back transaction consumed a value before failing; the server restarted while a cached block of IDENTITY values was in memory but not yet used; or rows were deleted. None of these are problems — IDENTITY guarantees uniqueness, not a gapless sequence. If your application's business logic requires a truly gapless sequence — invoice numbers that must be consecutive with no missing values — IDENTITY is the wrong tool. That requirement demands a separate sequence management table with a transaction wrapping the read-and-increment, which is significantly more complex and carries a concurrency cost. For primary keys, session IDs, order IDs, and almost every other internal identifier, gaps are completely harmless and the simplicity of IDENTITY is the right trade-off.

-- Demonstrate that a rolled-back INSERT creates a gap in the IDENTITY sequence

-- Current last value
SELECT last_value AS before_rollback
FROM sys.identity_columns
WHERE OBJECT_NAME(object_id) = 'Products';

-- Insert and immediately roll back — the IDENTITY value is consumed but the row is not saved
BEGIN TRANSACTION;
    INSERT INTO Products (ProductName, Category, Price, StockQty)
    VALUES ('Ghost Product', 'Test', 1.00, 0);   -- IDENTITY advances to reserve this value
ROLLBACK TRANSACTION;                             -- row disappears, but value is gone forever

-- Insert a real product — it gets the value AFTER the rolled-back one
INSERT INTO Products (ProductName, Category, Price, StockQty)
VALUES ('Smart Speaker', 'Electronics', 79.99, 25);

-- Show the gap
SELECT ProductID, ProductName
FROM Products
ORDER BY ProductID DESC;

before_rollback
───────────────
10

ProductID ProductName
───────── ─────────────────
12 Smart Speaker
10 4K Monitor
9 Desk Lamp
8 Standing Desk
...

ProductID 11 was consumed by the rolled-back transaction and is gone — the next successful INSERT got 12, creating a gap at 11 that will never be filled
This is expected and correct behaviour — attempting to reuse gap values would require locking the entire IDENTITY mechanism and would destroy the concurrent insert performance that makes IDENTITY fast
If gaps in a sequence are visible to end users and cause confusion — for example customer-facing order numbers — consider using a formatted display number separate from the internal IDENTITY primary key, keeping the two concerns independent

Summary Table

Concept	Detail	Key Point
IDENTITY(seed, increment)	Auto-generated column value	Never include in INSERT column list
SCOPE_IDENTITY()	Last value in current scope	Safe in all scenarios — use this, not @@IDENTITY
OUTPUT INSERTED	Capture generated values from DML	Only option that works for multi-row inserts
SET IDENTITY_INSERT ON	Allow explicit IDENTITY values	Turn off immediately after — one table at a time
DBCC CHECKIDENT RESEED	Reset the IDENTITY counter	Run after explicit inserts to avoid future conflicts
Gaps	Rollbacks, restarts, deletes	By design — IDENTITY guarantees uniqueness, not contiguity

Practice Questions

Practice 1. Why is SCOPE_IDENTITY() safer than @@IDENTITY for retrieving the last generated IDENTITY value?

Practice 2. When should you use OUTPUT INSERTED instead of SCOPE_IDENTITY() to capture generated IDENTITY values?

Practice 3. Why does a rolled-back transaction create a permanent gap in an IDENTITY sequence?

Practice 4. After using SET IDENTITY_INSERT ON to insert rows with explicit values, what should you do before turning it off?

Practice 5. A business requirement states that customer invoice numbers must be consecutive with no gaps. Is IDENTITY the right tool for this? Explain your answer.

Quiz

Quiz 1. What does IDENTITY(100, 5) mean?

First row gets 100, each subsequent row adds 5
First row gets 5, each subsequent row adds 100
Column stores values between 100 and 5
Column resets to 100 every 5 rows

Quiz 2. How many tables can have SET IDENTITY_INSERT ON at the same time within a single session?

One
Two
Unlimited
One per database

Quiz 3. Which function returns the last IDENTITY value generated for a named table by any session?

IDENT_CURRENT('tablename')
SCOPE_IDENTITY()
@@IDENTITY
LAST_INSERT_ID()

Quiz 4. What does DBCC CHECKIDENT ('Users', RESEED) do when no explicit reseed value is provided?

Sets the counter to the current maximum value in the table
Resets the counter to 1
Deletes all rows and resets to seed
Raises an error — value required

Quiz 5. An INSERT statement inserts 3 rows. What value does SCOPE_IDENTITY() return?

The value generated for the last inserted row only
The first row's value
All three values as a list
NULL — only works for single-row inserts

Next up - Insert Data - Put your first real rows into DataplexaStore and master every form of the INSERT statement.

← Previous Course Index Next →

MS SQL Server