Page Free Space: Undocumented

Showing posts with label Undocumented. Show all posts

Thursday 20 June 2024

SQL Server Parallel Index Builds

Parallel Index Building Execution Plan

SQL Server doesn't support parallel modifications to a b-tree index.

That might sound surprising. After all, you can certainly write to the same b-tree index from multiple sessions concurrently. For example, two sessions can happily write alternating odd and even numbers to the same integer b-tree index. So long as both sessions execute on different schedulers and take row locks, there will be no blocking and you'll get true concurrency.

No, what I mean is: A single session can't write to a b-tree index using more than one thread. No parallel plan modifications of a b-tree index, in other words. It's a bit like the lack of parallel backward ordered scans. There's no reason it couldn't be implemented, but it hasn't been so far.

You may have thought SQL Server would use a regular parallel scan to read the index source data, optionally sort it into index key order, then add those rows to the index in parallel. This would indeed work, even without sorting, but SQL Server just can't do it.

In case you're wondering, sorting into destination key order is an optimization. The resulting index would still be correct without it, but you'd be inserting rows essentially at random into a b-tree, with all the random I/O and page splitting that would entail.

Ok, you say, but what about parallel index builds? They've been around for a long time in premium editions and certainly seem to modify a single b-tree in parallel. Yes, they do seem to, but SQL Server cheats.

Read the full article on 𝕏.

https://t.co/0f59kSlLzQ
— Paul White (@SQL_Kiwi) June 20, 2024

Friday 31 May 2024

Impossible Execution Plan Timings

Erik Darling (@erikdarlingdata) shared an interesting SQL Server execution plan with me recently. The demo script is at the end of this article.

The important section is shown below:

Impossible timings?

The Gather Streams operator appears to execute for less time (2.16s) than the Sort operator below it (5.431s). This seems impossible on the face of it.

The Parallelism (Gather Streams) operator runs in row mode (as always), while the Sort and Hash Match (Inner Join) operators both run in batch mode. This mixed mode plan adds a little complexity to interpreting plan timings because:

A batch mode operator reports CPU and elapsed times for that operator alone
A row mode operator reports times for itself and all its children

I've written about those aspects before in Understanding Execution Plan Operator Timings, which also covers a confusing situation that can arise in exclusively row mode parallel plans.

I showed a hidden option to make all operators report only their individual times in More Consistent Execution Plan Timings in SQL Server 2022. That feature isn't complete yet, so the results aren't perfect, and it's not documented or supported.

I mention all that in case you are interested in the background. None of the foregoing explains what we see in this mixed mode plan. The row mode Gather Streams elapsed time ought to include its children. The batch mode Sort should just be reporting its own elapsed time. With that understanding in mind, there's no way the Sort could run for longer than the Gather Streams. What's going on here?

Read the full article on 𝕏.

https://t.co/DrANqdwngA
— Paul White (@SQL_Kiwi) May 30, 2024

Friday 17 November 2023

Setting a Fixed Size for Transaction Log Virtual Log Files (VLFs)

Setting a Fixed Size for Transaction Log VLFs

The documentation has this to say about virtual log file (VLF) sizes:

The SQL Server Database Engine divides each physical log file internally into several virtual log files (VLFs). Virtual log files have no fixed size, and there’s no fixed number of virtual log files for a physical log file. The Database Engine chooses the size of the virtual log files dynamically while it’s creating or extending log files. The Database Engine tries to maintain a few virtual files. The size of the virtual files after a log file has been extended is the sum of the size of the existing log and the size of the new file increment. The size or number of virtual log files can’t be configured or set by administrators.

It then goes on to describe the problems having too many VLFs can cause, and how the database owner can arrange things so a reasonable number of VLFs are created. There’s even a (mostly accurate) formula for the number and size of VLFs SQL Server will create when asked to extend a transaction log file.

This is all very familiar, of course, but it is also dumb. Why on earth should we have to worry about internal formulas? It seems ridiculous to have to provision or grow a transaction log in pieces just to get a reasonable VLF outcome.

Wouldn’t it be better to be able to specify a fixed size for VLFs instead?

Starting with SQL Server 2022, there is now a way though it is undocumented and unsupported for the time being at least.

You can’t use it in a production database and there’s a real risk of it damaging your database beyond repair. Aside from those warnings, there’s no reason not to play around with it in a development environment. Or, if you’re simply curious to know more, read on.

About This Blog

Thursday 20 June 2024

Friday 31 May 2024

Friday 17 November 2023

Monday 13 November 2023

Friday 20 October 2023

Tuesday 30 August 2022

Saturday 23 July 2022

Thursday 18 November 2021

Wednesday 24 March 2021

Sunday 11 October 2020

Tuesday 4 August 2020

Sunday 26 July 2020

Background

Sunday 5 July 2020

Sunday 31 May 2020

Tuesday 2 May 2017

Saturday 31 August 2013

Wednesday 24 July 2013

Monday 8 July 2013

Friday 8 March 2013

Wednesday 6 February 2013