Monday, 6 July 2015

MSSQL - Performance tuning of Scalar-Value User Defined Functions (UDFs)

In most of cases the scalar-value User Defined Functions (Scalar UDF) are very slow.
It is enough to move the scalar expression in to UDF and query performance will be several times reduced.
Let us check ...

For testing, we need a test table with the dates from about a thousand years:
  1. -- Recreate test table
  2. IF OBJECT_ID('dbo.TestUDFs') IS NOT NULL
  3. DROP TABLE dbo.TestUDFs;
  4. GO
  5. CREATE TABLE dbo.TestUDFs(
  6. Datum Date
  7. );
  8. GO
  9. -- SET NOCOUNT ON to prevent extra log messages returned as part of the result set:
  10. -- (See: https://msdn.microsoft.com/en-us/library/ms189837%28v=sql.110%29.aspx)
  11. SET NOCOUNT ON;
  12. -- Fill test-table
  13. DECLARE
  14. @Date Date = GetDate(),
  15. @i INT = 0;
  16. WHILE @i < 1000 * 365 BEGIN
  17. SET @Date = DATEADD(day, 1, @Date);
  18. INSERT INTO dbo.TestUDFs VALUES (@Date);
  19. SET @i = @i + 1;
  20. END;
  21. -- restore NOCOUNT
  22. SET NOCOUNT OFF;
  23. GO
  24. SELECT CNT = COUNT(*) FROM dbo.TestUDFs;
  25. GO


We will create four different UDFs:

FIRST:
Scalar-value UDF, which uses variables for intermediate calculations.
The body of function is placed into BEGIN-END container.
  1. -- Recreate scalar-value UDF with variables in body
  2. IF OBJECT_ID('dbo.fn_scalar_with_vars') IS NOT NULL
  3. DROP FUNCTION dbo.fn_scalar_with_vars;
  4. GO
  5. CREATE FUNCTION dbo.fn_scalar_with_vars (@date DATETIME)
  6. RETURNS INT
  7. AS BEGIN
  8. DECLARE @DateKey INT;
  9. SELECT @DateKey = CONVERT(INT, CONVERT(VARCHAR(8), @date, 112));
  10. RETURN @DateKey;
  11. END;
  12. GO
  13. SELECT dbo.fn_scalar_with_vars(GetDate());
  14. GO

SECOND:
Scalar-value UDF, which does not use intermediate calculations, and almost immediately returns the result of the calculation.
The body of function is placed into BEGIN-END container.
  1. -- Recreate scalar-value UDF with direct expression (without variables) in body
  2. IF OBJECT_ID('dbo.fn_scalar_direct_expression') IS NOT NULL
  3. DROP FUNCTION dbo.fn_scalar_direct_expression;
  4. GO
  5. CREATE FUNCTION dbo.fn_scalar_direct_expression (@date DATETIME)
  6. RETURNS INT
  7. AS BEGIN
  8. RETURN CONVERT(INT, CONVERT(VARCHAR(8), @date, 112));
  9. END;
  10. GO
  11. SELECT DateKey = dbo.fn_scalar_direct_expression(GetDate());
  12. GO
Scalar-value UDF which returns not scalar value, but a table-result.
For data storage uses UDF a table variable.
The body of function is placed into BEGIN-END container.
  1. -- Recreate table-value UDF which returns table-variable
  2. IF OBJECT_ID('dbo.fn_non_inline_table') IS NOT NULL
  3. DROP FUNCTION dbo.fn_non_inline_table;
  4. GO
  5. CREATE FUNCTION dbo.fn_non_inline_table (@date DATETIME)
  6. RETURNS @DateKey TABLE(DateKey INT)
  7. AS BEGIN
  8. INSERT INTO @DateKey
  9. VALUES (CONVERT(INT, CONVERT(VARCHAR(8), @date, 112)));
  10. RETURN;
  11. END
  12. GO
  13. SELECT DateKey FROM dbo.fn_non_inline_table(GetDate());
  14. GO
Finally the UDF, which returns a non-scalar value, but a table.
There is no any BEGIN-END container, and the expression is placed immediately after the AS expression.
Such UDFs are called "Inline Table-Value UDF".
  1. -- Recreate inlide table-value UDF
  2. IF OBJECT_ID('dbo.fn_inline_table') IS NOT NULL
  3. DROP FUNCTION dbo.fn_inline_table;
  4. GO
  5. CREATE FUNCTION dbo.fn_inline_table (@date DATETIME)
  6. RETURNS TABLE
  7. AS
  8. RETURN SELECT DateKey = CONVERT(INT, CONVERT(VARCHAR(8), @date, 112));
  9. GO
  10. SELECT DateKey FROM dbo.fn_inline_table(GetDate());
  11. GO


Let us test the direct evaluation of the expression in the query, and calls of all four UDFs.
Tests were performed on a virtual machine running on my NoteBook-e.
On other computers the results may differ.

The test script is defined so:
- in the first step we will store the current date and time in the variable
- for each row of the test table we will performed a calculation
- the test request substitute this text: / * Put calculation there * /.
We will count the number of milliseconds between start and time.

  1. -- Script to test calculations
  2. DECLARE
  3. @DateKey INT,
  4. @StartedAt DATETIME2 = GETDATE();
  5.  
  6. /* Put calculation there */
  7.  
  8. PRINT 'Elapsed time (Direct calculation): '
  9. + CAST(DATEDIFF(ms, @StartedAt, GETDATE()) AS VarChar(20)) + ' ms';

Calculation directly in query:
300 ms.
  1. SELECT @DateKey = CONVERT(INT, CONVERT(VARCHAR(8), Datum, 112))
  2. FROM dbo.TestUDFs;

Scalar-value UDF with variables and body:
1570 ms.
  1. SELECT @DateKey = dbo.fn_scalar_with_vars(Datum)
  2. FROM dbo.TestUDFs;

Scalar-value UDF with direct expression and body:
1274 ms.
  1. SELECT @DateKey = dbo.fn_scalar_direct_expression(Datum)
  2. FROM dbo.TestUDFs;

Non-inline table-value UDF with table-variable and body:
29623 ms.
  1. SELECT @DateKey = DateKey
  2. FROM dbo.TestUDFs CROSS APPLY dbo.fn_non_inline_table(Datum);

And now surprise - Inline table-value UDF:
210 ms!!! - at least the same result as in the first case, where a computation request occurs immediately.
  1. SELECT @DateKey = DateKey
  2. FROM dbo.TestUDFs CROSS APPLY dbo.fn_inline_table(Datum);



All-in-one Script (an execution takes about one minute):
  1. -- SET NOCOUNT ON to prevent extra log messages:
  2. -- See: https://msdn.microsoft.com/en-us/library/ms189837%28v=sql.110%29.aspx
  3. SET NOCOUNT ON;
  4. GO
  5. IF OBJECT_ID('dbo.TestUDFs') IS NOT NULL
  6. DROP TABLE dbo.TestUDFs;
  7. CREATE TABLE dbo.TestUDFs(
  8. [Date] Date
  9. );
  10. -- Fill test-table
  11. DECLARE
  12. @Date Date = GetDate(),
  13. @i INT = 0;
  14. WHILE @i < 1000 * 365 BEGIN
  15. SET @Date = DATEADD(day, 1, @Date);
  16. INSERT INTO dbo.TestUDFs VALUES (@Date);
  17. SET @i = @i + 1;
  18. END;
  19. -- Check if table is filled
  20. DECLARE @CNT INT = (SELECT COUNT(*) FROM dbo.TestUDFs);
  21. PRINT 'Rows in TestUDFs table: ' + CAST(@CNT AS VARCHAR(20));
  22. GO
  23. IF OBJECT_ID('dbo.fn_scalar_with_vars') IS NOT NULL
  24. DROP FUNCTION dbo.fn_scalar_with_vars;
  25. GO
  26. CREATE FUNCTION dbo.fn_scalar_with_vars (@Date DATETIME)
  27. RETURNS INT
  28. AS BEGIN
  29. DECLARE @DateKey INT;
  30. SELECT @DateKey = CONVERT(INT, CONVERT(VARCHAR(8), @Date, 112));
  31. RETURN @DateKey;
  32. END;
  33. GO
  34. IF OBJECT_ID('dbo.fn_scalar_direct_expression') IS NOT NULL
  35. DROP FUNCTION dbo.fn_scalar_direct_expression;
  36. GO
  37. CREATE FUNCTION dbo.fn_scalar_direct_expression (@Date DATETIME)
  38. RETURNS INT
  39. AS BEGIN
  40. RETURN CONVERT(INT, CONVERT(VARCHAR(8), @Date, 112));
  41. END;
  42. GO
  43. IF OBJECT_ID('dbo.fn_non_inline_table') IS NOT NULL
  44. DROP FUNCTION dbo.fn_non_inline_table;
  45. GO
  46. CREATE FUNCTION dbo.fn_non_inline_table (@Date DATETIME)
  47. RETURNS @DateKey TABLE(DateKey INT)
  48. AS BEGIN
  49. INSERT INTO @DateKey
  50. VALUES (CONVERT(INT, CONVERT(VARCHAR(8), @Date, 112)));
  51. RETURN;
  52. END
  53. GO
  54. IF OBJECT_ID('dbo.fn_inline_table') IS NOT NULL
  55. DROP FUNCTION dbo.fn_inline_table;
  56. GO
  57. -- Recreate scalar-value UDF with direct expression (without variables) in body
  58. IF OBJECT_ID('dbo.fn_scalar_direct_expression') IS NOT NULL
  59. DROP FUNCTION dbo.fn_scalar_direct_expression;
  60. GO
  61. CREATE FUNCTION dbo.fn_scalar_direct_expression (@Date DATETIME)
  62. RETURNS INT
  63. AS BEGIN
  64. RETURN CONVERT(INT, CONVERT(VARCHAR(8), @Date, 112));
  65. END;
  66. GO
  67. -- Recreate table-value UDF which returns table-variable
  68. IF OBJECT_ID('dbo.fn_non_inline_table') IS NOT NULL
  69. DROP FUNCTION dbo.fn_non_inline_table;
  70. GO
  71. CREATE FUNCTION dbo.fn_non_inline_table (@Date DATETIME)
  72. RETURNS @DateKey TABLE(DateKey INT)
  73. AS BEGIN
  74. INSERT INTO @DateKey
  75. VALUES (CONVERT(INT, CONVERT(VARCHAR(8), @Date, 112)));
  76. RETURN;
  77. END
  78. GO
  79. -- Recreate inlide table-value UDF
  80. IF OBJECT_ID('dbo.fn_inline_table') IS NOT NULL
  81. DROP FUNCTION dbo.fn_inline_table;
  82. GO
  83. CREATE FUNCTION dbo.fn_inline_table (@Date DATETIME)
  84. RETURNS TABLE
  85. AS
  86. RETURN SELECT DateKey = CONVERT(INT, CONVERT(VARCHAR(8), @Date, 112));
  87. GO
  88. /********** Test all methods **********/
  89. DECLARE
  90. @DateKey INT,
  91. @StartedAt DATETIME2 = GETDATE();
  92.  
  93. PRINT '
  94. Calculation directly in query:'
  95. SET @StartedAt = GETDATE();
  96. SELECT @DateKey = CONVERT(INT, CONVERT(VARCHAR(8), [Date], 112))
  97. FROM dbo.TestUDFs;
  98. PRINT ' Elapsed time: '
  99. + CAST(DATEDIFF(ms, @StartedAt, GETDATE()) AS VarChar(20)) + ' ms';
  100.  
  101. PRINT '
  102. Scalar-value UDF with variables and body:'
  103. SET @StartedAt = GETDATE();
  104. SELECT @DateKey = dbo.fn_scalar_with_vars([Date])
  105. FROM dbo.TestUDFs;
  106. PRINT ' Elapsed time: '
  107. + CAST(DATEDIFF(ms, @StartedAt, GETDATE()) AS VarChar(20)) + ' ms';
  108.  
  109. PRINT '
  110. Scalar-value UDF with direct expression and body:'
  111. SET @StartedAt = GETDATE();
  112. SELECT @DateKey = dbo.fn_scalar_direct_expression([Date])
  113. FROM dbo.TestUDFs;
  114. PRINT ' Elapsed time: '
  115. + CAST(DATEDIFF(ms, @StartedAt, GETDATE()) AS VarChar(20)) + ' ms';
  116.  
  117. PRINT '
  118. Non-inline table-value UDF with table-variable and body:'
  119. SET @StartedAt = GETDATE();
  120. SELECT @DateKey = DateKey
  121. FROM dbo.TestUDFs CROSS APPLY dbo.fn_non_inline_table([Date]);
  122. PRINT ' Elapsed time: '
  123. + CAST(DATEDIFF(ms, @StartedAt, GETDATE()) AS VarChar(20)) + ' ms';
  124.  
  125. PRINT '
  126. Inline table-value UDF:'
  127. SET @StartedAt = GETDATE();
  128. SELECT @DateKey = DateKey
  129. FROM dbo.TestUDFs CROSS APPLY dbo.fn_inline_table([Date]);
  130. PRINT ' Elapsed time: '
  131. + CAST(DATEDIFF(ms, @StartedAt, GETDATE()) AS VarChar(20)) + ' ms';
  132.  
  133. GO
  134. -- Restore SET NOCOUNT OFF
  135. SET NOCOUNT OFF;

An execution of this Script generates this result message:

  1. Rows in TestUDFs table: 365000
  2.  
  3. Calculation directly in query:
  4. Elapsed time: 300 ms
  5.  
  6. Scalar-value UDF with variables and body:
  7. Elapsed time: 1570 ms
  8.  
  9. Scalar-value UDF with direct expression and body:
  10. Elapsed time: 1274 ms
  11.  
  12. Non-inline table-value UDF with table-variable and body:
  13. Elapsed time: 29623 ms
  14.  
  15. Inline table-value UDF:
  16. Elapsed time: 210 ms

No comments:

Post a Comment

Note: only a member of this blog may post a comment.