Skip to content

Commit 62589e2

Browse files
committed
Introduce new DeleteConsecutiveExtendedAttributes maintenance procedure
In older versions of the OnTopic Library, new extended attribute versions were _always_ written, even if their composite XML value had not changed. This isn't necessary in terms of the OnTopic library, which will take whatever the latest version of an attribute is when loading the topic graph (or restoring a topic). As such, these _consecutive duplicates_ only add clutter to the database, and potentially slow down some queries by introducing more data to sort through. This is especially concerning for the extended attributes since they use large XML fields. The `DeleteConsecutiveExtendedAttributes` identifies and deletes these from the database, thus making it more efficient. This corresponds to #fc3e27a, which does the same for indexed attributes.
1 parent fc3e27a commit 62589e2

2 files changed

Lines changed: 72 additions & 0 deletions

File tree

Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
--------------------------------------------------------------------------------------------------------------------------------
2+
-- DELETE CONSECUTIVE ATTRIBUTES
3+
--------------------------------------------------------------------------------------------------------------------------------
4+
-- Current versions of the OnTopic Library evaluate whether or not the composite XML for the extended attribute values has
5+
-- changed since the previous version, and only creates a new version if it has. This wasn't true in previous versions, however.
6+
-- As a result, there are some cases, and especially in older databases, where unnecessary duplicates occur for attribute
7+
-- values. These dramatically increase the size of the database and can slow down the processing time of certain queries. This
8+
-- procedure will detect concurrent duplicates and remove them from the database. This reduces the size of the database, without
9+
-- interfering with the data integrity.
10+
--------------------------------------------------------------------------------------------------------------------------------
11+
-- NOTE: Because this query must cast the XML values as VARCHAR in order to compare them, it takes a LONG time to run. Please
12+
-- be patient!
13+
--------------------------------------------------------------------------------------------------------------------------------
14+
15+
CREATE PROCEDURE [dbo].[DeleteConsecutiveExtendedAttributes]
16+
AS
17+
18+
SET NOCOUNT ON;
19+
20+
--------------------------------------------------------------------------------------------------------------------------------
21+
-- CHECK INITIAL VALUES
22+
--------------------------------------------------------------------------------------------------------------------------------
23+
DECLARE @Count INT
24+
25+
SELECT @Count = Count(TopicID)
26+
FROM ExtendedAttributes
27+
28+
Print('Initial Count: ' + CAST(@Count AS VARCHAR) + ' Extended Attributes in the database.');
29+
30+
--------------------------------------------------------------------------------------------------------------------------------
31+
-- IDENTIFY GROUPS OF CONCURRENT DUPLICATES
32+
--------------------------------------------------------------------------------------------------------------------------------
33+
WITH GroupedValues AS (
34+
SELECT TopicID,
35+
AttributesXml,
36+
DateModified,
37+
Version,
38+
ValueGroup = ROW_NUMBER() OVER(PARTITION BY TopicID ORDER BY TopicID, Version)
39+
- ROW_NUMBER() OVER(PARTITION BY TopicID, CAST(AttributesXml AS NVARCHAR(MAX)) ORDER BY TopicID, Version)
40+
FROM ExtendedAttributes
41+
),
42+
43+
--------------------------------------------------------------------------------------------------------------------------------
44+
-- RANK DUPLICATES BY DATE
45+
--------------------------------------------------------------------------------------------------------------------------------
46+
RankedValues AS (
47+
SELECT TopicID,
48+
AttributesXml,
49+
DateModified,
50+
Version,
51+
ValueGroup,
52+
ValueRank = ROW_NUMBER() OVER(PARTITION BY ValueGroup, TopicID, CAST(AttributesXml AS NVARCHAR(MAX)) ORDER BY TopicID, Version)
53+
FROM GroupedValues
54+
)
55+
56+
--------------------------------------------------------------------------------------------------------------------------------
57+
-- DELETE NEWER DUPLICATES
58+
--------------------------------------------------------------------------------------------------------------------------------
59+
DELETE
60+
FROM RankedValues
61+
WHERE ValueRank > 1;
62+
63+
PRINT('Concurrent duplicates have been deleted.')
64+
65+
--------------------------------------------------------------------------------------------------------------------------------
66+
-- CHECK FINAL VALUES
67+
--------------------------------------------------------------------------------------------------------------------------------
68+
SELECT @Count = @Count - Count(TopicID)
69+
FROM ExtendedAttributes
70+
71+
Print('Final Count: ' + CAST(@Count AS VARCHAR) + ' duplicate Extended Attributes were identified and deleted.')

OnTopic.Data.Sql.Database/OnTopic.Data.Sql.Database.sqlproj

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -100,6 +100,7 @@
100100
<Build Include="Types\AttributeValues.sql" />
101101
<Build Include="Types\TopicList.sql" />
102102
<Build Include="Views\AttributeIndex.sql" />
103+
<Build Include="Maintenance\DeleteConsecutiveExtendedAttributes.sql" />
103104
<Build Include="Maintenance\DeleteConsecutiveAttributes.sql" />
104105
</ItemGroup>
105106
<ItemGroup>

0 commit comments

Comments
 (0)