Skip to main content

Shortest Common Supersequence (SCS)

Definition​

The Shortest Common Supersequence (SCS) algorithm finds the shortest supersequence that contains all the given sequences as subsequences. It's a dynamic programming-based approach, commonly used in bioinformatics, data compression, and version control systems. This algorithm efficiently computes the length of the SCS and can also reconstruct the SCS itself

Practice​

shortest_common_supersequence(sequence1, sequence2):
m = length(sequence1)
n = length(sequence2)
table = create_table(m+1, n+1)

// Step 1: Initialize the table
for i from 0 to m:
table[i][0] = i
for j from 0 to n:
table[0][j] = j

// Step 2: Fill the table
for i from 1 to m:
for j from 1 to n:
if sequence1[i-1] == sequence2[j-1]:
table[i][j] = 1 + table[i-1][j-1]
else:
table[i][j] = max(table[i-1][j], table[i][j-1])

// Step 3: Reconstruct the SCS
scs = ""
i = m
j = n
while i > 0 and j > 0:
if sequence1[i-1] == sequence2[j-1]:
scs = sequence1[i-1] + scs
i = i - 1
j = j - 1
else:
if table[i-1][j] > table[i][j-1]:
scs = sequence1[i-1] + scs
i = i - 1
else:
scs = sequence2[j-1] + scs
j = j - 1

// Step 4: Add remaining characters
while i > 0:
scs = sequence1[i-1] + scs
i = i - 1
while j > 0:
scs = sequence2[j-1] + scs
j = j - 1

return scs