LeetCode 115 Distinct Subsequences

Given a string S and a string T, count the number of distinct subsequences of T in S.

A subsequence of a string is a new string which is formed from the original string by deleting some (can be none) of the characters without disturbing the relative positions of the remaining characters. (ie, "ACE" is a subsequence of "ABCDE" while "AEC" is not).

Here is an example:
S = "rabbbit", T = "rabbit"

Return 3.

Diffculty
Hard

Similar Problems

Analysis

这题题意是要求在 S 中的所有包含 T 的不同子序列的数量。比如题例中就包含三个："bbbit", "abbbit", "rabbbit"，这三个子序列均包含了 T 。

看到有关字符串的子序列或者配准类的问题，首先应该考虑的就是用动态规划 Dynamic Programming 来求解，而且，要考虑空字符串的情况，因为空串也是任意字符串的一个子序列。 DP 类型的题惯例的三步走：定义状态，推导递推公式，确定状态计算方向和起始状态。把 DP[i][j] 定义为 S[0:j-1] 中存在包含 T[0:i-1] 的 distinct subsequence 的个数。显然如果 j < i，DP[i][j] = 0。

也可以这样想，设母串的长度为 j，子串的长度为 i，我们要求的就是长度为 i 的子串在长度为 j 的母串的所有子串中出现的次数，设为 dp[i][j]。

考虑下列两种情况：

若母串的最后一个字符与子串的最后一个字符不同（S[j] != T[i]），则长度为 i 的子串在长度为 j 的母串中出现的次数就是母串的前 j - 1 个字符中子串出现的次数，即 dp[i][j] = dp[i][j - 1]；
若母串的最后一个字符与子串的最后一个字符相同（S[j] == T[i]），那么最终数量是下列两种情况相加，也即 dp[i][j] = dp[i][j - 1] + t[i - 1][j - 1]。
- 母串的前 j - 1 个字符也可能出现整个子串，T[0:i]
- 母串的前 j - 1 个字符出现子串的前 i - 1个字符，T[0:i-1]

转换方程：

DP[i][j] = DP[i][j-1],                 S[j-1] != T[i-1]
DP[i][j] = DP[i-1][j-1] + DP[i][j-1],  S[j-1] == T[i-1]

初始值：

先设置第 0 行、第 0 列的值。

第 0 列：DP[i][0] = 0，i > 0。 S 是空字符串，T 不是空字符串，显然 T 不可能成为 S 的 subsequence。
第 0 行：DP[0][j] = 1，j >= 0。T 是空字符串，那么显然是 S 的 1 个 subsequence

优化由于只跟上一次结果有关，可以使用滚动数组来减小空间复杂度。

对于二维数组的解集矩阵来说，实际上只有上三角（或者下三角）的数据是有效的，因此可以通过求矩阵上三角（或者下三角）的方式来缩减空间，比如：

  — r a b b i t
- 1 0 0 0 0 0 0
r 1 1 0 0 0 0 0
a 1 1 1 0 0 0 0
b 1 1 1 1 0 0 0
b 1 1 1 2 1 0 0
b 1 1 1 3 3 0 0
i 1 1 1 3 3 3 0
t 1 1 1 3 3 3 3 
    图（一）

或者

  — r a b b b i t
- 1 1 1 1 1 1 1 1
r 0 1 1 1 1 1 1 1
a 0 0 1 1 1 1 1 1
b 0 0 0 1 2 3 3 3
b 0 0 0 0 1 3 3 3
i 0 0 0 0 0 0 3 3
t 0 0 0 0 0 0 0 3 
    图（二）

知道了解集的结构之后，我们就可以使用一维数组，然后按照上图所示的方式来进行遍历。图一应该是从右往左遍历，图二则是从左往右遍历。因为当发生 S[i] == T[j] 的时候，dp[j] = dp[j] + dp[j - 1] 。

Solutions

DP With 2D Array

class Solution {
public:
    int numDistinct(string s, string t) {
        int m = s.length(), n = t.length();
        vector<vector<int>> dp(n + 1, vector<int>(m + 1, 0));

        for (int i = 0; i <= m; ++i) dp[0][i] = 1;
        for (int i = 1; i <= n; ++i) {
            for (int j = 1; j <= m; ++j) {
                if (s[j - 1] != t[i - 1]) dp[i][j] = dp[i][j - 1];
                else {
                    dp[i][j] = dp[i - 1][j - 1] + dp[i][j - 1];
                }
            }
        }
        return dp[n][m];
    }
};

正序

class Solution {
public:
    int numDistinct(string s, string t) {
        int m = s.length(), n = t.length();
        vector<vector<int>> dp(m + 1, vector<int>(n + 1, 0));
        for (int i = 0; i <= m; ++i) dp[i][0] = 1;
        for (int i = 1; i <= m; ++i) {
            for (int j = 1; j <= n; ++j) {
                if (s[i - 1] != t[j - 1]) dp[i][j] = dp[i - 1][j];
                else {
                    dp[i][j] = dp[i - 1][j] + dp[i - 1][j - 1];
                }
            }
        }
        return dp[m][n];
    }
};

DP With 1D Array j 从尾到头，因为每次要使用上一次 loop 的值。如果从头往尾扫的话，重复计算了。

class Solution {
public:
    int numDistinct(string s, string t) {
        int m = s.length(), n = t.length();
        if (m < n) return 0;
        vector<int> dp(n + 1, 0);
        dp[0] = 1;      // 相当于 dp[0][0]，肯定是 1
        for (int i = 1; i <= m; ++i) {
            for (int j = n; j >= 1; --j) { // 从后往前
                if (s[i - 1] == t[j - 1]) {
                    dp[j] += dp[j - 1];
                }
            }
        }
        return dp[n];
    }
};

DP With 1D Array

class Solution {
public:
    int numDistinct(string s, string t) {
        int n = s.size(), m = t.size();
        vector<int> dp(n + 1, 1);

        for(int i = 1; i <= m; i++) {
            int upLeft = dp[0];
            dp[0] = 0;
            for(int j = 1; j <= n; j++) {
                int temp = dp[j];
                dp[j] = dp[j - 1];
                if(s[j - 1] == t[i - 1]) 
                    dp[j] += upLeft;
                upLeft = temp;
            }
        }

        return dp[n];
    }
};

15.10 (115) Distinct Subsequences

LeetCode 115 Distinct Subsequences

Analysis

Solutions

Reference

results matching ""

No results matching ""