阅读视图

发现新文章，点击刷新页面。

[Python3/Java/C++/Go/TypeScript] 一题一解：贪心（清晰题解）

LeetCode 每日一题题解

lcbin

2025年7月23日 06:54

方法一：贪心

我们不妨假设子字符串 "ab" 的得分总是不低于子字符串 "ba" 的得分，如果不是，我们可以交换 "a" 和 "b"，同时交换 $x$ 和 $y$。

接下来，我们只需要考虑字符串中只包含 "a" 和 "b" 的情况。如果字符串中包含其他字符，我们可以将其视为一个分割点，将字符串分割成若干个只包含 "a" 和 "b" 的子字符串，然后分别计算每个子字符串的得分。

我们观察发现，对于一个只包含 "a" 和 "b" 的子字符串，无论采取什么样的操作，最后一定只剩下一种字符，或者空串。由于每次操作都会同时删除一个 "a" 和一个 "b"，因此总的操作次数一定是固定的。我们可以贪心地先删除 "ab"，再删除 "ba"，这样可以保证得分最大。

因此，我们可以使用两个变量 $\textit{cnt1}$ 和 $\textit{cnt2}$ 分别记录 "a" 和 "b" 的数量，然后遍历字符串，根据当前字符的不同情况更新 $\textit{cnt1}$ 和 $\textit{cnt2}$，并计算得分。

对于当前遍历到的字符 $c$：

如果 $c$ 是 "a"，由于要先删除 "ab"，因此此时我们不消除该字符，只增加 $\textit{cnt1}$；
如果 $c$ 是 "b"，如果此时 $\textit{cnt1} > 0$，我们可以消除一个 "ab"，并增加 $x$ 分，否则我们只能增加 $\textit{cnt2}$；
如果 $c$ 是其他字符，那么对于该子字符串，我们剩下了一个 $\textit{cnt2}$ 个 "b" 和 $\textit{cnt1}$ 个 "a"，我们可以消除 $\min(\textit{cnt1}, \textit{cnt2})$ 个 "ab"，并增加 $y$ 分。

遍历结束后，我们还需要额外处理一下剩余的 "ab"，增加若干个 $y$ 分。

###python

class Solution:
    def maximumGain(self, s: str, x: int, y: int) -> int:
        a, b = "a", "b"
        if x < y:
            x, y = y, x
            a, b = b, a
        ans = cnt1 = cnt2 = 0
        for c in s:
            if c == a:
                cnt1 += 1
            elif c == b:
                if cnt1:
                    ans += x
                    cnt1 -= 1
                else:
                    cnt2 += 1
            else:
                ans += min(cnt1, cnt2) * y
                cnt1 = cnt2 = 0
        ans += min(cnt1, cnt2) * y
        return ans

###java

class Solution {
    public int maximumGain(String s, int x, int y) {
        char a = 'a', b = 'b';
        if (x < y) {
            int t = x;
            x = y;
            y = t;
            char c = a;
            a = b;
            b = c;
        }
        int ans = 0, cnt1 = 0, cnt2 = 0;
        int n = s.length();
        for (int i = 0; i < n; ++i) {
            char c = s.charAt(i);
            if (c == a) {
                cnt1++;
            } else if (c == b) {
                if (cnt1 > 0) {
                    ans += x;
                    cnt1--;
                } else {
                    cnt2++;
                }
            } else {
                ans += Math.min(cnt1, cnt2) * y;
                cnt1 = 0;
                cnt2 = 0;
            }
        }
        ans += Math.min(cnt1, cnt2) * y;
        return ans;
    }
}

###cpp

class Solution {
public:
    int maximumGain(string s, int x, int y) {
        char a = 'a', b = 'b';
        if (x < y) {
            swap(x, y);
            swap(a, b);
        }

        int ans = 0, cnt1 = 0, cnt2 = 0;
        for (char c : s) {
            if (c == a) {
                cnt1++;
            } else if (c == b) {
                if (cnt1) {
                    ans += x;
                    cnt1--;
                } else {
                    cnt2++;
                }
            } else {
                ans += min(cnt1, cnt2) * y;
                cnt1 = 0;
                cnt2 = 0;
            }
        }
        ans += min(cnt1, cnt2) * y;
        return ans;
    }
};

###go

func maximumGain(s string, x int, y int) (ans int) {
a, b := 'a', 'b'
if x < y {
x, y = y, x
a, b = b, a
}

var cnt1, cnt2 int
for _, c := range s {
if c == a {
cnt1++
} else if c == b {
if cnt1 > 0 {
ans += x
cnt1--
} else {
cnt2++
}
} else {
ans += min(cnt1, cnt2) * y
cnt1, cnt2 = 0, 0
}
}
ans += min(cnt1, cnt2) * y
return
}

###ts

function maximumGain(s: string, x: number, y: number): number {
    let [a, b] = ['a', 'b'];
    if (x < y) {
        [x, y] = [y, x];
        [a, b] = [b, a];
    }

    let [ans, cnt1, cnt2] = [0, 0, 0];
    for (let c of s) {
        if (c === a) {
            cnt1++;
        } else if (c === b) {
            if (cnt1) {
                ans += x;
                cnt1--;
            } else {
                cnt2++;
            }
        } else {
            ans += Math.min(cnt1, cnt2) * y;
            cnt1 = 0;
            cnt2 = 0;
        }
    }
    ans += Math.min(cnt1, cnt2) * y;
    return ans;
}

###rust

impl Solution {
    pub fn maximum_gain(s: String, mut x: i32, mut y: i32) -> i32 {
        let (mut a, mut b) = ('a', 'b');
        if x < y {
            std::mem::swap(&mut x, &mut y);
            std::mem::swap(&mut a, &mut b);
        }

        let mut ans = 0;
        let mut cnt1 = 0;
        let mut cnt2 = 0;

        for c in s.chars() {
            if c == a {
                cnt1 += 1;
            } else if c == b {
                if cnt1 > 0 {
                    ans += x;
                    cnt1 -= 1;
                } else {
                    cnt2 += 1;
                }
            } else {
                ans += cnt1.min(cnt2) * y;
                cnt1 = 0;
                cnt2 = 0;
            }
        }

        ans += cnt1.min(cnt2) * y;
        ans
    }
}

###js

function maximumGain(s, x, y) {
    let [a, b] = ['a', 'b'];
    if (x < y) {
        [x, y] = [y, x];
        [a, b] = [b, a];
    }

    let [ans, cnt1, cnt2] = [0, 0, 0];
    for (let c of s) {
        if (c === a) {
            cnt1++;
        } else if (c === b) {
            if (cnt1) {
                ans += x;
                cnt1--;
            } else {
                cnt2++;
            }
        } else {
            ans += Math.min(cnt1, cnt2) * y;
            cnt1 = 0;
            cnt2 = 0;
        }
    }
    ans += Math.min(cnt1, cnt2) * y;
    return ans;
}

###cs

public class Solution {
    public int MaximumGain(string s, int x, int y) {
        char a = 'a', b = 'b';
        if (x < y) {
            (x, y) = (y, x);
            (a, b) = (b, a);
        }

        int ans = 0, cnt1 = 0, cnt2 = 0;
        foreach (char c in s) {
            if (c == a) {
                cnt1++;
            } else if (c == b) {
                if (cnt1 > 0) {
                    ans += x;
                    cnt1--;
                } else {
                    cnt2++;
                }
            } else {
                ans += Math.Min(cnt1, cnt2) * y;
                cnt1 = 0;
                cnt2 = 0;
            }
        }

        ans += Math.Min(cnt1, cnt2) * y;
        return ans;
    }
}

时间复杂度 $O(n)$，其中 $n$ 为字符串 $s$ 的长度。空间复杂度 $O(1)$。

有任何问题，欢迎评论区交流，欢迎评论区提供其它解题思路（代码），也可以点个赞支持一下作者哈😄~

每日一题-删除子字符串的最大得分🟡

LeetCode 每日一题题解

2025年7月23日 00:00

给你一个字符串 s 和两个整数 x 和 y 。你可以执行下面两种操作任意次。

删除子字符串 "ab" 并得到 x 分。
- 比方说，从 "cabxbae" 删除 ab ，得到 "cxbae" 。
删除子字符串"ba" 并得到 y 分。
- 比方说，从 "cabxbae" 删除 ba ，得到 "cabxe" 。

请返回对 s 字符串执行上面操作若干次能得到的最大得分。

示例 1：

输入：s = "cdbcbbaaabab", x = 4, y = 5
输出：19
解释：
- 删除 "cdbcbbaaabab" 中加粗的 "ba" ，得到 s = "cdbcbbaaab" ，加 5 分。
- 删除 "cdbcbbaaab" 中加粗的 "ab" ，得到 s = "cdbcbbaa" ，加 4 分。
- 删除 "cdbcbbaa" 中加粗的 "ba" ，得到 s = "cdbcba" ，加 5 分。
- 删除 "cdbcba" 中加粗的 "ba" ，得到 s = "cdbc" ，加 5 分。
总得分为 5 + 4 + 5 + 5 = 19 。

示例 2：

输入：s = "aabbaaxybbaabb", x = 5, y = 4
输出：20

提示：

1 <= s.length <= 10⁵
1 <= x, y <= 10⁴
s 只包含小写英文字母。

【无需额外空间】贪心算法

LeetCode 每日一题题解

Arsenal-591

2021年1月10日 00:11

方法一：贪心

首先，不妨假设 $"ab"$ 的得分总是不低于 $"ba"$；否则，我们将字符串中的字符 $a$ 换成 $b$，$b$ 换成 $a$，再交换 $x$ 和 $y$ 即可。

随后，我们也只需考虑字符串中只包含 $a,b$ 的情形。如果字符串中含有其他的字符，就以该字符为分隔，分别考虑左右两个字符串即可。

注意到：对于一个只包含 $a,b$ 的字符串而言，无论采取怎样的方案进行消除，最后一定只剩下一种字符（或者为空字符串）；而由于每次消除操作都同等地将 $a,b$ 的出现次数减 $1$，因此总的消除操作数量也是固定的。既然消除操作的数量是固定值，那么最优的策略一定是：尽可能地多消除 $ab$。

因此，我们维护两个计数器 $c_a, c_b$，分别代表着 $a,b$ 两种字符剩余的数目。

如果当前字符为 $a$，由于贪心策略要求多消除 $ab$，因此此时不消除该字符 $a$，而是将 $c_a$ 递增
如果当前字符为 $b$，
- 如果 $c_a > 0$，说明此前有剩余的字符 $a$，因此我们利用这个 $a$ 消除当前的 $b$，于是将 $c_a$ 递减，并记录一次得分 $x$
- 如果 $c_a = 0$，说明没有剩余的字符 $a$ 了，此时我们无法将这个 $b$ 消除掉，于是将 $c_b$ 递增。

最后，我们留下了 $c_a$ 个字符 $a$，$c_b$ 个字符 $b$。此时我们终于可以消除 $ba$ 了，消除的次数为 $\min{c_a, c_b}$，故记录得分 $y\cdot min{c_a, c_b}$.

class Solution {
public:
    int maximumGain(string s, int x, int y) {
        int n = s.length();
        if (x < y) {
            swap(x, y);
            for (int i = 0; i < n; i++) {
                if (s[i] == 'a') s[i] = 'b';
                else if (s[i] == 'b') s[i] = 'a';
            }
        }

        int ret = 0;
        int i = 0;
        while (i < n) {
            while (i < n && s[i] != 'a' && s[i] != 'b') i++;
            
            int ca = 0, cb = 0;
            while (i < n && (s[i] == 'a' || s[i] == 'b')) {
                if (s[i] == 'a') {
                    ca++;
                } else {
                    if (ca > 0) {
                        ca--;
                        ret += x;
                    } else {
                        cb++;
                    }
                }
                i++;
            }
            
            ret += min(ca, cb) * y;
        }

        return ret;
    }
};

C++ 贪心超多细节，手把手讲明白噢！

LeetCode 每日一题题解

已注销

2021年1月10日 00:11

题目2：5634. 删除子字符串的最大得分

思路：贪心

贪心算法的正确性

我们不妨假设 $x <= y$，由于本题只涉及到 $a, b$ 两个字符，且不论是删除 $ab$ 还是 $ba$，两个字符都是同等数量的减少，不会凭空产生，所以一定不会出现优先删除了一个 $ba$，导致无法删除多个 $ab$ ，从而失去了获得更高分数的可能性。故利用贪心算法优先删除得分较高的子字符串是正确的。
问题处理的优化

对于如下的两个字符串互为逆序的示例，示例 $1$ 将先删除 $ab$ ，再删除 $ba$ 得到 $15$ 分，而示例 $2$ 将先删除 $ba$ ，再删除 $ab$ 得到 $15$ 分，我们发现，字符串逆序后，将对应的得分也进行互换，最终的最大得分是相同的，所以，我们可以将 $x > y$ 的情况经过预处理转化为 $x <= y$ 的情况。

###cpp

示例1：s = "abcdba", x = 10, y = 5
示例2：s = "abdcba", x = 5, y = 10

具体实现的细节

在用栈结构优先处理完所有 $ba$ 子字符串后，要第二次处理栈中剩余的 $ab$ 子字符串。由于栈具有“后入先出”的特点，该栈中 $ab$ 子字符串弹出元素的顺序实际上为 $ba$，所以在代码实现中，利用了第二个栈 $t$ 作为载体，基本复制第一次处理的代码即可。

代码：

###c++

class Solution {
public:
    int maximumGain(string a, int x, int y) {
        stack<char> s, t;
        int ret = 0;
        // 处理优化
        if(x > y) {
            swap(x, y);
            reverse(a.begin(), a.end());
        } 
        // 先处理 ba
        for(char c : a) {
            if(c != 'a') s.push(c);
            else {
                // 形成 ba 子字符串
                if(!s.empty() && s.top() == 'b') {
                    s.pop();
                    ret += y;
                } else {
                    s.push(c);
                }
            }
        }
        // 再处理 ab
        while(!s.empty()) {
            char c = s.top();
            s.pop();
            if(c != 'a') t.push(c);
            else {
                if(!t.empty() && t.top() == 'b') {
                    t.pop();
                    ret += x;
                } else {
                    t.push(c);
                }
            }
        }
        return ret;
    }
};

复杂度分析：

时间复杂度为 $O(n)$，相当于遍历了两次字符串。
空间复杂度为 $O(n)$，利用了两个辅助栈。

关注GTAlgorithm，专注周赛、面经题解分享，陪大家一起攻克算法难关~

层序遍历？套模板就够了

LeetCode 第 222 场周赛题解

每日一题-删除子数组的最大得分🟡

LeetCode 每日一题题解

2025年7月22日 00:00

给你一个正整数数组 nums ，请你从中删除一个含有 若干不同元素 的子数组。删除子数组的得分就是子数组各元素之和。

返回 只删除一个 子数组可获得的 最大得分 。

如果数组 b 是数组 a 的一个连续子序列，即如果它等于 a[l],a[l+1],...,a[r] ，那么它就是 a 的一个子数组。

示例 1：

输入：nums = [4,2,4,5,6]
输出：17
解释：最优子数组是 [2,4,5,6]

示例 2：

输入：nums = [5,2,1,2,5,2,1,2,5]
输出：8
解释：最优子数组是 [5,2,1] 或 [1,2,5]

提示：

1 <= nums.length <= 10⁵
1 <= nums[i] <= 10⁴

滑动窗口 + 数组 / 哈希集合（Python/Java/C++/Go/JS/Rust）

LeetCode 每日一题题解

endlesscheng

2025年7月10日 13:50

本题和 3. 无重复字符的最长子串是一样的，额外维护窗口中的元素和即可。见滑动窗口【基础算法精讲 03】。

本题可以用布尔数组实现（更快），也可以用哈希集合实现（更通用，适用于 $\textit{nums}[i]\le 10^9$ 这种值域更大的场景）。

数组写法

###py

class Solution:
    def maximumUniqueSubarray(self, nums: List[int]) -> int:
        has = [False] * (max(nums) + 1)
        ans = s = left = 0
        for x in nums:
            while has[x]:
                has[nums[left]] = False
                s -= nums[left]
                left += 1
            has[x] = True
            s += x
            ans = max(ans, s)
        return ans

###java

class Solution {
    public int maximumUniqueSubarray(int[] nums) {
        int mx = 0;
        for (int x : nums) {
            mx = Math.max(mx, x);
        }

        boolean[] has = new boolean[mx + 1];
        int ans = 0, s = 0, left = 0;
        for (int x : nums) {
            while (has[x]) {
                has[nums[left]] = false;
                s -= nums[left];
                left++;
            }
            has[x] = true;
            s += x;
            ans = Math.max(ans, s);
        }
        return ans;
    }
}

###cpp

class Solution {
public:
    int maximumUniqueSubarray(vector<int>& nums) {
        int mx = ranges::max(nums);
        vector<int8_t> has(mx + 1);
        int ans = 0, s = 0, left = 0;
        for (int x : nums) {
            while (has[x]) {
                has[nums[left]] = false;
                s -= nums[left];
                left++;
            }
            has[x] = true;
            s += x;
            ans = max(ans, s);
        }
        return ans;
    }
};

###go

func maximumUniqueSubarray(nums []int) (ans int) {
mx := slices.Max(nums)
has := make([]bool, mx+1)
s, left := 0, 0
for _, x := range nums {
for has[x] {
has[nums[left]] = false
s -= nums[left]
left++
}
has[x] = true
s += x
ans = max(ans, s)
}
return
}

###js

var maximumUniqueSubarray = function(nums) {
    const mx = Math.max(...nums);
    const has = Array(mx + 1).fill(false);
    let ans = 0, s = 0, left = 0;
    for (const x of nums) {
        while (has[x]) {
            has[nums[left]] = false;
            s -= nums[left];
            left++;
        }
        has[x] = true;
        s += x;
        ans = Math.max(ans, s);
    }
    return ans;
};

###rust

impl Solution {
    pub fn maximum_unique_subarray(nums: Vec<i32>) -> i32 {
        let mx = *nums.iter().max().unwrap();
        let mut has = vec![false; (mx + 1) as usize];
        let mut ans = 0;
        let mut s = 0;
        let mut left = 0;
        for &x in &nums {
            while has[x as usize] {
                has[nums[left] as usize] = false;
                s -= nums[left];
                left += 1;
            }
            has[x as usize] = true;
            s += x;
            ans = ans.max(s);
        }
        ans
    }
}

哈希集合写法

###py

class Solution:
    def maximumUniqueSubarray(self, nums: List[int]) -> int:
        st = set()
        ans = s = left = 0
        for x in nums:
            while x in st:
                st.remove(nums[left])
                s -= nums[left]
                left += 1
            st.add(x)
            s += x
            ans = max(ans, s)
        return ans

###java

class Solution {
    public int maximumUniqueSubarray(int[] nums) {
        Set<Integer> set = new HashSet<>();
        int ans = 0, s = 0, left = 0;
        for (int x : nums) {
            while (set.contains(x)) {
                set.remove(nums[left]);
                s -= nums[left];
                left++;
            }
            set.add(x);
            s += x;
            ans = Math.max(ans, s);
        }
        return ans;
    }
}

###cpp

class Solution {
public:
    int maximumUniqueSubarray(vector<int>& nums) {
        unordered_set<int> st;
        int ans = 0, s = 0, left = 0;
        for (int x : nums) {
            while (st.contains(x)) {
                st.erase(nums[left]);
                s -= nums[left];
                left++;
            }
            st.insert(x);
            s += x;
            ans = max(ans, s);
        }
        return ans;
    }
};

###go

func maximumUniqueSubarray(nums []int) (ans int) {
has := map[int]bool{}
s, left := 0, 0
for _, x := range nums {
for has[x] {
delete(has, nums[left])
s -= nums[left]
left++
}
has[x] = true
s += x
ans = max(ans, s)
}
return
}

###js

var maximumUniqueSubarray = function(nums) {
    const set = new Set();
    let ans = 0, s = 0, left = 0;
    for (const x of nums) {
        while (set.has(x)) {
            set.delete(nums[left]);
            s -= nums[left];
            left++;
        }
        set.add(x);
        s += x;
        ans = Math.max(ans, s);
    }
    return ans;
};

###rust

use std::collections::HashSet;

impl Solution {
    pub fn maximum_unique_subarray(nums: Vec<i32>) -> i32 {
        let mut set = HashSet::new();
        let mut ans = 0;
        let mut s = 0;
        let mut left = 0;
        for &x in &nums {
            while set.contains(&x) {
                set.remove(&nums[left]);
                s -= nums[left];
                left += 1;
            }
            set.insert(x);
            s += x;
            ans = ans.max(s);
        }
        ans
    }
}

复杂度分析

时间复杂度：$\mathcal{O}(n)$，其中 $n$ 是 $\textit{nums}$ 的长度。虽然写了个二重循环，但是内层循环中对 $\textit{left}$ 加一的总执行次数不会超过 $n$ 次，所以总循环次数为 $\mathcal{O}(n)$。
空间复杂度：$\mathcal{O}(n)$。

思考题

如果 $\textit{nums}$ 中有负数，要怎么做？

欢迎在评论区分享你的思路/代码。

分类题单

如何科学刷题？

我的题解精选（已分类）

欢迎关注 B站@灵茶山艾府

前缀和 + 滑动窗口

LeetCode 每日一题题解

yixingzhang

2020年12月20日 14:39

解法：前缀和 + 滑动窗口

首先由题意知，我们需要返回一个连续的子数组的最大和，该连续子数组中不能有相同的元素。所以我们有三个功能要实现：

找出所有的连续子数组
判断该子数组中是否有相同的元素
求出该子数组的和，和其他子数组的和进行比较

滑动窗口

我们需要的是一个连续的子数组，所以可以用双指针来实现滑动窗口，指针不断向后移动直到越界为止。

如何判断窗口是否有相同的元素呢？

大家的第一想法肯定是 Hash 表，而为了提升效率，我们可使用计数排序的思想，来用空间换时间的方法来提升效率。对于窗口中的 nums[i]，我们使用 sign[nums[i]] 来判断它是否已存在。

如何快速求出窗口中的元素和呢？

如果我们要求 nums[i] ~ nums[j] 的和，最暴力的方法就是使用循环挨个累加。而我们可以对数组进行预处理，求出前缀和。创建新的数组 array，数组 array[i+1] 存放 nums[0] ~ nums[i] 的和。nums[i] ~ nums[j] 的和为 array[j+1] - array[i]，这样就＊＊＊缩短了求和的时间。

执行用时：7 ms, 在所有 Java 提交中击败了94.20%的用户
内存消耗：47.2 MB, 在所有 Java 提交中击败了98.07%的用户

###java

class Solution {
    public int maximumUniqueSubarray(int[] nums) {
        // 存放前缀和
        int[] array = new int[nums.length + 1];
        // 存放数组中的最大值，用来确定标记数组的长度
        int maxlength = 0;
        // 首先对数组进行预处理并求出最大值
        for (int i = 0; i < nums.length; i++) {
            array[i + 1] = array[i] + nums[i];
            maxlength = Math.max(maxlength, nums[i]);
        }
        int max = 0;
        // 创建标记数组
        boolean[] sign = new boolean[maxlength + 1];
        // 记录起始下标
        int start = 0;
        // 记录结束下标
        int end = 0;
        while (end < nums.length) {
            // 窗口扩大向后移动
            for (; end < nums.length; end++) {
                // 如果该数已被访问过则结束循环
                if (sign[nums[end]]) {
                    break;
                }
                // 对于访问过的数做标记
                sign[nums[end]] = true;
            }
            // 计算窗口的和
            int sum = array[end] - array[start];
            // 求出最大的和
            if (max < sum) {
                max = sum;
            }
            if (end < nums.length) {
                // 窗口缩小，向右移动
                while (start < end && nums[end] != nums[start]) {
                    // 回溯标记过的下标
                    sign[nums[start]] = false;
                    start++;
                }
                sign[nums[start]] = false;
                start++;
            }
        }
        return max;
    }
}

文中若有不恰当的地方，请您一定要告诉我。前路崎岖，望我们可以互相帮助，并肩前行！

滑动窗口，双百

LeetCode 每日一题题解

nian-zi-zai-zi

2020年12月20日 13:12

开始用map记录下标和数字，结果超时了。后来改用数组记录，隐藏用例超时了。心态崩了。其实只用set记录就好了。

###java

class Solution {
    public int maximumUniqueSubarray(int[] nums) {
        Set<Integer> set = new HashSet<>();
        int max = 0, sum = 0, start = 0;
        for (int i = 0; i < nums.length; i++) {
            if (!set.contains(nums[i])) {
                set.add(nums[i]);
                sum += nums[i];
                max = Math.max(sum, max);
            } else {
                while (nums[i] != nums[start]) {
                    sum -= nums[start];
                    set.remove(nums[start]);
                    start++;
                }
                start++;
            }
        }
        return max;
    }
}

每日一题-删除字符使字符串变好🟢

LeetCode 每日一题题解

2025年7月21日 00:00

一个字符串如果没有 三个连续 相同字符，那么它就是一个 好字符串 。

给你一个字符串 s ，请你从 s 删除最少的字符，使它变成一个 好字符串 。

请你返回删除后的字符串。题目数据保证答案总是 唯一的 。

示例 1：

输入：s = "leeetcode"
输出："leetcode"
解释：
从第一组 'e' 里面删除一个 'e' ，得到 "leetcode" 。
没有连续三个相同字符，所以返回 "leetcode" 。

示例 2：

输入：s = "aaabaaaa"
输出："aabaa"
解释：
从第一组 'a' 里面删除一个 'a' ，得到 "aabaaaa" 。
从第二组 'a' 里面删除两个 'a' ，得到 "aabaa" 。
没有连续三个相同字符，所以返回 "aabaa" 。

示例 3：

输入：s = "aab"
输出："aab"
解释：没有连续三个相同字符，所以返回 "aab" 。

提示：

1 <= s.length <= 10⁵
s 只包含小写英文字母。

简单题，简单做（Python/Java/C++/C/Go/JS/Rust）

LeetCode 每日一题题解

endlesscheng

2025年7月5日 13:06

把 $s$ 按照连续相同字母分成若干段，每段保留至多 $2$ 个字母。

示例 2 的 $s=\texttt{aaabaaaa}$，分成三段 $\texttt{aaa} + \texttt{b} + \texttt{aaaa}$，其中第一段和第三段不符合要求（有三个连续相同字符），保留 $2$ 个字母，变成 $\texttt{aa} + \texttt{b} + \texttt{aa} = \texttt{aabaa}$。

用一个计数器 $\textit{cnt}$ 统计每一段的当前长度，如果 $\textit{cnt}<3$ 就把当前字母加入答案。

如果当前字母和下一个字母不同，则重置 $\textit{cnt}=0$，统计下一段的长度。

###py

class Solution:
    def makeFancyString(self, s: str) -> str:
        ans = []
        cnt = 0
        for i, ch in enumerate(s):
            cnt += 1
            if cnt < 3:
                ans.append(ch)
            if i < len(s) - 1 and ch != s[i + 1]:
                cnt = 0  # 当前字母和下个字母不同，重置计数器
        return ''.join(ans)

###py

class Solution:
    def makeFancyString(self, s: str) -> str:
        ans = []
        for _, group in groupby(s):
            ans += list(group)[:2]
        return ''.join(ans)

###java

class Solution {
    public String makeFancyString(String s) {
        StringBuilder ans = new StringBuilder();
        int cnt = 0;
        for (int i = 0; i < s.length(); i++) {
            cnt++;
            if (cnt < 3) {
                ans.append(s.charAt(i));
            }
            if (i < s.length() - 1 && s.charAt(i) != s.charAt(i + 1)) {
                cnt = 0; // 当前字母和下个字母不同，重置计数器
            }
        }
        return ans.toString();
    }
}

###cpp

class Solution {
public:
    string makeFancyString(string s) {
        string ans;
        int cnt = 0;
        for (int i = 0; i < s.size(); i++) {
            cnt++;
            if (cnt < 3) {
                ans += s[i];
            }
            if (i < s.size() - 1 && s[i] != s[i + 1]) {
                cnt = 0; // 当前字母和下个字母不同，重置计数器
            }
        }
        return ans;
    }
};

###c

char* makeFancyString(char* s) {
    int cnt = 0, j = 0;
    for (int i = 0; s[i]; i++) {
        cnt++;
        if (cnt < 3) {
            s[j++] = s[i];
        }
        if (s[i] != s[i + 1]) {
            cnt = 0; // 当前字母和下个字母不同，重置计数器
        }
    }
    s[j] = '\0';
    return s;
}

###go

func makeFancyString(s string) string {
ans := []byte{}
cnt := 0
for i, ch := range s {
cnt++
if cnt < 3 {
ans = append(ans, byte(ch))
}
if i < len(s)-1 && byte(ch) != s[i+1] {
cnt = 0 // 当前字母和下个字母不同，重置计数器
}
}
return string(ans)
}

###js

var makeFancyString = function(s) {
    const ans = [];
    let cnt = 0;
    for (let i = 0; i < s.length; i++) {
        cnt++;
        if (cnt < 3) {
            ans.push(s[i]);
        }
        if (i < s.length - 1 && s[i] !== s[i + 1]) {
            cnt = 0; // 当前字母和下个字母不同，重置计数器
        }
    }
    return ans.join('');
};

###rust

impl Solution {
    pub fn make_fancy_string(s: String) -> String {
        let s = s.into_bytes();
        let mut ans = vec![];
        let mut cnt = 0;
        for (i, &ch) in s.iter().enumerate() {
            cnt += 1;
            if cnt < 3 {
                ans.push(ch);
            }
            if i + 1 < s.len() && ch != s[i + 1] {
                cnt = 0; // 当前字母和下个字母不同，重置计数器
            }
        }
        unsafe { String::from_utf8_unchecked(ans) }
    }
}

复杂度分析

时间复杂度：$\mathcal{O}(n)$，其中 $n$ 是 $s$ 的长度。
空间复杂度：$\mathcal{O}(n)$ 或 $\mathcal{O}(1)$，取决于实现。

分类题单

如何科学刷题？

我的题解精选（已分类）

欢迎关注 B站@灵茶山艾府

5193. 删除字符使字符串变好 - 模拟

LeetCode 每日一题题解

MGAronya

2021年8月8日 20:13

5193. 删除字符使字符串变好

第一道题。

要我们删掉字符串中一部分的字母，使这个字符串能没有一个片段连续三个字母相同。

在string上一个一个的删会超时，所以改成了用char记录新的字符串。

###c++

class Solution {
public:
    string makeFancyString(string s) {
        const int n = s.size();
        char ans[n + 1];  //用char存储新字符串
        ans[0] = s[0];  //记录第一个字符
        int j = 1;
        for(int i = 1; i < n; ++i, ++j){
            ans[j] = s[i];    //记录下这个字符
            if(ans[j] == ans[j - 1]){  //如果和先前字符相等
                while(i < n && s[i] == ans[j]) ++i;//一直遍历到与该字符不等的地方
                --i;  //后面还有++i，这里先减一下
            }
        }
        ans[j] = '\0';  //最后的中止符
        return ans;
    }
};

删除字符使字符串变好

LeetCode 每日一题题解

LeetCode-Solution

2021年8月8日 00:18

方法一：模拟

思路与算法

如果想使得好字符串对应的删除字符数量最少，那么最佳的删除策略是：对于 $s$ 中每一个长度为 $k (k \ge 3)$ 的连续相同字符子串，删去其中任意 $k - 2$ 个字符。

我们可以用一个新字符串 $\textit{res}$ 来维护删除最少字符后得到的好字符串，并从左至右遍历字符串 $s$ 模拟删除过程。每当遍历至一个新的字符时，我们检查 $\textit{res}$ 中的最后两个字符（如有）是否均等于当前字符：

如果是，则该字符应被删除，我们不将该字符添加进 $\textit{res}$；
如果不是，则不需要删除该字符，我们应当将该字符添加进 $\textit{res}$。

当遍历完成 $s$ 后，$\textit{res}$ 即为删除最少字符后得到的好字符串，我们返回 $\textit{res}$ 作为答案。

代码

###C++

class Solution {
public:
    string makeFancyString(string s) {
        string res;   // 删除后的字符串
        // 遍历 s 模拟删除过程
        for (char ch : s){
            int n = res.size();
            if (n >= 2 && res[n-1] == ch && res[n-2] == ch){
                // 如果 res 最后两个字符与当前字符均相等，则不添加
                continue;
            }
            // 反之则添加
            res.push_back(ch);
        }
        return res;
    }
};

###Python

class Solution:
    def makeFancyString(self, s: str) -> str:
        res = []   # 删除后的字符串
        # 遍历 s 模拟删除过程
        for ch in s:
            if len(res) >= 2 and res[-1] == res[-2] == ch:
                # 如果 res 最后两个字符与当前字符均相等，则不添加
                continue
            # 反之则添加
            res.append(ch)
        return "".join(res)

###Java

class Solution {
    public String makeFancyString(String s) {
        StringBuilder res = new StringBuilder();   // 删除后的字符串
        // 遍历 s 模拟删除过程
        for (char ch : s.toCharArray()) {
            int n = res.length();
            if (n >= 2 && res.charAt(n - 1) == ch && res.charAt(n - 2) == ch) {
                // 如果 res 最后两个字符与当前字符均相等，则不添加
                continue;
            }
            // 反之则添加
            res.append(ch);
        }
        return res.toString();
    }
}

###C#

public class Solution {
    public string MakeFancyString(string s) {
        StringBuilder res = new StringBuilder();   // 删除后的字符串
        // 遍历 s 模拟删除过程
        foreach (char ch in s) {
            int n = res.Length;
            if (n >= 2 && res[n-1] == ch && res[n-2] == ch) {
                // 如果 res 最后两个字符与当前字符均相等，则不添加
                continue;
            }
            // 反之则添加
            res.Append(ch);
        }
        return res.ToString();
    }
}

###Go

func makeFancyString(s string) string {
    res := []rune{}   // 删除后的字符串
    // 遍历 s 模拟删除过程
    for _, ch := range s {
        n := len(res)
        if n >= 2 && res[n - 1] == ch && res[n - 2] == ch {
            // 如果 res 最后两个字符与当前字符均相等，则不添加
            continue
        }
        // 反之则添加
        res = append(res, ch)
    }
    return string(res)
}

###C

char* makeFancyString(char* s) {
    int len = strlen(s);
    char* res = (char*)malloc(len + 1);   // 删除后的字符串
    int pos = 0;
    // 遍历 s 模拟删除过程
    for (int i = 0; i < len; i++) {
        char ch = s[i];
        if (pos >= 2 && res[pos-1] == ch && res[pos-2] == ch) {
            // 如果 res 最后两个字符与当前字符均相等，则不添加
            continue;
        }
        // 反之则添加
        res[pos++] = ch;
    }
    res[pos] = '\0';
    return res;
}

###JavaScript

var makeFancyString = function(s) {
    let res = [];   // 删除后的字符串
    // 遍历 s 模拟删除过程
    for (const ch of s) {
        const n = res.length;
        if (n >= 2 && res[n - 1] === ch && res[n - 2] === ch) {
            // 如果 res 最后两个字符与当前字符均相等，则不添加
            continue;
        }
        // 反之则添加
        res.push(ch);
    }
    return res.join('');
};

###TypeScript

function makeFancyString(s: string): string {
    const res: string[] = [];   // 删除后的字符串
    // 遍历 s 模拟删除过程
    for (const ch of s) {
        const n = res.length;
        if (n >= 2 && res[n - 1] === ch && res[n - 2] === ch) {
            // 如果 res 最后两个字符与当前字符均相等，则不添加
            continue;
        }
        // 反之则添加
        res.push(ch);
    }
    return res.join('');
};

###Rust

impl Solution {
    pub fn make_fancy_string(s: String) -> String {
        let mut res = Vec::new();   // 删除后的字符串
        // 遍历 s 模拟删除过程
        for ch in s.chars() {
            let n = res.len();
            if n >= 2 && res[n - 1] == ch && res[n - 2] == ch {
                // 如果 res 最后两个字符与当前字符均相等，则不添加
                continue;
            }
            // 反之则添加
            res.push(ch);
        }
        res.into_iter().collect()
    }
}

复杂度分析

时间复杂度：$O(n)$。
空间复杂度：由于不同语言的字符串实现与方法不同，空间复杂度也有所不同。对于 $\texttt{C++}$ 解法，空间复杂度为 $O(1)$；而对于 $\texttt{Python}$ 解法，空间复杂度为 $O(n)$。

每日一题-删除系统中的重复文件夹🔴

LeetCode 每日一题题解

2025年7月20日 00:00

由于一个漏洞，文件系统中存在许多重复文件夹。给你一个二维数组 paths，其中 paths[i] 是一个表示文件系统中第 i 个文件夹的绝对路径的数组。

例如，["one", "two", "three"] 表示路径 "/one/two/three" 。

如果两个文件夹（不需要在同一层级）包含 非空且相同的 子文件夹集合并具有相同的子文件夹结构，则认为这两个文件夹是相同文件夹。相同文件夹的根层级不需要相同。如果存在两个（或两个以上）相同文件夹，则需要将这些文件夹和所有它们的子文件夹标记为待删除。

例如，下面文件结构中的文件夹 "/a" 和 "/b" 相同。它们（以及它们的子文件夹）应该被全部标记为待删除：
- /a
- /a/x
- /a/x/y
- /a/z
- /b
- /b/x
- /b/x/y
- /b/z
然而，如果文件结构中还包含路径 "/b/w" ，那么文件夹 "/a" 和 "/b" 就不相同。注意，即便添加了新的文件夹 "/b/w" ，仍然认为 "/a/x" 和 "/b/x" 相同。

一旦所有的相同文件夹和它们的子文件夹都被标记为待删除，文件系统将会删除所有上述文件夹。文件系统只会执行一次删除操作。执行完这一次删除操作后，不会删除新出现的相同文件夹。

返回二维数组 ans ，该数组包含删除所有标记文件夹之后剩余文件夹的路径。路径可以按 任意顺序 返回。

示例 1：

输入：paths = [["a"],["c"],["d"],["a","b"],["c","b"],["d","a"]]
输出：[["d"],["d","a"]]
解释：文件结构如上所示。
文件夹 "/a" 和 "/c"（以及它们的子文件夹）都会被标记为待删除，因为它们都包含名为 "b" 的空文件夹。

示例 2：

输入：paths = [["a"],["c"],["a","b"],["c","b"],["a","b","x"],["a","b","x","y"],["w"],["w","y"]]
输出：[["c"],["c","b"],["a"],["a","b"]]
解释：文件结构如上所示。
文件夹 "/a/b/x" 和 "/w"（以及它们的子文件夹）都会被标记为待删除，因为它们都包含名为 "y" 的空文件夹。
注意，文件夹 "/a" 和 "/c" 在删除后变为相同文件夹，但这两个文件夹不会被删除，因为删除只会进行一次，且它们没有在删除前被标记。

示例 3：

输入：paths = [["a","b"],["c","d"],["c"],["a"]]
输出：[["c"],["c","d"],["a"],["a","b"]]
解释：文件系统中所有文件夹互不相同。
注意，返回的数组可以按不同顺序返回文件夹路径，因为题目对顺序没有要求。

示例 4：

输入：paths = [["a"],["a","x"],["a","x","y"],["a","z"],["b"],["b","x"],["b","x","y"],["b","z"]]
输出：[]
解释：文件结构如上所示。
文件夹 "/a/x" 和 "/b/x"（以及它们的子文件夹）都会被标记为待删除，因为它们都包含名为 "y" 的空文件夹。
文件夹 "/a" 和 "/b"（以及它们的子文件夹）都会被标记为待删除，因为它们都包含一个名为 "z" 的空文件夹以及上面提到的文件夹 "x" 。

示例 5：

输入：paths = [["a"],["a","x"],["a","x","y"],["a","z"],["b"],["b","x"],["b","x","y"],["b","z"],["b","w"]]
输出：[["b"],["b","w"],["b","z"],["a"],["a","z"]]
解释：本例与上例的结构基本相同，除了新增 "/b/w" 文件夹。
文件夹 "/a/x" 和 "/b/x" 仍然会被标记，但 "/a" 和 "/b" 不再被标记，因为 "/b" 中有名为 "w" 的空文件夹而 "/a" 没有。
注意，"/a/z" 和 "/b/z" 不会被标记，因为相同子文件夹的集合必须是非空集合，但这两个文件夹都是空的。

提示：

1 <= paths.length <= 2 * 10⁴
1 <= paths[i].length <= 500
1 <= paths[i][j].length <= 10
1 <= sum(paths[i][j].length) <= 2 * 10⁵
path[i][j] 由小写英文字母组成
不会存在两个路径都指向同一个文件夹的情况
对于不在根层级的任意文件夹，其父文件夹也会包含在输入中

【合集】一些有漏洞的哈希函数（更新中）

LeetCode 每日一题题解

hqztrue

2021年7月30日 11:09

这里介绍一些常见的错误哈希函数，并给出反例。这些哈希函数可能易于实现，并对于随机数据效果不错，所以广受人民群众喜爱(包括我)，但对于某些确定的树结构会出问题。
(“正确”的哈希函数应当对于任何树结构都可以保证极小的出错概率，出错与否是无法由测试数据设计者控制的。)
记号：以下用 $x$ 表示字母树上的一个结点，$y_1,\dots,y_d$ 表示结点 $x$ 的孩子集合，$s(x)$ 表示结点 $x$ 的文件夹名，$h(x)$ 表示结点 $x$ 的哈希值，$h(s)$ 表示字符串 $s$ 的哈希值。令 $M$ 代表随便选的一个数，$P$ 代表大质数。

一些错误的哈希函数：

$h(x)=M\cdot \sum_{1\leq i\leq d}h(y_i)+h'(s(x))~~(\bmod~P)$.
这个函数的问题是太线性了。当我们把不同儿子的哈希值加起来时，如果对一个儿子的哈希值加1，对另一个儿子的哈希值减1，那么结果不变。对于比较简单的 $h'(\cdot)$ 这很容易做到。
实现举例1 举例2 变种3(周赛CN第8名的代码 @JTJL)
反例：
[["a"],["b"],["a","b"],["a","d"],["b","a"],["b","e"]]
$h(x)=1+M\cdot \sum_{1\leq i\leq d}h(y_i)\cdot h'(s(y_i))~~(\bmod~P)$.
这是上一个函数的变种，因为有 $h'$ 的存在，看起来更复杂了。但即使 $h'$ 很复杂，如果我们把所有的 $h'(s(y_i))$ 看成变量，那么最终结果是关于它们的多项式。我们可以用多种方法(对应多棵不同的树)凑出同一个多项式，例如 $1+M((1+Ma)\cdot b+1\cdot a)=1+M((1+Mb)\cdot a+1\cdot b)$。此时即使并行使用多个模数也不能带来任何帮助。
实现举例1(更新后) 举例2(这个例子实现时有个小错误，反而卡不掉了。修正之后可以卡掉。)
反例：
[["y"],["y","a"],["y","a","b"],["y","b"],["z"],["z","b"],["z","b","a"],["z","a"]]
即使混合使用 $\bmod P$ 与自然溢出，当$P$较大时我们也可以用类似的办法构造数据使得出错概率较大(这可能有点反直觉，$P$大了反而容易出错)：
实现举例3 反例：
[["xtiiadgbdw"], ["xtiiadgbdw", "monetnwzju"], ["xtiiadgbdw", "monetnwzju", "uqqiqmeoqw"], ["xtiiadgbdw", "uqqiqmeoqw"], ["djjdtgpgmf"], ["djjdtgpgmf", "uqqiqmeoqw"], ["djjdtgpgmf", "uqqiqmeoqw", "monetnwzju"], ["djjdtgpgmf", "monetnwzju"]]
实现举例4 反例：
[["ztnorzuybq"], ["ztnorzuybq", "pyjmbyiqtv"], ["ztnorzuybq", "pyjmbyiqtv", "vfsicaayya"], ["ztnorzuybq", "vfsicaayya"], ["omvuigvvjx"], ["omvuigvvjx", "vfsicaayya"], ["omvuigvvjx", "vfsicaayya", "pyjmbyiqtv"], ["omvuigvvjx", "pyjmbyiqtv"]]
它们跟前一个反例的树结构是相同的，但字符串随机选取。
令 $h_i(x)=(h_{i-1}(x)\cdot M+h(y_i)\cdot M+h'(s(y_i)))\bmod P$ 为前 $i$ 个儿子的 hash 值, 而 $h(x)=h_d(x)$.
这个哈希函数类似Rabin-Karp，但有个重要的漏洞：如果我们把取模前的最终结果看成一个 $M$ 进制大数，则多个结点可能共享同一位。如果选取的 $h'(s)$ 函数过于简单，例如使用线性函数，则我们可以容易地对一个后代结点的哈希值加1，对相应的另一个后代结点的哈希值减1，并保持结果不变。
实现举例
反例：
[["y"],["z"],["a"],["i"],["j"],["e"],["y","a"],["y","a","b"],["y","d"],["y","d","e"],["z","i"],["z","i","b"],["z","d"],["z","d","j"]]
正确的做法应当记录子树大小$\mathrm{size}(y_i)$，合并一个子结点时乘上 $M^{\mathrm{size}(y_i)}$ 而不是 $M$。
在例三的基础上，即使选取较为复杂的函数 $h'(s)$，例如std::hash()，我们也可以换一种攻击方式：采用类似例二的方法，把所有的 $h'(s(y_i))$ 看成变量，并用多种方法凑出同一个多项式。常见的办法是找到两个共享答案同一位的点，并交换两个点在树中的位置。
以下例子使用的是 $h_i(x)=(h_{i-1}(x)\cdot M^2+h(y_i)\cdot M+h'(s(y_i)))\bmod P$，原理是一样的。
实现举例
反例：
[["y"],["y","b"],["y","b","a"],["y","d"],["y","d","c"],["z"],["z","d"],["z","d","c"],["z","d","c","b"],["z","d","c","b","a"]]
在例四的基础上，令 $h(x)=(h_d(x)\cdot M+B)\bmod P$，即在尾巴上添加一个随便选的常数 $B$. 注意如果在头上添加常数 $B$，即令 $h_0(x)=B$，则反例4已经能卡掉了。这个反例的构造方式是类似的。
实现举例1 变种2(周赛CN第4名的代码 @wifiii)
反例：
[["y"],["y","b"],["y","b","a"],["y","d"],["y","d","c"],["z"],["z","c"],["z","c","a"],["z","d"],["z","d","b"]]

删除系统中的重复文件夹

LeetCode 每日一题题解

LeetCode-Solution

2021年7月25日 15:46

方法一：子树的序列化表示

思路

我们可以想出这道题在抽象层面（也就是省去了所有实现细节）的解决方法，即：

第一步，我们通过给定的 $\textit{paths}$，简历出文件系统的树型表示。这棵树是一棵多叉树，根节点为 $\texttt{/}$，每个非根节点表示一个文件夹。
第二步，我们对整棵树从根节点开始进行一次遍历。根据题目中的描述，如果两个节点 $x$ 和 $y$ 包含的子文件夹的「结构」（即这些子文件夹、子文件夹的子文件夹等等，递归直到空文件夹为止）完全相同，我们就需要将 $x$ 和 $y$ 都进行删除。那么，为了得到某一个节点的子文件夹的「结构」，我们应当首先遍历完成该节点的所有子节点，再回溯遍历该节点本身。这就对应着多叉树的后序遍历。

在回溯到某节点时，我们需要将该节点的「结构」存储下来，记录在某一「数据结构」中，以便于与其它节点的「结构」进行比较。
第三步，我们再次对整棵树从根节点开始进行一次遍历。当我们遍历到节点 $x$ 时，如果 $x$ 的「结构」在「数据结构」中出现了超过 $1$ 次，那就说明存在于 $x$ 相同的文件夹，我们就需要将 $x$ 删除并回溯，否则 $x$ 是唯一的，我们将从根节点开始到 $x$ 的路径计入答案，并继续向下遍历 $x$ 的子节点。

在遍历完成后，我们就删除了所有重复的文件夹，并且得到了最终的答案。

算法

对于上面的三个步骤，我们依次尝试进行解决。

对于第一步而言，我们只需要定义一个表示树结构的类，建立一个根节点，随后遍历 $\textit{paths}$ 中的每一条表示文件夹的路径，将路径上的所有节点加入树中即可。如果读者已经掌握了字典树（Trie）这一数据结构，就可以较快地实现这一步。

对于第二步而言，难点不在于对树进行后序遍历，而在于如何表示一个节点的「结构」。我们可以参考「297. 二叉树的序列化与反序列化」，实现一个多叉树的序列化表示。我们用 $\text{serial}(x)$ 记录节点 $x$ 的序列化表示，具体地：

如果节点 $x$ 是子节点，那么 $\text{serial}(x)$ 为空字符串，这是因为节点 $x$ 中不包含任何文件夹，它没有「结构」。例如示例 $1$ 中，三个叶节点 $b, a, a$ 对应的序列化表示均为空字符串；
如果节点 $x$ 不是子节点，它的子节点分别为 $y_1, y_2, \cdots, y_k$ 那么 $\text{serial}(x)$ 递归定义为：

$$
\text{serial}(x) = y_1(\text{serial}(y_1))y_2(\text{serial}(y_2))\cdots y_k(\text{serial}(y_k))
$$

也就是说，我们首先递归地求出 $y_1, y_2, \cdots, y_k$ 的序列化表示，随后将它们连通本身的文件夹名拼接在一起，并在外层使用括号 $()$ 将它们之间进行区分（或者说隔离）。但如果只是随意地进行拼接，会产生顺序的问题，即如果有节点 $x_1$ 和 $x_2$，它们有相同的子节点 $y_1$ 和 $y_2$，但在 $x_1$ 的子节点中 $y_1$ 先出现 $y_2$ 后出现，而在 $x_2$ 的子节点中 $y_2$ 先出现而 $y_1$ 后出现，这样尽管 $x_1$ 和 $x_2$ 的「结构」是完全相同的，但会因为子节点的出现顺序不同，导致序列化的字符串不同。

因此，在将 $y_1, y_2, \cdots, y_k$ 的序列化表示进行拼接之前，我们可以对它们进行排序（字典序顺序），再将排序后的结果进行拼接，就可以保证具有相同「结构」的节点的序列化表示是完全相同的了。例如示例 $4$ 中，根节点下方的两个子节点 $a, b$，它们的序列化表示均为 $\texttt{x(y())z()}$。

这样一来，通过一次树的后序遍历，我们就可以求出每一个节点「结构」的序列化表示。由于序列化表示都是字符串，因此我们可以使用一个哈希映射，记录每一种序列化表示以及其对应的出现次数。

对于第三步而言，我们从根节点开始对树进行深度优先遍历，并使用一个数组 $\textit{path}$ 记录从根节点到当前遍历到的节点 $x$ 的路径。如果 $x$ 的序列化表示在哈希映射中出现了超过 $1$ 次，就进行回溯，否则将 $\textit{path}$ 加入答案，并向下递归遍历 $x$ 的所有子节点。

代码

下面的 $\texttt{C++}$ 代码没有析构树的空间。如果在面试中遇到了本题，可以和面试官进行沟通，询问是否需要析构对应的空间。

###C++

struct Trie {
    // 当前节点结构的序列化表示
    string serial;
    // 当前节点的子节点
    unordered_map<string, Trie*> children;
};

class Solution {
public:
    vector<vector<string>> deleteDuplicateFolder(vector<vector<string>>& paths) {
        // 根节点
        Trie* root = new Trie();

        for (const vector<string>& path: paths) {
            Trie* cur = root;
            for (const string& node: path) {
                if (!cur->children.count(node)) {
                    cur->children[node] = new Trie();
                }
                cur = cur->children[node];
            }
        }

        // 哈希表记录每一种序列化表示的出现次数
        unordered_map<string, int> freq;
        // 基于深度优先搜索的后序遍历，计算每一个节点结构的序列化表示
        function<void(Trie*)> construct = [&](Trie* node) {
            // 如果是叶节点，那么序列化表示为空字符串，无需进行任何操作
            if (node->children.empty()) {
                return;
            }

            vector<string> v;
            // 如果不是叶节点，需要先计算子节点结构的序列化表示
            for (const auto& [folder, child]: node->children) {
                construct(child);
                v.push_back(folder + "(" + child->serial + ")");
            }
            // 防止顺序的问题，需要进行排序
            sort(v.begin(), v.end());
            for (string& s: v) {
                node->serial += move(s);
            }
            // 计入哈希表
            ++freq[node->serial];
        };

        construct(root);

        vector<vector<string>> ans;
        // 记录根节点到当前节点的路径
        vector<string> path;

        function<void(Trie*)> operate = [&](Trie* node) {
            // 如果序列化表示在哈希表中出现了超过 1 次，就需要删除
            if (freq[node->serial] > 1) {
                return;
            }
            // 否则将路径加入答案
            if (!path.empty()) {
                ans.push_back(path);
            }
            for (const auto& [folder, child]: node->children) {
                path.push_back(folder);
                operate(child);
                path.pop_back();
            }
        };

        operate(root);
        return ans;
    }
};

###Python

class Trie:
    # 当前节点结构的序列化表示
    serial: str = ""
    # 当前节点的子节点
    children: dict

    def __init__(self):
        self.children = dict()

class Solution:
    def deleteDuplicateFolder(self, paths: List[List[str]]) -> List[List[str]]:
        # 根节点
        root = Trie()

        for path in paths:
            cur = root
            for node in path:
                if node not in cur.children:
                    cur.children[node] = Trie()
                cur = cur.children[node]

        # 哈希表记录每一种序列化表示的出现次数
        freq = Counter()
        # 基于深度优先搜索的后序遍历，计算每一个节点结构的序列化表示
        def construct(node: Trie) -> None:
            # 如果是叶节点，那么序列化表示为空字符串，无需进行任何操作
            if not node.children:
                return

            v = list()
            # 如果不是叶节点，需要先计算子节点结构的序列化表示
            for folder, child in node.children.items():
                construct(child)
                v.append(folder + "(" + child.serial + ")")
            
            # 防止顺序的问题，需要进行排序
            v.sort()
            node.serial = "".join(v)
            # 计入哈希表
            freq[node.serial] += 1

        construct(root)

        ans = list()
        # 记录根节点到当前节点的路径
        path = list()

        def operate(node: Trie) -> None:
            # 如果序列化表示在哈希表中出现了超过 1 次，就需要删除
            if freq[node.serial] > 1:
                return
            # 否则将路径加入答案
            if path:
                ans.append(path[:])

            for folder, child in node.children.items():
                path.append(folder)
                operate(child)
                path.pop()

        operate(root)
        return ans

###Java

class Solution {
    class Trie {
        String serial; // 当前节点结构的序列化表示
        Map<String, Trie> children = new HashMap<>(); // 当前节点的子节点
    }

    public List<List<String>> deleteDuplicateFolder(List<List<String>> paths) {
        Trie root = new Trie(); // 根节点

        // 构建字典树
        for (List<String> path : paths) {
            Trie cur = root;
            for (String node : path) {
                cur.children.putIfAbsent(node, new Trie());
                cur = cur.children.get(node);
            }
        }

        Map<String, Integer> freq = new HashMap<>(); // 哈希表记录每一种序列化表示的出现次数
        // 基于深度优先搜索的后序遍历，计算每一个节点结构的序列化表示
        construct(root, freq);
        List<List<String>> ans = new ArrayList<>();
        List<String> path = new ArrayList<>();
        // 操作字典树，删除重复文件夹
        operate(root, freq, path, ans);
        return ans;
    }

    private void construct(Trie node, Map<String, Integer> freq) {
        if (node.children.isEmpty()) return; // 如果是叶节点，无需操作

        List<String> v = new ArrayList<>();
        for (Map.Entry<String, Trie> entry : node.children.entrySet()) {
            construct(entry.getValue(), freq);
            v.add(entry.getKey() + "(" + entry.getValue().serial + ")");
        }

        Collections.sort(v);
        StringBuilder sb = new StringBuilder();
        for (String s : v) {
            sb.append(s);
        }
        node.serial = sb.toString();
        freq.put(node.serial, freq.getOrDefault(node.serial, 0) + 1);
    }

    private void operate(Trie node, Map<String, Integer> freq, List<String> path, List<List<String>> ans) {
        if (freq.getOrDefault(node.serial, 0) > 1) return; // 如果序列化表示出现超过1次，需要删除

        if (!path.isEmpty()) {
            ans.add(new ArrayList<>(path));
        }

        for (Map.Entry<String, Trie> entry : node.children.entrySet()) {
            path.add(entry.getKey());
            operate(entry.getValue(), freq, path, ans);
            path.remove(path.size() - 1);
        }
    }
}

###C#

public class Solution {
    class Trie {
        public string Serial { get; set; } = ""; // 当前节点结构的序列化表示
        public Dictionary<string, Trie> Children { get; } = new Dictionary<string, Trie>(); // 当前节点的子节点
    }

    public IList<IList<string>> DeleteDuplicateFolder(IList<IList<string>> paths) {
        // 根节点
        Trie root = new Trie();
        // 构建字典树
        foreach (var p in paths) {
            Trie current = root;
            foreach (var node in p) {
                if (!current.Children.ContainsKey(node)) {
                    current.Children[node] = new Trie();
                }
                current = current.Children[node];
            }
        }

        // 哈希表记录每一种序列化表示的出现次数
        var freq = new Dictionary<string, int>();
        
        // 基于深度优先搜索的后序遍历，计算每一个节点结构的序列化表示
        void Construct(Trie node) {
            // 如果是叶节点，那么序列化表示为空字符串，无需进行任何操作
            if (node.Children.Count == 0) {
                return;
            }
            var v = new List<string>();
            // 如果不是叶节点，需要先计算子节点结构的序列化表示
            foreach (var entry in node.Children) {
                Construct(entry.Value);
                v.Add($"{entry.Key}({entry.Value.Serial})");
            }
            // 防止顺序的问题，需要进行排序
            v.Sort();
            node.Serial = string.Join("", v);
            // 计入哈希表
            if (!freq.ContainsKey(node.Serial)) {
                freq[node.Serial] = 0;
            }
            freq[node.Serial]++;
        }

        Construct(root);
        var ans = new List<IList<string>>();
        // 记录根节点到当前节点的路径
        var path = new List<string>();

        void Operate(Trie node) {
            // 如果序列化表示在哈希表中出现了超过 1 次，就需要删除
            if (freq.TryGetValue(node.Serial, out int count) && count > 1) {
                return;
            }
            // 否则将路径加入答案
            if (path.Count > 0) {
                ans.Add(new List<string>(path));
            }

            foreach (var entry in node.Children) {
                path.Add(entry.Key);
                Operate(entry.Value);
                path.RemoveAt(path.Count - 1);
            }
        }

        Operate(root);
        return ans;
    }
}

###Go

type Trie struct {
serial   string             // 当前节点结构的序列化表示
children map[string]*Trie   // 当前节点的子节点
}

func deleteDuplicateFolder(paths [][]string) [][]string {
root := &Trie{children: make(map[string]*Trie)} // 根节点
// 构建字典树
for _, path := range paths {
cur := root
for _, node := range path {
if _, ok := cur.children[node]; !ok {
cur.children[node] = &Trie{children: make(map[string]*Trie)}
}
cur = cur.children[node]
}
}

freq := make(map[string]int) // 哈希表记录每一种序列化表示的出现次数
// 基于深度优先搜索的后序遍历，计算每一个节点结构的序列化表示
var construct func(*Trie)
construct = func(node *Trie) {
if len(node.children) == 0 {
return // 如果是叶节点，无需操作
}
v := make([]string, 0, len(node.children))
for folder, child := range node.children {
construct(child)
v = append(v, folder + "(" + child.serial + ")")
}
sort.Strings(v)
node.serial = strings.Join(v, "")
freq[node.serial]++
}
construct(root)
ans := make([][]string, 0)
path := make([]string, 0)
// 操作字典树，删除重复文件夹
var operate func(*Trie)
operate = func(node *Trie) {
if freq[node.serial] > 1 {
return // 如果序列化表示出现超过1次，需要删除
}

if len(path) > 0 {
tmp := make([]string, len(path))
copy(tmp, path)
ans = append(ans, tmp)
}

for folder, child := range node.children {
path = append(path, folder)
operate(child)
path = path[:len(path) - 1]
}
}
operate(root)

return ans
}

###JavaScript

var deleteDuplicateFolder = function(paths) {
    class Trie {
        constructor() {
            this.serial = ""; // 当前节点结构的序列化表示
            this.children = new Map(); // 当前节点的子节点
        }
    }

    const root = new Trie(); // 根节点
    // 构建字典树
    for (const path of paths) {
        let cur = root;
        for (const node of path) {
            if (!cur.children.has(node)) {
                cur.children.set(node, new Trie());
            }
            cur = cur.children.get(node);
        }
    }

    const freq = new Map(); // 哈希表记录每一种序列化表示的出现次数
    // 基于深度优先搜索的后序遍历，计算每一个节点结构的序列化表示
    function construct(node) {
        if (node.children.size === 0) return; // 如果是叶节点，无需操作
        const v = [];
        for (const [folder, child] of node.children) {
            construct(child);
            v.push(`${folder}(${child.serial})`);
        }
        v.sort();
        node.serial = v.join("");
        freq.set(node.serial, (freq.get(node.serial) || 0) + 1);
    }
    construct(root);

    const ans = [];
    const path = [];
    // 操作字典树，删除重复文件夹
    function operate(node) {
        if ((freq.get(node.serial) || 0) > 1) return; // 如果序列化表示出现超过1次，需要删除
        if (path.length > 0) {
            ans.push([...path]);
        }
        for (const [folder, child] of node.children) {
            path.push(folder);
            operate(child);
            path.pop();
        }
    }
    operate(root);

    return ans;
}

###TypeScript

function deleteDuplicateFolder(paths: string[][]): string[][] {
    class Trie {
        serial: string = ""; // 当前节点结构的序列化表示
        children: Map<string, Trie> = new Map(); // 当前节点的子节点
    }

    const root = new Trie(); // 根节点

    // 构建字典树
    for (const path of paths) {
        let cur = root;
        for (const node of path) {
            if (!cur.children.has(node)) {
                cur.children.set(node, new Trie());
            }
            cur = cur.children.get(node)!;
        }
    }

    const freq = new Map<string, number>(); // 哈希表记录每一种序列化表示的出现次数
    // 基于深度优先搜索的后序遍历，计算每一个节点结构的序列化表示
    function construct(node: Trie) {
        if (node.children.size === 0) return; // 如果是叶节点，无需操作

        const v: string[] = [];
        for (const [folder, child] of node.children) {
            construct(child);
            v.push(`${folder}(${child.serial})`);
        }

        v.sort();
        node.serial = v.join("");
        freq.set(node.serial, (freq.get(node.serial) || 0) + 1);
    }
    construct(root);
    const ans: string[][] = [];
    const path: string[] = [];

    // 操作字典树，删除重复文件夹
    function operate(node: Trie) {
        if ((freq.get(node.serial) || 0) > 1) return; // 如果序列化表示出现超过1次，需要删除

        if (path.length > 0) {
            ans.push([...path]);
        }

        for (const [folder, child] of node.children) {
            path.push(folder);
            operate(child);
            path.pop();
        }
    }
    operate(root);
    return ans;
}

###Rust

use std::collections::HashMap;

#[derive(Default)]
struct Trie {
    serial: String, // 当前节点结构的序列化表示
    children: HashMap<String, Trie>, // 当前节点的子节点
}

impl Solution {
    pub fn delete_duplicate_folder(paths: Vec<Vec<String>>) -> Vec<Vec<String>> {
        let mut root = Trie::default(); // 根节点
        // 构建字典树
        for path in paths {
            let mut cur = &mut root;
            for node in path {
                cur = cur.children.entry(node.clone()).or_default();
            }
        }

        let mut freq = HashMap::new(); // 哈希表记录每一种序列化表示的出现次数

        // 基于深度优先搜索的后序遍历，计算每一个节点结构的序列化表示
        fn construct(node: &mut Trie, freq: &mut HashMap<String, usize>) {
            if node.children.is_empty() {
                return; // 如果是叶节点，无需操作
            }

            let mut v = Vec::new();
            for (folder, child) in node.children.iter_mut() {
                construct(child, freq);
                v.push(format!("{}({})", folder, child.serial));
            }

            v.sort();
            node.serial = v.join("");
            *freq.entry(node.serial.clone()).or_default() += 1;
        }
        construct(&mut root, &mut freq);
        let mut ans = Vec::new();
        let mut path = Vec::new();

        // 操作字典树，删除重复文件夹
        fn operate(node: &Trie, freq: &HashMap<String, usize>, path: &mut Vec<String>, ans: &mut Vec<Vec<String>>) {
            if freq.get(&node.serial).unwrap_or(&0) > &1 {
                return; // 如果序列化表示出现超过1次，需要删除
            }

            if !path.is_empty() {
                ans.push(path.clone());
            }

            for (folder, child) in &node.children {
                path.push(folder.clone());
                operate(child, freq, path, ans);
                path.pop();
            }
        }
        operate(&root, &freq, &mut path, &mut ans);

        ans
    }
}

复杂度分析

这里我们只考虑计算所有节点结构的序列化表示需要的时间，以及哈希映射需要使用的空间。对于其它的项（无论是时间项还是空间项），它们在渐近意义下一定都小于计算以及存储序列化表示的部分，因此可以忽略。

在最坏情况下，节点结构的序列化表示的字符串两两不同，那么需要的时间和空间级别均为「所有节点结构的序列化表示的字符串的长度之和」。如何求出这个长度之和的上界呢？

这里我们需要用到一个很重要的结论：

设 $T$ 为无权多叉树。对于 $T$ 中的节点 $x$，记 $\textit{dist}[x]$ 为从根节点到 $x$ 经过的节点个数，$\textit{size}[x]$ 为以 $x$ 为根的子树的大小，那么有：

$$
\sum_{x \in T} \textit{dist}[x] = \sum_{x \in T} \textit{size}[x]
$$

证明也较为直观。对于任意的节点 $x'$，在右侧的 $\sum_{x \in T} \textit{size}[x]$ 中，$x'$ 被包含在 $\textit{size}[x]$ 中的次数就等于 $x'$ 的祖先节点的数目（也包括 $x'$ 本身），其等于根节点到 $x'$ 的经过的节点个数，因此得证。

回到本题，$\textit{path}$ 中给出了根节点到所有节点的路径，其中最多包含 $2\times 10^5$ 个字符，那么 $\sum_{x \in T} \textit{dist}[x]$ 不超过 $2\times 10^5$，$\sum_{x \in T} \textit{size}[x]$ 同样也不超过 $2\times 10^5$。

对于任意的节点 $x$，$x$ 结构的序列化表示的字符串长度包含两部分，第一部分为其中所有子文件夹名的长度之和，其不超过 $10 \cdot \textit{size}[x]$，第二部分为额外添加的用来区分的括号，由于一个子文件夹会恰好被添加一对括号，因此其不超过 $2 \cdot \textit{size}[x]$。这样一来，「所有节点结构的序列化表示的字符串的长度之和」的上界为：

$$
12 \cdot \sum_{x \in T} \textit{size}[x] = 2.4 \times 10^6
$$

即空间的数量级为 $10^6$。而对于时间，即使算上排序的额外 $\log$ 的时间复杂度，也在 $10^7$ 的数量级，可以在规定的时间内通过本题。并且需要指出的是，在上述估算上界的过程中，我们作出的许多假设是非常极端的，因此实际上该方法的运行时间很快。

括号表示法 + 字典树（Python/Java/C++/Go）

LeetCode 每日一题题解

endlesscheng

2021年7月25日 12:06

核心思路：把相同的子树映射为相同的字符串，就能用哈希表去重了。

如何把子树转化成字符串？为了准确判断两棵子树的结构是否相同，需要做到两点：

字符串需要包含子树所有节点的文件夹名。
字符串要能够表达节点的父子关系。

考察树的递归过程，把向下「递」的动作用一个非字母字符表示，向上「归」的动作用另一个非字母字符表示，就可以描述一棵树的形状了（用非字母字符是为了与文件夹名区分开）。比如说，用左括号表示递，用右括号表示归。从节点 $\texttt{x}$ 向下递到节点 $\texttt{y}$，再归回 $\texttt{x}$，就可以表示为 $\texttt{x(y)}$。如果 $\texttt{x}$ 有两个儿子 $\texttt{y}$ 和 $\texttt{z}$（并且这两个儿子都是叶子），那么子树 $\texttt{x}$ 就可以表示为 $\texttt{x(y)(z)}$。

一般地，定义如下括号表达式：

对于叶子节点，设其文件夹名为 $S$，则其括号表达式就是 $S$。
对于任意子树 $x$，设 $x$ 的儿子为 $y_1,y_2,\ldots,y_k$，则子树 $x$ 的括号表达式为
$$
x 的文件夹名 + (子树 y_1 的表达式) + (子树 y_2 的表达式) + \cdots + (子树 y_k 的表达式)
$$

{:width=300}

看示例 4，子树 $\texttt{x}$ 的括号表达式为 $\texttt{x(y)}$，子树 $\texttt{a}$ 的括号表达式为 $\texttt{a(x(y))(z)}$，子树 $\texttt{b}$ 的括号表达式为 $\texttt{b(x(y))(z)}$。

根据题意，我们不关心子树根节点的文件夹名，在去掉子树根节点后，子树 $\texttt{a}$ 和子树 $\texttt{b}$ 的括号表达式都是 $\texttt{(x(y))(z)}$，所以这两个文件夹「包含非空且相同的子文件夹集合，并具有相同的子文件夹结构」。

括号表达式既包含了文件夹名，又通过括号的嵌套关系表达了父子关系，因此可用于判断两个文件夹是否为相同文件夹。

你可能会问：如果子树 $\texttt{b}$ 的两棵子树是 $\texttt{z}$ 在左，$\texttt{x(y)}$ 在右呢？得到的表达式为 $\texttt{(z)(x(y))}$，这样没法判断两棵子树相同呀？

解决办法：在构造括号表达式时，先把子树 $y_1,y_2,\ldots,y_k$ 的表达式按照字典序排序，再把表达式依次拼接，就可以避免出现上述情况了。

代码实现时，用字典树表示这个文件系统，节点保存文件夹名称。注意这与一般的字典树不同，不是二十六叉树那种用单个字母对应节点，而是用一整个字符串（文件夹名）对应节点。这棵字典树一个节点（文件夹）最多能有 $20000$ 个儿子（子文件夹）。

用 $\textit{paths}$ 构建完字典树后，DFS 这棵树，按照前文的规则生成括号表达式 $\textit{subTreeExpr}$：

如果首次遇到 $\textit{subTreeExpr}$，那么把 $\textit{subTreeExpr}$ 及其对应的子树根节点保存到哈希表中。
否则我们找到了重复的文件夹，把哈希表中 $\textit{subTreeExpr}$ 对应的节点，以及当前节点，都标记为待删除。

最后，再次 DFS（回溯）这棵字典树，仅访问未被删除的节点，同时用一个列表 $\textit{path}$ 记录路径上的文件夹名。每次递归到一个节点，就把 $\textit{path}$ 的一个拷贝加到答案中。做法类似 257. 二叉树的所有路径。

class TrieNode:
    __slots__ = 'son', 'name', 'deleted'

    def __init__(self):
        self.son = {}
        self.name = ''  # 文件夹名称
        self.deleted = False  # 删除标记


class Solution:
    def deleteDuplicateFolder(self, paths: List[List[str]]) -> List[List[str]]:
        root = TrieNode()
        for path in paths:
            # 把 path 插到字典树中，见 208. 实现 Trie
            cur = root
            for s in path:
                if s not in cur.son:
                    cur.son[s] = TrieNode()
                cur = cur.son[s]
                cur.name = s

        expr_to_node = {}  # 子树括号表达式 -> 子树根节点

        def gen_expr(node: TrieNode) -> str:
            if not node.son:  # 叶子
                return node.name  # 表达式就是文件夹名

            # 每个子树的表达式外面套一层括号
            expr = sorted('(' + gen_expr(son) + ')' for son in node.son.values())
            sub_tree_expr = ''.join(expr)  # 按字典序拼接所有子树的表达式
            if sub_tree_expr in expr_to_node:  # 哈希表中有 sub_tree_expr，说明有重复的文件夹
                expr_to_node[sub_tree_expr].deleted = True  # 哈希表中记录的节点标记为删除
                node.deleted = True  # 当前节点标记为删除
            else:
                expr_to_node[sub_tree_expr] = node

            return node.name + sub_tree_expr

        for son in root.son.values():
            gen_expr(son)

        ans = []
        path = []

        # 在字典树上回溯，仅访问未被删除的节点，并将路径记录到答案中
        # 类似 257. 二叉树的所有路径
        def dfs(node: TrieNode) -> None:
            if node.deleted:
                return
            path.append(node.name)
            ans.append(path.copy())  # path[:]
            for child in node.son.values():
                dfs(child)
            path.pop()  # 恢复现场

        for son in root.son.values():
            dfs(son)

        return ans

class Solution {
    private static class TrieNode {
        Map<String, TrieNode> son = new HashMap<>();
        String name; // 文件夹名称
        boolean deleted = false; // 删除标记
    }

    public List<List<String>> deleteDuplicateFolder(List<List<String>> paths) {
        TrieNode root = new TrieNode();
        for (List<String> path : paths) {
            // 把 path 插到字典树中，见 208. 实现 Trie
            TrieNode cur = root;
            for (String s : path) {
                if (!cur.son.containsKey(s)) {
                    cur.son.put(s, new TrieNode());
                }
                cur = cur.son.get(s);
                cur.name = s;
            }
        }

        Map<String, TrieNode> exprToNode = new HashMap<>(); // 子树括号表达式 -> 子树根节点
        for (TrieNode son : root.son.values()) {
            genExpr(son, exprToNode);
        }

        List<List<String>> ans = new ArrayList<>();
        List<String> path = new ArrayList<>();
        for (TrieNode son : root.son.values()) {
            dfs(son, path, ans);
        }
        return ans;
    }

    private String genExpr(TrieNode node, Map<String, TrieNode> exprToNode) {
        if (node.son.isEmpty()) { // 叶子
            return node.name; // 表达式就是文件夹名
        }

        List<String> expr = new ArrayList<>();
        for (TrieNode son : node.son.values()) {
            // 每个子树的表达式外面套一层括号
            expr.add("(" + genExpr(son, exprToNode) + ")");
        }
        Collections.sort(expr);

        String subTreeExpr = String.join("", expr); // 按字典序拼接所有子树的表达式
        TrieNode n = exprToNode.get(subTreeExpr);
        if (n != null) { // 哈希表中有 subTreeExpr，说明有重复的文件夹
            n.deleted = true; // 哈希表中记录的节点标记为删除
            node.deleted = true; // 当前节点标记为删除
        } else {
            exprToNode.put(subTreeExpr, node);
        }

        return node.name + subTreeExpr;
    }

    // 在字典树上回溯，仅访问未被删除的节点，并将路径记录到答案中
    // 类似 257. 二叉树的所有路径
    private void dfs(TrieNode node, List<String> path, List<List<String>> ans) {
        if (node.deleted) {
            return;
        }
        path.add(node.name);
        ans.add(new ArrayList<>(path)); // 记录路径
        for (TrieNode son : node.son.values()) {
            dfs(son, path, ans);
        }
        path.removeLast(); // 恢复现场
    }
}

struct TrieNode {
    unordered_map<string, TrieNode*> son;
    string name; // 文件夹名称
    bool deleted = false; // 删除标记
};

class Solution {
public:
    vector<vector<string>> deleteDuplicateFolder(vector<vector<string>>& paths) {
        TrieNode* root = new TrieNode();
        for (auto& path : paths) {
            // 把 path 插到字典树中，见 208. 实现 Trie
            TrieNode* cur = root;
            for (auto& s : path) {
                if (!cur->son.contains(s)) {
                    cur->son[s] = new TrieNode();
                }
                cur = cur->son[s];
                cur->name = s;
            }
        }

        unordered_map<string, TrieNode*> expr_to_node; // 子树括号表达式 -> 子树根节点

        auto gen_expr = [&](this auto&& gen_expr, TrieNode* node) -> string {
            if (node->son.empty()) { // 叶子
                return node->name; // 表达式就是文件夹名
            }

            vector<string> expr;
            for (auto& [_, son] : node->son) {
                // 每个子树的表达式外面套一层括号
                expr.emplace_back("(" + gen_expr(son) + ")");
            }
            ranges::sort(expr);

            string sub_tree_expr;
            for (auto& e : expr) {
                sub_tree_expr += e; // 按字典序拼接所有子树的表达式
            }

            if (expr_to_node.contains(sub_tree_expr)) { // 哈希表中有 sub_tree_expr，说明有重复的文件夹
                expr_to_node[sub_tree_expr]->deleted = true; // 哈希表中记录的节点标记为删除
                node->deleted = true; // 当前节点标记为删除
            } else {
                expr_to_node[sub_tree_expr] = node;
            }

            return node->name + sub_tree_expr;
        };

        for (auto& [_, son] : root->son) {
            gen_expr(son);
        }

        vector<vector<string>> ans;
        vector<string> path;

        // 在字典树上回溯，仅访问未被删除的节点，并将路径记录到答案中
        // 类似 257. 二叉树的所有路径
        auto dfs = [&](this auto&& dfs, TrieNode* node) -> void {
            if (node->deleted) {
                return;
            }
            path.push_back(node->name);
            ans.push_back(path);
            for (auto& [_, son] : node->son) {
                dfs(son);
            }
            path.pop_back(); // 恢复现场
        };

        for (auto& [_, son] : root->son) {
            dfs(son);
        }

        return ans;
    }
};

type trieNode struct {
son     map[string]*trieNode
name    string // 文件夹名称
deleted bool   // 删除标记
}

func deleteDuplicateFolder(paths [][]string) (ans [][]string) {
root := &trieNode{}
for _, path := range paths {
// 把 path 插到字典树中，见 208. 实现 Trie
cur := root
for _, s := range path {
if cur.son == nil {
cur.son = map[string]*trieNode{}
}
if cur.son[s] == nil {
cur.son[s] = &trieNode{}
}
cur = cur.son[s]
cur.name = s
}
}

exprToNode := map[string]*trieNode{} // 子树括号表达式 -> 子树根节点
var genExpr func(*trieNode) string
genExpr = func(node *trieNode) string {
if node.son == nil { // 叶子
return node.name // 表达式就是文件夹名
}

expr := make([]string, 0, len(node.son)) // 预分配空间
for _, son := range node.son {
// 每个子树的表达式外面套一层括号
expr = append(expr, "("+genExpr(son)+")")
}
slices.Sort(expr)

subTreeExpr := strings.Join(expr, "") // 按字典序拼接所有子树的表达式
n := exprToNode[subTreeExpr]
if n != nil { // 哈希表中有 subTreeExpr，说明有重复的文件夹
n.deleted = true    // 哈希表中记录的节点标记为删除
node.deleted = true // 当前节点标记为删除
} else {
exprToNode[subTreeExpr] = node
}

return node.name + subTreeExpr
}
for _, son := range root.son {
genExpr(son)
}

// 在字典树上回溯，仅访问未被删除的节点，并将路径记录到答案中
// 类似 257. 二叉树的所有路径
path := []string{}
var dfs func(*trieNode)
dfs = func(node *trieNode) {
if node.deleted {
return
}
path = append(path, node.name)
ans = append(ans, slices.Clone(path))
for _, son := range node.son {
dfs(son)
}
path = path[:len(path)-1] // 恢复现场
}
for _, son := range root.son {
dfs(son)
}
return
}

复杂度分析

时间复杂度：$\mathcal{O}(\ell\cdot m\log m)$，其中 $m$ 是字符串的总个数，$\ell\le 10$ 是单个字符串的长度，题目保证 $\ell\cdot m\le 2\cdot 10^5$。瓶颈在排序上。字符串拼接的复杂度是 $\mathcal{O}(\ell\cdot m)$。
- 排序：最坏情况下一个节点有 $\mathcal{O}(m)$ 个儿子，我们会对 $\mathcal{O}(m)$ 个字符串排序。会发生 $\mathcal{O}(m\log m)$ 次字符串比较，每次 $\mathcal{O}(\ell)$ 时间，所以排序的时间复杂度为 $\mathcal{O}(\ell\cdot m\log m)$。
- 字符串拼接：可能读者会认为当树退化成链时，代码会跑到 $\mathcal{O}((\ell\cdot m)^2)$ 时间，但这是不可能的。注意题目的这句话：「对于不在根层级的任意文件夹，其父文件夹也会包含在输入中。」这意味着如果一棵树的高度是 $d$，那么至少要 $1+2+3+\cdots+d = \mathcal{O}(d^2)$ 个字符串才能生成这棵树（可以参考示例 2），所以树的高度只有 $\mathcal{O}(\sqrt m)$。相应地，代码中的 node.name + subTreeExpr 会对长为 $\mathcal{O}(\ell\sqrt m)$ 的字符串复制拼接 $\mathcal{O}(\sqrt m)$ 次，所以时间复杂度为 $\mathcal{O}(\ell(\sqrt m)^2) = \mathcal{O}(\ell\cdot m)$。
空间复杂度：$\mathcal{O}(\ell\cdot m)$。

专题训练

297. 二叉树的序列化与反序列化
下面数据结构题单的「六、字典树（trie）」。

分类题单

如何科学刷题？

我的题解精选（已分类）

欢迎关注 B站@灵茶山艾府

排序（Python/Java/C++/C/Go/JS/Rust）

LeetCode 每日一题题解

endlesscheng

2025年7月19日 08:58

如果 $B$ 是 $A$ 的子文件夹，那么：

$A + \texttt{/}$ 是 $B$ 的前缀。

比如 $\texttt{/a/b}$ 是 $\texttt{/a}$ 的子文件夹，其中 $\texttt{/a/}$ 是 $\texttt{/a/b}$ 的前缀。（当然，$\texttt{/a}$ 也是 $\texttt{/a/b}$ 的前缀。）

暴力的想法是，枚举路径 $B$，枚举其他路径 $A$，判断是否满足上述要求，这需要 $\mathcal{O}(\ell\cdot n^2)$ 时间，其中 $\ell\le 100$ 为单个字符串的长度。这太慢了。

可以不枚举 $A$ 吗？

如果 $\textit{folder}$ 列表是有序的（按照字典序从小到大排序），那么当 $B$ 是 $A$ 的子文件夹时，$A$ 是 $B$ 的前缀，所以 $A$ 一定排在 $B$ 的左边。

比如 $\textit{folder} = [\texttt{/a},\texttt{/a/b},\texttt{/c},\texttt{/d}]$：

$\texttt{/a}$ 不可能是任何文件夹 $A$ 的子文件夹（否则 $A$ 会排在 $\texttt{/a}$ 前面）。
$\texttt{/a/b}$ 是 $\texttt{/a}$ 的子文件夹，删除。
$\texttt{/c}$ 不是 $\texttt{/a}$ 的子文件夹。
$\texttt{/d}$ 不是 $\texttt{/c}$ 的子文件夹。注意，不需要判断 $\texttt{/d}$ 是否为 $\texttt{/a}$ 的子文件夹。因为一定不是，为什么？如果 $\texttt{/d}$ 是 $\texttt{/a}$ 的子文件夹，由于列表是有序的，$\texttt{/a}$ 的下一个字符串到 $\texttt{/d}$ 之间的字符串都是以 $\texttt{/a/}$ 开头的，所以上一个文件夹 $\texttt{/c}$ 也是以 $\texttt{/a/}$ 开头的，一定是 $\texttt{/a}$ 的子文件夹，一定会被删除，但这与实际矛盾。

因此，排序后，只需比较当前路径与上一个未被删除的路径。

代码实现时，为避免 $A + \texttt{/}$ 生成新字符串花费额外时间和空间，判断 $A + \texttt{/}$ 是否为 $B$ 的前缀，可以分解为：

判断 $A$ 是否为 $B$ 的前缀。
判断 $B[m]$ 是否为 $\texttt{/}$，其中 $m$ 是 $A$ 的长度。注意题目保证所有字符串互不相同，所以如果 $A$ 是 $B$ 的前缀，那么 $B$ 的长度一定大于 $A$ 的长度。所以 $B[m]$ 不会下标越界。

###py

class Solution:
    def removeSubfolders(self, folder: List[str]) -> List[str]:
        folder.sort()
        ans = [folder[0]]
        for s in folder[1:]:  # 切片可以用 for i in range(1, n) 代替，做到 O(1) 空间
            last = ans[-1]
            if not s.startswith(last) or s[len(last)] != '/':
                ans.append(s)
        return ans

###java

class Solution {
    public List<String> removeSubfolders(String[] folder) {
        Arrays.sort(folder);
        List<String> ans = new ArrayList<>();
        ans.add(folder[0]);
        for (int i = 1; i < folder.length; i++) {
            String s = folder[i];
            String last = ans.getLast();
            if (!s.startsWith(last) || s.charAt(last.length()) != '/') {
                ans.add(s);
            }
        }
        return ans;
    }
}

###cpp

class Solution {
public:
    vector<string> removeSubfolders(vector<string>& folder) {
        ranges::sort(folder);
        vector<string> ans = {folder[0]};
        for (int i = 1; i < folder.size(); i++) {
            string& s = folder[i];
            string& last = ans.back();
            if (!s.starts_with(last) || s[last.size()] != '/') {
                ans.emplace_back(s);
            }
        }
        return ans;
    }
};

###c

int cmp(const void* a, const void* b) {
    return strcmp(*(char**)a, *(char**)b);
}

char** removeSubfolders(char** folder, int folderSize, int* returnSize) {
    qsort(folder, folderSize, sizeof(char*), cmp);

    *returnSize = 1;
    for (int i = 1; i < folderSize; i++) {
        char* s = folder[i];
        char* last = folder[*returnSize - 1];
        int len = strlen(last);
        if (strncmp(last, s, len) != 0 || s[len] != '/') {
            folder[(*returnSize)++] = s; // 原地保存
        }
    }

    return folder;
}

###go

func removeSubfolders(folder []string) []string {
    slices.Sort(folder)
    ans := folder[:1]
    for _, s := range folder[1:] {
        last := ans[len(ans)-1]
        if !strings.HasPrefix(s, last) || s[len(last)] != '/' {
            ans = append(ans, s)
        }
    }
    return ans
}

###js

var removeSubfolders = function(folder) {
    folder.sort();
    const ans = [folder[0]];
    for (let i = 1; i < folder.length; i++) {
        const s = folder[i];
        const last = ans[ans.length - 1];
        if (!s.startsWith(last) || s[last.length] !== '/') {
            ans.push(s);
        }
    }
    return ans;
};

###rust

impl Solution {
    pub fn remove_subfolders(mut folder: Vec<String>) -> Vec<String> {
        folder.sort_unstable();
        let mut ans = vec![folder[0].clone()];
        for s in folder.into_iter().skip(1) {
            let last = ans.last().unwrap();
            if !s.starts_with(last) || s.as_bytes()[last.len()] != b'/' {
                ans.push(s);
            }
        }
        ans
    }
}

复杂度分析

时间复杂度：$\mathcal{O}(\ell\cdot n\log n)$ 时间，$n$ 是 $\textit{folder}$ 的长度，$\ell\le 100$ 为单个字符串的长度。排序有 $\mathcal{O}(n\log n)$ 次比较，每次 $\mathcal{O}(\ell)$ 时间。
空间复杂度：$\mathcal{O}(1)$。返回值不计入。忽略排序的栈开销。

分类题单

如何科学刷题？

我的题解精选（已分类）

欢迎关注 B站@灵茶山艾府

每日一题-删除子文件夹🟡

LeetCode 每日一题题解

2025年7月19日 00:00

你是一位系统管理员，手里有一份文件夹列表 folder，你的任务是要删除该列表中的所有 子文件夹，并以 任意顺序 返回剩下的文件夹。

如果文件夹 folder[i] 位于另一个文件夹 folder[j] 下，那么 folder[i] 就是 folder[j] 的 子文件夹 。folder[j] 的子文件夹必须以 folder[j] 开头，后跟一个 "/"。例如，"/a/b" 是 "/a" 的一个子文件夹，但 "/b" 不是 "/a/b/c" 的一个子文件夹。

文件夹的「路径」是由一个或多个按以下格式串联形成的字符串：'/' 后跟一个或者多个小写英文字母。

例如，"/leetcode" 和 "/leetcode/problems" 都是有效的路径，而空字符串和 "/" 不是。

示例 1：

输入：folder = ["/a","/a/b","/c/d","/c/d/e","/c/f"]
输出：["/a","/c/d","/c/f"]
解释："/a/b" 是 "/a" 的子文件夹，而 "/c/d/e" 是 "/c/d" 的子文件夹。

示例 2：

输入：folder = ["/a","/a/b/c","/a/b/d"]
输出：["/a"]
解释：文件夹 "/a/b/c" 和 "/a/b/d" 都会被删除，因为它们都是 "/a" 的子文件夹。

示例 3：

输入: folder = ["/a/b/c","/a/b/ca","/a/b/d"]
输出: ["/a/b/c","/a/b/ca","/a/b/d"]

提示：

1 <= folder.length <= 4 * 10⁴
2 <= folder[i].length <= 100
folder[i] 只包含小写字母和 '/'
folder[i] 总是以字符 '/' 起始
folder 每个元素都是唯一的

【爪哇缪斯】图解LeetCode

LeetCode 每日一题题解

muse-77

2023年2月8日 13:03

感慨万千，伟大的king！！！

解题思路

根据题目描述，我们要删除所有的子目录，然后将文件夹列表 folder中剩余的目录输出即可。那么假设我们有一个目录/a，那么所有以/a开头的路径都是它的子目录，如下所示：

【主目录】/a
【子目录】/a/a，/a/b，/a/b/c，/a/b/d/e/f/g，……

那么针对如上规则，我们首先需要做的就是对无序的文件夹列表 folder执行排序操作，当排序完毕后，相关的主目录和子目录就会被排列在一起。遍历排序后的文件列表folder，只将主目录保存到result中即可。但是，再描述具体操作之前，我们先来说明一下流程介绍所涉及的变量：

【folder[i]】表示遍历的每个folder元素；
【result】用于存储最终结果集合；
【result(last) 】表示result中存储的最后一个元素；

具体处理逻辑，如下所示：

【case1】如果result为空，则直接将folder[0]保存到result中；
【case2】如果result(last)满足folder[i]的前缀，则说明folder[i]属于子目录，i执行加1，遍历下一个目录；
【case3】如果result(last)不满足folder[i]的前缀，则说明folder[i]属于主目录，将folder[i]保存到result中，然后i执行加1，遍历下一个目录；
【结果】当遍历完所有folder列表后，result中存储的就是所有主目录。

以上就是这道题的解题思路，下面我们以输入folder是["/c/d","/a/ab","/a","/c/d/e","/a/b","/c/f/d","/c/f"]为例，看一下具体的处理过程：

代码实现

###java

public class Solution {
    public List<String> removeSubfolders(String[] folder) {
        Arrays.sort(folder);
        List<String> result = new ArrayList<>();
        result.add(folder[0]);
        for (int i = 1; i < folder.length; i++) {
            StringBuilder sb = new StringBuilder(result.get(result.size() - 1)).append("/");
            if (!folder[i].startsWith(sb.toString())) result.add(folder[i]);
        }
        return result;
    }
}

今天的文章内容就这些了：

写作不易，笔者几个小时甚至数天完成的一篇文章，只愿换来您几秒钟的点赞 & 分享。