面试刷题-Trie树-单词搜索
题目描述
给定一个 m x n 二维字符网格 board 和一个单词(字符串)列表 words,找出所有同时在二维网格和字典中出现的单词。
单词必须按照字母顺序,通过 相邻的单元格 内的字母构成,其中“相邻”单元格是那些水平相邻或垂直相邻的单元格。同一个单元格内的字母在一个单词中不允许被重复使用。
(从当前位置开始,向前、向后、向上、向下搜索,查看搜索路径组成的单词是否在words中出现过。)
示例 1:

输入:board = [[“o”,”a”,”a”,”n”],[“e”,”t”,”a”,”e”],[“i”,”h”,”k”,”r”],[“i”,”f”,”l”,”v”]], words = [“oath”,”pea”,”eat”,”rain”]
输出:[“eat”,”oath”]
基于前缀树(Trie树)的回溯
解决这个问题的关键在于我们如何从字典中找到单词的匹配项,最直观的思路就是将字典中的所有单词都放入到Map或者Dict中,然后一边遍历网格,一边查询搜索过的字符组成的单词是否在Map/Dict中。显然这种效率比较低。
改进的思路:1、采用回溯的算法,避免每次都从头开始搜索;2、如果我们知道给定前缀在字典中不存在任何单词匹配,那么就不需要在这个方向上继续搜索,Trie树可以很方便的实现前缀的查询。

代码实现如下:
1、TrieNode结构
我们在TrieNode中记录了以该Node结尾的Word,这样就避免了在网格遍历过程中还要记录走过的路径。只要搜索到该TrieNode就可以直接取到搜索路径的所有字符组成的Word。
class TrieNode { public: bool is_end{false}; std::unordered_map<char, std::shared_ptr<TrieNode>> childs; std::string word; };
2、构建Trie树
将所有的words构建Trie树,后面用这个Trie树查询前缀组成的字符串是否在字典中。
auto root = std::make_shared<TrieNode>(); for (const auto& word : words) { auto node = root; for (size_t i = 0; i < word.size(); i++) { const auto& ch = word.at(i); if (node->childs.count(ch) <= 0) { node->childs[ch] = std::make_shared<TrieNode>(); } node = node->childs.at(ch); } node->word = word; node->is_end = true; }
3、回溯遍历搜索
代码中有几个技巧:
1)当首次查询到word之后,清空word内容。这样就解决了结果集中出现重复项的问题,避免了要先搜索匹配结果再去重的过程。
if (node->childs.at(ch)->is_end == true) { if (!node->childs[ch]->word.empty()) { rets.emplace_back(node->childs[ch]->word); node->childs[ch]->word.clear(); } }
2)在搜索过程中引入回溯机制
在搜索过程中,Trie树遍历与回溯过程一起进行,提高了搜索效率。
3)回溯过程中对Trie树进行剪枝

对于 Trie 中的叶节点,一旦遍历它(即找到匹配的单词),就不需要再遍历它了,因此我们可以把它从树上剪下来。逐渐地,这些非叶节点可以成为叶节点以后。在极端情况下,一旦我们找到字典中所有单词的匹配项,Trie 就会变成空的,极大的提升在线检索的效率。
剪枝的代码如下:
if (node->childs[ch]->childs.size() <= 0) { node->childs.erase(ch); }
剪枝优化最典型的是下面这个测试用例,剪枝之前耗时1516ms
,剪枝之后耗时4ms。
[["a","a","a","a","a","a","a","a","a","a","a","a"],["a","a","a","a","a","a","a","a","a","a","a","a"],["a","a","a","a","a","a","a","a","a","a","a","a"],["a","a","a","a","a","a","a","a","a","a","a","a"],["a","a","a","a","a","a","a","a","a","a","a","a"],["a","a","a","a","a","a","a","a","a","a","a","a"],["a","a","a","a","a","a","a","a","a","a","a","a"],["a","a","a","a","a","a","a","a","a","a","a","a"],["a","a","a","a","a","a","a","a","a","a","a","a"],["a","a","a","a","a","a","a","a","a","a","a","a"],["a","a","a","a","a","a","a","a","a","a","a","a"],["a","a","a","a","a","a","a","a","a","a","a","a"]] ["lllllll","fffffff","ssss","s","rr","xxxx","ttt","eee","ppppppp","iiiiiiiii","xxxxxxxxxx","pppppp","xxxxxx","yy","jj","ccc","zzz","ffffffff","r","mmmmmmmmm","tttttttt","mm","ttttt","qqqqqqqqqq","z","aaaaaaaa","nnnnnnnnn","v","g","ddddddd","eeeeeeeee","aaaaaaa","ee","n","kkkkkkkkk","ff","qq","vvvvv","kkkk","e","nnn","ooo","kkkkk","o","ooooooo","jjj","lll","ssssssss","mmmm","qqqqq","gggggg","rrrrrrrrrr","iiii","bbbbbbbbb","aaaaaa","hhhh","qqq","zzzzzzzzz","xxxxxxxxx","ww","iiiiiii","pp","vvvvvvvvvv","eeeee","nnnnnnn","nnnnnn","nn","nnnnnnnn","wwwwwwww","vvvvvvvv","fffffffff","aaa","p","ddd","ppppppppp","fffff","aaaaaaaaa","oooooooo","jjjj","xxx","zz","hhhhh","uuuuu","f","ddddddddd","zzzzzz","cccccc","kkkkkk","bbbbbbbb","hhhhhhhhhh","uuuuuuu","cccccccccc","jjjjj","gg","ppp","ccccccccc","rrrrrr","c","cccccccc","yyyyy","uuuu","jjjjjjjj","bb","hhh","l","u","yyyyyy","vvv","mmm","ffffff","eeeeeee","qqqqqqq","zzzzzzzzzz","ggg","zzzzzzz","dddddddddd","jjjjjjj","bbbbb","ttttttt","dddddddd","wwwwwww","vvvvvv","iii","ttttttttt","ggggggg","xx","oooooo","cc","rrrr","qqqq","sssssss","oooo","lllllllll","ii","tttttttttt","uuuuuu","kkkkkkkk","wwwwwwwwww","pppppppppp","uuuuuuuu","yyyyyyy","cccc","ggggg","ddddd","llllllllll","tttt","pppppppp","rrrrrrr","nnnn","x","yyy","iiiiiiiiii","iiiiii","llll","nnnnnnnnnn","aaaaaaaaaa","eeeeeeeeee","m","uuu","rrrrrrrr","h","b","vvvvvvv","ll","vv","mmmmmmm","zzzzz","uu","ccccccc","xxxxxxx","ss","eeeeeeee","llllllll","eeee","y","ppppp","qqqqqq","mmmmmm","gggg","yyyyyyyyy","jjjjjj","rrrrr","a","bbbb","ssssss","sss","ooooo","ffffffffff","kkk","xxxxxxxx","wwwwwwwww","w","iiiiiiii","ffff","dddddd","bbbbbb","uuuuuuuuu","kkkkkkk","gggggggggg","qqqqqqqq","vvvvvvvvv","bbbbbbbbbb","nnnnn","tt","wwww","iiiii","hhhhhhh","zzzzzzzz","ssssssssss","j","fff","bbbbbbb","aaaa","mmmmmmmmmm","jjjjjjjjjj","sssss","yyyyyyyy","hh","q","rrrrrrrrr","mmmmmmmm","wwwww","www","rrr","lllll","uuuuuuuuuu","oo","jjjjjjjjj","dddd","pppp","hhhhhhhhh","kk","gggggggg","xxxxx","vvvv","d","qqqqqqqqq","dd","ggggggggg","t","yyyy","bbb","yyyyyyyyyy","tttttt","ccccc","aa","eeeeee","llllll","kkkkkkkkkk","sssssssss","i","hhhhhh","oooooooooo","wwwwww","ooooooooo","zzzz","k","hhhhhhhh","aaaaa","mmmmm"]
4)避免循环搜索
在搜索过程中,搜索过的节点不能重复搜索。为了解决这个问题,需要给搜索过的节点打个标记。
打标记的方法很多,这里只说两种,一种是定义一个与网格大小一样的布尔类型数组,访问过的节点标记为True,未访问过的节点标记为False;另一种是把节点的值修改为标志变量,搜索完成之后再还原回去。
board[i][j] = '.'; .... board[i][j] = ch;
完整的代码
class Solution { private: class TrieNode { public: bool is_end{false}; std::unordered_map<char, std::shared_ptr<TrieNode>> childs; std::string word; }; void search(std::shared_ptr<TrieNode> node, std::vector<std::vector<char>>& board, const size_t i, const size_t j, std::vector<std::string>& rets) { const auto ch = board[i][j]; if (node->childs.count(ch) <= 0) { return; } // std::cout << "ch:" << ch << std::endl; if (node->childs.at(ch)->is_end == true) { // std::cout << "ch 1:" << ch << std::endl; if (!node->childs[ch]->word.empty()) { rets.emplace_back(node->childs[ch]->word); node->childs[ch]->word.clear(); } } board[i][j] = '.'; if (i > 0) { search(node->childs[ch], board, i - 1, j, rets); } if (j > 0) { search(node->childs[ch], board, i, j - 1, rets); } if (i + 1 < board.size()) { search(node->childs[ch], board, i + 1, j, rets); } if (j + 1 < board[0].size()) { search(node->childs[ch], board, i, j + 1, rets); } board[i][j] = ch; if (node->childs[ch]->childs.size() <= 0) { node->childs.erase(ch); } } public: vector<string> findWords(vector<vector<char>>& board, vector<string>& words) { std::vector<std::string> rets; if (words.size() <= 0) { return rets; } if (board.size() <= 0 || board[0].size() <= 0) { return rets; } auto root = std::make_shared<TrieNode>(); for (const auto& word : words) { auto node = root; for (size_t i = 0; i < word.size(); i++) { const auto& ch = word.at(i); if (node->childs.count(ch) <= 0) { node->childs[ch] = std::make_shared<TrieNode>(); } node = node->childs.at(ch); } node->word = word; node->is_end = true; } for (size_t i = 0; i < board.size(); i++) { for (size_t j = 0; j < board[0].size(); j++) { const auto& ch = board[i][j]; if (root->childs.count(ch) > 0) { search(root, board, i, j, rets); } } } return rets; } };
参考材料
https://leetcode-cn.com/problems/word-search-ii/
除非注明,否则均为[半杯茶的小酒杯]原创文章,转载必须以链接形式标明本文链接