Trie : Data Structure

Trie
This Trie has Three words, BUDDHA, SUN and SUPER.
 
Each node can have multiple child nodes, max children is decided by number of alphabets Trie has,  in this case 26 (English Alphabets).
 
node representation:
C++
struct trieNode
{
     struct trieNode *childrenArr[ALPHABET_SIZE];
     bool isEndOfWord;
}
 
Java
public classtrieNode
{
     trieNode[] childrenArr;
     Boolean isEndOfWord;
     trieNode()
     {
          childrenArr = new trieNode[26];
     }
     //methods ...
}
If child present at index i then that child contains (i+1)th alphabet eg. root node children array will have child at only at index 1 and 19 representing B and S respectively. All other index will be storing null.

Operations:

  1. String[] autoComplete(String str)
  2. Boolean contains(String str)
  3. void insert(String str)
  4. void delete(String str)

Boolean contains(String str)

This function is simple, start from root with first alphabet of string and keep traversing down the tree if you encounter a node which does not have child corresponding to the current char then complete word is not present, otherwise present.
eg:
Considering Trie represented in Figure.
contains("sun")
  root   if (null != childrenArr[indexOf('s')]) True
   |
   s     if (null != childrenArr[indexOf('u')]) True
   |
   u     if (null != childrenArr[indexOf('n')]) True
   |
   n     if (True == isEndOfWord) True

  return True
contains("song")
 root   if (null != childrenArr[indexOf('s')]) True
  |
  s     if (null != childrenArr[indexOf('o')]) False
return False  

void insert(String str)

This function add child nodes corresponding to the chars in string only if they are present already, if already present keep traversing and if not present add child node corresponding to chars and keep on adding.

eg: Considering Trie represented in Figure.
insert("buddy")
root              if (null != childrenArr[indexOf('b')]) True
/ \
b  s   at b node: if (null != childrenArr[indexOf('u')]) True
|  .
u  .              if (null != childrenArr[indexOf('d')]) True
|
d                 if (null != childrenArr[indexOf('d')]) True
|
d                 if (null != childrenArr[indexOf('y')]) False
|\
| \       y child is not present so add it.
h  y   ---added a y child at indexOf('y') with isEndOfWord True
|
a

insert("bud")
root              if (null != childrenArr[indexOf('b')]) True
/ \
b  s   at b node: if (null != childrenArr[indexOf('u')]) True
|  .
u  .              if (null != childrenArr[indexOf('d')]) True
|
d                 set isEndOfWord True

String[] autoComplete(Stringstr)

autoComplete(“bu”) output: [“bud”, “buddy”, “buddha”] autoComplete(“su”) output: [“sun”, “super”]

Its simple first reach node representing last char of prefix, and then start traversing
all children recursively in other words traverse all paths till leaf nodes. During traversal
isEndOfWord can be used to find all possible words.

void delete(String str)

                  root
                    / \
                    b  s
                    |  |
                    u  u
                    |  |\
     end of word ---d  n p
                    |    |
                    d    e
                    |\   |
                    h y  r
                    |
                    a

Lets understand expected behavior of delete().
contains("buddy") True contains("bud") True delete("bud") This will set isEndOfWord False in in b->u->(d).
contains("bud") False contains("buddy") True delete("buddy") This will remove y node and free memory of node.
contains("buddy") False contains("buddy") False contains("buddha") True delete("buddha") It will remove all nodes from b->u->d->d->h->a.
contains("buddha") False

Delete operation is tricky one and there are Three scenarios:

First Scenario: Deleting a word which is a prefix of other existing longer words.
eg: deletion of bud.
In this case we need to set isEndOfWord to Flase only.

Second Scenario: Deleting a word which shares a part of it with other words and one part is not shared by any other words.
eg: deletion of buddy, “y” is not shared with anyone so node “y” will be deleted.

Third Scenario: Deleting a word which does not share any char with other words in trie.
eg: we have deleted bud and buddy now buddha is not shared with any word.
delete(“buddha”) This will delete all nodes b to a.

 

Applications of Trie:

  1. Lexicographical sorting
  2. Auto Complete feature.
  3. Spell Check.
  4. Longest Prefix Matching
  5. Replacement for Binary Search Tree and Hash Lookup.

code:

Please visit github link given below for full implementation.
https://github.com/abhi1kush/coding/blob/master/Trie_Java/src/Trie.java
c++, c, python, javascript, go, scalla, clojure implementation coming soon.

Leave a comment