LeetCode 49: Group Anagrams

Question

I'm exceeding the time limit for a 10,000 word test case provided on LeetCode:

Given an array of strings, group anagrams together.

For example, given: ["eat", "tea", "tan", "ate", "nat", "bat"],
Return:
[
  ["ate", "eat","tea"],
  ["nat","tan"],
  ["bat"]
]
Note: All inputs will be in lower-case.

Any suggestions on how I could speed this up a little bit?

public HashMap<String, List<Integer>> createMap(String[] strs)
{
  HashMap<String, List<Integer>> map = new HashMap<String, List<Integer>>();

    for(String s: strs)
    {
     char[] charArray = s.toCharArray();
     Arrays.sort(charArray);
     String sortedString = new String(charArray);

     if(map.containsKey(sortedString))
     {
        List<Integer> pos = map.get(sortedString);
        pos.add(Arrays.asList(strs).indexOf(s));
        map.put(sortedString, pos);
     }

    else
    {
        List<Integer> pos = new ArrayList<Integer>();
        pos.add(Arrays.asList(strs).indexOf(s));
        map.put(sortedString, pos);
    }
  }

  return map;
}

public List<List<String>> groupAnagrams(String[] strs) {

  HashMap hm = createMap(strs);

  List<List<String>> anagrams = new ArrayList<List<String>>();

  for(Object k: hm.keySet())
  {
    String key = k.toString();
    ArrayList pos = (ArrayList)hm.get(key);

    List<String> a = new ArrayList<String>();
    for(Object p : pos)
    {
      int idx = Integer.parseInt(p.toString());
      a.add(strs[idx]);
    }
    anagrams.add(a);
  }
  return anagrams;
}

Please include the description of the programming-challenge into the question so the reviewers can see what you're trying to do and why. — Mast, Commented Feb 4, 2017 at 8:24

janos · Accepted Answer · 2017-02-04 13:28:44Z

Why so slow

It's important to consider the time complexity of all the operations in your program.

Take a closer looks at this part I extracted from your code:

for(String s: strs)
{
 // ...

    pos.add(Arrays.asList(strs).indexOf(s));

For each s, converting a String[] to a List<String>, in order to use the indexOf method to find the index of s? That doesn't sound so good. How about making that a counting loop?

for (int i = 0; i < strs.length; i++) {
    // ...

    pos.add(i);

But why use indexes at all? Why not make this method return a Map<String, List<String>> instead?

for (String s : strs) {
    // ...

    pos.add(s);

If you do that, then the implementation of the other method becomes simply:

public List<List<String>> groupAnagrams(String[] strs) {
    return new ArrayList<>(createMap(strs).values());
}

Good practices

The code violates many good practices and common conventions:

Never use raw types, for example use HashMap<String, List<String>> instead of just HashMap without type parameters
Prefer to declare variables with interface types, for example Map<String, List<String>> instead of HashMap<String, List<String>>
Use the diamond operator <> when possible, for example new HashMap<> instead of new HashMap<String, List<String>>
Use consistent indentation, and please place braces they way I did in my examples above

Community · Accepted Answer · 2020-06-10 13:24:26Z

Jianmin commented on my answer to a similar question here: Grouping anagrams and it got me thinking about the problem, and also reading this solution.

Janos has some good comments, but I wanted to point out three additional things:

computeIfAbsent

Java Map instances now have the computeIfAbsent function. Code like:

 if(map.containsKey(sortedString))
 {
    List<Integer> pos = map.get(sortedString);
    pos.add(Arrays.asList(strs).indexOf(s));
    map.put(sortedString, pos);
 }

else
{
    List<Integer> pos = new ArrayList<Integer>();
    pos.add(Arrays.asList(strs).indexOf(s));
    map.put(sortedString, pos);
}

should instead be:

List<Integer> pos = map.computeIfAbsent(sortedString, k -> new ArrayList<>());
pos.add(Arrays.asList(strs).indexOf(s));

Of course, you should be using String values and not Integers, as Janos indicated, but the idea is to only have one call in to the Map. See the docs here: computeIfAbsent

function extraction

Your code to compute the "key" from the word, should be extracted in to a separate function. There should be a function:

public static final String keyOf(String word) {
    char[] chars = word.toCharArray();
    Arrays.sort(chars);
    return new String(chars);
}

It clearly is separate, isolated logic, and should be maintained as such.

Streams

Streams have been around for a while, and you should become familiar with them.

I tool the liberty of "solving" the problem using streams, and the code is relatively neat, and concise:

public static final List<List<String>> groupAnagrams(Stream<String> stream) {
    Map<String, List<String>> mapped = stream.collect(Collectors.groupingBy(w -> keyOf(w)));
    return new ArrayList<>(mapped.values());
}

I have put that in to an ideone page here: https://ideone.com/qLiA8c (that ideone code also has code to format it "pretty" ... like the examples above).

These are all very good and helpful points. I'm not so familiar with Streams and Generics functions, but I will incorporate these changes. Thank you very much! — boltthrower, Commented Feb 5, 2017 at 22:21

Jianmin Chen · Accepted Answer · 2017-02-04 22:53:30Z

Make Java code more readable, ready to review

I strongly agree with code review conducted by Janos, code review is not just to share you a workable solution. You also have to learn ways to make code more readable, practice better way using Java language in your case.

Use O(N) solution to generate key, avoid \$O(NLogN)\$ Sorting algorithm

The idea to avoid timeout issue is to generate a key for each anagram group using \$O(N)\$ time complexity, instead of naive one by sorting the string. To take advantage of alphabetic number only has constant of size \$26\$, go through the string once, one char a time, to record the number of occurrence, like a counting sort. Here is the C# code, pass all test cases on leetcode online judge.

Here is the anagram hashed key algorithm in C#, most of important decision is to choose a more efficient sort - counting sort instead of comparison based sorting:

    /* O(n) solution 
     * abc 
     * hashed to key:
     * 01-11-11
     * 0 stands for a, 1 is count of a
     * further simplify the key:
     * 1-1-1
     * first 1 is count of a, 
     * second 1 is count of b, 
     * third 1 is count of c
     * 
     */
    public static string ConvertHashKeys(string input)
    {
        if (input == null || input.Length == 0)
        {
            return string.Empty;
        }

        int[] countAlphabetic = new int[26];

        foreach (char c in input)
        {
            countAlphabetic[c - 'a']++;
        }

        return String.Join("-", countAlphabetic);
    }

Stack Exchange Network

LeetCode 49: Group Anagrams

3 Answers 3

Why so slow

Good practices

computeIfAbsent

function extraction

Streams

Not the answer you're looking for? Browse other questions tagged
java
programming-challenge
time-limit-exceeded
hash-map
or ask your own question.

Linked

Hot Network Questions

LeetCode 49: Group Anagrams

3 Answers 3

Why so slow

Good practices

computeIfAbsent

function extraction

Streams

Not the answer you're looking for? Browse other questions tagged javaprogramming-challengetime-limit-exceededhash-map or ask your own question.

Linked

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
java
programming-challenge
time-limit-exceeded
hash-map
or ask your own question.