3
$\begingroup$

Does anyone know how to improve this for speed performance? The code works fine, but DeleteDuplicates makes it really slow for large lists.

removeDuplicatesWithRules[mylist_] := Module[{rules, isDuplicateQ},
   
   rules = 
    Flatten /@ (Subsets[{{1 -> 2, 2 -> 1}, {3 -> 4, 4 -> 3}, {5 -> 6, 
         6 -> 5}, {7 -> 8, 8 -> 7}, {9 -> 10, 10 -> 9}}]);
   
   (*Function to check if two elements are considered duplicates \
based on rules*)
   isDuplicateQ[element1_, element2_] := 
    AnyTrue[rules, ((element1 /. #) /. 
         List :> Composition[Sort, List]) == (element2 /. 
         List :> Composition[Sort, List]) &];
   
   (*Function to remove duplicates from a list*)
   DeleteDuplicates[mylist, isDuplicateQ[#1, #2] &]
   
   ];

(*Example test list*)
mylist = {{1, 2}, {2, 1}, {3, 4}, {4, 3}, {5, 6, {7, 8, 2}}, {5, 
    6, {8, 7, 1}}, {7, 8}, {9, 10}, {10, 9}, {11, 12}};

(*Applying the function*)
removeDuplicatesWithRules[mylist]

(* {{1, 2}, {3, 4}, {5, 6, {7, 8, 2}}, {7, 8}, {9, 10}, {11, 12}} *)
$\endgroup$
4
  • 1
    $\begingroup$ I don't really understand your rule of duplicates, but removeDuplicatesWithRules seems to be the main reason for slowing down. DeleteDuplicates[Sort /@ mylist] produces the same output, and runs much faster. Does that match your need? $\endgroup$
    – A. Kato
    Commented Jul 5 at 1:46
  • $\begingroup$ @A.Kato nope, that is not the same. Two elements are considered duplicate if #1/. ruleX= #2 where ruleX could be any rule from rules. If you use Sort only, these two elements {5, 6, {7, 8, 2}}, {5, 6, {8, 7, 1}} would not be duplicates. $\endgroup$
    – internet
    Commented Jul 5 at 1:48
  • $\begingroup$ I see. thank you. $\endgroup$
    – A. Kato
    Commented Jul 5 at 2:20
  • 1
    $\begingroup$ FYI: DeleteDuplicates[data, sameFunc] has much worse performance than DeleteDuplicates[data] or DeleteDuplicatesBy[data, hashFunc] in general. $\endgroup$
    – Michael E2
    Commented Jul 5 at 20:29

1 Answer 1

7
$\begingroup$

In checking duplicates, you are considering 2, 4, 6, 8, 10 are the same as 1, 3 ,5, 7, 9, respectively. Then, instead of testing all possible replacement, I recommend you to convert into standard forms:

mylist = {{1, 2}, {2, 1}, {3, 4}, {4, 3}, {5, 6, {7, 8, 2}}, {5, 
    6, {8, 7, 1}}, {7, 8}, {9, 10}, {10, 9}, {11, 12}}; 

makestandard = i_?EvenQ :> (i - 1);
DeleteDuplicatesBy[mylist, (# /. makestandard) &]

(* {{1, 2}, {3, 4}, {5, 6, {7, 8, 2}}, {7, 8}, {9, 10}, {11, 12}} *)
$\endgroup$
3
  • 2
    $\begingroup$ (+1) Also: DeleteDuplicatesBy[list, # /. i_?EvenQ :> i - 1 &] $\endgroup$
    – eldo
    Commented Jul 5 at 6:18
  • $\begingroup$ That's a creative solution! $\endgroup$
    – internet
    Commented Jul 5 at 11:09
  • 2
    $\begingroup$ Not really unique enough to add as seperate answer, but you could also do DeleteDuplicatesBy[mylist, # + Mod[#, 2] &] $\endgroup$
    – ydd
    Commented Jul 5 at 14:16

Not the answer you're looking for? Browse other questions tagged or ask your own question.