I want to find the unique elements in a large R vector. What function should I use?

+1 vote

Best answer

You can use either the **unique()** or **duplicated()** function to find the unique values in any R vector.

The **duplicated()** function returns FALSE for the first occurrence of any element in the vector; all other occurrences are marked as TRUE, i.e., they are duplicates. You need to use the indices of FALSE to find the unique elements.

Here is an example using both functions.

> a <- c(1,0,11,2,1,5,0,1,2,3,4,5,6,7,1,2,3,4,5,3,1)

> unique(a)

[1] 1 0 11 2 5 3 4 6 7> duplicated(a)

[1] FALSE FALSE FALSE FALSE TRUE FALSE TRUE TRUE TRUE FALSE FALSE TRUE FALSE FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE> a[duplicated(a)==FALSE]

[1] 1 0 11 2 5 3 4 6 7