uniq Command in Linux: Remove and Count Duplicate Lines
uniq is a command-line utility that filters adjacent duplicate lines from sorted input and writes the result to standard output. It is most commonly used together with sort
to remove or count duplicates across an entire file.
This guide explains how to use the uniq command with practical examples.
uniq Command Syntax
The syntax for the uniq command is as follows:
uniq [OPTIONS] [INPUT [OUTPUT]]If no input file is specified, uniq reads from standard input. The optional OUTPUT argument redirects the result to a file instead of standard output.
Removing Duplicate Lines
By default, uniq removes adjacent duplicate lines, keeping one copy of each. Because it only compares consecutive lines, the input must be sorted first — otherwise only back-to-back duplicates are removed.
Given a file with repeated entries:
apple
apple
banana
cherry
banana
cherry
cherryRunning uniq on the unsorted file removes only the back-to-back duplicates (apple apple and cherry cherry), but leaves the second banana and the second cherry intact:
uniq fruits.txtapple
banana
cherry
banana
cherry
To remove all duplicates regardless of position, sort the file first:
sort fruits.txt | uniqapple
banana
cherry
Counting Occurrences
To prefix each output line with the number of times it appears, use the -c (--count) option:
sort fruits.txt | uniq -c 2 apple
2 banana
3 cherry
The count and the line are separated by whitespace. This is especially useful for finding the most frequent lines in a file. To rank them from most to least common, pipe the output back to sort:
sort fruits.txt | uniq -c | sort -rn 3 cherry
2 banana
2 apple
Showing Only Duplicate Lines
To print only lines that appear more than once (one copy per group), use the -d (--repeated) option:
sort fruits.txt | uniq -dapple
banana
cherry
To print every instance of a repeated line rather than just one, use -D:
sort fruits.txt | uniq -Dapple
apple
banana
banana
cherry
cherry
cherry
Showing Only Unique Lines
To print only lines that appear exactly once (no duplicates), use the -u (--unique) option:
sort fruits.txt | uniq -uIf every line appears more than once, the output is empty. In the example above, all three fruits are duplicated, so nothing is printed.
Ignoring Case
By default, uniq comparisons are case-sensitive, so Apple and apple are treated as different lines. To compare lines case-insensitively, use the -i (--ignore-case) option:
sort -f file.txt | uniq -iSkipping Fields and Characters
uniq can be told to skip leading fields or characters before comparing lines.
To skip the first N whitespace-separated fields, use -f N (--skip-fields=N). This is useful when lines share a common prefix (such as a timestamp) that should not be part of the comparison:
2026-01-01 ERROR disk full
2026-01-02 ERROR disk full
2026-01-03 WARNING low memoryuniq -f 1 log.txt2026-01-01 ERROR disk full
2026-01-03 WARNING low memory
The first field (the date) is skipped, so the two ERROR disk full lines are treated as duplicates.
To skip the first N characters instead of fields, use -s N (--skip-chars=N). To limit the comparison to the first N characters per line, use -w N (--check-chars=N).
Combining uniq with Other Commands
uniq works well in pipelines with grep
, cut
, sort
, and wc
.
To count the number of unique lines in a file:
sort file.txt | uniq | wc -lTo find the top 10 most common words in a text file:
grep -Eo '[[:alnum:]]+' file.txt | sort | uniq -c | sort -rn | head -10To list unique IP addresses from an access log:
awk '{print $1}' /var/log/nginx/access.log | sort | uniqTo find lines that appear in one file but not another (using -u against merged sorted files):
sort file1.txt file2.txt | uniq -uQuick Reference
| Command | Description |
|---|---|
sort file.txt | uniq |
Remove all duplicate lines |
sort file.txt | uniq -c |
Count occurrences of each line |
sort file.txt | uniq -c | sort -rn |
Rank lines by frequency |
sort file.txt | uniq -d |
Show only duplicate lines (one per group) |
sort file.txt | uniq -D |
Show all instances of duplicate lines |
sort file.txt | uniq -u |
Show only lines that appear exactly once |
uniq -i file.txt |
Compare lines case-insensitively |
uniq -f 2 file.txt |
Skip first 2 fields when comparing |
uniq -s 5 file.txt |
Skip first 5 characters when comparing |
uniq -w 10 file.txt |
Compare only first 10 characters |
Troubleshooting
Duplicates are not removed
uniq only removes adjacent duplicate lines. If the file is not sorted, non-consecutive duplicates are not detected. Always sort the input first: sort file.txt | uniq.
-c output has inconsistent spacing
The count is right-aligned and padded with spaces. If you need to process the output further, use awk to normalize spacing while preserving the full line text: sort file.txt | uniq -c | awk '{count=$1; $1=\"\"; sub(/^ +/, \"\"); print count, $0}'.
Case variants are treated as different lines
Use the -i option to compare case-insensitively. Also sort with -f so that case-insensitive duplicates are adjacent before uniq processes them: sort -f file.txt | uniq -i.
FAQ
What is the difference between sort -u and sort | uniq?
Both produce the same output for simple deduplication. sort -u is slightly more efficient. sort | uniq is more flexible because uniq supports options like -c (count occurrences) and -d (show only duplicates) that sort -u does not.
Does uniq modify the input file?
No. uniq reads from a file or standard input and writes to standard output. The original file is never modified. To save the result, use redirection: sort file.txt | uniq > deduped.txt.
How do I count unique lines in a file?
Pipe through sort, uniq, and wc
: sort file.txt | uniq | wc -l.
How do I find lines that only appear in one of two files?
Merge both files and use uniq -u: sort file1.txt file2.txt | uniq -u. Lines shared by both files become duplicates in the merged sorted stream and are suppressed. Lines that exist in only one file remain unique and are printed.
Can uniq work on columns instead of full lines?
Yes. Use -f N to skip the first N fields, -s N to skip the first N characters, or -w N to limit comparison to the first N characters. This lets you deduplicate based on a portion of each line.
Conclusion
The uniq command is a focused tool for filtering and counting duplicate lines. It is most effective when used after sort
in a pipeline and pairs naturally with wc
, grep
, and head
for text analysis tasks.
If you have any questions, feel free to leave a comment below.
