How remove duplicated lines¶

Using `awk`¶

Sometimes lines are duplicated in your text file. You can easily remove these duplicates using awk

awk '!seen[$0]++' files.txt

By adding the -i inplace flag, the original file is modified directly.

awk -i inplace  '!seen[$0]++' files.txt

To remove duplicate lines based on a specific column, such as the second column, replace !seen[$0]++ to !seen[$2]++.

awk -i inplace  '!seen[$2]++' files.txt

You can use sort to remove duplicates. The following code sorts the file and selects unique values based on the second column.

sort -u -t' ' -k2,2 file

The flags used are:

-u: prints only unique lines.
-t: specifies the delimiter (in this case, a space).
-k: specifies the column for sorting. For example:
- -k2: means sorting based on the second column
- -k1,3: means sorting from column 1 through column 3.
- -k1: means sorting from column 1 through the end.