Linux Tips and Tricks

Linux Tips and Tricks

Basic operations using Awk and Sed

March15

awk and sed

Swap the columns using the sed and awk

1
2
3
4
5
$ echo "A B" | sed 's/\(.\) \(.\)/\2 \1/'
B A
 
$ echo "A B" | awk '{print $2,$1}'
B A

Replace the first occurance of “AA” with “ZZ”

1
2
3
4
5
$ echo "ABCDAB" | awk '{sub("AB","ZZ",$0)}1'
ZZCDAB
 
$ echo "ABCDAB" | sed 's/AB/ZZ/'
ZZCDAB

Replace all the occruances of “AA” to “ZZ”

1
2
3
4
5
$ echo "ABCDAB" | awk '{gsub("AB","ZZ",$0)}1'
ZZCDZZ
 
$ echo "ABCDAB" | sed 's/AB/ZZ/g'
ZZCDZZ

Print all the charcters with space delimited

1
2
3
4
5
$ echo "ABCDAB" | awk -v FS= '{for(i=1;i<=NF;i++){printf("%s ",$i)}}'
A B C D A B 
 
$ echo "ABCDAB" | sed 's/./& /g'                                                
A B C D A B

Print the 2nd line of the test.txt file

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
$ cat test.txt
one
two
three
four
five
 
$ sed -n '2p' test.txt
two
 
$ awk 'NR==2' test.txt
two
 
$ sed '2!d' test.txt
two

Print all the lines except the 2nd line

1
2
3
4
5
6
7
8
9
10
11
$ sed '2d' test.txt
one
three
four
five
 
$ awk 'NR!=2' test.txt
one
three
four
five

Print the 2nd,3rd and 4th line in the test.txt file

1
2
3
4
5
6
7
8
$ sed -n '2,4p' test.txt
two
three
four
 
$ awk 'NR>1&&NR
 
<strong>Replace the wore "one" to "two" in the 2nd line of the file.</strong>
1
2
3
4
5
6
7
8
9
10
11
12
13
$ sed -e '2 s/one/two/' test.txt
one
two
three
four
five
 
$ awk 'NR==2{sub("one","two",$0)}1' test.txt
one
two
three
four
five
posted under Uncategorized | No Comments »

Exploring echo and Colorful echo

March13

echo command

echo  command is used to places the text in the terminal or console

1) Create a file using the echo command.

1
$ echo "The Linux Tips" > myfile.txt

2) Append the text in the existing file

1
2
3
4
5
6
7
8
$ cat myfile.txt
The Linux Tips
 
$ echo "The Linux Tips - Second Line" >> myfile.txt
 
$ cat myfile.txt
The Linux Tips
The Linux Tips - Second Line

3) echo and its arguments

-n -> It will not output the trailing newline

1
2
$ echo -n "The Linux Tips"
The Linux Tips$

Normal one ( trailing newline )

1
2
3
$ echo "The Linux Tips"
The Linux Tips
$

-e -> It will enable the interpretation of backslash escapes

1
2
3
4
#In the below example, i used \n (new line), but without using -e it prints the \n
 
$ echo "The Linux \n Tips "
The Linux \n Tips
1
2
3
$ echo -e "The Linux \n Tips"
The Linux
 Tips
1
2
3
4
5
6
#In the below example, i used \t (tab), but without using -e it prints the \t
$ echo "The Linux\tTips"
The Linux\tTips
 
$ echo -e "The Linux\tTips"
The Linux    Tips

With effective to -e option, we can use the below options

1
2
3
#print the \ (back slash)
 
$ echo -e "\\"

\

1
2
3
4
# \b is for backspace. In the below example, you can see the d character is removed
 
$ echo -e "abcd\bef"
abcef

 

1
2
3
4
# \c is used produce no further output. In the below example, ef is not printed
 
$ echo -e "abcd\cef"
abcd$

 

1
2
3
4
# \e is used for escape
 
$ echo -e "abcd\eef"
abcdef

 

1
2
3
4
5
# \f is for form feed.
 
$ echo -e "abcd\fef"
abcd
    ef

 

1
2
3
4
# \r is for carriage return
 
$ echo -e "abcd\ref"
efcd

 

1
2
3
4
# \t is for horizontal tab
 
$ echo -e "abcd\tef"
abcd    ef

 

1
2
3
4
5
# \v is for vertical tab
 
$ echo -e "abcd\vef"
abcd
    ef

-E -> disable interpretation of backslash escapes (default)

1
2
3
4
5
6
$ echo -E "The Linux \n Tips"
The Linux \n Tips
 
$ echo -e "The Linux \n Tips"
The Linux
 Tips

echo and its colors

The below program is using ANSI escape code SGR sequences. ( For more about the SGR sequence, search “Ansi escape code + wiki” in google )

Type the below program and save it as colors.sh

1
2
3
4
5
6
7
8
9
10
11
12
13
14
$ cat colors.sh
#!/bin/sh
FGRED=`echo "\033[31m"`
FGCYAN=`echo "\033[36m"`
BGRED=`echo "\033[41m"`
FGBLUE=`echo "\033[35m"`
BGGREEN=`echo "\033[42m"`
 
NORMAL=`echo "\033[m"`
 
echo "${FGBLUE} Text in blue ${NORMAL}"
echo "Text normal"
echo "${BGRED} Background in red"
echo "${BGGREEN} Background in Green and back to Normal ${NORMAL}"

Assign the execute permission and execute the script.  code tag didn’t show the colors 🙁 in this page

1
2
3
4
5
$ ./colors.sh  
 Text in blue
Text normal
 Background in red
 Background in Green and back to Normal

After executing the script, you can see some colorful text and background colors

Change the numbers after the “[” ( opening square bracket ) and see the difference

For more information about colors, see here

Try with other numeric combinations and see different colors 🙂

 

bye
kamaraj

posted under Uncategorized | No Comments »

Awk Basics & Tutorial – 2

March3

Awk Basics & Tutorial – 2

Today, we will see some awk pattern matching.

create a input.txt file with the below contents. we are going to use the input.txt for all our awk commands.

1
2
3
4
5
6
7
$ cat input.txt
The AWK utility is a data extraction and reporting tool that uses a data-driven
scripting language consisting of a set of actions to be taken against
textual data (either in files or data streams) for the purpose of
producing formatted reports. The language used by awk extensively uses the
string datatype, associative arrays (that is, arrays indexed by key strings),
and regular expressions.

In the input.txt, i want to print only the lines which have the word “awk”, then we can use the below commands.

# In the below command /pattern/ is any normal word or regular expression.
awk ‘/pattern/’ input.txt

1
2
$ awk '/awk/' input.txt
producing formatted reports. The language used by awk extensively uses the
1
2
$ awk '/awk/{print}' input.txt
producing formatted reports. The language used by awk extensively uses the
1
2
$ awk '/awk/{print $0}' input.txt
producing formatted reports. The language used by awk extensively uses the

How to ignore the case and match the pattern ?

we can use the tolower or toupper function in awk to ignore the case and print the lines.

1
2
3
$ awk 'tolower($0)~/awk/' input.txt
The AWK utility is a data extraction and reporting tool that uses a data-driven
producing formatted reports. The language used by awk extensively uses the
1
2
3
$ awk 'toupper($0)~/AWK/' input.txt
The AWK utility is a data extraction and reporting tool that uses a data-driven
producing formatted reports. The language used by awk extensively uses the

# GNU awk has the special varibale called IGNORECASE. In default it set to 0 (means not case-sensitive)
# In the below command, i used -v ( used to define the variable name and initialize the variable value )

1
2
3
$ awk -v IGNORECASE=1 '/awk/' input.txt
The AWK utility is a data extraction and reporting tool that uses a data-driven
producing formatted reports. The language used by awk extensively uses the
1
2
3
$ awk 'BEGIN{IGNORECASE=1}/awk/' input.txt
The AWK utility is a data extraction and reporting tool that uses a data-driven
producing formatted reports. The language used by awk extensively uses the

we can also inclue the upper case and lower case in the square brackets and find the pattern.

1
2
3
$ awk '/[Aa][Ww][Kk]/' input.txt
The AWK utility is a data extraction and reporting tool that uses a data-driven
producing formatted reports. The language used by awk extensively uses the

Now, we can see how to match two or more patterns in a file.

we can use && and || in the condition.

How to search a word “the” and “AWK” in the same line of a file ?

1
2
$ awk '/the/ && /awk/' input.txt
producing formatted reports. The language used by awk extensively uses the
1
2
3
4
# Ignore the case (GNU AWK)
$ awk '{IGNORECASE=1} /the/ && /AWK/' input.txt
The AWK utility is a data extraction and reporting tool that uses a data-driven
producing formatted reports. The language used by awk extensively uses the

To search a any one of the pattern, then use ||

1
2
3
4
5
# Ignore the case (GNU AWK)
$ awk '{IGNORECASE=1} /the/ || /AWK/' input.txt
The AWK utility is a data extraction and reporting tool that uses a data-driven
textual data (either in files or data streams) for the purpose of
producing formatted reports. The language used by awk extensively uses the
1
2
3
$ awk '/the/ ||  /awk/' input.txt
textual data (either in files or data streams) for the purpose of
producing formatted reports. The language used by awk extensively uses the

How to find a line which starts with “t” or “T” ?

1
2
3
4
# Here ^ is used to mention the start of the line
$ awk '/^[tT]/' input.txt
The AWK utility is a data extraction and reporting tool that uses a data-driven
textual data (either in files or data streams) for the purpose of
1
2
3
4
#GNU AWK
$ awk '{IGNORECASE=1}/^t/' input.txt
The AWK utility is a data extraction and reporting tool that uses a data-driven
textual data (either in files or data streams) for the purpose of

How to find a line which ends with specific word ?

You can use a special character “$” to match the pattern which is there in the end of the line.

1
2
3
4
5
$ cat one.txt
one two three
two three one
three two one
one three two
1
2
3
4
# Find the line which ends with the word "one"
$ awk '/one$/' one.txt
two three one
three two one
1
2
3
# Find the line which ends with the word "two"
$ awk '/two$/' one.txt
one three two

Hope you guys enjoyed and learned something about the pattern matching ( basics ) in this blog post.

will see some more basics in the next blog.

bye
Kamaraj

posted under Uncategorized | No Comments »

Awk Basics & Tutorial – 1

March1

Awk Basics & Tutorial – 1

What is AWK ?

The AWK utility is a data extraction and reporting tool that uses a data-driven scripting language consisting of a set of actions to be taken against textual data (either in files or data streams) for the purpose of producing formatted reports. The language used by awk extensively uses the string datatype, associative arrays (that is, arrays indexed by key strings), and regular expressions.

For more theory about AWK, just google it.

Create a text file (input.txt) with the below contents

1
2
3
4
5
6
7
$ cat input.txt
The AWK utility is a data extraction and reporting tool that uses a data-driven
scripting language consisting of a set of actions to be taken against
textual data (either in files or data streams) for the purpose of
producing formatted reports. The language used by awk extensively uses the
string datatype, associative arrays (that is, arrays indexed by key strings),
and regular expressions.

AWK Structure

pattern {action}

BEGIN and END patterns in the AWK

BEGIN { print “BEGIN” }
{ print }
END   { print “END”  }

Exmaple :

1
2
3
4
5
6
7
8
9
$ awk 'BEGIN{print "BEGIN"}{print}END{print "END"}' input.txt
BEGIN
The AWK utility is a data extraction and reporting tool that uses a data-driven
scripting language consisting of a set of actions to be taken against
textual data (either in files or data streams) for the purpose of
producing formatted reports. The language used by awk extensively uses the
string datatype, associative arrays (that is, arrays indexed by key strings),
and regular expressions.
END

In the above example you can see that word “BEGIN” came in the First line and the word “END” came at the last line.

BEGIN and END is the special pattern, which is not used to match the records.

BEGIN block always execute before reading the file. In the below example, i am giving the file which is not exists in my current path. Eventhough the BEGIN block executes properly.

1
2
3
$ awk 'BEGIN{print "BEGIN"}{print "TEST"}' aaaaaaaaaa.txt
BEGIN
awk: fatal: cannot open file `aaaaaaaaaa.txt' for reading (No such file or directory)

END block always executes, once the file read is completed (file is processed fully)

Now, we see how to print particular columns in the input file.

The default delimiter for the awk is space.

$N – here i represent the N as column position.

1
2
3
4
5
6
# Contents of the in.txt
 
$ cat in.txt    
AAA 123
BBB 234
CCC 456
1
2
3
4
5
6
# Print the First Column in the in.txt
 
$ awk '{print $1}' in.txt
AAA
BBB
CCC
1
2
3
4
5
6
# Print the Second Column in the in.txt
 
$ awk '{print $2}' in.txt
123
234
456

 

1
2
3
4
5
6
# Swap the columns and print the in.txt
 
$ awk '{print $2,$1}' in.txt
123 AAA
234 BBB
456 CCC

If the file is seperated with some delimeter, then how to print the columns ?

we have a special option called -F for awk. we can used this option to specify the delimiter.

In the below example, the input file is using the pipe ( | ) as delimiter

1
2
3
4
$ cat in.txt
AAA|123
BBB|234
CCC|456

Why we are using -F\| ( back slash + | ) ?
All the special characters needs to be escaped.

1
2
3
4
$ awk -F\| '{print $1}' in.txt
AAA
BBB
CCC

 

1
2
3
4
$ awk -F\| '{print $2}' in.txt
123
234
456

 

1
2
3
4
$ awk -F\| '{print $2,$1}' in.txt
123 AAA
234 BBB
456 CCC

In some cases, we dont know how many fields (columns) are there in the input file. In that case, how to print the last column or last before column ?

we have a special variable called NF (number of fileds)

so, we can print the last filed using $NF and last before column as $(NF-1)

1
2
3
4
5
6
7
$ cat input.txt
The AWK utility is a data extraction and reporting tool that uses a data-driven
scripting language consisting of a set of actions to be taken against
textual data (either in files or data streams) for the purpose of
producing formatted reports. The language used by awk extensively uses the
string datatype, associative arrays (that is, arrays indexed by key strings),
and regular expressions.

 

1
2
3
4
5
6
7
8
9
#Prints the number of fields in each line
 
$ awk '{print NF}' input.txt
14
12
12
11
11
3

 

1
2
3
4
5
6
7
8
9
#Prints the last field in the line
 
$ awk '{print $NF}' input.txt
data-driven
against
of
the
strings),
expressions.

 

1
2
3
4
5
6
7
8
9
#Prints the last before field in the line.
 
$ awk '{print $(NF-1)}' input.txt
a
taken
purpose
uses
key
regular

How to print the line number in the awk ?

we have special variable called NR. This holds the line number which gets processed.

1
2
3
4
5
6
7
$ awk '{print NR}' input.txt
1
2
3
4
5
6

 

1
2
3
4
5
6
7
$ awk '{print NR,$0}' input.txt
1 The AWK utility is a data extraction and reporting tool that uses a data-driven
2 scripting language consisting of a set of actions to be taken against
3 textual data (either in files or data streams) for the purpose of
4 producing formatted reports. The language used by awk extensively uses the
5 string datatype, associative arrays (that is, arrays indexed by key strings),
6 and regular expressions.

you can notice $0 in the above command. what is that ?

$0 is used to print the whole line.

If we are using print alone in the block, then it will print the whole line

1
2
3
4
$ awk '{print}' in.txt
AAA|123
BBB|234
CCC|456

So, today you learned about the below things about awk.

1) AWK pattern
2) BEGIN block
3) END block
4) Print the particular colmns
5) -F argument
6) NF variable
7) NR variable

I will write some other basic things in the next blog.

– Kamaraj

posted under Uncategorized | No Comments »

Recent Comments

    Categories