Banner of

sed command: A comprehensive guide


Category: Linux

Date: November 2021
Views: 1.37K


sed is a stream editor. A stream editor is used to perform basic text transformations on an input stream (a file or input from a pipeline). While in some ways similar to an editor which permits scripted edits (such as ed), sed works by making only one pass over the input(s), and is consequently more efficient. But it is sed's ability to filter text in a pipeline which particularly distinguishes it from other types of editors.

in this article I will show you the power of sed and how easy and convinient it is to edit text files. so lets get to it

Search and replace

this is one of the basic uses of sed . you will find yourself using this a lot

the way we use this command is: sed 's/pattern/replacement/'. the character / can be any character you want but it will result in an error if your text contains the said character so if you're dealing with a text file containing paths, ie a lot of / character , it is advised to use another character like so: sed 's|day|night|' another thing is that you can add a "g" at the end of the command to make it global or greedy, meaning change all occurences of the pattern not just the first one, like so

    
echo "good day to you Ser, and good day to you Madame" | sed 's/day/night/'
# the result will be:
 good night to you Ser, and good day to you Madame
echo "good day to you Ser, and good day to you Madame" | sed 's/day/night/g'
# the result will be:
 good night to you Ser, and good night to you Madame
    

Search and replace: regular expressions

you can search using literal words or regular expressions, here are a few symbols if you are not familiar with reg ex: ^ means the beginning of the line, $ the end of the line, . "dot" means every character

The following will make all lines of the file between double quotes, by inserting " at the beginning of the line and at the end

    
# insert double quote " at the beginning of the line
# the -i means change the actual file: performe the changes in place instead of printing the result to stdout
sed -i 's/^/"/'  myfile
# insert double quote " at the end of the line, this can be used to append to a line
sed -i 's/$/"/'  myfile
echo {a..z} | sed -E 's/^.{20}//'
# remove the first 20 characters: k l m n o p q r s t u v w x y z
    

Search and replace: keep and reuse a pattern

you can search for a pattern and reuse it as much as you like, but to do so you need to use sed with the argument -E which tells it that we want to use extended regular expressions. we do this by surrending the patterns with () like so.

    
echo "day and night" | sed -E 's/(and)/\1 \1 \1/'
# day and and and night
    

consider a line containing the following :

hello 1234 ladies and gentlmen, welcome

we will apply the follwoing sed command to it :

sed -E 's/(^[^ ]*) ([0-9]*) (.*)and (.*)$/\2 \1 \3 AND :\2: \4 \1 + \3/'

Explanation:

  • ^[^ ]* : means from the beginning of the line ^, any non ^ space characters ie [^ ] or more * . this will match "hello" at the beginning of the line
  • then space and then
  • [0-9]* : means any numeric character [0-9] or more * . this will match 1234
  • then space and then
  • (.*) : means every character "." or more "*", this will match "ladies ", notice the space after ladies
  • then "and" then space
  • (.*)$ : finally every character untill the end of the line "$", this will match "gentlmen, welcome"

the parentheses () are not included in the patterns unless we escape them with \( and \). each opening and closing parenthesis make a pattern that will be assigned an index starting from \1 . in the example above we have 4 of them and we can use them however we want in the replacement. this will produce

1234 hello ladies  AND :1234: gentlmen, welcome hello + ladies

Operating on lines : by line numbers

The sed commands affect all the lines of the file unless told otherwise. the "s" command for search and replace for instance can be told to operate on a single line or a range of lines like so:

    
# replace day by night only on the first line
sed '1s/day/night/' file
# do the same but on the last line
sed '$s/day/night/' file
# replace day by night in the lines from line 3 to line 7
sed '3,7s/day/night/' file
    

Let's say we want to print only specific lines of a file. we need to tell sed to suppress printing the output by the argument -n and then use the command "p" to print the output we want:

    
# only print the 2nd line:
sed -n '2p' file
# only print lines 3 to 7
sed -n '3,7p' file
# print from the line 10 to the last line
sed '10,$p' file
    

there is also the "d" command to delete lines by their numbers, it behaves just like the "p" command above. PS no need to suppress the output by -n when you are deleting lines, unless you want to, of course.

Operating on lines : by patterns

What if you want to operate on specific lines but you don't know their line numbers for whatever reason. sed of course gives you the choice to operate on lines using patterns. consider a file containing the following lines

dear diary.
day 1 : we went camping outside to spend 7 days in the forest

and then we needed to go fishing in the forest

day 2 : it was raining all day
day 3 : trip canceled

operating on lines by patterns:

    
# print lines containing "forest"
sed -n '/forest/p' file
# delete everything between "day 1" and "day 2"
sed '/day 1/,/day 2/d' file
# delete empty lines, ie lines that has nothing between the start ^ and the end $
sed '/^$/d' file
# or lines containing only spaces
sed '/^[ ]*$/d' file
    

When we are dealing with lines with patterns or a range of lines, the delimiter character is "/" if we need to change it. we have to escape it with "\" like so: sed -n '\XhelloXp' file. the sed character is "X" . this will only print lines containing the word "hello"

The following example will replace "fishing" by "hunting" in lines between "day 1" and "day 3"

    
sed '/day 1/,/day 3/s/fishing/hunting/' file
# combining the above with p to only print these lines after the change
sed -n '/day 1/,/day 3/{s/fishing/hunting/;p}' file
    

in the second example above, we can combine as many commands as we want inside the {} and separate them by a semicolon ";". these commands can be multiple search and replace or "p" for printing or "d" for deleting, or any other commands that we'll show later

Inserting text before or after a line: by line number

sed of course can insert text in a specific line of our file, and here is how we can do it

    
# insert before line 1, ie at the beginning of the file
sed '1ithis text will be inserted at the beginning\nand this is a new line' file
# insert after the 3rd line
sed '3athis will be added after the 3rd line' file
    

Notice that there is no need for spaces or anything. sed will read the number, then the "i" for insertion before or "a" for after. and then you can add your text. in a shell script with a string variable containing text and some special characters, it was not consistent and sometimes it throws some errors, so for complicated and multiline text that needs to be inserted. I use "ed the standard editor" which is an awesome tool and an ancestor of all the text editors out there (sed, grep, awk, vim ...). the ed command needs its own article, maybe it will be next

Inserting text before or after a line: by pattern

Inserting before or after a line can also be done by finding the line by a specific pattern or regular expression.

    
# insert before the line containing "day 2"
sed -i '/day 2/i these lines describe actions done before day 2' file
# how about the crazy idea of inserting after every line containing the word "forest"
sed -i '/forest/a hello nature\nhello world' file
    

Advanced stuff:

at this point we need an in depth understanding of the inner workings of sed command. sed iterates over the file line by line. when a line is read. it goes into a space called "pattern space". it is in this pattern space that sed applies the instructions and commands we provide. be it s, p ,d or any other command. after applying the commands on the lines in the pattern space the result is then printed to the output. then the pattern space is emptied for the next line/lines of the file. and the cycle continues until the end of the file.
there is another space in sed. it is called "hold space". any line that is stored in the hold space is not operated upon by our commands. as the name suggests, the lines in the hold space are on hold and the stay as they are untill we get them back to the pattern space

altaltalt

the following commands are for the pattern and hold space. and are illustrated in the picutre above

  • n N Read/append the next line of input into the pattern space
  • h H Copy/append pattern space to hold space
  • g G Copy/append hold space to pattern space
  • x Exchange the contents of the hold and pattern spaces.

in order to clearly show you examples of using the hold/pattern space. I will introduce the concept of sed scripts. up until now we used sed one liners. ie a block of sed commands inside single ' ' or double quotes " ". but it is possible to write a separate sed script and provide it to sed via the argument "-f"
the following script will not do anything fancy. it will just move the first line of our file to the end making it the last line. also it will change the first occurence of "day" in each line to "night", it will also replace "and" by "AND". it will be done by the command:
sed -f scriptfilename ourfilename

    
# this is a comment in the sed script
# this is a block of commands
1{
    # this block will operate on the first line
    # copy the line from pattern space to hold space
    h
    # then Delete pattern space.  Start next cycle.
    d
}
{
    # do the following for all lines
    s/day/night/
    s/and/AND/
}
${
    # this block will operate on the last line $
    # we get back the first line we stored in the hold space
    G
    # magiscule G will append to pattern space
    # miniscule g will erase the pattern space and replace it
    # with what is stored in hold space
}
    

Labels and branching

if we are in a situation where we need to repeat a set of commands or instructions, we can make them in a label, and then instruct sed to "branch to this label", ie go to this label. and repeat it again. this script shows us how to read all the lines in a file and after that we replace all the new line characters with spaces, thus making it a single line file. if you are familiar with windows batch scripting, this is the same concept

    
# we make a new label using the colon ":" then a character as the name
# of the label
:a
# append the next line of the file into the pattern space
N
# unless this is the last line "$!" means not last line
# branch to label a "ba" go to label a and repeat the comands after the label
# in this example a single instruction: N
$!ba
# when we exit the loop, ie we reach the last line and the end of the file
# we substitute the new line character "\n" by space
s/\n/ /g
    

the above script can be compressed into a one liner: sed ':a;N;$!ba;s/\n//g' file

A few useful sed examples

remove leading spaces: sed -E 's/^\s+//g' file

remove leading spaces: sed 's/^[ \t]*//g' file

remove html tags: sed 's/<[^>]*>//g' file.html

change in place and keep backup: sed -i.bak '' file

And there you have it. I believe I covered everything there is to know about sed command. if I overlooked or forgot something. please tell me in the comments



1.37K views

Previous Article Next Article

0 Comments, latest

No comments.