Trimming White Space in Bash
Last updated
Last updated
Working with strings and structured data generally requires some specific formatting to correctly interpret data. Often, there is a requirement to remove the leading and trailing white spaces from the strings so that the text is consistently formatted. In this tutorial, we will look at some of the common ways for trimming excess whitespaces in Bash.
The (stream editor) command is a useful shell command used for text editing and parsing. It allows us to perform a variety of operations on strings such as search, replace, append, or delete strings.
Let’s see how we can use the sed command for trimming extra white spaces from a Bash string.
#!/bin/bash
str=" Hello world!"echo
"Untrimmed string: $str"
trim_str="$(echo "$str" | sed -r "s/[[:space:]]+/ /g")"echo
"Trimmed string: $trim_str"
The ‘-r’ flag enables it to understand regular expressions.
The first ‘s’ denotes we are using sed as a substitute command.
Pattern after the first ‘/’: [[:blank:]]+ specifies the pattern to match.
String after the second ‘/’: ‘ ‘ denotes the string with which the pattern is to be replaced.
‘/g’ is used to apply the filter globally and substitute all the occurrences of the pattern.
On executing the script:
All the spaces are reduced to a single blank space.
The sed command can also be used to trim white spaces from large text files. Let’s see this with an example.
Observe the file contents after the sed command is run:
It removes all the excess white spaces but doesn’t completely remove the leading, and trailing whitespaces.
We can use the awk command to trim leading, trailing and extra white spaces from the strings as follows:
#!/bin/bash
str=" Hello world!"echo
"Untrimmed string: $str"
trim_str="$(awk '{$1=$1}1' <<< "$str")"echo
"Trimmed string: $trim_str"
The first part of the awk command ‘{$1=$1}’ assigns the first word field to itself. This also causes awk to re-evaluate ‘$0’by concatenating all fields using OFS ( default = SPACE ) as the delimiter.
The ‘1’ following {$1=$1} is a shorthand to print the re-evaluated line, which is stored in the trim_str variable.
The command `<<< “$str”` is used to provide a string to the awk command.
Executing the script:
We can also use the awk command to trim white spaces in the text file.
Using the text file from the previous example:
See the modified contents:
Using ‘-i inplace’ argument is used to modify the file in place.
All the leading, trailing and extra spaces between the words are trimmed.
Although, xargs isn’t meant for this purpose but because of its built-in text formatting, it removes the leading, trailing and any extra white spaces between the words in a string.
For example:
#!/bin/bash
str=" Hello world!"echo
"Untrimmed string: $str"
trim_str="$(echo "$str" | xargs)"echo
"Trimmed string: $trim_str"
We pipe the untrimmed string to xargs and store the result in a new variable trim_str.
Observe the output on executing the script:
Let’s take an example to see how we can use the tr command to trim the white spaces in a string.
#!/bin/bash
str=" Hello world!"echo "Untrimmed string: $str"
trim_str="$(echo "$str" | tr -d " ")"echo "Trimmed string: $trim_str"
Using ‘-d’ flag is used to delete all the occurrences of the specified pattern in the string.
Note that this even removes the spaces between the words.
If the string has an inconsistent amount of spaces between the words, we can use the ‘-s’ (squeeze) flag to replace them with a single space.
Replacing -d with -s in the previous example:
All the leading, trailing and any extra spaces between the words are converted into a single space.
When dealing with strings and texts, formatting and consistency is important. Bash provides a number of methods to trim the white spaces.
The sed command provides an easy-to-use syntax for trimming whitespaces. Multiple filters may be used in conjunction to trim leading, trailing and any extra whitespaces between the words.
Awk command can also be easily used to process the input line by line and remove white spaces from variables or a text file.
Trim command (tr) can be used with the delete flag (-d) or squeeze flag (-s) to delete or squeeze the white spaces with the specified pattern.
The xargs command can also format the strings to remove excess white spaces but it isn’t recommended for this purpose.
The command is another commonly used text-processing tool for working with streams of text, processing the input line by line, and performing operations such as parsing, editing, and appending, etc.
The command is used to pass the output of one command as an input of another command when the command doesn’t accept parameters directly using the pipe (|) operator.
The command is used in Bash to translate, squeeze, or delete characters read from the stdin and write the output to the stdout.