Extract Extension from File Path in Bash
Last updated
Last updated
In Bash scripting, extracting file extensions from file paths is a common requirement for various tasks. Being able to retrieve the file extension provides valuable information about the file type and enables users to perform actions like renaming files, filtering by extension, or applying specific operations based on the file format.
This tutorial shows different techniques in Bash for extracting file extensions, offering practical solutions and example scripts to empower users in efficiently handling file manipulation tasks.
We saw how parameter expansions work. We have used it to with the extension, as well as just the file name. But we might only be interested in the file extension. We can still proceed with it.
The above example can be generalized as follows:
In our example, we have used the regex (*) as the pattern. In the pattern we have also added the dot (.). This dot represents the beginning of the extension. Bash will look for all the characters till the last dot and trim it. Hence, we will get the characters after the final dot. This gives us the extension of the file.
We have used the awk command in the previous article to extract the name of the file with its extension. We can just take out the extension from the provided full path as well.
Example:
The -F option available with the awk command acts as the field separator. We have built the regular expression for the awk statement in such a way that the string dissociates into two halves from the final dot (.).
Awk with the support of the NF keyword facilitates the extraction of the file extension. $NF selects the last half of the field, already separated by the -F option.
Let us consider another example:
Not the output we were hoping for? We were expecting the awk command to give us the extension as "tar.gz". However, we only got "gz". This is because the string in the filepath splits from the final dot (gz) rather than from the initial one tar.gz.
As a result, we should be careful with awk when working with extensions that have more than one dot. This will also produce unexpected results, if the file path has a dot as well.
The script prompts the user for the directory path. After checking its existence it asks the user to provide the extension. Based on that it will create 5 sequential files from 1 to 5 with the provided extension.
Example:
#!/bin/bash
read
-p "Enter the directory path: "
dpath
if
[ ! -d "$dpath"
]then echo
"Invalid directory. Please check the input."else read
-p "Enter the new extension you want: "
ext for
i in
$(seq
1 5)dotouch
${i}.${ext}done
fi
1. The script begins by using the read command to prompt the user to enter the directory path. The entered value is stored in the variable named dpath.
2. The script checks if the value present in the variable path is a directory or not using the -d test condition. If the directory structure doesn't exist, it prints "Invalid directory path. Please check the input."
3. If the file exists, the script prompts the user to enter the new extension. This response is stored in the ext variable.
4. The script runs a for loop 5 times. The iterator is the variable i which starts from the value 1 and goes till 5.
5. In every iteration, the script creates a file $i with the extension $ext.
6. By the end of the script, 5 files will be created in the working directory, with filename as 1, 2, 3, 4 and 5 respectively. The extension of all these five files will come from input provided by the user, which is stored in the ext variable.
The script prompts the user for the directory path. After checking its existence it lists the extensions of all the files present
Example:
#!/bin/bash
read
-p "Enter the directory path: "
dpath
if
[ ! -d "$dpath"
]then echo
"Invalid directory. Please check the input."else contentsOfDir=`ls`for
content in
$contentsOfDirdo if
[ -d $content ] then echo
$content is a directory and will not have any extension else ext=`echo
${content#*.}` echo
"$content has extension: "$ext fidonefi
1. The execution of the script starts by prompting the user to enter the directory path. This is done by the read command and it stores the response in the variable named dpath.
2. The script verifies if the user input is a directory or not using the -d test condition on the variable dpath. If the path is non-existent, it prints "Invalid directory path. Please check the input."
3. If the file exists, the script fetches all the contents of that directory and stores it in the variable contentsofDir using the backticks technique.
4. The script runs a for loop for all the contents present in that directory. The for loop iterator is named content.
5. In every iteration, the script checks if the content is a file or directory. If it is a directory, it displays that the current value of the iterator is a directory and will not have any extension.
6. If it is a file (not a directory), the script will extract the extension using the parameter expansion and store the result in the ext variable.
7. For each iteration, it will finally print the name of the file and the extension it has.
This trims the longest match from the beginning till the pattern. It displays whatever is after the pattern. The only drawback of this method is that we have to use a variable since this is a .
In this example, we have taken a variable to hold the entire file path. We are free to supply that string directly to the echo command. We pipe the result of this echo command so that Bash can send it as an input to the command.