Bash String Manipulation

9 minute read     Updated:

Adam Gordon Bell %     Adam Gordon Bell

One thing that bash is excellent at is manipulating strings of text. If you’re at the command line or writing a small script, then knowing some bash string idioms can be a lot of help.

So in this article, I’m going to go over techniques for working with strings in bash.

You can run any of the examples at the bash prompt:

> echo "test"
test

Or you can put the same commands into a file with a bash shebang.

#!/bin/bash

echo "test"

And then make it executable using chmod and run it at the command line:

> chmod +x strings.sh
> ./strings.sh
test

Background

If you need a review of the basics of bash scripting, check out Understanding Bash. If not, know that everything covered will work in bash version 3.2 and greater. Much covered will also work in ZSH. That means everything here will work on macOS, Windows under WSL and WSL 2, and most Linux distributions. Of course, if you’re on Alpine, or some minimal linux distribution, you will need to install bash first.

Let’s start at the beginning.

Bash Concatenate Strings

In bash, I can declare a variable like this:

> one="1"

and then I can refer to it in a double-quoted string like this:

> echo "$one"
1

Concatenating strings follows easily from this same pattern:

#!/bin/bash

one="1"
two="2"
three="$one$two"
echo "$three"
12

Side Note: Globs and File Expansions

You can, in theory, refer to variables directly like this:

echo $one

but if you do that, you might have unexpected things happen.

#!/bin/bash

comment="/* begin comment block"
echo $comment
/Applications /Library /System /Users /Volumes /bin /cores /dev /etc 
/home /opt /private /sbin /tmp /usr /var begin comment block

Without quotes, bash splits your string on whitespace and then does a pathname expansion on /*.

I’m going to use whitespace splitting later on, but for now remember: You should always use double quotes if you want the literal value of a variable.

Another Concatenate Method +=

Another way to combine strings is using +=:

#!/bin/bash

concat=""
concat+="1"
concat+="2"
echo "$concat"
12

Next, let’s do string length.

Bash String Length

The "$var" syntax is called variable expansion in bash and you can also write it as "${var}". This expansion syntax allows you to do some powerful things. One of those things is getting the length of a string:

> words="here are some words"
> echo "'$words' is ${#words} characters long
'here are some words' is 19 characters long

Bash SubString

If I need to get a portion of an existing string, then the substring syntax of parameter expansion is here to help.

The format you use for this is ${string:position:length}. Here are some examples.

Bash First Character

You can get the first character of a string like this:

> word="bash"
> echo "${word:0:1}"
b

Since I’m asking to start at position zero and return a string of length one, I can also shorten this a bit:

> word="bash"
> echo "${word::1}"
b

However, this won’t work in ZSH (where you must provide the 0):

> word="zsh"
> echo "${word::1}"
zsh: closing brace expected

You can get the inverse of this string, the portion starting after the first character, using an alternate substring syntax ${string:position} (Note the single colon and single parameter). It ends up looking like this:

#!/bin/bash

word="bash"
echo "Head: ${word:0:1}"
echo "Rest: ${word:1}"
Head: b
Rest: ash

This substring works by telling the parameter expansion to return a new string starting a position one, which drops the first character.

Bash Last Character

To return the last character of a string in bash, I can use the same single-argument substring parameter expansion but use negative indexes, which will start from the end.

#!/bin/bash

word="bash" 
echo "${word:(-1)}"
echo "${word:(-2)}"
echo "${word:(-3)}"
h
sh
ash

To drop the last character, I can combine this with the string length expansion (${#var}):

#!/bin/bash

word="bash" 
echo "${word:0:${#word}-1}"
echo "${word:0:${#word}-2}"
echo "${word:0:${#word}-3}"
bas
ba
b

That is a bit long, though, so you could also use the pattern expansion for removing a regex from the end of a string (${var%<<regex>>}) and the regular expression for any single character (?):

#!/bin/bash

word="bash" 
echo "${word%?}
echo "${word%??}
echo "${word%???}
bas
ba
b

This regex trim feature only removes the regex match if it finds one. If the regex doesn’t match, it doesn’t remove anything.

> word="one"
> echo "${word%????}" # remove first four characters
one 

You can also use regular expressions to remove characters from the beginning of a string by using # like this:

#!/bin/bash

word="bash" 
echo "${word#?}
echo "${word#??}
echo "${word#???}

Running that I get characters dropped from the beginning of the string if they match the regex:

ash
sh
h

Bash String Replace

Another common task I run into when working with strings in bash is replacing parts of an existing string.

Let’s say I want to change the word create to make in this quote:

When you don’t create things, you become defined by your tastes rather than ability. Your tastes only narrow & exclude people. So create.

Why The Lucky Stiff

There is a parameter expansion for string replacement:

#!/bin/bash

phrase="When you don't create things, you become defined by your tastes 
rather than ability. Your tastes only narrow & exclude people. So create."
echo "${phrase/create/make}"
When you don't make things, you become defined by your tastes 
rather than ability. Your tastes only narrow & exclude people. So create.

You can see that my script only replaced the first create.

To replace all, I can change it from test/find/replace to /text//find/replace (Note the double slash //):

#!/bin/bash

phrase="When you don't create things, you become defined by your tastes 
rather than ability. Your tastes only narrow & exclude people. So create."
echo "${phrase//create/make}"
When you don't make things, you become defined by your tastes 
rather than ability. Your tastes only narrow & exclude people. So make.

You can do more complicated string placements using regular expressions. Like redact a phone number:

#!/bin/bash

number="Phone Number: 234-234-2343"
echo "${number//[0-9]/X}
Phone number: XXX-XXX-XXXX

If the substitution logic is complex, this regex replacement format can become hard to understand, and you may want to consider using regex match (below) or pulling in an outside tool like sed.

Bash String Conditionals

You can compare strings for equality (==), inequality (!=), and ordering (> or <):

if [[ "one" == "one" ]]; then
    echo "Strings are equal."
fi

if [[ "one" != "two" ]]; then
    echo "Strings are not equal."
fi

if [[ "aaa" < "bbb" ]]; then
    echo "aaa is smaller."
fi
Strings are equal.
Strings are not equal.
aaa is smaller.

You can also use = to compare strings to globs:

#!/bin/bash
file="todo.gz"
if [[ "$file" = *.gz ]]; then
    echo "Found gzip file: $file"
fi
if [[ "$file" = todo.* ]]; then
    echo "Found file named todo: $file"
fi
Found gzip file: todo.gz
Found file named todo: todo.gz

(Note that in the above the glob is not quoted.)

You can see it matched in both cases. Glob patterns have their limits, though. And so when I need to confirm a string matches a specific format, I usually more right to regular expression match(=~).

Bash String Regex Match

Here is how I would write a regex match for checking if a string starts with aa:

#!/bin/bash
name="aardvark"
if [[ "$name" =~ ^aa ]]; then
    echo "Starts with aa: $name"
fi
Starts with aa: aardvark

And here using the regex match to find if the string contains a certain substring:

#!/bin/bash
name="aardvark"
if [[ "$name" =~ dvark ]]; then
    echo "Contains dvark: $name"
fi
Contains dvark: aardvark

Unfortunately, this match operator does not support all of modern regular expression syntax: you can’t use positive or negative look behind and the character classes are a little different then you would find in most modern programming languages. But it does support capture groups.

Bash Split Strings

What if I want to use regexes to spit a string on pipes and pull out the values? This is possible using capture groups :

if [[ "1|tom|1982" =~ (.*)\|(.*)\|(.*) ]]; 
then 
  echo "ID = ${BASH_REMATCH[1]}" ; 
  echo "Name = ${BASH_REMATCH[2]}" ; 
  echo "Year = ${BASH_REMATCH[3]}" ; 
else 
  echo "Not proper format"; 
fi
ID = 1
Name = tom
Year = 1982

Capture groups can be convenient for doing some light-weight string parsing in bash. However, there is a better method for splitting strings by a delimiter. It requires a little explanation, though.

Internal Field Separator Split

By default, bash treats spaces as the delimiter between separate elements. This can lead to problems, though, and is one of the reasons I mentioned earlier for double quoting your variable assignments. However, this space delimiting can also be a helpful feature:

list="foo bar baz"
for word in $list; do # <-- $list is not in double quotes
  echo "Word: $word"
done
Word: foo
Word: bar
Word: baz

If you wrap space-delimited items in brackets, you get an array.

You can access this array like this:

list="foo bar baz"
array=($list)
echo "Zeroth: ${array[0]}"  
echo "First: ${array[1]}"  
echo "Whole Array: ${array[*]}" 
Zeroth: foo
First: bar
Whole Array: foo bar baz

I can use this feature to split a string on a delimiter. All I need to do change the internal field separator (IFS), create my array, and then change it back.

#!/bin/bash

text="1|tom|1982"

IFS='|' 
array=($text)
unset IFS;

And now I have an array split on my chosen delimiter:

echo "ID = ${array[1]}" ; 
echo "Name = ${array[2]}" ; 
echo "Year = ${array[3]}" ; 
ID = 1
Name = tom
Year = 1982

Reaching Outside of Bash

Many things are hard to do directly in pure bash but easy to do with the right supporting tools. For example, trimming the whitespace from a string is verbose in pure bash, but its simple to do by piping to existing POSIX tools like xargs:

> echo "   lol  " | xargs
lol

Bash regular expressions have some limitations but sed, grep, and awk make it easy to do whatever you need, and if you have to deal with JSON data jq will make your life easier.

Conclusion

I hope this overview of string manipulation in bash gave you enough details to cover most of your use cases.

Also, if you’re the type of person who’s not afraid to solve problems in bash then take a look at Earthly. It’s a great tool for creating repeatable builds in a approachable syntax.

While you’re here:

Earthly is a syntax for defining your build. It works with your existing build system. Get repeatable and understandable builds today.

Feedback

If you have any clever tricks for handling strings in bash, or spot any problems with my examples, let me know on Twitter @AdamGordonBell:
Adam Gordon Bell %
Adam Gordon Bell

Spreading the word about Earthly. Host of CoRecursive podcast. Physical Embodiment of Cunningham’s Law