Removing files called X and containing Y in Bash

I recently had cause to remove all files with a particular name and containing a particular string from some production servers. Below is the Gist I used to do it. The first part of the script is some basic validations, but line 43 is where the main work is done. We first `find` all files named `example.html`, we then pass those to `grep` to check for the string, and, if the string is found, finally we pass the file paths to `rm` to verbosely (so we have a record of what was removed) remove them.

While checking this script, my colleague Sean introduced me to Shell Check, which in turn pointed to two ways I could tighten the original script:

Firstly, and most simply, I wasn’t quoting the `$PATH_TO_INSTALLS` on line 43 in the script; this could have caused issues if my file path contained spaces. This was fixed pretty easily.

Secondly, and along similar lines, I was not aware that `xargs` might interpret said spaces in unexpected and inglorious ways. To fix this, I’ve added `-print0` to the `find` command and `-0` to my `xargs` commands.


#!/bin/bash
#
# Remove all files called example.html and
# containing the string
# "<title>Genericons</title>
#
# Copyright Automattic Inc, 2015
#
# This script is free software, and is released under the
# terms of the GPL version 2 or (at your option) any
# later version.
#
# See
# —
#
# https://cftp.zendesk.com/agent/tickets/187
#
# Usage
# —–
#
# # Process updates for all admin users in all instances
# ./remove-genericons-example-html.sh /var/www/html/
#
PATH_TO_INSTALLS=$1
RED='\e[0;31m'
GREEN='\e[0;32m'
NC='\e[0m' # No Color
# VALIDATIONS
if [ -z "$PATH_TO_INSTALLS" ]; then
echo -e "${RED}Please provide a path to the installs, e.g. './remove-genericons-example-html.sh /var/www/html/'.${NC}"
exit 1
fi
if [ ! -d "$PATH_TO_INSTALLS" ]; then
echo -e "${RED}The directory ${PATH_TO_INSTALLS} does not exist.${NC}"
exit 20
fi
find "$PATH_TO_INSTALLS" -name example.html -print0 | xargs -0 grep -lIZ "<title>Genericons</title>" | xargs -0 rm -fv —
echo -e "${GREEN}Recursively removed all example.html files containing '<title>Genericons</title>', starting at ${PATH_TO_INSTALLS}. See output above for details.${NC}"
exit 0

Join the Conversation

1 Comment

  1. You can use -exec with find, so you don’t need to pipe into the first xargs;
    find "$PATH" -name example.html -exec grep -lIZ "Genericons" {} \; | xargs -0 rm -fv --
    Not tested it, or spent much time, but should work :)

Leave a comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.