I recently had cause to remove all files with a particular name and containing a particular string from some production servers. Below is the Gist I used to do it. The first part of the script is some basic validations, but line 43 is where the main work is done. We first `find` all files named `example.html`, we then pass those to `grep` to check for the string, and, if the string is found, finally we pass the file paths to `rm` to verbosely (so we have a record of what was removed) remove them.
While checking this script, my colleague Sean introduced me to Shell Check, which in turn pointed to two ways I could tighten the original script:
Firstly, and most simply, I wasn’t quoting the `$PATH_TO_INSTALLS` on line 43 in the script; this could have caused issues if my file path contained spaces. This was fixed pretty easily.
Secondly, and along similar lines, I was not aware that `xargs` might interpret said spaces in unexpected and inglorious ways. To fix this, I’ve added `-print0` to the `find` command and `-0` to my `xargs` commands.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
# | |
# Remove all files called example.html and | |
# containing the string | |
# "<title>Genericons</title> | |
# | |
# Copyright Automattic Inc, 2015 | |
# | |
# This script is free software, and is released under the | |
# terms of the GPL version 2 or (at your option) any | |
# later version. | |
# | |
# See | |
# — | |
# | |
# https://cftp.zendesk.com/agent/tickets/187 | |
# | |
# Usage | |
# —– | |
# | |
# # Process updates for all admin users in all instances | |
# ./remove-genericons-example-html.sh /var/www/html/ | |
# | |
PATH_TO_INSTALLS=$1 | |
RED='\e[0;31m' | |
GREEN='\e[0;32m' | |
NC='\e[0m' # No Color | |
# VALIDATIONS | |
if [ -z "$PATH_TO_INSTALLS" ]; then | |
echo -e "${RED}Please provide a path to the installs, e.g. './remove-genericons-example-html.sh /var/www/html/'.${NC}" | |
exit 1 | |
fi | |
if [ ! -d "$PATH_TO_INSTALLS" ]; then | |
echo -e "${RED}The directory ${PATH_TO_INSTALLS} does not exist.${NC}" | |
exit 20 | |
fi | |
find "$PATH_TO_INSTALLS" -name example.html -print0 | xargs -0 grep -lIZ "<title>Genericons</title>" | xargs -0 rm -fv — | |
echo -e "${GREEN}Recursively removed all example.html files containing '<title>Genericons</title>', starting at ${PATH_TO_INSTALLS}. See output above for details.${NC}" | |
exit 0 |
You can use -exec with find, so you don’t need to pipe into the first xargs;
find "$PATH" -name example.html -exec grep -lIZ "Genericons" {} \; | xargs -0 rm -fv --
Not tested it, or spent much time, but should work :)