File2Genre a file sorting shell script

This bash shell script (Linux or MAC command line tool) helps you arrange your movies, books, music or any other documents into genres/categories. The way it works is that you enter your 'genres/categories' and a specific search string in the script and the script will determine the appropriate genre by the highest amount of hits for each genre on Google. The example script will categorize your movies into genres.

Download the script in the attachment at the end of this page, and take a look at the sample code below:

PS: If you are sure you don't have certain genres (like Musical in the example script), remove them from the genres variable to improve the results.

#!/bin/bash
#
# File2Genre version 0.04
# Organize files into directories based on the results of a Google search
#
# Copyright (c) 2009-2011 JF Nutbroek
# Visit http://www.mywebmymail.com for more information
#
# Permission to use, copy, modify, and/or distribute this software for any
# purpose without fee is hereby granted, provided that the above
# copyright notice and this permission notice appear in all copies.
#
# THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
# WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
# MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
# ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
# WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
# ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTUOUS ACTION, ARISING OUT OF
# OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
#
# Usage:
#
# This script should be located in the same directory as the files you want to sort
# The script will try to determine the genre/category of the file by using the google search engine
# The script will create the genre/category directory if it does not exists and move the file there
# When the script is finished all files should be placed in the correct directory
# Remaining files need to be moved to the correct directories manually (or a new genre needs to be added)
# To run the script in test mode use option -t: sh file2genre.sh -t
#
# Working principle:
#
# The script will look on Google and count the hits per genre/category based on the searchwords and genres given below
# The file will be moved to the highest ranking genre/category
# The example genres and keywords below will categorize movies, change GENRES and the SEARCHWORDS to adapt
#
# The Genres/Categories and search keywords (space separated)

GENRES="Action Adventure Animation Biography Comedy Crime Disaster Documentary Drama Epic Family Fantasy Film-Noir History Horror Musical Mystery Romance Sci-Fi Sport Thriller War Western"
SEARCHWORDS="movie genre"

# Function to clean up old workfiles
clean_up() {
if [ -f ".search.html" ]; then
rm .search.html
fi
if [ -f ".search.txt" ]; then
rm .search.txt
fi
if [ -f ".count" ]; then
rm .count
fi
}

# Get all files in this directory
echo ""
echo " Initializing"
FILENAMES="`find ./ -type f -name '[^.]*' -maxdepth 1 | sed 's/\ /_/g;' | sed 's/\///g;' | sed 's/.\(.*\)/\1/'`"
FILENAMES=`echo $FILENAMES | sed "s/\file2genre.sh//g"`
ELEMENTS=($FILENAMES)
TOTAL=${#ELEMENTS[@]}
COUNTFILE=1

# Prepare search string
SEARCHGENRE="+"${GENRES// /+OR+}
SEARCHWORDS="+"${SEARCHWORDS// /+}"+"

# Determine the genre/category from each filename and move to genre/category directory
for FILENAME in $FILENAMES
do

clean_up
FILE="`echo $FILENAME | sed 's/%/ /g' | sed 's/_/+/g' | sed 's/\.[^\.]*$//'`"
FILE=${FILE// /+}
curl --user-agent "Mozilla/4.73 [en] (X11; U; Linux 2.2.15 i686)" --silent "http://www.google.com/search?q=${FILE}${SEARCHWORDS}\"${FILE}\"${SEARCHGENRE}&num=50" > .search.html
textutil -convert txt .search.html

for GENRE in $GENRES
do
COUNT="`grep "$GENRE" .search.txt | wc -l`"
echo "${COUNT}@${GENRE}" >> .count
done

FILEGENRE="`sort -f .count | tail -1 | cut -d '@' -f2`"
HITS="`sort -f .count | tail -1 | cut -d '@' -f1`"
HITS=${HITS// /}
FILE="`echo $FILENAME | sed 's/%/ /g' | sed 's/_/ /g'`"
MOVE=`expr $HITS \> 0`

if [ "$MOVE" = "1" ]; then
if [ ! -d $FILEGENRE ]; then
if [ "$1" != "-t" ]; then
mkdir $FILEGENRE
fi
fi
if [ "$1" != "-t" ]; then
mv "$FILE" "${FILEGENRE}/${FILE}"
else
echo " Testing: Will move ${FILE} to ${FILEGENRE}"
fi
fi

# Progress bar
if [ "$1" != "-t" ]; then
PERCENT=$(($COUNTFILE*100/$TOTAL))
BAR=""
DONE=$(($PERCENT/2))
for ((c=0;c<=${DONE};c++))
do
BAR="${BAR}="
done
BAR="${BAR}>"
for ((c=${DONE};c<50;c++))
do
BAR="${BAR}."
done
printf "\r Progress: [ %s%s ] $PERCENT%%" ${BAR}
COUNTFILE=$(($COUNTFILE+1))
fi

done

clean_up
echo " Completed"
echo " Done!"
echo " Move any remaining files manually or add a missing genre to the GENRES variable in the script"
echo ""

exit 0

AttachmentSize
file2genre.sh_.zip2.18 KB