Just some tips and tricks for html related technology. sed -e 's/<[^>]*>//g' file.html remove html tags