Bash: Convert HTML to Markdown Recursively with Pandoc
You can recursively convert all your HTML files to Mardown format in Bash, by using Pandoc.
find . \-name "*.ht*" | while read i; do pandoc -f html -t markdown "$i" -o "${i%.*}.md"; done
I used this when migrating a WordPress site to a static site generator — I had hundreds of exported HTML files that needed to become Markdown. The find command grabs all .html and .htm files recursively, and Pandoc handles the conversion for each one. The ${i%.*}.md part strips the original extension and replaces it with .md.
You’ll want to review the output since Pandoc’s HTML-to-Markdown conversion isn’t always perfect — complex tables, inline styles, and embedded scripts can produce messy results. For cleaner output, add --wrap=none to prevent Pandoc from inserting hard line breaks. Make sure Pandoc is installed first (brew install pandoc on Mac or apt install pandoc on Ubuntu).