Awesome Open Source
Awesome Open Source

Document2HTML converter

Build Status Build status Github Releases Documentation MIT Licence

Documents to HTML converter


Extension Text Styles extraction Images extraction
XML Yes Not applicable Not applicable
DOCX Yes Yes Yes
DOC Yes No No
RTF Yes Yes Yes
ODT Yes Yes Yes
XLSX Yes Yes Yes
XLS Yes Yes No
CSV Yes Not applicable Not applicable
TXT/MD Yes Yes Yes
JSON Yes Not applicable Not applicable
EPUB Yes Yes Yes
PDF Yes No Yes
PPT Yes No No


cURL for downloading images:

apt-get install libcurl4-openssl-dev
brew install curl

iconv for encoding conversion

sudo apt-get install libc6
brew install libiconv

Tidy for cleaning and repairing HTML

sudo apt-get install libtidy-dev
brew install tidy-html5

file for determining file extension



Make sure the Qt (>= 5.6) development libraries are installed:

  • In Ubuntu/Debian: apt-get install qt5-default qttools5-dev-tools zlib1g-dev
  • In Fedora: sudo dnf builddep tiled
  • In Arch Linux: pacman -S qt
  • In Mac OS X with Homebrew:
    • brew install qt5
    • brew link qt5 --force
  • Or you can download Qt from:

Now you can compile by running:

qmake (or qmake-qt5 on some systems)

To do a shadow build, you can run qmake from a different directory and refer it to, for example:

mkdir build
cd build
qmake ../src/

If you have ideas how to build project with CMake instead of Qt please contact me.

Tool usage


    document2html -f|-d <input file|dir> -o <output dir> [-si]
    document2html -h
    document2html -v


Short Flag Long Flag Description
-f --file Input file
-d --dir Input directory
-o --out Output directory
-s --style Extract styles
-i --image Extract images
-h --help Display help message
-v --version Display package version



If you have questions regarding the libraries, I would like to invite you to open an issue at Github. Please describe your request, problem, or question as detailed as possible, and also mention the version of the libraries you are using as well as the version of your compiler and operating system. Opening an issue at Github allows other users and contributors to this libraries to collaborate.

You're welcome! :)

Get A Weekly Email With Trending Projects For These Topics
No Spam. Unsubscribe easily at any time.
Html (417,503
C Plus Plus (363,149
C (270,317
Json (11,079
Xml (3,766
Pdf (3,001
Csv (2,702
Converter (1,968
Document (771
Txt (649
Epub (510
Xlsx (470
Docx (279
Xls (174
Ppt (140
Related Projects