Awesome Open Source
Awesome Open Source

Apache Tika for TYPO3

Build Status Scrutinizer Code Quality Code Coverage Latest Stable Version License Monthly Downloads

A TYPO3 CMS extension that provides Apache Tika functionality including

  • text extraction
  • meta data extraction
  • language detection (from strings or files)

Tika can be used as standalone Tika app/jar, Tika server, and via SolrCell integrated in Apache Solr.

We're open for contributions !

Please find further information regarding Apache Tika on the project's homepage

Continuous Integration

We use GitHub Actions for continuous integration.

To run the test suite locally, please use our DDEV docker environment

Note: This requires a proper combination of branches:

  • solr-ddev-site on master branch:
    • packages/ext-solr on master
    • packages/ext-tika on master
  • solr-ddev-site on release-11.0.x branch
    • packages/ext-solr on release-11.0.x
    • packages/ext-tika on release-6.0.x
  • Please refer to version matrix for proper combination of branches
ddev enable tika
ddev tests-unit-tika
ddev tests-integration-tika


  1. Fork the repository
  2. Clone repository
  3. Create a new branch
  4. Make your changes
  5. Commit your changes to your fork. In your commit message refer to the issue number if there is already one, e.g. [BUGFIX] short description of fix (resolves #4711)
  6. Submit a Pull Request (here are some hints on How to write the perfect pull request)

Keep your fork in sync with original repository

  1. git remote add upstream
  2. git fetch upstream
  3. git checkout master
  4. git merge upstream/master
  5. git push origin master

Get A Weekly Email With Trending Projects For These Topics
No Spam. Unsubscribe easily at any time.
Php (404,479
Cms (2,908
Search (2,683
Metadata (1,403
Typo3 (466
Typo3 Cms Extension (169
Language Detection (153
Related Projects