q is a command line tool that allows direct execution of SQL-like queries on CSVs/TSVs (and any other tabular text files).
q treats ordinary files as database tables, and supports all SQL constructs, such as
JOINs, etc. It supports automatic column name and type detection, and q provides full support for multiple character encodings.
q's web site is http://harelba.github.io/q/. It contains everything you need to download and use q immediately.
Instructions for all OSs are here.
q "SELECT COUNT(*) FROM ./clicks_file.csv WHERE c3 > 32.3" ps -ef | q -H "SELECT UID, COUNT(*) cnt FROM - GROUP BY UID ORDER BY cnt DESC LIMIT 3"
Go here for more examples.
I have created a preliminary benchmark comparing q's speed between python2, python3, and comparing both to textql and octosql.
Your input about the validity of the benchmark and about the results would be greatly appreciated. More details are here.
Any feedback/suggestions/complaints regarding this tool would be much appreciated. Contributions are most welcome as well, of course.
Linkedin: Harel Ben Attia
Email [email protected]
q on twitter: #qtextasdata