Spark Cassandra Connector

DataStax Spark Cassandra Connector
Alternatives To Spark Cassandra Connector
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
7 months ago85July 18, 20171apache-2.0Jsonnet
PipelineAI Kubeflow Distribution
Zio Quill2,12212043 minutes ago96September 09, 2022399apache-2.0Scala
Compile-time Language Integrated Queries for Scala
Spark Cassandra Connector1,9031092214 days ago81April 08, 202122apache-2.0Scala
DataStax Spark Cassandra Connector
Elassandra1,633141a year ago17September 01, 202041apache-2.0Java
Elassandra = Elasticsearch + Apache Cassandra
6 years ago23apache-2.0Scala
KillrWeather is a reference application (work in progress) showing how to easily integrate streaming and batch data processing with Apache Spark Streaming, Apache Cassandra, Apache Kafka and Akka for fast, streaming computations on time series data in asynchronous event-driven environments.
19 days ago14mitShell
50+ DockerHub public images for Docker & Kubernetes - DevOps, CI/CD, GitHub Actions, CircleCI, Jenkins, TeamCity, Alpine, CentOS, Debian, Fedora, Ubuntu, Hadoop, Kafka, ZooKeeper, HBase, Cassandra, Solr, SolrCloud, Presto, Apache Drill, Nifi, Spark, Consul, Riak
Freestyle615344 years ago19June 07, 201838apache-2.0Scala
A cohesive & pragmatic framework of FP centric Scala libraries
Reference Apps615
6 years ago32otherScala
Spark reference applications
Cassandra Lucene Index574
13 years ago126October 02, 201864apache-2.0Java
Lucene based secondary indexes for Cassandra
Data Engineering Projects322
4 months ago5Jupyter Notebook
Personal Data Engineering Projects
Alternatives To Spark Cassandra Connector
Select To Compare

Alternative Project Comparisons

Spark Cassandra Connector


Quick Links

What Where
Community Chat with us at Datastax and Cassandra Q&A
Scala Docs Most Recent Release (3.3.0): Spark-Cassandra-Connector, Spark-Cassandra-Connector-Driver
Latest Production Release 3.3.0


Lightning-fast cluster computing with Apache Spark™ and Apache Cassandra®.

This library lets you expose Cassandra tables as Spark RDDs and Datasets/DataFrames, write Spark RDDs and Datasets/DataFrames to Cassandra tables, and execute arbitrary CQL queries in your Spark applications.

  • Compatible with Apache Cassandra version 2.1 or higher (see table below)
  • Compatible with Apache Spark 1.0 through 3.3 (see table below)
  • Compatible with Scala 2.11 and 2.12
  • Exposes Cassandra tables as Spark RDDs and Datasets/DataFrames
  • Maps table rows to CassandraRow objects or tuples
  • Offers customizable object mapper for mapping rows to objects of user-defined classes
  • Saves RDDs back to Cassandra by implicit saveToCassandra call
  • Delete rows and columns from cassandra by implicit deleteFromCassandra call
  • Join with a subset of Cassandra data using joinWithCassandraTable call for RDDs, and optimizes join with data in Cassandra when using Datasets/DataFrames
  • Partition RDDs according to Cassandra replication using repartitionByCassandraReplica call
  • Converts data types between Cassandra and Scala
  • Supports all Cassandra data types including collections
  • Filters rows on the server side via the CQL WHERE clause
  • Allows for execution of arbitrary CQL statements
  • Plays nice with Cassandra Virtual Nodes
  • Could be used in all languages supporting Datasets/DataFrames API: Python, R, etc.

Version Compatibility

The connector project has several branches, each of which map into different supported versions of Spark and Cassandra. For previous releases the branch is named "bX.Y" where X.Y is the major+minor version; for example the "b1.6" branch corresponds to the 1.6 release. The "master" branch will normally contain development for the next connector release in progress.

Currently, the following branches are actively supported: 3.3.x (master), 3.2.x (b3.2), 3.1.x (b3.1), 3.0.x (b3.0) and 2.5.x (b2.5).

Connector Spark Cassandra Cassandra Java Driver Minimum Java Version Supported Scala Versions
3.3 3.3 2.1.5*, 2.2, 3.x, 4.x 4.13 8 2.12
3.2 3.2 2.1.5*, 2.2, 3.x, 4.0 4.13 8 2.12
3.1 3.1 2.1.5*, 2.2, 3.x, 4.0 4.12 8 2.12
3.0 3.0 2.1.5*, 2.2, 3.x, 4.0 4.12 8 2.12
2.5 2.4 2.1.5*, 2.2, 3.x, 4.0 4.12 8 2.11, 2.12
2.4.2 2.4 2.1.5*, 2.2, 3.x 3.0 8 2.11, 2.12
2.4 2.4 2.1.5*, 2.2, 3.x 3.0 8 2.11
2.3 2.3 2.1.5*, 2.2, 3.x 3.0 8 2.11
2.0 2.0, 2.1, 2.2 2.1.5*, 2.2, 3.x 3.0 8 2.10, 2.11
1.6 1.6 2.1.5*, 2.2, 3.0 3.0 7 2.10, 2.11
1.5 1.5, 1.6 2.1.5*, 2.2, 3.0 3.0 7 2.10, 2.11
1.4 1.4 2.1.5* 2.1 7 2.10, 2.11
1.3 1.3 2.1.5* 2.1 7 2.10, 2.11
1.2 1.2 2.1, 2.0 2.1 7 2.10, 2.11
1.1 1.1, 1.0 2.1, 2.0 2.1 7 2.10, 2.11
1.0 1.0, 0.9 2.0 2.0 7 2.10, 2.11

*Compatible with 2.1.X where X >= 5

Hosted API Docs

API documentation for the Scala and Java interfaces are available online:








This project is available on the Maven Central Repository. For SBT to download the connector binaries, sources and javadoc, put this in your project SBT config:

libraryDependencies += "com.datastax.spark" %% "spark-cassandra-connector" % "3.3.0"
  • The default Scala version for Spark 3.0+ is 2.12 please choose the appropriate build. See the FAQ for more information.


See Building And Artifacts


Online Training

DataStax Academy

DataStax Academy provides free online training for Apache Cassandra and DataStax Enterprise. In DS320: Analytics with Spark, you will learn how to effectively and efficiently solve analytical problems with Apache Spark, Apache Cassandra, and DataStax Enterprise. You will learn about Spark API, Spark-Cassandra Connector, Spark SQL, Spark Streaming, and crucial performance optimization techniques.


Reporting Bugs

New issues may be reported using JIRA. Please include all relevant details including versions of Spark, Spark Cassandra Connector, Cassandra and/or DSE. A minimal reproducible case with sample code is ideal.

Mailing List

Questions and requests for help may be submitted to the user mailing list.

Q/A Exchange

The DataStax Community provides a free question and answer website for any and all questions relating to any DataStax Related technology. Including the Spark Cassandra Connector. Both DataStax engineers and community members frequent this board and answer questions.


To protect the community, all contributors are required to sign the DataStax Spark Cassandra Connector Contribution License Agreement. The process is completely electronic and should only take a few minutes.

To develop this project, we recommend using IntelliJ IDEA. Make sure you have installed and enabled the Scala Plugin. Open the project with IntelliJ IDEA and it will automatically create the project structure from the provided SBT configuration.

Tips for Developing the Spark Cassandra Connector

Checklist for contributing changes to the project:

  • Create a SPARKC JIRA
  • Make sure that all unit tests and integration tests pass
  • Add an appropriate entry at the top of CHANGES.txt
  • If the change has any end-user impacts, also include changes to the ./doc files as needed
  • Prefix the pull request description with the JIRA number, for example: "SPARKC-123: Fix the ..."
  • Open a pull-request on GitHub and await review


To run unit and integration tests:

./sbt/sbt test
./sbt/sbt it:test

Note that the integration tests require CCM to be installed on your machine. See Tips for Developing the Spark Cassandra Connector for details.

By default, integration tests start up a separate, single Cassandra instance and run Spark in local mode. It is possible to run integration tests with your own Cassandra and/or Spark cluster. First, prepare a jar with testing code:

./sbt/sbt test:package

Then copy the generated test jar to your Spark nodes and run:

export IT_TEST_CASSANDRA_HOST=<IP of one of the Cassandra nodes>
export IT_TEST_SPARK_MASTER=<Spark Master URL>
./sbt/sbt it:test

Generating Documents

To generate the Reference Document use

./sbt/sbt spark-cassandra-connector-unshaded/run (outputLocation)

outputLocation defaults to doc/


Copyright 2014-2022, DataStax, Inc.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Popular Cassandra Projects
Popular Spark Projects
Popular Data Storage Categories
Related Searches

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.