Phantom Tips: Tip #3: Connecting to Cassandra
Flavian
Flavian Scala Developer
Flavian is a Scala engineer with many years of experience and the author of phantom and morpheus.

Phantom connects using the mini-connectors framework, which is a small abstraction layer around the ClusterBuilder found in the Datastax Java driver. It does quite a bit more than that. Phantom aims to be user friendly, but the way it achieves things isn't necessarily easy, and that's why we try our best to mask away all of that complexity from the end user.

The whole auto-magical element to the plot consists of two things: The implicit session and the implicit keyspace. To make the experience of using the DSL as smooth as possible and to allow users to prevent scope leaks and to minimise contact with the surfaces at all levels, providing proper separation of concern, we choose to go implicit.

The session is obviously something that would need to at some point touch base with every single query executed, and while in Java you would be dealing with the slightly less convenient session.execute(query), but fortunately Scala gives us implicit scope.

If you are new to Phantom, here's a very important rule of thumb: You will never ever have to provide the session explicitly. If you are, you are doing it wrong. We have built the connectors framework and phantom in such a way that you will never ever need to do anything like the following:

database.table.insert.value(_.id, id).future(session, keyspace, executionContext)

The right way is actually very simple, every execution call you make, meaning calling one of the methods of a query which will actually run a query and return a future, should not contain an explicit session, keySpace or executionContext. The below are all examples of what you might expect to write when using phantom, and the same is valid for all the API methods that trigger query execution, such as: fetch(), fetchRecord(), one(), future(), iterator(), iteratorRecord() and so on.

database.table.insert.value(_.id, id).future()
database.table.select.where(_.id eqs id).one()
database.table.update.where(_.id eqs id).modify(_.name setTo "newValue").future()


This is a better simpler way which not only allows you to write less code and achieve better separation of concerns, but it will also allow you to exploit all the native constructs in phantom that should just do the work for you, and we are about to discover exactly what those constructs are and how to use them.

1. Creating a connector lets you specify any properties you normally would.

  • Connectors ensure sessions are initialised when they need to and that keyspaces are created when they need to. They deal primarily with thread/safety performance, something easy to get wrong with sensitive things.
  • By containing an inner mixin trait, they can propagate the necessary implicits, `session` and `keySpace`, to every other method. Since every database call phantom will expect a `session` and a `keySpace` implicitly, connectors just bridge the gap here.

To create a connector:

// This is a fictional series of IP addresses
val hosts = Seq("10.41.2.2", "10.42.12.12")
val port = 9042
val connector = ContactPoints(hosts, port).withClusterBuilder(
  _.withSocketOptions(new SocketOptions() 
    .setConnectionTimeout(10000)
  ).noHeartbeat().keySpace("my_app")

Connectors usually get created in a singleton object wrapper:

object DefaultConnector {
val connector = ContactPoints(hosts, port).withClusterBuilder(
  _.withSocketOptions(new SocketOptions() 
    .setConnectionTimeout(10000)
  ).noHeartbeat().keySpace("my_app")

And the general point is to pass them to a Database object as constructor arguments.

class MyDb(override val connector: KeySpaceDef) extends Database(connector) {
  // The connector.Connector trait is the secret sauce here.
  object firstTable extends FirstTable with connector.Connector
}

Tables usually mixin `RootConnector`, which is a trait that tells the table a connector with a `session` and `keySpace` will be injected later on.

class MyTable extends CassandraTable[MyTable, MyRec] {
  object id extends UUIDColumn(this) with PartitionKey[UUID]
}

In general we prefer having 2 classes per table, one traditionally called `Concrete$TableName` or in this case `ConcreteMyTable`. The reason is that we often may want to even override the methods available to specific parts of the application and have a full cake pattern DI and we also want to make sure we have full scope enclosure of the implicits we need.

// This will now need to be abstract
// because we haven't yet defined a session and a keySpace
// but want to pretend that we do
// as we don't want to manually have to add 2 implicit params to every method

abstract class ConcreteMyTable extends MyTable with RootConnector {
  // so instead of having to type something like:
  def store(rec: MyRec)(implicit session: Session, space: KeySpace): Future[Result] = {
    insert.value(_.id, rec.id).future()
  }

  // we can simply skip having to type the same two implicits
  // This is simply because implicits propagate by inheritance
  // so the RootConnector is simply saying that when this class will be instantiated
  // "someone" will need to provide a valid session and keySpace implicitly.
  def store(rec: MyRec): Future[ResultSet] = {..}
}

The most important goal of defining a Database object like that and providing a connector is to also provide full encapsulation.

The requirement for an implicit session or keyspace should never ever leak outside outside of the Database object itself.That should be the final frontiere, not all components of your application should care you are using Cassandra and as a result we've designed this cutoff point.

Let's imagine you are in a reasonably sized company/team. If you let your implicits leak, then you risk having "end" users of your desired API based on Cassandra having to be aware of the internals. Here's a perfect example. Your domain model being stored in a Cassandra table is Record, and it looks like this:

case class Record(id: UUID, name: String, timestamp: DateTime)


Now let's have a look at the common methods you might define. If any method part of your end DSL looks like this:

def storeRecord(record: Record)(
  implicit session: Session,
  keySpace: KeySpace
): Future[ResultSet]

Then you have a problem, because now all your end users need to be domain aware and they need to understand how Cassandra connects internally and so on. What you want is to achieve full encapsulation and you need to make sure none of the Cassandra specific implicits leak scope, and the database abstraction is the perfect way to do it, because you can wrap services around databases simply by using a database provider trait.

That will guarantee all DB calls will look like this:

// Remember the Record class is something you specify, coming from your own domain, not something
// generic or phantom specific. Record should be the directly usable result you want.

def storeRecord(record: Record): Future[ResultSet]
def getById(id: UUID): Future[Option[Record]]

When an end user calls such a method, they should only ever pass in a known domain object, and no implicits or any Cassandra specific information, and they should only ever get back either a known domain type such as Record in the above example or a ResultSet, which is common for all Cassandra operations. You can even chose to further abstract `ResultSet` into something you feel is more appropriate for your particular use cases.

As of phantom 1.26.0, phantom pre-bundles a trait called DatabaseProvider, so you can achieve a very powerful degree of encapsulation and separation of concern. Our next series on phantom will include a full tutorial on how to use database providers to achieve the highest degree of API cleanliness and quality code with phantom, and the best strategy for structuring your application code with phantom.

Hopefully that gives some much needed insight into how phantom connects and why things are the way they are. If you enjoyed this series of tips, subscribe to our newsletter bellow and follow us on twitter at Outworkers for more!

Related articles