Skip to content

Cassandra timeout during read query at consistency ALL #916

@serragnoli

Description

@serragnoli
  • Phantom is the only driver of this project

  • The connector code

object PhantomConnector {

  private val username: String   = cassandra.username
  private val password: String   = cassandra.password
  private val keyspace: String   = cassandra.keyspace
  private val hosts: Seq[String] = cassandra.hosts
  private val port: Int          = cassandra.port.value

  private val databasePooling: PoolingOptions = new PoolingOptions()
    .setCoreConnectionsPerHost(HostDistance.REMOTE, 2)
    .setMaxConnectionsPerHost(HostDistance.REMOTE, 4)
    .setMaxRequestsPerConnection(HostDistance.REMOTE, 2000)
    .setMaxQueueSize(16192)
    .setPoolTimeoutMillis(120000)

  private val cluster: Cluster.Builder = new Cluster.Builder()
    .addContactPoints(hosts: _*)
    .withPort(port)
    .withCredentials(username, password)
    .withPoolingOptions(databasePooling)
    .withoutJMXReporting()
    .withoutMetrics()
    .withSocketOptions(
      new SocketOptions()
        .setReadTimeoutMillis(900000)
        .setConnectTimeoutMillis(900000)
    )

  val connection: CassandraConnection = ContactPoint(port)
    .noHeartbeat()
    .withClusterBuilder(_ => cluster)
    .keySpace(keyspace)
}
  • The consistency level explicitly set to all read queries is LOCAL_QUORUM
    select.where(_.territory eqs territory)
      .and(_.source eqs source)
      .and(_.lookupKey eqs lookupKey)
      .and(_.fragmentType eqs fragmentType)
      .consistencyLevel_=(ConsistencyLevel.LOCAL_QUORUM)
      .fetch()
  • There are sporadic errors related to consistency ALL as below
2020-08-19T18:00:01.739 [cel-api-akka.actor.default-dispatcher-8060] INFO  com.myapp.cel.api.Main$ - Transforming for GB at 1597860001739
2020-08-19T18:00:02.159 [scala-execution-context-global-19974] INFO  c.myapp.cel.api.service.MapperService$ - Classification fetched [62280]
[ERROR] [08/19/2020 18:00:13.574] [cel-api-akka.actor.default-dispatcher-8059] [akka.actor.ActorSystemImpl(cel-api)] Error during processing of request: 'Cassandra timeout during read query at consistency ALL (6 responses were required but only 5 replica responded)'. Completing with 500 Internal Server Error response. To change default exception handling behavior, provide a custom ExceptionHandler.
com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra timeout during read query at consistency ALL (6 responses were required but only 5 replica responded)
	at com.datastax.driver.core.exceptions.ReadTimeoutException.copy(ReadTimeoutException.java:124)
	at com.datastax.driver.core.Responses$Error.asException(Responses.java:169)
	at com.datastax.driver.core.RequestHandler$SpeculativeExecution.onSet(RequestHandler.java:646)
	at com.datastax.driver.core.Connection$Dispatcher.channelRead0(Connection.java:1233)
	at com.datastax.driver.core.Connection$Dispatcher.channelRead0(Connection.java:1151)
	at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:310)
	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:297)
	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:413)
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:265)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1334)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:926)
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:134)
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:644)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:579)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:496)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:458)
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
	at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
	at java.lang.Thread.run(Thread.java:748)
Caused by: com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra timeout during read query at consistency ALL (6 responses were required but only 5 replica responded)
	at com.datastax.driver.core.Responses$Error$1.decode(Responses.java:91)
	at com.datastax.driver.core.Responses$Error$1.decode(Responses.java:66)
	at com.datastax.driver.core.Message$ProtocolDecoder.decode(Message.java:297)
	at com.datastax.driver.core.Message$ProtocolDecoder.decode(Message.java:268)
	at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:88)
	... 22 more
  • The Cassandra cluster topology is 2 DCs of 7 nodes each with replication factor of 3
  • Where is Phantom possibly finding query consistency ALL as all reads are explicitly set to LOCAL_QUORUM?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions