Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-3533

TimeoutException when there is a firewall issue.

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Low
    • Resolution: Fixed
    • 2.0 beta 1
    • None
    • None

    Description

      When one node in the cluster is not able to talk to the other DC/RAC due to firewall or network related issue (StorageProxy calls fail), and the nodes are NOT marked down because at least one node in the cluster can talk to the other DC/RAC, we get timeoutException instead of throwing a unavailableException.

      The problem with this:
      1) It is hard to monitor/identify these errors.
      2) It is hard to diffrentiate from the client if the node being bad vs a bad query.
      3) when this issue happens we have to wait for at-least the RPC timeout time to know that the query wont succeed.

      Possible Solution: when marking a node down we might want to check if the node is actually alive by trying to communicate to it? So we can be sure that the node is actually alive.

      Attachments

        1. 3533.txt
          4 kB
          Brandon Williams
        2. 0001-CASSANDRA-3533.patch
          9 kB
          Vijay
        3. 0001-3533-v2.patch
          10 kB
          Vijay

        Activity

          People

            vijay2win@yahoo.com Vijay
            vijay2win@yahoo.com Vijay
            Vijay
            Jonathan Ellis
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: