Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-2216

OpenBitSet#hashCode() may return false for identical sets.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 2.9, 2.9.1, 3.0
    • 2.9.4, 3.0.3
    • core/other
    • None
    • New, Patch Available

    Description

      OpenBitSet uses an internal buffer of long variables to store set bits and an additional 'wlen' index that points
      to the highest used component inside

      {@link #bits}

      buffer.

      Unlike in JDK, the wlen field is not continuously maintained (on clearing bits, for example). This leads to a situation when wlen may point
      far beyond the last set bit.

      The hashCode implementation iterates over all long components of the bits buffer, rotating the hash even for empty components. This is against the contract of hashCode-equals. The following test case illustrates this:

      // initialize two bitsets with different capacity (bits length).
      BitSet bs1 = new BitSet(200);
      BitSet bs2 = new BitSet(64);
      // set the same bit.
      bs1.set(3);
      bs2.set(3);
              
      // equals returns true (passes).
      assertEquals(bs1, bs2);
      // hashCode returns false (against contract).
      assertEquals(bs1.hashCode(), bs2.hashCode());
      

      Fix and test case attached.

      Attachments

        1. openbitset.patch
          2 kB
          Dawid Weiss
        2. LUCENE-2216.patch
          0.9 kB
          Yonik Seeley

        Activity

          People

            Unassigned Unassigned
            dawidweiss Dawid Weiss
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: