Uploaded image for project: 'Chukwa (retired)'
  1. Chukwa (retired)
  2. CHUKWA-462

Store the cluster in the key for performance and easier customization on mappers

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Data Processors
    • None

    Description

      Right now the chukwa framework is storing the destination cluster as a tag in the Chunk. Then the tags are copied to the ChukwaRecord, and before storing it, it's parsed with a regular expression from each record.

      • It's slow to apply a preg to each record
      • It's harder to modify the destination cluster from the mapper, you have to tweak the tags field.
      • Takes unneeded space on records storing the cluster on each of them.

      The proposed path:

      • Extracts the cluster from chunk tags just once per chunk, much faster.
      • Stores the cluster in the key, so it's easy to recover.
      • It's easy to tweak from the mapper. Just alter it with key.setClusterName(String clusterName)
      • Strips the cluster from the tags field of the resulting chukwa records. If the tags field is empty, completely skips setting the tags field in the record.

      Attachments

        1. cluster_in_ChukwaRecordKey.v3.diff
          10 kB
          Guille -bisho-
        2. cluster_in_ChukwaRecordKey.v4.diff
          38 kB
          Guille -bisho-

        Activity

          People

            Unassigned Unassigned
            bisho Guille -bisho-
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated: