Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-14340

Add second bulk load option to Spark Bulk Load to send puts as the value

    XMLWordPrintableJSON

Details

    • Reviewed

    Description

      The initial bulk load option for Spark bulk load sends values over one by one through the shuffle. This is the similar to how the original MR bulk load worked.

      How ever the MR bulk loader have more then one bulk load option. There is a second option that allows for all the Column Families, Qualifiers, and Values or a row to be combined in the map side.

      This only works if the row is not super wide.

      But if the row is not super wide this method of sending values through the shuffle will reduce the data and work the shuffle has to deal with.

      Attachments

        1. HBASE-14340.1.patch
          65 kB
          Theodore michael Malaska
        2. HBASE-14340.2.patch
          64 kB
          Theodore michael Malaska

        Issue Links

          Activity

            People

              ted.m Theodore michael Malaska
              ted.m Theodore michael Malaska
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: