Details

    • Task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.2.0
    • None
    • examples
    • None

    Description

      Hi,

      I recently wrote some code to find the max K integers corresponding a group.

      Given one of more input files containing input lines of the following form:

      "key",value

      where key is a string
      value is any integer

      the program prints the top K elements corresponding to each key.

      eg.

      "a",1
      "b",1
      "a",2
      "a",5
      "b",17
      "c",5
      "b",6

      if k = 2, the program prints

      "a" [2,5]
      "b" [6,17]
      "c" [5]

      Compile steps:
      mvn clean
      mvn package javadoc:javadoc

      Run steps:

      hadoop jar <ranking jar file> <main class> <K> <input directory> <output directory>
      eg. hadoop jar target/ranking-1.0-SNAPSHOT.jar org.ml.MaxKRanker 5 data/input data/output

      Wanted to know if there is a component (examples maybe) where the code can be contributed. Also open to any suggestions for improvements.

      Thanks,
      Arnab

      Attachments

        1. k-ranking.tgz
          76 kB
          Arnab

        Activity

          People

            Unassigned Unassigned
            arnabguin Arnab
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: