Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-1256

WebGraph to dump host + score

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.4
    • 1.5
    • None
    • None

    Description

      WebGraph's NodeDumper tool can dump url,score information but a host|domain,score output can also be put to good use. This is likely to require a new MapReduce job as the NodeDumper's atonomy is not suited to return max or or summed scores. Code could also be merged with the tool.

      Attachments

        1. NUTCH-1256-1.5-1.patch
          8 kB
          Markus Jelsma

        Issue Links

          Activity

            People

              markus17 Markus Jelsma
              markus17 Markus Jelsma
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: