Uploaded image for project: 'Apache Gora'
  1. Apache Gora
  2. GORA-23

Limit result set in store reads

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.1-incubating
    • storage
    • None
    • MySQL

    Description

      Once again, whatever the capacity of our system, we have a limited amount of RAM. Sooner or later, we will eventually run out of memory.

      Please refer to http://techvineyard.blogspot.com/2010/12/build-nutch-20.html#Gora for the description of the issue:

      When using MySQL as Gora backend, with the parse command, the execution hangs then crashes because it runs out of memory, because of this query:

      SELECT id,content,status,outlinks,baseUrl,typ,parseStatus,metadata,signature,markers FROM webpage;

      We are running exactly into the same issue that GORA-20. Except that we are not writing to the store, but reading it. Currently the code loads the entire webpage table into memory. We want to set a limit to the system call that pulls data from the database.

      Attachments

        1. gora.patch
          6 kB
          Alexis
        2. gora.patch
          4 kB
          Alexis
        3. mapred-site.xml
          0.5 kB
          Alexis

        Activity

          People

            Unassigned Unassigned
            alexis779 Alexis
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: