Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
MySQL
Description
Once again, whatever the capacity of our system, we have a limited amount of RAM. Sooner or later, we will eventually run out of memory.
Please refer to http://techvineyard.blogspot.com/2010/12/build-nutch-20.html#Gora for the description of the issue:
When using MySQL as Gora backend, with the parse command, the execution hangs then crashes because it runs out of memory, because of this query:
SELECT id,content,status,outlinks,baseUrl,typ,parseStatus,metadata,signature,markers FROM webpage;
We are running exactly into the same issue that GORA-20. Except that we are not writing to the store, but reading it. Currently the code loads the entire webpage table into memory. We want to set a limit to the system call that pulls data from the database.