Uploaded image for project: 'ManifoldCF'
  1. ManifoldCF
  2. CONNECTORS-1555

Apache ManifoldCF 2.5 Job Scheduling Issues

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • ManifoldCF 2.5
    • None
    • Solr 6.x component
    • None

    Description

      Hi ,

      My team is using Apache ManifoldCF 2.5 with SOLR Cloud for indexing of data. we are currently having 400 jobs which needs to run simultaneously. we need to index json data and we are using connector type as file system along with postgres as backend database.

      we are facing several issues like
      1. scheduling works for some jobs and doesn't work for other jobs.
      2. some jobs gets completed and some jobs hangs and doesn't get completed.
      3. with one job earlier 60000 documents was getting indexed in 15minutes but now even a path having 5 documents takes 20 minutes or sometimes doesn't get completed
      4. "list all jobs" page doesn't load sometimes and on seeing the pg_stat_activity we observe that 2 queries are in waiting state state because of which the page doesn't load. so if we kill those queries or restart manifold the issue gets resolved and the page loads properly
      queries getting stuck:
      1. SELECT ID,FAILTIME, FAILCOUNT, SEEDINGVERSION, STATUS FROM JOBS WHERE (STATUS=$1 OR STATUS=$2) FOR UPDATE
      2. UPDATE JOBS SET ERRORTEXT=NULL, ENDTIME=NULL, WINDOWEND=NULL, STATUS=$1 WHERE ID=$2

      note : we have deployed manifold in linux.

      Please help us in fine tuning manifold so that it runs smoothly and acts as a robust system.

      Thanks in advance. looking forward for your solution

      Attachments

        Activity

          People

            Unassigned Unassigned
            pavankvvr VVR Pavan Kumar
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: