Uploaded image for project: 'ManifoldCF'
  1. ManifoldCF
  2. CONNECTORS-430

An error should be returned if invalid seeds are typed into the seeds list for the web connector

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF 0.3, ManifoldCF 0.4, ManifoldCF 0.5
    • ManifoldCF 0.6
    • Web connector
    • None

    Description

      If you create a job for the web connector and enter an invalid URL into the seeds list, any value is accepted. An error message should be returned to the user in order to prevent invalid seeds.

      Attachments

        1. CONNECTORS-430.patch
          3 kB
          Karl Wright
        2. CONNECTORS-430.patch
          3 kB
          Erlend Garåsen

        Activity

          This can be fixed by using JavaScript which performs an url validation. I have entered the following in editjobs.jsp (vrawler-ui) which seems to work. This code is not finished since some URLs (localhost) are not accepted.

          function checkSeedsList()
          	{
          		if (typeof(editjob.seeds) != "undefined")
          		{
          			var regexp = /http(s)?:\/\/[a-z0-9-\.]+\.[a-z]{2,4}\/?([^\s<>\#%"\,\{\}\\|\\\^\[\]`]+)?/;
          			var lines = editjob.seeds.value.toString().split("\n");
          			var invalidUrlList = "";
          			for (var i = 0; i < lines.length; i++) {
          				if (lines[i].length > 0 && !regexp.test(lines[i]))
          				{
          					invalidUrlList += lines[i] + "\n";
          				}
          			}
          			if (invalidUrlList.length > 0)
          			{
          				alert(<%=Messages.getString(pageContext.getRequest().getLocale(),"editjob.InvalidUrlInSeedsList")%> + "\n" + invalidUrlList);
          				editjob.seeds.focus();
          				return false;
          			}
          		}
          		return true;
          	}
          
          erlendfg Erlend Garåsen added a comment - This can be fixed by using JavaScript which performs an url validation. I have entered the following in editjobs.jsp (vrawler-ui) which seems to work. This code is not finished since some URLs (localhost) are not accepted. function checkSeedsList() { if (typeof(editjob.seeds) != "undefined" ) { var regexp = /http(s)?:\/\/[a-z0-9-\.]+\.[a-z]{2,4}\/?([^\s<>\#%"\,\{\}\\|\\\^\[\]`]+)?/; var lines = editjob.seeds.value.toString().split( "\n" ); var invalidUrlList = ""; for ( var i = 0; i < lines.length; i++) { if (lines[i].length > 0 && !regexp.test(lines[i])) { invalidUrlList += lines[i] + "\n" ; } } if (invalidUrlList.length > 0) { alert(<%=Messages.getString(pageContext.getRequest().getLocale(), "editjob.InvalidUrlInSeedsList" )%> + "\n" + invalidUrlList); editjob.seeds.focus(); return false ; } } return true ; }
          kwright@metacarta.com Karl Wright added a comment -

          Some comments.

          First, it is not a good idea to include quotation marks in translated messages. The quotation marks belong as part of the Javascript.

          Second, the following Javascript constructs are unsupported by the browser simulator:

          • typeof() - it looks like you could just remove this, however.
          • toString() - once again, looks like this could be just removed.
          • split() - easy to add.
          • the "+=" operator - can be added, or you might more readily just rephrase the expression to not use it
          kwright@metacarta.com Karl Wright added a comment - Some comments. First, it is not a good idea to include quotation marks in translated messages. The quotation marks belong as part of the Javascript. Second, the following Javascript constructs are unsupported by the browser simulator: typeof() - it looks like you could just remove this, however. toString() - once again, looks like this could be just removed. split() - easy to add. the "+=" operator - can be added, or you might more readily just rephrase the expression to not use it

          If we shouldn't include quotation marks in translated messages, we should change similar occurrences as well, for instance:
          editjob.ExpirationIntervalMustBeAValidIntegerOrNull="Expiration interval must be a valid integer or null"

          Do you think we should create another ticket about these quotation marks as well?

          Is it difficult to add more functionality to the browser simulator or is it a better approach to follow some strict rules when writing JavaScript? If there are so many restrictions, I think it is a good idea to provide a list of safe JavaScript functions to use. My apologizes if such a list exists.

          It would be great if the browser simulator could support the split function. Otherwise the JavaScript code will be harder to read due to a lot more lines of code.

          erlendfg Erlend Garåsen added a comment - If we shouldn't include quotation marks in translated messages, we should change similar occurrences as well, for instance: editjob.ExpirationIntervalMustBeAValidIntegerOrNull="Expiration interval must be a valid integer or null" Do you think we should create another ticket about these quotation marks as well? Is it difficult to add more functionality to the browser simulator or is it a better approach to follow some strict rules when writing JavaScript? If there are so many restrictions, I think it is a good idea to provide a list of safe JavaScript functions to use. My apologizes if such a list exists. It would be great if the browser simulator could support the split function. Otherwise the JavaScript code will be harder to read due to a lot more lines of code.
          kwright@metacarta.com Karl Wright added a comment -

          If we shouldn't include quotation marks in translated messages, we should change similar occurrences as well, for instance:

          editjob.ExpirationIntervalMustBeAValidIntegerOrNull="Expiration interval must be a valid integer or null"

          Do you think we should create another ticket about these quotation marks as well?

          Yes, I thought I'd fixed all of those, but I must have missed some.

          kwright@metacarta.com Karl Wright added a comment - If we shouldn't include quotation marks in translated messages, we should change similar occurrences as well, for instance: editjob.ExpirationIntervalMustBeAValidIntegerOrNull="Expiration interval must be a valid integer or null" Do you think we should create another ticket about these quotation marks as well? Yes, I thought I'd fixed all of those, but I must have missed some.
          kwright@metacarta.com Karl Wright added a comment -

          Is it difficult to add more functionality to the browser simulator or is it a better approach to follow some strict rules when writing JavaScript? If there are so many restrictions, I think it is a good idea to provide a list of safe JavaScript functions to use. My apologizes if such a list exists.

          I haven't created a list, because one adds things as one needs to to the simulator. If you have time you can certainly look at it. It is in general not difficult to add certain things (methods of already existing objects), but it is harder to add others (such as whole new operations).

          kwright@metacarta.com Karl Wright added a comment - Is it difficult to add more functionality to the browser simulator or is it a better approach to follow some strict rules when writing JavaScript? If there are so many restrictions, I think it is a good idea to provide a list of safe JavaScript functions to use. My apologizes if such a list exists. I haven't created a list, because one adds things as one needs to to the simulator. If you have time you can certainly look at it. It is in general not difficult to add certain things (methods of already existing objects), but it is harder to add others (such as whole new operations).
          kwright@metacarta.com Karl Wright added a comment -

          It would be great if the browser simulator could support the split function. Otherwise the JavaScript code will be harder to read due to a lot more lines of code.

          I can try adding that this evening.

          kwright@metacarta.com Karl Wright added a comment - It would be great if the browser simulator could support the split function. Otherwise the JavaScript code will be harder to read due to a lot more lines of code. I can try adding that this evening.

          I can try adding that this evening.

          Great! It is possible to rewrite everything in my JavaScript code but the split function, so this will make it much more easier and the code will be more readable.

          erlendfg Erlend Garåsen added a comment - I can try adding that this evening. Great! It is possible to rewrite everything in my JavaScript code but the split function, so this will make it much more easier and the code will be more readable.
          kwright@metacarta.com Karl Wright added a comment -

          Good news; I added it a while ago and forgot about it. So you should be all set.

          kwright@metacarta.com Karl Wright added a comment - Good news; I added it a while ago and forgot about it. So you should be all set.

          Adds an error message with a list of invalid URLs in the seeds list, based on JavaScript.

          erlendfg Erlend Garåsen added a comment - Adds an error message with a list of invalid URLs in the seeds list, based on JavaScript.

          I have created a patch which adds URL validation to the seeds list. The regular expression for URL validation is not perfect, but IP addresses and localhosts are accepted in addition to domains. If it looks ok, I can committ my changes with Japaneese translation of the error message as well.

          erlendfg Erlend Garåsen added a comment - I have created a patch which adds URL validation to the seeds list. The regular expression for URL validation is not perfect, but IP addresses and localhosts are accepted in addition to domains. If it looks ok, I can committ my changes with Japaneese translation of the error message as well.
          kwright@metacarta.com Karl Wright added a comment -

          Hi Erlend,

          All looks good except the check for editjob.seeds == undefined. That's not supported. The whole conditional should be unneeded because there is always a "editjobs.seeds.value" available, either as a hidden or as a text box.

          kwright@metacarta.com Karl Wright added a comment - Hi Erlend, All looks good except the check for editjob.seeds == undefined. That's not supported. The whole conditional should be unneeded because there is always a "editjobs.seeds.value" available, either as a hidden or as a text box.

          The whole conditional should be unneeded because there is always a "editjobs.seeds.value" available, either as a hidden or as a text box.

          I'm afraid not, and that's the reason why I added this check. Sorry, I thought it was the typeof function that wasn't supported by the browser simulator. Anyway, I'll double-check the requirement for this check later. If it's required, I'll add the necessary hidden element instead.

          erlendfg Erlend Garåsen added a comment - The whole conditional should be unneeded because there is always a "editjobs.seeds.value" available, either as a hidden or as a text box. I'm afraid not, and that's the reason why I added this check. Sorry, I thought it was the typeof function that wasn't supported by the browser simulator. Anyway, I'll double-check the requirement for this check later. If it's required, I'll add the necessary hidden element instead.
          kwright@metacarta.com Karl Wright added a comment -

          I'm afraid not...

          All tabs are coded so that either the displayed form element is output, or instead a corresponding hidden is output. The web connector's tabs are no different. The code is:

              if (tabName.equals(Messages.getString(locale,"WebcrawlerConnector.Seeds")))
              {
                out.print(
          "<table class=\"displaytable\">\n"+
          "  <tr><td class=\"separator\" colspan=\"2\"><hr/></td></tr>\n"+
          "  <tr>\n"+
          "    <td class=\"value\" colspan=\"2\">\n"+
          "      <textarea rows=\"25\" cols=\"80\" name=\"seeds\">"+org.apache.manifoldcf.ui.util.Encoder.bodyEscape(seeds)+"</textarea>\n"+
          "    </td>\n"+
          "  </tr>\n"+
          "</table>\n"
                );
              }
              else
              {
                out.print(
          "<input type=\"hidden\" name=\"seeds\" value=\""+org.apache.manifoldcf.ui.util.Encoder.attributeEscape(seeds)+"\"/>\n"
                );
              }
          
          kwright@metacarta.com Karl Wright added a comment - I'm afraid not... All tabs are coded so that either the displayed form element is output, or instead a corresponding hidden is output. The web connector's tabs are no different. The code is: if (tabName.equals(Messages.getString(locale, "WebcrawlerConnector.Seeds" ))) { out.print( "<table class=\" displaytable\ ">\n" + " <tr><td class=\" separator\ " colspan=\" 2\ "><hr/></td></tr>\n" + " <tr>\n" + " <td class=\" value\ " colspan=\" 2\ ">\n" + " <textarea rows=\" 25\ " cols=\" 80\ " name=\" seeds\ ">" +org.apache.manifoldcf.ui.util.Encoder.bodyEscape(seeds)+ "</textarea>\n" + " </td>\n" + " </tr>\n" + "</table>\n" ); } else { out.print( "<input type=\" hidden\ " name=\" seeds\ " value=\" "+org.apache.manifoldcf.ui.util.Encoder.attributeEscape(seeds)+" \ "/>\n" ); }

          If I comment out the if check, I get the following error in my Error console (Firefox):

          Timestamp: 22.03.12 19.25.43
          Error: editjob.seeds is undefined
          Source File: http://localhost:8345/mcf-crawler-ui/editjob.jsp
          Line: 243
          

          This happens when I press the Connection tab just after I have named my new job. I think it is due to a missing hidden element, something I will figure out tomorrow.

          erlendfg Erlend Garåsen added a comment - If I comment out the if check, I get the following error in my Error console (Firefox): Timestamp: 22.03.12 19.25.43 Error: editjob.seeds is undefined Source File: http: //localhost:8345/mcf-crawler-ui/editjob.jsp Line: 243 This happens when I press the Connection tab just after I have named my new job. I think it is due to a missing hidden element, something I will figure out tomorrow.
          kwright@metacarta.com Karl Wright added a comment -

          Ah, I see the problem - you are putting your check in the main UI javascript, not in the javascript that the web connector is supposed to provide.

          You should be editing the method WebCrawlerConnector.outputSpecificationHeader(), not editing the UI JSP.

          So please don't commit this change in its current form.

          kwright@metacarta.com Karl Wright added a comment - Ah, I see the problem - you are putting your check in the main UI javascript, not in the javascript that the web connector is supposed to provide. You should be editing the method WebCrawlerConnector.outputSpecificationHeader(), not editing the UI JSP. So please don't commit this change in its current form.

          Thanks for clarifying. This was also the reason why I created a patch in the first place.

          erlendfg Erlend Garåsen added a comment - Thanks for clarifying. This was also the reason why I created a patch in the first place.

          Is the replace command supported by the browser simulator? I found a problem while testing my changes. Leading and trailing spaces should be removed in order to avoid URL formatted white spaces in the hidden seeds field. Otherwise, an annoying alert box pops up if you first enter a blank line in the seeds list, select another tab and then tries to select another tab again. The following seems to work inside my function:

          editjob.seeds.value = editjob.seeds.value.replace(/^\s*/, "").replace(/\s*$/, "");
          
          erlendfg Erlend Garåsen added a comment - Is the replace command supported by the browser simulator? I found a problem while testing my changes. Leading and trailing spaces should be removed in order to avoid URL formatted white spaces in the hidden seeds field. Otherwise, an annoying alert box pops up if you first enter a blank line in the seeds list, select another tab and then tries to select another tab again. The following seems to work inside my function: editjob.seeds.value = editjob.seeds.value.replace(/^\s*/, "").replace(/\s*$/, " ");
          kwright@metacarta.com Karl Wright added a comment -

          It looks like replace is supported:

              def get_value( self, member_name ):
                  if member_name == "indexOf":
                      return JSIndexOf( self )
                  if member_name == "charAt":
                      return JSCharAt( self )
                  if member_name == "length":
                      return JSNumber( len(self.value) )
                  if member_name == "search":
                      return JSSearch( self )
                  if member_name == "replace":
                      return JSReplace( self )
                  if member_name == "split":
                      return JSSplit( self )
                  if member_name == "substring":
                      return JSSubstring( self )
                  return JSObject.get_value( self, member_name )
          

          Can you attach the trace you are seeing?

          kwright@metacarta.com Karl Wright added a comment - It looks like replace is supported: def get_value( self, member_name ): if member_name == "indexOf" : return JSIndexOf( self ) if member_name == "charAt" : return JSCharAt( self ) if member_name == "length" : return JSNumber( len(self.value) ) if member_name == "search" : return JSSearch( self ) if member_name == "replace" : return JSReplace( self ) if member_name == "split" : return JSSplit( self ) if member_name == "substring" : return JSSubstring( self ) return JSObject.get_value( self, member_name ) Can you attach the trace you are seeing?

          You mean the content of the hidden seeds field if it contains white spaces?

          <input type="hidden" name="seeds" value="&#13;&#10;http://www.uio.no/"/>
          
          erlendfg Erlend Garåsen added a comment - You mean the content of the hidden seeds field if it contains white spaces? <input type= "hidden" name= "seeds" value= "&#13;&#10;http: //www.uio.no/" />
          kwright@metacarta.com Karl Wright added a comment -

          You mean the content of the hidden seeds field if it contains white spaces?

          No, I think I misunderstood you when you said:

          I found a problem while testing my changes.

          I thought by "problem" you meant that the UI test failed. Clearly you meant that you wanted to use replace, but hadn't yet.

          kwright@metacarta.com Karl Wright added a comment - You mean the content of the hidden seeds field if it contains white spaces? No, I think I misunderstood you when you said: I found a problem while testing my changes. I thought by "problem" you meant that the UI test failed. Clearly you meant that you wanted to use replace, but hadn't yet.

          Clearly you meant that you wanted to use replace, but hadn't yet.

          Exactly. I will probably commit my changes within few hours unless I find another problem. Then I will reassign the ticket to Hitoshi in order to add a Japanese translation of the error message.

          erlendfg Erlend Garåsen added a comment - Clearly you meant that you wanted to use replace, but hadn't yet. Exactly. I will probably commit my changes within few hours unless I find another problem. Then I will reassign the ticket to Hitoshi in order to add a Japanese translation of the error message.
          kwright@metacarta.com Karl Wright added a comment -

          Before you commit, can you try:

          ant run-webcrawler-UI-tests-derby

          ... to make sure the UI tests continue to pass? You will first need to install the latest version of python 2.x on your system of course.

          Thanks!

          kwright@metacarta.com Karl Wright added a comment - Before you commit, can you try: ant run-webcrawler-UI-tests-derby ... to make sure the UI tests continue to pass? You will first need to install the latest version of python 2.x on your system of course. Thanks!

          I will.

          erlendfg Erlend Garåsen added a comment - I will.

          The UI tests actually failed. I will take a closer look tomorrow in order to find the reason. BTW, I already have Python 2.6 installed. Do you think it's necessary to upgrade to version 2.7?

          Everything seems to work fine by using Firefox, Opera and Safari.

              [junit]   File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/httplib.py", line 714, in send
              [junit] TypeError: fromfile() takes exactly 2 arguments (1 given)
              [junit] 2012-03-26 21:49:26.260:INFO::Stopped SocketConnector@0.0.0.0:8346
              [junit] ------------- ---------------- ---------------
              [junit] Testcase: createConnectionsAndJob(org.apache.manifoldcf.webcrawler_tests.NavigationDerbyUI):	Caused an ERROR
              [junit] UI test failed; error code: 1
              [junit] java.lang.Exception: UI test failed; error code: 1
              [junit] 	at org.apache.manifoldcf.core.tests.HTMLTester.executeTest(HTMLTester.java:183)
              [junit] 	at org.apache.manifoldcf.webcrawler_tests.NavigationDerbyUI.createConnectionsAndJob(NavigationDerbyUI.java:282)
          
          erlendfg Erlend Garåsen added a comment - The UI tests actually failed. I will take a closer look tomorrow in order to find the reason. BTW, I already have Python 2.6 installed. Do you think it's necessary to upgrade to version 2.7? Everything seems to work fine by using Firefox, Opera and Safari. [junit] File "/ System /Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/httplib.py" , line 714, in send [junit] TypeError: fromfile() takes exactly 2 arguments (1 given) [junit] 2012-03-26 21:49:26.260:INFO::Stopped SocketConnector@0.0.0.0:8346 [junit] ------------- ---------------- --------------- [junit] Testcase: createConnectionsAndJob(org.apache.manifoldcf.webcrawler_tests.NavigationDerbyUI): Caused an ERROR [junit] UI test failed; error code: 1 [junit] java.lang.Exception: UI test failed; error code: 1 [junit] at org.apache.manifoldcf.core.tests.HTMLTester.executeTest(HTMLTester.java:183) [junit] at org.apache.manifoldcf.webcrawler_tests.NavigationDerbyUI.createConnectionsAndJob(NavigationDerbyUI.java:282)
          kwright@metacarta.com Karl Wright added a comment -

          I have never seen this before; the tester was originally developed on python 2.3 and has worked ever since. I use 2.7 at the moment.

          kwright@metacarta.com Karl Wright added a comment - I have never seen this before; the tester was originally developed on python 2.3 and has worked ever since. I use 2.7 at the moment.

          The same error returns if I check out a new version from trunk and run the UI test, so the failure is not caused by my changes. I will try to upgrade to Python 2.7 later this evening and/or try to run the same test on Linux instead of OS X.

          erlendfg Erlend Garåsen added a comment - The same error returns if I check out a new version from trunk and run the UI test, so the failure is not caused by my changes. I will try to upgrade to Python 2.7 later this evening and/or try to run the same test on Linux instead of OS X.
          kwright@metacarta.com Karl Wright added a comment -

          I think it is possible that you have more than one version of python rattling around and it is picking up the wrong one. Try "which python" to see if you are getting the one you think.

          kwright@metacarta.com Karl Wright added a comment - I think it is possible that you have more than one version of python rattling around and it is picking up the wrong one. Try "which python" to see if you are getting the one you think.
          kwright@metacarta.com Karl Wright added a comment -

          You can also attach a new patch and I can see if the UI tests complete here on my system.

          kwright@metacarta.com Karl Wright added a comment - You can also attach a new patch and I can see if the UI tests complete here on my system.

          The UI test ran successfully on Linux with version 2.4.3 of Python.

          I have attached a new patch. If it looks OK and that the UI test passes on your computer, I suggest that I commit my changes and later try to find the reason why these tests do not run on my mac. I don't think I have several versions of Python installed.

          erlend-garasens-macbook-pro:~ erlendfg$ which python
          /usr/bin/python
          
          Python 2.6.1 (r261:67515, Jun 24 2010, 21:47:49) 
          [GCC 4.2.1 (Apple Inc. build 5646)] on darwin
          
          erlendfg Erlend Garåsen added a comment - The UI test ran successfully on Linux with version 2.4.3 of Python. I have attached a new patch. If it looks OK and that the UI test passes on your computer, I suggest that I commit my changes and later try to find the reason why these tests do not run on my mac. I don't think I have several versions of Python installed. erlend-garasens-macbook-pro:~ erlendfg$ which python /usr/bin/python Python 2.6.1 (r261:67515, Jun 24 2010, 21:47:49) [GCC 4.2.1 (Apple Inc. build 5646)] on darwin
          kwright@metacarta.com Karl Wright added a comment - - edited

          For me it fails:

             [junit] Multipart posting url 'http://localhost:8346/mcf-crawler-ui/execute.jsp' with parameters 'outputname=MyOutputConnection&index=&recrawlinterval=1440&description=MyJob&startmethod=2&expirationinterval=&jobid=1332885945254&priority=5&reseedinterval=60&tabname=Connection&connectionname=MyRepositoryConnection&schedulerecords=0&scheduletype=1&type=job&op=Continue' and 0 files...
              [junit] Traceback (most recent call last):
              [junit]   File "test.py", line 166, in <module>
              [junit]     var124.click()
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/VirtualBrowser.py", line 120, in click
              [junit]     self.get_form( ).execute_javascript_expression( self.onclick )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/VirtualBrowser.py", line 672, in execute_javascript_expression
              [junit]     return self.window_instance.execute_javascript_expression( javascript )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/VirtualBrowser.py", line 920, in execute_javascript_expression
              [junit]     return tokenstream.evaluate_expr( self.jscontext, "HTML" )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py", line 1085, in evaluate_expr
              [junit]     rval = self.evaluate_expr1( context, place, parse_only )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py", line 1120, in evaluate_expr1
              [junit]     rval = self.evaluate_expr2( context, place, parse_only )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py", line 1164, in evaluate_expr2
              [junit]     return self.evaluate_expr3( context, place, parse_only )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py", line 1176, in evaluate_expr3
              [junit]     rval = self.evaluate_expr4( context, place, parse_only )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py", line 1264, in evaluate_expr4
              [junit]     rval = self.evaluate_expr5( context, place, parse_only )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py", line 1299, in evaluate_expr5
              [junit]     rval = self.evaluate_expr6( context, place, parse_only )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py", line 1341, in evaluate_expr6
              [junit]     return self.evaluate_expr7( context, place, parse_only )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py", line 1438, in evaluate_expr7
              [junit]     return reference_object.call( arguments, context )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py", line 554, in call
              [junit]     return self.get_referenced_object().call(argset,context)
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py", line 186, in call
              [junit]     response = ts.evaluate_statement( context, "method %s" % self.name )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py", line 752, in evaluate_statement
              [junit]     result = self.evaluate_statement( newscope, place )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py", line 988, in evaluate_statement
              [junit]     if self.evaluate_expr( context, place ) == None:
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py", line 1085, in evaluate_expr
              [junit]     rval = self.evaluate_expr1( context, place, parse_only )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py", line 1120, in evaluate_expr1
              [junit]     rval = self.evaluate_expr2( context, place, parse_only )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py", line 1164, in evaluate_expr2
              [junit]     return self.evaluate_expr3( context, place, parse_only )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py", line 1176, in evaluate_expr3
              [junit]     rval = self.evaluate_expr4( context, place, parse_only )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py", line 1264, in evaluate_expr4
              [junit]     rval = self.evaluate_expr5( context, place, parse_only )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py", line 1299, in evaluate_expr5
              [junit]     rval = self.evaluate_expr6( context, place, parse_only )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py", line 1341, in evaluate_expr6
              [junit]     return self.evaluate_expr7( context, place, parse_only )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py", line 1438, in evaluate_expr7
              [junit]     return reference_object.call( arguments, context )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py", line 554, in call
              [junit]     return self.get_referenced_object().call(argset,context)
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py", line 186, in call
              [junit]     response = ts.evaluate_statement( context, "method %s" % self.name )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py", line 752, in evaluate_statement
              [junit]     result = self.evaluate_statement( newscope, place )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py", line 810, in evaluate_statement
              [junit]     rval = self.evaluate_statement( context, place )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py", line 752, in evaluate_statement
              [junit]     result = self.evaluate_statement( newscope, place )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py", line 988, in evaluate_statement
              [junit]     if self.evaluate_expr( context, place ) == None:
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py", line 1085, in evaluate_expr
              [junit]     rval = self.evaluate_expr1( context, place, parse_only )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py", line 1120, in evaluate_expr1
              [junit]     rval = self.evaluate_expr2( context, place, parse_only )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py", line 1164, in evaluate_expr2
              [junit]     return self.evaluate_expr3( context, place, parse_only )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py", line 1176, in evaluate_expr3
              [junit]     rval = self.evaluate_expr4( context, place, parse_only )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py", line 1264, in evaluate_expr4
              [junit]     rval = self.evaluate_expr5( context, place, parse_only )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py", line 1299, in evaluate_expr5
              [junit]     rval = self.evaluate_expr6( context, place, parse_only )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py", line 1341, in evaluate_expr6
              [junit]     return self.evaluate_expr7( context, place, parse_only )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py", line 1438, in evaluate_expr7
              [junit]     return reference_object.call( arguments, context )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py", line 554, in call
              [junit]     return self.get_referenced_object().call(argset,context)
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/VirtualBrowser.py", line 1291, in call
              [junit]     self.form_instance.submit( )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/VirtualBrowser.py", line 706, in submit
              [junit]     self.window_instance.execute_action( self.method, variables, files, self.action_url )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/VirtualBrowser.py", line 928, in execute_action
              [junit]     return self.browser_instance.execute_action( self.window_name, method, parameters, files, self.resolve( url ) )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/VirtualBrowser.py", line 1069, in execute_action
              [junit]     self.reload_window( window_name, window_data, url )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/VirtualBrowser.py", line 1032, in reload_window
              [junit]     self.build_window( window_name, window_data, old_window.get_parent_window( ), full_url, old_window.get_dialog_answers( ) )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/VirtualBrowser.py", line 1038, in build_window
              [junit]     new_window = VirtualWindow( self, window_name, window_data, parent_window, current_url )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/VirtualBrowser.py", line 785, in __init__
              [junit]     parser.feed( data )
              [junit]   File "/usr/lib/python2.7/HTMLParser.py", line 109, in feed
              [junit]     self.goahead(0)
              [junit]   File "/usr/lib/python2.7/HTMLParser.py", line 153, in goahead
              [junit]     k = self.parse_endtag(i)
              [junit]   File "/usr/lib/python2.7/HTMLParser.py", line 327, in parse_endtag
              [junit]     self.handle_endtag(tag.lower())
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/VirtualBrowser.py", line 1497, in handle_endtag
              [junit]     self.end_script( )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/VirtualBrowser.py", line 1823, in end_script
              [junit]     self.window_instance.accept_javascript( javascript_text )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/VirtualBrowser.py", line 965, in accept_javascript
              [junit]     jstokens.evaluate_statement_list( self.jscontext )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py", line 720, in evaluate_statement_list
              [junit]     self.evaluate_statement( context, place )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py", line 979, in evaluate_statement
              [junit]     self.skip_statement( )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py", line 1017, in skip_statement
              [junit]     self.skip_statement( )
              [junit]   File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py", line 1053, in skip_statement
              [junit]     raise Exception("Unexpected end of statement; need semicolon")
              [junit] Exception: Unexpected end of statement; need semicolon
              [junit] 2012-03-27 18:05:50.226:INFO::Stopped SocketConnector@0.0.0.0:8346
              [junit] ------------- ---------------- ---------------
              [junit] Testcase: createConnectionsAndJob(org.apache.manifoldcf.webcrawler_tests.NavigationDerbyUI):	Caused an ERROR
              [junit] UI test failed; error code: 1
              [junit] java.lang.Exception: UI test failed; error code: 1
              [junit] 	at org.apache.manifoldcf.core.tests.HTMLTester.executeTest(HTMLTester.java:183)
              [junit] 	at org.apache.manifoldcf.webcrawler_tests.NavigationDerbyUI.createConnectionsAndJob(NavigationDerbyUI.java:282)
              [junit] 
              [junit] 
          
          BUILD FAILED
          

          This looks like a Javascript syntax issue - there's a missing semicolon.

          kwright@metacarta.com Karl Wright added a comment - - edited For me it fails: [junit] Multipart posting url 'http: //localhost:8346/mcf-crawler-ui/execute.jsp' with parameters 'outputname=MyOutputConnection&index=&recrawlinterval=1440&description=MyJob&startmethod=2&expirationinterval=&jobid=1332885945254&priority=5&reseedinterval=60&tabname=Connection&connectionname=MyRepositoryConnection&schedulerecords=0&scheduletype=1&type=job&op=Continue' and 0 files... [junit] Traceback (most recent call last): [junit] File "test.py" , line 166, in <module> [junit] var124.click() [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/VirtualBrowser.py" , line 120, in click [junit] self.get_form( ).execute_javascript_expression( self.onclick ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/VirtualBrowser.py" , line 672, in execute_javascript_expression [junit] return self.window_instance.execute_javascript_expression( javascript ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/VirtualBrowser.py" , line 920, in execute_javascript_expression [junit] return tokenstream.evaluate_expr( self.jscontext, "HTML" ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py" , line 1085, in evaluate_expr [junit] rval = self.evaluate_expr1( context, place, parse_only ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py" , line 1120, in evaluate_expr1 [junit] rval = self.evaluate_expr2( context, place, parse_only ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py" , line 1164, in evaluate_expr2 [junit] return self.evaluate_expr3( context, place, parse_only ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py" , line 1176, in evaluate_expr3 [junit] rval = self.evaluate_expr4( context, place, parse_only ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py" , line 1264, in evaluate_expr4 [junit] rval = self.evaluate_expr5( context, place, parse_only ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py" , line 1299, in evaluate_expr5 [junit] rval = self.evaluate_expr6( context, place, parse_only ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py" , line 1341, in evaluate_expr6 [junit] return self.evaluate_expr7( context, place, parse_only ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py" , line 1438, in evaluate_expr7 [junit] return reference_object.call( arguments, context ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py" , line 554, in call [junit] return self.get_referenced_object().call(argset,context) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py" , line 186, in call [junit] response = ts.evaluate_statement( context, "method %s" % self.name ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py" , line 752, in evaluate_statement [junit] result = self.evaluate_statement( newscope, place ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py" , line 988, in evaluate_statement [junit] if self.evaluate_expr( context, place ) == None: [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py" , line 1085, in evaluate_expr [junit] rval = self.evaluate_expr1( context, place, parse_only ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py" , line 1120, in evaluate_expr1 [junit] rval = self.evaluate_expr2( context, place, parse_only ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py" , line 1164, in evaluate_expr2 [junit] return self.evaluate_expr3( context, place, parse_only ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py" , line 1176, in evaluate_expr3 [junit] rval = self.evaluate_expr4( context, place, parse_only ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py" , line 1264, in evaluate_expr4 [junit] rval = self.evaluate_expr5( context, place, parse_only ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py" , line 1299, in evaluate_expr5 [junit] rval = self.evaluate_expr6( context, place, parse_only ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py" , line 1341, in evaluate_expr6 [junit] return self.evaluate_expr7( context, place, parse_only ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py" , line 1438, in evaluate_expr7 [junit] return reference_object.call( arguments, context ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py" , line 554, in call [junit] return self.get_referenced_object().call(argset,context) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py" , line 186, in call [junit] response = ts.evaluate_statement( context, "method %s" % self.name ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py" , line 752, in evaluate_statement [junit] result = self.evaluate_statement( newscope, place ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py" , line 810, in evaluate_statement [junit] rval = self.evaluate_statement( context, place ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py" , line 752, in evaluate_statement [junit] result = self.evaluate_statement( newscope, place ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py" , line 988, in evaluate_statement [junit] if self.evaluate_expr( context, place ) == None: [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py" , line 1085, in evaluate_expr [junit] rval = self.evaluate_expr1( context, place, parse_only ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py" , line 1120, in evaluate_expr1 [junit] rval = self.evaluate_expr2( context, place, parse_only ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py" , line 1164, in evaluate_expr2 [junit] return self.evaluate_expr3( context, place, parse_only ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py" , line 1176, in evaluate_expr3 [junit] rval = self.evaluate_expr4( context, place, parse_only ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py" , line 1264, in evaluate_expr4 [junit] rval = self.evaluate_expr5( context, place, parse_only ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py" , line 1299, in evaluate_expr5 [junit] rval = self.evaluate_expr6( context, place, parse_only ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py" , line 1341, in evaluate_expr6 [junit] return self.evaluate_expr7( context, place, parse_only ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py" , line 1438, in evaluate_expr7 [junit] return reference_object.call( arguments, context ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py" , line 554, in call [junit] return self.get_referenced_object().call(argset,context) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/VirtualBrowser.py" , line 1291, in call [junit] self.form_instance.submit( ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/VirtualBrowser.py" , line 706, in submit [junit] self.window_instance.execute_action( self.method, variables, files, self.action_url ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/VirtualBrowser.py" , line 928, in execute_action [junit] return self.browser_instance.execute_action( self.window_name, method, parameters, files, self.resolve( url ) ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/VirtualBrowser.py" , line 1069, in execute_action [junit] self.reload_window( window_name, window_data, url ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/VirtualBrowser.py" , line 1032, in reload_window [junit] self.build_window( window_name, window_data, old_window.get_parent_window( ), full_url, old_window.get_dialog_answers( ) ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/VirtualBrowser.py" , line 1038, in build_window [junit] new_window = VirtualWindow( self, window_name, window_data, parent_window, current_url ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/VirtualBrowser.py" , line 785, in __init__ [junit] parser.feed( data ) [junit] File "/usr/lib/python2.7/HTMLParser.py" , line 109, in feed [junit] self.goahead(0) [junit] File "/usr/lib/python2.7/HTMLParser.py" , line 153, in goahead [junit] k = self.parse_endtag(i) [junit] File "/usr/lib/python2.7/HTMLParser.py" , line 327, in parse_endtag [junit] self.handle_endtag(tag.lower()) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/VirtualBrowser.py" , line 1497, in handle_endtag [junit] self.end_script( ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/VirtualBrowser.py" , line 1823, in end_script [junit] self.window_instance.accept_javascript( javascript_text ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/VirtualBrowser.py" , line 965, in accept_javascript [junit] jstokens.evaluate_statement_list( self.jscontext ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py" , line 720, in evaluate_statement_list [junit] self.evaluate_statement( context, place ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py" , line 979, in evaluate_statement [junit] self.skip_statement( ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py" , line 1017, in skip_statement [junit] self.skip_statement( ) [junit] File "/home/kwright/wip/trunk/tests/webcrawler/test-derby-output/Javascript.py" , line 1053, in skip_statement [junit] raise Exception( "Unexpected end of statement; need semicolon" ) [junit] Exception: Unexpected end of statement; need semicolon [junit] 2012-03-27 18:05:50.226:INFO::Stopped SocketConnector@0.0.0.0:8346 [junit] ------------- ---------------- --------------- [junit] Testcase: createConnectionsAndJob(org.apache.manifoldcf.webcrawler_tests.NavigationDerbyUI): Caused an ERROR [junit] UI test failed; error code: 1 [junit] java.lang.Exception: UI test failed; error code: 1 [junit] at org.apache.manifoldcf.core.tests.HTMLTester.executeTest(HTMLTester.java:183) [junit] at org.apache.manifoldcf.webcrawler_tests.NavigationDerbyUI.createConnectionsAndJob(NavigationDerbyUI.java:282) [junit] [junit] BUILD FAILED This looks like a Javascript syntax issue - there's a missing semicolon.
          kwright@metacarta.com Karl Wright added a comment -

          Further research shows that the problem is due to the tester and is twofold:

          (1) Only while loops are supported
          (2) The ++ operator is not supported

          So if you change the for loop to a while, and use i = i+1 instead of i++, the test should pass.

          kwright@metacarta.com Karl Wright added a comment - Further research shows that the problem is due to the tester and is twofold: (1) Only while loops are supported (2) The ++ operator is not supported So if you change the for loop to a while, and use i = i+1 instead of i++, the test should pass.
          kwright@metacarta.com Karl Wright added a comment -

          After fixing the above problems, I'm still getting errors. I improved the error output and now I see this:

          [junit] Exception: Unexpected end of statement; unknown tokens: '['Symbol: var', 'Symbol: regexp', 'Punctuation: =', 'Regexp: http(s)?:\\/\\/([a-z0-9+!*(),;?&=\\$_.-]+(\\:[a-z0-9+!*(),;?&=\\$_.-]+)?@)?[a-z0-9+\\$_-]+(\\.[a-z0-9+\\$_-]+)*(\\:[0-9]{2,5})?(\\/([a-z0-9+\\$_-]\\.?)+)*\\/?(\\?[a-z+&\\$_.-][a-z0-9;:@()', 'Punctuation: &', 'Punctuation: %', 'Punctuation: =', 'Punctuation: +', 'Punctuation: \\', 'Punctuation: $', 'Symbol: _', 'Punctuation: .', 'Punctuation: -', 'Punctuation: ]', 'Punctuation: *', 'Punctuation: )', 'Punctuation: ?', 'Punctuation: (', 'Punctuation: #', 'Punctuation: [', 'Symbol: a', 'Punctuation: -', 'Symbol: z_', 'Punctuation: .', 'Punctuation: -', 'Punctuation: ]', 'Punctuation: [', 'Symbol: a', 'Punctuation: -', 'Symbol: z0', 'Punctuation: -', 'Int: 9', 'Punctuation: +', 'Punctuation: \\', 'Punctuation: $', 'Symbol: _', 'Punctuation: .', 'Punctuation: -', 'Punctuation: ]', 'Punctuation: *', 'Punctuation: )', 'Punctuation: ?', 'Regexp: ;\n  var lines = editjob.seeds.value.split("\\n");\n  var ...
          

          It looks like the escaping of the large regular expression is incorrect, or the parsing of the regular expression is incorrect in the tester. Looking further...

          kwright@metacarta.com Karl Wright added a comment - After fixing the above problems, I'm still getting errors. I improved the error output and now I see this: [junit] Exception: Unexpected end of statement; unknown tokens: '[' Symbol: var ', ' Symbol: regexp ', ' Punctuation: = ', ' Regexp: http(s)?:\\/\\/([a-z0-9+!*(),;?&=\\$_.-]+(\\:[a-z0-9+!*(),;?&=\\$_.-]+)?@)?[a-z0-9+\\$_-]+(\\.[a-z0-9+\\$_-]+)*(\\:[0-9]{2,5})?(\\/([a-z0-9+\\$_-]\\.?)+)*\\/?(\\?[a-z+&\\$_.-][a-z0-9;:@() ', ' Punctuation: & ', ' Punctuation: % ', ' Punctuation: = ', ' Punctuation: + ', ' Punctuation: \\ ', ' Punctuation: $ ', ' Symbol: _ ', ' Punctuation: . ', ' Punctuation: - ', ' Punctuation: ] ', ' Punctuation: * ', ' Punctuation: ) ', ' Punctuation: ? ', ' Punctuation: ( ', ' Punctuation: # ', ' Punctuation: [ ', ' Symbol: a ', ' Punctuation: - ', ' Symbol: z_ ', ' Punctuation: . ', ' Punctuation: - ', ' Punctuation: ] ', ' Punctuation: [ ', ' Symbol: a ', ' Punctuation: - ', ' Symbol: z0 ', ' Punctuation: - ', ' Int: 9 ', ' Punctuation: + ', ' Punctuation: \\ ', ' Punctuation: $ ', ' Symbol: _ ', ' Punctuation: . ', ' Punctuation: - ', ' Punctuation: ] ', ' Punctuation: * ', ' Punctuation: ) ', ' Punctuation: ? ', ' Regexp: ;\n var lines = editjob.seeds.value.split( "\\n" );\n var ... It looks like the escaping of the large regular expression is incorrect, or the parsing of the regular expression is incorrect in the tester. Looking further...
          kwright@metacarta.com Karl Wright added a comment -

          Fixed a problem in the tester. Now I get this, which could well be just a test error:

              [junit] ALERT: Invalid URLs in seeds list:n.comn
              [junit] FOCUS: On field 'seeds'
          
          kwright@metacarta.com Karl Wright added a comment - Fixed a problem in the tester. Now I get this, which could well be just a test error: [junit] ALERT: Invalid URLs in seeds list:n.comn [junit] FOCUS: On field 'seeds'
          kwright@metacarta.com Karl Wright added a comment -

          It's not a test error. The test line uses a full URL, which is not appropriately flagged:

          textarea.setValue(testerInstance.createStringDescription("http://www.cnn.com"));

          I'm going to attach the patch as I currently have it and debug some more tomorrow.

          kwright@metacarta.com Karl Wright added a comment - It's not a test error. The test line uses a full URL, which is not appropriately flagged: textarea.setValue(testerInstance.createStringDescription("http://www.cnn.com")); I'm going to attach the patch as I currently have it and debug some more tomorrow.

          Thanks, I will also try to debug some more today if I get sufficient time. By the way, the === operator might not be supported as well? I can try to change it to == if it helps.

          erlendfg Erlend Garåsen added a comment - Thanks, I will also try to debug some more today if I get sufficient time. By the way, the === operator might not be supported as well? I can try to change it to == if it helps.
          kwright@metacarta.com Karl Wright added a comment -

          I changed the === to == already in the updated patch. The remaining problem is not parsing related, but rather related to the regular expression, I think. Either the tester is not processing the regular expression properly or the actual regular expression is incorrect. But in any case I won't be able to look at this again until tonight.

          kwright@metacarta.com Karl Wright added a comment - I changed the === to == already in the updated patch. The remaining problem is not parsing related, but rather related to the regular expression, I think. Either the tester is not processing the regular expression properly or the actual regular expression is incorrect. But in any case I won't be able to look at this again until tonight.

          I have got rid of the annoying error after I upgraded to version 2.7.2 of Python. Now I get a different error message, but I'm afraid that it is not related to an invalid regular expression. I simplified the regexp just in case, but it still fails.

          var regexp = /http(s)?:\/\/.*/;
          

          I think the problem is related to the variable declaration above, i.e. the browser simulator think it is an invalid variable declaration.

              [junit]   File "/Users/erlendfg/tmp/mcf_2012/tests/webcrawler/test-derby-output/Javascript.py", line 790, in evaluate_statement
              [junit]     raise Exception("Didn't find expected ';' at end of var statement, saw %s, in %s" % (unicode(token),place))
              [junit] Exception: Didn't find expected ';' at end of var statement, saw Punctuation: ., in method check_seedsList
              [junit] 2012-03-30 15:40:22.530:INFO::Stopped SocketConnector@0.0.0.0:8346
          

          My intention is to solve the problem before I fly to Azerbaijan tomorrow morning.

          erlendfg Erlend Garåsen added a comment - I have got rid of the annoying error after I upgraded to version 2.7.2 of Python. Now I get a different error message, but I'm afraid that it is not related to an invalid regular expression. I simplified the regexp just in case, but it still fails. var regexp = /http(s)?:\/\/.*/; I think the problem is related to the variable declaration above, i.e. the browser simulator think it is an invalid variable declaration. [junit] File "/Users/erlendfg/tmp/mcf_2012/tests/webcrawler/test-derby-output/Javascript.py" , line 790, in evaluate_statement [junit] raise Exception( "Didn 't find expected ' ;' at end of var statement, saw %s, in %s" % (unicode(token),place)) [junit] Exception: Didn 't find expected ' ;' at end of var statement, saw Punctuation: ., in method check_seedsList [junit] 2012-03-30 15:40:22.530:INFO::Stopped SocketConnector@0.0.0.0:8346 My intention is to solve the problem before I fly to Azerbaijan tomorrow morning.
          kwright@metacarta.com Karl Wright added a comment -

          Did you svn update? I fixed several issues in the simulator a while back, and this was one of them.

          kwright@metacarta.com Karl Wright added a comment - Did you svn update? I fixed several issues in the simulator a while back, and this was one of them.

          Now I did a svn up. I noticed a few things. If I keep my simplified shortened regexp mentioned above, I get the following error:

              [junit]   File "/Users/erlendfg/tmp/mcf_2012/tests/webcrawler/test-derby-output/VirtualBrowser.py", line 870, in find_button
              [junit]     raise Exception("Can't find button %s on page %s" % (alt,self.current_url))
              [junit] Exception: Can't find button Add url regexp on page http://localhost:8346/mcf-cr ...
          

          If I remove the regexp var and do the following:

          if (! /http(s)?:\/\/.*/.test(line))
          

          I get:

              [junit]   File "/Users/erlendfg/tmp/mcf_2012/tests/webcrawler/test-derby-output/Javascript.py", line 72, in get_referenced_object
              [junit]     raise Exception("Object %s has no legal object reference" % unicode(self))
              [junit] Exception: Object <Javascript.JSRegexp instance at 0x1005a03b0> has no legal object reference
              [junit] 2012-03-30 18:12:52.429:INFO::Stopped SocketConnector@0.0.0.0:8346
          

          If I comment out the regexp check (two lines), the test passes. I guess it is something about the test() function which does not play well with the browser simulator.

          erlendfg Erlend Garåsen added a comment - Now I did a svn up. I noticed a few things. If I keep my simplified shortened regexp mentioned above, I get the following error: [junit] File "/Users/erlendfg/tmp/mcf_2012/tests/webcrawler/test-derby-output/VirtualBrowser.py" , line 870, in find_button [junit] raise Exception( "Can't find button %s on page %s" % (alt,self.current_url)) [junit] Exception: Can't find button Add url regexp on page http: //localhost:8346/mcf-cr ... If I remove the regexp var and do the following: if (! /http(s)?:\/\/.*/.test(line)) I get: [junit] File "/Users/erlendfg/tmp/mcf_2012/tests/webcrawler/test-derby-output/Javascript.py" , line 72, in get_referenced_object [junit] raise Exception( " Object %s has no legal object reference" % unicode(self)) [junit] Exception: Object <Javascript.JSRegexp instance at 0x1005a03b0> has no legal object reference [junit] 2012-03-30 18:12:52.429:INFO::Stopped SocketConnector@0.0.0.0:8346 If I comment out the regexp check (two lines), the test passes. I guess it is something about the test() function which does not play well with the browser simulator.
          kwright@metacarta.com Karl Wright added a comment -

          The first error is the same as what I get, but if you go back further in the output of the test you will see this:

          [junit] ALERT: Invalid URLs in seeds list:n.comn
              [junit] FOCUS: On field 'seeds'
          

          That's the smoking gun of what is actually going wrong: the page is not reloading because there's an alert (which outputs that busted message) and there's a subsequent focus(), but no submit occurs. So the problem is that the regexp is not matching.

          About the only way to chase this down will be to dump the regexp as the browser simulator sees it during the "test" method. The appropriate place to look at the code is in framework/core/src/test/resources/org/apache/manifoldcf/core/tests/Javascript.py.

          kwright@metacarta.com Karl Wright added a comment - The first error is the same as what I get, but if you go back further in the output of the test you will see this: [junit] ALERT: Invalid URLs in seeds list:n.comn [junit] FOCUS: On field 'seeds' That's the smoking gun of what is actually going wrong: the page is not reloading because there's an alert (which outputs that busted message) and there's a subsequent focus(), but no submit occurs. So the problem is that the regexp is not matching. About the only way to chase this down will be to dump the regexp as the browser simulator sees it during the "test" method. The appropriate place to look at the code is in framework/core/src/test/resources/org/apache/manifoldcf/core/tests/Javascript.py.
          kwright@metacarta.com Karl Wright added a comment -

          I just added another fix to Javascript.py that should allow your second syntactical construction above to work properly.

          kwright@metacarta.com Karl Wright added a comment - I just added another fix to Javascript.py that should allow your second syntactical construction above to work properly.
          kwright@metacarta.com Karl Wright added a comment -

          After yet a third fix to the virtual browser, the test passed, so I committed the code.

          r1309631

          kwright@metacarta.com Karl Wright added a comment - After yet a third fix to the virtual browser, the test passed, so I committed the code. r1309631

          Great! Perhaps you should reassign the issue to Hitoshi in order to add the Japanese error message before we close it.

          erlendfg Erlend Garåsen added a comment - Great! Perhaps you should reassign the issue to Hitoshi in order to add the Japanese error message before we close it.
          kwright@metacarta.com Karl Wright added a comment -

          Add Japanese translation

          kwright@metacarta.com Karl Wright added a comment - Add Japanese translation

          Japanese translation added and tested.
          r1325649

          erlendfg Erlend Garåsen added a comment - Japanese translation added and tested. r1325649

          People

            erlendfg Erlend Garåsen
            erlendfg Erlend Garåsen
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: