Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Not A Problem
    • 1.4, 1.5
    • None
    • protocol
    • None
    • Patch Available

    Description

      For some reason some URL's always time out with protocol-http but not protocol-httpclient. The stack trace is always the same:

      2012-04-20 11:25:44,275 ERROR http.Http - Failed to get protocol output
      java.net.SocketTimeoutException: Read timed out
              at java.net.SocketInputStream.socketRead0(Native Method)
              at java.net.SocketInputStream.read(SocketInputStream.java:129)
              at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
              at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
              at java.io.FilterInputStream.read(FilterInputStream.java:116)
              at java.io.PushbackInputStream.read(PushbackInputStream.java:169)
              at java.io.FilterInputStream.read(FilterInputStream.java:90)
              at org.apache.nutch.protocol.http.HttpResponse.readPlainContent(HttpResponse.java:228)
              at org.apache.nutch.protocol.http.HttpResponse.<init>(HttpResponse.java:157)
              at org.apache.nutch.protocol.http.Http.getResponse(Http.java:64)
              at org.apache.nutch.protocol.http.api.HttpBase.getProtocolOutput(HttpBase.java:138)
      

      Some example URL's:

      Attachments

        1. NUTCH-1342-1.6-1.patch
          1 kB
          Markus Jelsma

        Issue Links

          Activity

            No work has yet been logged on this issue.

            People

              markus17 Markus Jelsma
              markus17 Markus Jelsma
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: