Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-1824

Support import modules in Jython UDF

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.8.0, 0.9.0
    • 0.10.0
    • None
    • None
    • Reviewed
    • Hide
      module import state is determined before and after user code is executed. The resolved modules are inspected and added to the pigContext, then they are added to the job jar.

      this patch addresses the following import modes:
      - import re, which will (if configured) find re on the filesystem in the jython install root
      - import foo (which can import bar), this works now provided bar is resolvable JYTHON_HOME, JYTHONPATH, curdir, etc.
      - from pkg import *, which works when the cachedir is writable
      - import non.jvm.class, which works when the cachedir is writable
      - the directly imported module may use schema decorators, but recursively imported modules cannot until PIG-1943 is addressed
      Show
      module import state is determined before and after user code is executed. The resolved modules are inspected and added to the pigContext, then they are added to the job jar. this patch addresses the following import modes: - import re, which will (if configured) find re on the filesystem in the jython install root - import foo (which can import bar), this works now provided bar is resolvable JYTHON_HOME, JYTHONPATH, curdir, etc. - from pkg import *, which works when the cachedir is writable - import non.jvm.class, which works when the cachedir is writable - the directly imported module may use schema decorators, but recursively imported modules cannot until PIG-1943 is addressed
    • jython, import

    Description

      Currently, Jython UDF script doesn't support Jython import statement as in the following example:

      #!/usr/bin/python
      
      import re
      @outputSchema("word:chararray")
      def resplit(content, regex, index):
              return re.compile(regex).split(content)[index]
      

      Can Pig automatically locate the Jython module file and ship it to the backend? Or should we add a ship clause to let user explicitly specify the module to ship?

      Attachments

        1. 1824_final.patch
          30 kB
          Woody Anderson
        2. TEST-org.apache.pig.test.TestScriptUDF.txt
          192 kB
          Alan Gates
        3. TEST-org.apache.pig.test.TestScriptLanguage.txt
          898 kB
          Alan Gates
        4. TEST-org.apache.pig.test.TestGrunt.txt
          1.09 MB
          Alan Gates
        5. 1824x.patch
          30 kB
          Woody Anderson
        6. 1824d.patch
          28 kB
          Woody Anderson
        7. 1824c.patch
          28 kB
          Woody Anderson
        8. 1824b.patch
          28 kB
          Woody Anderson
        9. 1824a.patch
          23 kB
          Woody Anderson
        10. 1824.patch
          24 kB
          Woody Anderson

        Issue Links

          Activity

            People

              woody.anderson@gmail.com Woody Anderson
              rding Richard Ding
              Votes:
              1 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: