Uploaded image for project: 'TOREE'
  1. TOREE
  2. TOREE-344

No module named pyspark

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Won't Fix
    • None
    • 0.2.0
    • None
    • None

    Description

      I have installed toree to my jupyter environment (https://github.com/apache/incubator-toree) and written a piece of code that works with pyspark. Yarn starts properly and I can see the containers running in the queue,

      When I run the code, I get the following error

      Error from python worker:
      /usr/local/bin/python2.7: No module named pyspark

      the kernel is set-up as follows:

      {
      "language": "python",
      "display_name": "Apache Toree - PySpark",
      "env":

      { "__TOREE_SPARK_OPTS__": " --master yarn", "SPARK_HOME": "/usr/hdp/2.4.2.0-258/spark", "__TOREE_OPTS__": "", "DEFAULT_INTERPRETER": "PySpark", "PYTHONPATH": "/usr/hdp/2.4.2.0-258/spark/python:/usr/hdp/2.4.2.0-258/spark/python/lib/py4j-0.9-src.zip", "PYTHON_EXEC": "python", "PYTHONSTARTUP": "/usr/hdp/2.4.2.0-258/spark/python/pyspark/shell.py", "PYSPARK_PYTHON": "/usr/local/bin/python2.7", "PYSPARK_DRIVER_PYTHON": "/usr/local/bin/python2.7" }

      ,
      "argv": [
      "/usr/local/share/jupyter/kernels/apache_toree_pyspark/bin/run.sh",
      "--profile",
      "

      {connection_file}

      "
      ]
      }

      Attachments

        Activity

          People

            Unassigned Unassigned
            hani1814 haniar
            Votes:
            4 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: