Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
1.4.6
-
None
-
None
Description
I am trying to execute Sqoop from a java program leveraging the SqoopOptions and ImportTool packages to import a table from a postgres database onto a hive table. Running the sqoop command from command line works perfectly fine and imports the table into hive. The problem I am facing with leveraging the Sqoop APIs is that after the map tasks finish and loads data onto hdfs directory, the step where loading of data into hive managed table happens, hive complains that the directory "file:/user/hive/warehouse/lineitem" doesn't exist. From the error message, it is clear that hive is looking for the directory "/user/hive/warehouse/lineitem" on my local filesystem instead of hdfs even though I provided all the necessary conf files before invoking sqoop.
Here is a miniature version of the sqoop program I am using:
import com.cloudera.sqoop.SqoopOptions; import com.cloudera.sqoop.tool.ImportTool; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; public class sqoopexperiments { protected static Configuration getConfiguration() { Configuration conf = new Configuration(); conf.addResource(new Path("/usr/local/Cellar/hadoop/2.7.2/libexec/etc/hadoop/core-site.xml")); conf.addResource(new Path("/usr/local/Cellar/hadoop/2.7.2/libexec/etc/hadoop/yarn-site.xml")); conf.addResource(new Path("/usr/local/Cellar/hadoop/2.7.2/libexec/etc/hadoop/hdfs-site.xml")); conf.addResource(new Path("/usr/local/Cellar/hadoop/2.7.2/libexec/etc/hadoop/mapred-site.xml")); conf.addResource(new Path("/usr/local/Cellar/hive/1.2.1/libexec/conf/hive-site.xml")); return conf; } private static SqoopOptions SqoopOptions = new SqoopOptions(getConfiguration()); private static final String connectionString = "jdbc:postgresql://127.0.0.1:5432/sales"; private static final String username = "unifi"; private static final String password = "unifi"; private static int executeSqoop() { int retCode; retCode = new ImportTool().run(SqoopOptions); if (retCode != 0) { throw new RuntimeException("Sqoop execution failure. Return code : "+Integer.toString(retCode)); } return retCode; } public static void main(String[] args) { SqoopOptions.setConnectString(connectionString); SqoopOptions.setUsername(username); SqoopOptions.setPassword(password); SqoopOptions.setTableName("lineitem"); SqoopOptions.setTargetDir("/user/unifi/tmp/testsqoop/1"); SqoopOptions.setHiveImport(true); executeSqoop(); } }
Here is the output of my program execution:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/bin/java -agentlib:jdwp=transport=dt_socket,address=127.0.0.1:64590,suspend=y,server=n -Dfile.encoding=UTF-8 -classpath "/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/charsets.jar: /Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/deploy.jar: /Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/ext/cldrdata.jar: /Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/ext/dnsns.jar: /Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/ext/jfxrt.jar: /Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/ext/localedata.jar: /Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/ext/nashorn.jar: /Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/ext/sunec.jar: /Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/ext/sunjce_provider.jar: /Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/ext/sunpkcs11.jar: /Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/ext/zipfs.jar: /Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/javaws.jar: /Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/jce.jar: /Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/jfr.jar: /Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/jfxswt.jar: /Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/jsse.jar: /Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/management-agent.jar: /Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/plugin.jar: /Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/resources.jar: /Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/rt.jar: /Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/lib/ant-javafx.jar: /Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/lib/dt.jar: /Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/lib/javafx-mx.jar: /Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/lib/jconsole.jar: /Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/lib/packager.jar: /Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/lib/sa-jdi.jar: /Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/lib/tools.jar: /Users/bharath/dev/sqoopexperiments/out/production/sqoopexperiments: /usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/hadoop-common-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/activation-1.1.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/apacheds-i18n-2.0.0-M15.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/apacheds-kerberos-codec-2.0.0-M15.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/api-asn1-api-1.0.0-M20.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/api-util-1.0.0-M20.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/asm-3.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/avro-1.7.4.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/commons-beanutils-1.7.0.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/commons-beanutils-core-1.8.0.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/commons-cli-1.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/commons-codec-1.4.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/commons-collections-3.2.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/commons-compress-1.4.1.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/commons-configuration-1.6.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/commons-digester-1.8.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/commons-httpclient-3.1.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/commons-io-2.4.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/commons-lang-2.6.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/commons-logging-1.1.3.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/commons-math3-3.1.1.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/commons-net-3.1.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/curator-client-2.7.1.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/curator-framework-2.7.1.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/curator-recipes-2.7.1.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/gson-2.2.4.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/guava-11.0.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/hadoop-annotations-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/hadoop-auth-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/hamcrest-core-1.3.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/htrace-core-3.1.0-incubating.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/httpclient-4.2.5.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/httpcore-4.2.5.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jackson-core-asl-1.9.13.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jackson-jaxrs-1.9.13.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jackson-mapper-asl-1.9.13.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jackson-xc-1.9.13.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/java-xmlbuilder-0.4.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jaxb-api-2.2.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jaxb-impl-2.2.3-1.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jersey-core-1.9.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jersey-json-1.9.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jersey-server-1.9.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jets3t-0.9.0.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jettison-1.1.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jetty-6.1.26.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jetty-util-6.1.26.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jsch-0.1.42.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jsp-api-2.1.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jsr305-3.0.0.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/junit-4.11.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/log4j-1.2.17.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/mockito-all-1.8.5.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/netty-3.6.2.Final.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/paranamer-2.3.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/protobuf-java-2.5.0.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/servlet-api-2.5.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/slf4j-api-1.7.10.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/snappy-java-1.0.4.1.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/stax-api-1.0-2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/xmlenc-0.52.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/xz-1.0.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/zookeeper-3.4.6.jar:/usr/local/Cellar/sqoop/1.4.6/libexec/sqoop-1.4.6.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/mapreduce/hadoop-mapreduce-client-app-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/mapreduce/hadoop-mapreduce-client-common-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-plugins-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.2-tests.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/mapreduce/hadoop-mapreduce-client-shuffle-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/hdfs/hadoop-hdfs-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/yarn/hadoop-yarn-api-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/yarn/hadoop-yarn-client-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/yarn/hadoop-yarn-common-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/yarn/hadoop-yarn-registry-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/yarn/hadoop-yarn-server-applicationhistoryservice-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/yarn/hadoop-yarn-server-common-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/yarn/hadoop-yarn-server-nodemanager-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/yarn/hadoop-yarn-server-sharedcachemanager-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/yarn/hadoop-yarn-server-tests-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/yarn/hadoop-yarn-server-web-proxy-2.7.2.jar:/usr/local/Cellar/sqoop/1.4.6/libexec/lib/postgresql-9.3-1102.jdbc4.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/accumulo-core-1.6.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/accumulo-fate-1.6.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/accumulo-start-1.6.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/accumulo-trace-1.6.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/activation-1.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/ant-1.9.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/ant-launcher-1.9.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/antlr-2.7.7.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/antlr-runtime-3.4.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/apache-log4j-extras-1.2.17.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/asm-commons-3.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/asm-tree-3.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/avro-1.7.5.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/bonecp-0.8.0.RELEASE.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/calcite-avatica-1.2.0-incubating.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/calcite-core-1.2.0-incubating.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/calcite-linq4j-1.2.0-incubating.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-beanutils-1.7.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-beanutils-core-1.8.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-cli-1.2.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-codec-1.4.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-collections-3.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-compiler-2.7.6.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-compress-1.4.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-configuration-1.6.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-dbcp-1.4.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-digester-1.8.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-httpclient-3.0.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-io-2.4.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-lang-2.6.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-logging-1.1.3.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-math-2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-pool-1.5.4.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-vfs2-2.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/curator-client-2.6.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/curator-framework-2.6.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/curator-recipes-2.6.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/datanucleus-api-jdo-3.2.6.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/datanucleus-core-3.2.10.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/datanucleus-rdbms-3.2.9.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/derby-10.10.2.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/eigenbase-properties-1.1.5.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/geronimo-annotation_1.0_spec-1.1.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/geronimo-jaspic_1.0_spec-1.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/geronimo-jta_1.1_spec-1.1.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/groovy-all-2.1.6.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/guava-14.0.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hamcrest-core-1.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-accumulo-handler-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-ant-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-beeline-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-cli-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-common-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-contrib-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-exec-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-hbase-handler-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-hwi-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-jdbc-1.2.1-standalone.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-jdbc-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-metastore-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-serde-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-service-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-shims-0.20S-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-shims-0.23-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-shims-common-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-shims-scheduler-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-testutils-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/httpclient-4.4.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/httpcore-4.4.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/ivy-2.4.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/janino-2.7.6.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/jcommander-1.32.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/jdo-api-3.0.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/jetty-all-7.6.0.v20120127.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/jetty-all-server-7.6.0.v20120127.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/jline-2.12.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/joda-time-2.5.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/jpam-1.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/json-20090211.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/jsr305-3.0.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/jta-1.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/junit-4.11.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/libfb303-0.9.2.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/libthrift-0.9.2.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/log4j-1.2.16.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/mail-1.4.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/maven-scm-api-1.4.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/maven-scm-provider-svn-commons-1.4.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/maven-scm-provider-svnexe-1.4.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/netty-3.7.0.Final.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/opencsv-2.3.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/oro-2.0.8.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/paranamer-2.3.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/parquet-hadoop-bundle-1.6.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/pentaho-aggdesigner-algorithm-5.1.5-jhyde.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/plexus-utils-1.5.6.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/postgresql-9.3-1102.jdbc4.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/regexp-1.3.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/servlet-api-2.5.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/snappy-java-1.0.5.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/ST4-4.0.4.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/stax-api-1.0.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/stringtemplate-3.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/super-csv-2.2.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/tempus-fugit-1.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/velocity-1.5.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/xz-1.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/zookeeper-3.4.6.jar:/Applications/IntelliJ IDEA.app/Contents/lib/idea_rt.jar" sqoopexperiments Connected to the target VM, address: '127.0.0.1:64590', transport: 'socket' 2017-09-03 12:25:28,072 WARN [main] sqoop.ConnFactory (ConnFactory.java:loadManagersFromConfDir(273)) - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration. 2017-09-03 12:25:28,199 INFO [main] manager.SqlManager (SqlManager.java:initOptionDefaults(98)) - Using default fetchSize of 1000 2017-09-03 12:25:30,085 INFO [main] tool.CodeGenTool (CodeGenTool.java:generateORM(92)) - Beginning code generation 2017-09-03 12:25:30,198 INFO [main] manager.SqlManager (SqlManager.java:execute(757)) - Executing SQL statement: SELECT t.* FROM "lineitem" AS t LIMIT 1 2017-09-03 12:25:30,232 INFO [main] orm.CompilationManager (CompilationManager.java:findHadoopJars(85)) - $HADOOP_MAPRED_HOME is not set Note: /tmp/sqoop-bharath/compile/0df909eb6973527d155c8c591a072c5e/lineitem.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. 2017-09-03 12:25:31,464 INFO [main] orm.CompilationManager (CompilationManager.java:jar(330)) - Writing jar file: /tmp/sqoop-bharath/compile/0df909eb6973527d155c8c591a072c5e/lineitem.jar 2017-09-03 12:25:31,471 WARN [main] manager.PostgresqlManager (PostgresqlManager.java:importTable(119)) - It looks like you are importing from postgresql. 2017-09-03 12:25:31,471 WARN [main] manager.PostgresqlManager (PostgresqlManager.java:importTable(120)) - This transfer can be faster! Use the --direct 2017-09-03 12:25:31,471 WARN [main] manager.PostgresqlManager (PostgresqlManager.java:importTable(121)) - option to exercise a postgresql-specific fast path. 2017-09-03 12:25:31,475 WARN [main] manager.CatalogQueryManager (CatalogQueryManager.java:getPrimaryKey(239)) - The table lineitem contains a multi-column primary key. Sqoop will default to the column l_orderkey only for this job. 2017-09-03 12:25:31,476 WARN [main] manager.CatalogQueryManager (CatalogQueryManager.java:getPrimaryKey(239)) - The table lineitem contains a multi-column primary key. Sqoop will default to the column l_orderkey only for this job. 2017-09-03 12:25:31,476 INFO [main] mapreduce.ImportJobBase (ImportJobBase.java:runImport(235)) - Beginning import of lineitem 2017-09-03 12:25:31,651 WARN [main] util.NativeCodeLoader (NativeCodeLoader.java:<clinit>(62)) - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2017-09-03 12:25:31,680 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1173)) - mapred.jar is deprecated. Instead, use mapreduce.job.jar 2017-09-03 12:25:32,243 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1173)) - mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 2017-09-03 12:25:32,244 WARN [main] mapreduce.JobBase (JobBase.java:cacheJars(179)) - SQOOP_HOME is unset. May not be able to find all job dependencies. 2017-09-03 12:25:32,308 INFO [main] client.RMProxy (RMProxy.java:createRMProxy(98)) - Connecting to ResourceManager at /0.0.0.0:8032 2017-09-03 12:25:32,676 WARN [main] mapreduce.JobResourceUploader (JobResourceUploader.java:uploadFiles(64)) - Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this. 2017-09-03 12:25:32,916 INFO [main] db.DBInputFormat (DBInputFormat.java:setTxIsolation(192)) - Using read commited transaction isolation 2017-09-03 12:25:32,917 INFO [main] db.DataDrivenDBInputFormat (DataDrivenDBInputFormat.java:getSplits(147)) - BoundingValsQuery: SELECT MIN("l_orderkey"), MAX("l_orderkey") FROM "lineitem" 2017-09-03 12:25:32,947 INFO [main] mapreduce.JobSubmitter (JobSubmitter.java:submitJobInternal(198)) - number of splits:4 2017-09-03 12:25:33,022 INFO [main] mapreduce.JobSubmitter (JobSubmitter.java:printTokens(287)) - Submitting tokens for job: job_1504463611328_0004 2017-09-03 12:25:33,291 INFO [main] impl.YarnClientImpl (YarnClientImpl.java:submitApplication(273)) - Submitted application application_1504463611328_0004 2017-09-03 12:25:33,334 INFO [main] mapreduce.Job (Job.java:submit(1294)) - The url to track the job: http://Bharaths-MacBook-Pro.local:8088/proxy/application_1504463611328_0004/ 2017-09-03 12:25:33,335 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1339)) - Running job: job_1504463611328_0004 2017-09-03 12:25:39,430 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1360)) - Job job_1504463611328_0004 running in uber mode : false 2017-09-03 12:25:39,431 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1367)) - map 0% reduce 0% 2017-09-03 12:25:43,479 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1367)) - map 25% reduce 0% 2017-09-03 12:25:45,495 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1367)) - map 50% reduce 0% 2017-09-03 12:25:46,504 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1367)) - map 75% reduce 0% 2017-09-03 12:25:47,513 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1367)) - map 100% reduce 0% 2017-09-03 12:25:47,520 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1378)) - Job job_1504463611328_0004 completed successfully 2017-09-03 12:25:47,597 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1385)) - Counters: 30 File System Counters FILE: Number of bytes read=0 FILE: Number of bytes written=486880 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=485 HDFS: Number of bytes written=8508828 HDFS: Number of read operations=16 HDFS: Number of large read operations=0 HDFS: Number of write operations=8 Job Counters Launched map tasks=4 Other local map tasks=4 Total time spent by all maps in occupied slots (ms)=12656 Total time spent by all reduces in occupied slots (ms)=0 Total time spent by all map tasks (ms)=12656 Total vcore-milliseconds taken by all map tasks=12656 Total megabyte-milliseconds taken by all map tasks=12959744 Map-Reduce Framework Map input records=60175 Map output records=60175 Input split bytes=485 Spilled Records=0 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=204 CPU time spent (ms)=0 Physical memory (bytes) snapshot=0 Virtual memory (bytes) snapshot=0 Total committed heap usage (bytes)=553648128 File Input Format Counters Bytes Read=0 File Output Format Counters Bytes Written=8508828 2017-09-03 12:25:47,602 INFO [main] mapreduce.ImportJobBase (ImportJobBase.java:runJob(184)) - Transferred 8.1147 MB in 15.3528 seconds (541.2293 KB/sec) 2017-09-03 12:25:47,604 INFO [main] mapreduce.ImportJobBase (ImportJobBase.java:runJob(186)) - Retrieved 60175 records. 2017-09-03 12:25:47,611 INFO [main] manager.SqlManager (SqlManager.java:execute(757)) - Executing SQL statement: SELECT t.* FROM "lineitem" AS t LIMIT 1 2017-09-03 12:25:47,615 WARN [main] hive.TableDefWriter (TableDefWriter.java:getCreateTableStmt(188)) - Column l_quantity had to be cast to a less precise type in Hive 2017-09-03 12:25:47,615 WARN [main] hive.TableDefWriter (TableDefWriter.java:getCreateTableStmt(188)) - Column l_extendedprice had to be cast to a less precise type in Hive 2017-09-03 12:25:47,615 WARN [main] hive.TableDefWriter (TableDefWriter.java:getCreateTableStmt(188)) - Column l_discount had to be cast to a less precise type in Hive 2017-09-03 12:25:47,615 WARN [main] hive.TableDefWriter (TableDefWriter.java:getCreateTableStmt(188)) - Column l_tax had to be cast to a less precise type in Hive 2017-09-03 12:25:47,615 WARN [main] hive.TableDefWriter (TableDefWriter.java:getCreateTableStmt(188)) - Column l_shipdate had to be cast to a less precise type in Hive 2017-09-03 12:25:47,615 WARN [main] hive.TableDefWriter (TableDefWriter.java:getCreateTableStmt(188)) - Column l_commitdate had to be cast to a less precise type in Hive 2017-09-03 12:25:47,615 WARN [main] hive.TableDefWriter (TableDefWriter.java:getCreateTableStmt(188)) - Column l_receiptdate had to be cast to a less precise type in Hive 2017-09-03 12:25:47,637 INFO [main] hive.HiveImport (HiveImport.java:importTable(194)) - Loading uploaded data into Hive Logging initialized using configuration in jar:file:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-common-1.2.1.jar!/hive-log4j.properties FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:file:/user/hive/warehouse/lineitem is not a directory or unable to create one) Disconnected from the target VM, address: '127.0.0.1:64590', transport: 'socket' Exception in thread "main" java.lang.RuntimeException: Sqoop execution failure. Return code : 1 at sqoopexperiments.executeSqoop(sqoopexperiments.java:30) at sqoopexperiments.main(sqoopexperiments.java:44) Process finished with exit code 1
Notice that everything works perfectly fine until the "Loading uploaded data into Hive" stage. Stepping through the call stack using debugger, I discovered the reason for the failure. The problem is in "executeScript" method of org/apache/sqoop/hive/HiveImport.class. When the control reaches this point, the temporary hive script contains this line of contents exactly as expected:
CREATE TABLE IF NOT EXISTS `lineitem` ( `l_orderkey` INT, `l_partkey` INT, `l_suppkey` INT, `l_linenumber` INT, `l_quantity` DOUBLE, `l_extendedprice` DOUBLE, `l_discount` DOUBLE, `l_tax` DOUBLE, `l_returnflag` STRING, `l_linestatus` STRING, `l_shipdate` STRING, `l_commitdate` STRING, `l_receiptdate` STRING, `l_shipinstruct` STRING, `l_shipmode` STRING, `l_comment` STRING) COMMENT 'Imported by sqoop on 2017/09/03 12:25:47' ROW FORMAT DELIMITED FIELDS TERMINATED BY '\054' LINES TERMINATED BY '\012' STORED AS TEXTFILE; LOAD DATA INPATH 'hdfs://localhost:9000/user/unifi/tmp/testsqoop/1' INTO TABLE `lineitem`;
There is this block of code which determines how to execute this temporary hive script in executeScript method of HiveImport.class file:
try { Class ite = Class.forName("org.apache.hadoop.hive.cli.CliDriver"); LOG.debug("Using in-process Hive instance."); subprocessSM = new SubprocessSecurityManager(); subprocessSM.install(); String[] cause1 = new String[]{"-f", filename}; Method ese1 = ite.getMethod("main", new Class[]{cause1.getClass()}); ese1.invoke((Object)null, new Object[]{cause1}); } catch (ClassNotFoundException var14) { LOG.debug("Using external Hive process."); this.executeExternalHiveScript(filename, env); }
If Hive CLI driver related jars are in my classpath, the program tries to invoke this line "ese1.invoke((Object)null, new Object[]
{cause1});" and after this stage it looses all the hadoop configuration context (hdfs-site, yarn-site, mapred-site, hive-site configs) it carried until this far and resorts to a default configuration because of this code in /usr/local/Cellar/hive/1.2.1/libexec/lib/hive-cli-1.2.1.jar!/org/apache/hadoop/hive/cli/CliDriver.class:
public CliDriver() { SessionState ss = SessionState.get(); this.conf = (Configuration)(ss != null?ss.getConf():new Configuration()); Log LOG = LogFactory.getLog("CliDriver"); if(LOG.isDebugEnabled()) { LOG.debug("CliDriver inited with classpath " + System.getProperty("java.class.path")); } this.console = new LogHelper(LOG); }
I don't fully understand what this "SessionState" is and why it is null here. This causes a new Configuration() to be generated which results in all my hadoop configuration being lost and hence it looks for the "/user/hive/warehouse/lineitem" directory on my local file system instead of hdfs.
If I remove the "hive-cli-1.2.1.jar" from my classpath, the HiveImport program takes the route of executing hive script using the hive binary on my system and in this mode the hive table gets created properly:
private void executeExternalHiveScript(String filename, List<String> env) throws IOException { String hiveExec = this.getHiveBinPath(); ArrayList args = new ArrayList(); args.add(hiveExec); args.add("-f"); args.add(filename); LoggingAsyncSink logSink = new LoggingAsyncSink(LOG); int ret = Executor.exec((String[])args.toArray(new String[0]), (String[])env.toArray(new String[0]), logSink, logSink); if(0 != ret) { throw new IOException("Hive exited with status " + ret); } }
My intention is to execute sqoop import to hive from java by prepackaging all the necessary hadoop jars without the need for hadoop binaries (hdfs, mapred, hive etc) to be present on my system. Stepping though the debugger, it appears to me that there is some bug which is causing all the hadoop configs to be lost when sqoop reaches Hive execution stage and executes via org.apache.hadoop.hive.cli.CliDriver.
Hoping that somebody has attempted to do hive imports via sqoop in this fashion and figured out a solution or a workaround.