Description
I have Pig script of this nature. It accesses a partitioned table, partitioned on gridname and dt (datestamp)
A = LOAD 'mytable' USING org.apache.hcatalog.pig.HCatLoader(); B = FILTER A BY gridname=='XY' and dt != '2012_03_21'; C = foreach B generate job_id, user; store C into '/user/viraj/test/XY' using PigStorage();
I use this as some partitions of the table have not been populated.
I get an error:
Backend error message during job submission
-------------------------------------------
org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Input path does not exist: hdfs://namenode/warehouse/database_confs/gridname=XY/dt=2012_03_21
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:282)
at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:962)
at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:979)
at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:174)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:897)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1082)
I suspect that the filter clause is not pushed up.
Regards
Viraj
Attachments
Issue Links
- depends upon
-
HIVE-2975 Filter parsing does not recognize '!=' as operator and silently ignores invalid tokens
- Closed