Uploaded image for project: 'ORC'
  1. ORC
  2. ORC-1053

Timestamp values read in Hive are different when using ORC file created using CSV to ORC converter tools

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.7.2
    • 1.7.2
    • Java
    • None

    Description

      I have a CSV file that has a column having timestamp values as 0001-01-01 00:00:00.0. Then I convert CSV file to ORC file using CSV to ORC converter and place the ORC file in a hive table backed by ORC files. On querying the data using Hive beeline and Spark SQL, different results are obtained

      If converted using CPP tool, value read using Hive beeline and Spark SQL queries is 0001-01-03 00:00:00

      If converted using Java tool, value read using Hive beeline and Spark SQL queries is 0001-01-02 23:56:02.0

      Attachments

        1. converted_by_cpp.orc
          0.3 kB
          Varun Raval
        2. converted_by_hive.orc
          0.2 kB
          Varun Raval
        3. converted_by_java.orc
          0.3 kB
          Varun Raval
        4. hive_table_desc.jpg
          83 kB
          Varun Raval
        5. timestamp.csv
          0.0 kB
          Varun Raval

        Issue Links

          Activity

            People

              Guiyankuang Yiqun Zhang
              vraval48 Varun Raval
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: