Uploaded image for project: 'Atlas'
  1. Atlas
  2. ATLAS-492 Hive Hook Improvements
  3. ATLAS-702

"ALTER TABLE .. NOT STORED AS DIRECTORIES" metadata update is not sent to Atlas.

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.7-incubating
    • None
    • None
    • None

    Description

      1. CREATE TABLE list_bucket_single (key STRING, value STRING) SKEWED BY (key) ON (1,5,6) STORED AS DIRECTORIES;

      0: jdbc:hive2://localhost:10000/default> describe formatted list_bucket_single2;
      +-------------------------------+----------------------------------------------------------------+-----------------------+--+
      |           col_name            |                           data_type                            |        comment        |
      +-------------------------------+----------------------------------------------------------------+-----------------------+--+
      | # col_name                    | data_type                                                      | comment               |
      |                               | NULL                                                           | NULL                  |
      | key                           | string                                                         |                       |
      | value                         | string                                                         |                       |
      |                               | NULL                                                           | NULL                  |
      | # Detailed Table Information  | NULL                                                           | NULL                  |
      | Database:                     | default                                                        | NULL                  |
      | Owner:                        | apathan                                                        | NULL                  |
      | CreateTime:                   | Mon Apr 25 16:27:21 IST 2016                                   | NULL                  |
      | LastAccessTime:               | UNKNOWN                                                        | NULL                  |
      | Protect Mode:                 | None                                                           | NULL                  |
      | Retention:                    | 0                                                              | NULL                  |
      | Location:                     | hdfs://localhost:9000/user/hive/warehouse/list_bucket_single2  | NULL                  |
      | Table Type:                   | MANAGED_TABLE                                                  | NULL                  |
      | Table Parameters:             | NULL                                                           | NULL                  |
      |                               | transient_lastDdlTime                                          | 1461581841            |
      |                               | NULL                                                           | NULL                  |
      | # Storage Information         | NULL                                                           | NULL                  |
      | SerDe Library:                | org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe             | NULL                  |
      | InputFormat:                  | org.apache.hadoop.mapred.TextInputFormat                       | NULL                  |
      | OutputFormat:                 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat     | NULL                  |
      | Compressed:                   | No                                                             | NULL                  |
      | Num Buckets:                  | -1                                                             | NULL                  |
      | Bucket Columns:               | []                                                             | NULL                  |
      | Sort Columns:                 | []                                                             | NULL                  |
      | Stored As SubDirectories:     | Yes                                                            | NULL                  |
      | Skewed Columns:               | [key]                                                          | NULL                  |
      | Skewed Values:                | [[1], [5], [6]]                                                | NULL                  |
      | Storage Desc Params:          | NULL                                                           | NULL                  |
      |                               | serialization.format                                           | 1                     |
      +-------------------------------+----------------------------------------------------------------+-----------------------+--+
      30 rows selected (0.115 seconds)
      

      2. ALTER TABLE list_bucket_single NOT STORED AS DIRECTORIES;

      0: jdbc:hive2://localhost:10000/default> describe formatted list_bucket_single2;
      +-------------------------------+----------------------------------------------------------------+-----------------------+--+
      |           col_name            |                           data_type                            |        comment        |
      +-------------------------------+----------------------------------------------------------------+-----------------------+--+
      | # col_name                    | data_type                                                      | comment               |
      |                               | NULL                                                           | NULL                  |
      | key                           | string                                                         |                       |
      | value                         | string                                                         |                       |
      |                               | NULL                                                           | NULL                  |
      | # Detailed Table Information  | NULL                                                           | NULL                  |
      | Database:                     | default                                                        | NULL                  |
      | Owner:                        | apathan                                                        | NULL                  |
      | CreateTime:                   | Mon Apr 25 16:27:21 IST 2016                                   | NULL                  |
      | LastAccessTime:               | UNKNOWN                                                        | NULL                  |
      | Protect Mode:                 | None                                                           | NULL                  |
      | Retention:                    | 0                                                              | NULL                  |
      | Location:                     | hdfs://localhost:9000/user/hive/warehouse/list_bucket_single2  | NULL                  |
      | Table Type:                   | MANAGED_TABLE                                                  | NULL                  |
      | Table Parameters:             | NULL                                                           | NULL                  |
      |                               | COLUMN_STATS_ACCURATE                                          | false                 |
      |                               | last_modified_by                                               | apathan               |
      |                               | last_modified_time                                             | 1461581898            |
      |                               | numFiles                                                       | 0                     |
      |                               | numRows                                                        | -1                    |
      |                               | rawDataSize                                                    | -1                    |
      |                               | totalSize                                                      | 0                     |
      |                               | transient_lastDdlTime                                          | 1461581898            |
      |                               | NULL                                                           | NULL                  |
      | # Storage Information         | NULL                                                           | NULL                  |
      | SerDe Library:                | org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe             | NULL                  |
      | InputFormat:                  | org.apache.hadoop.mapred.TextInputFormat                       | NULL                  |
      | OutputFormat:                 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat     | NULL                  |
      | Compressed:                   | No                                                             | NULL                  |
      | Num Buckets:                  | -1                                                             | NULL                  |
      | Bucket Columns:               | []                                                             | NULL                  |
      | Sort Columns:                 | []                                                             | NULL                  |
      | Skewed Columns:               | [key]                                                          | NULL                  |
      | Skewed Values:                | [[1], [5], [6]]                                                | NULL                  |
      | Storage Desc Params:          | NULL                                                           | NULL                  |
      |                               | serialization.format                                           | 1                     |
      +-------------------------------+----------------------------------------------------------------+-----------------------+--+
      36 rows selected (0.231 seconds)
      

      3. GET request on that entity still shows "store as subdirectories"

      curl 'http://localhost:21000/api/atlas/entities/57b601be-696f-4954-9d21-1dfb817f6661' -H 'Accept-Encoding: gzip, deflate, sdch' -H 'Accept-Language: en-US,en;q=0.8' -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.112 Safari/537.36' -H 'Accept: application/json, text/plain, */*' -H 'Referer: http://localhost:21000/index.html' -H 'Cookie: JSESSIONID=ppqjvb94mb2y1rqcj3ky46648' -H 'Connection: keep-alive' --compressed | python -m json.tool
        % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                       Dload  Upload   Total   Spent    Left  Speed
      100  1108    0  1108    0     0  45307      0 --:--:-- --:--:-- --:--:-- 46166
      {
          "GUID": "57b601be-696f-4954-9d21-1dfb817f6661",
          "definition": {
              "id": {
                  "id": "57b601be-696f-4954-9d21-1dfb817f6661",
                  "jsonClass": "org.apache.atlas.typesystem.json.InstanceSerialization$_Id",
                  "state": "ACTIVE",
                  "typeName": "hive_storagedesc",
                  "version": 0
              },
              "jsonClass": "org.apache.atlas.typesystem.json.InstanceSerialization$_Reference",
              "traitNames": [],
              "traits": {},
              "typeName": "hive_storagedesc",
              "values": {
                  "bucketCols": null,
                  "compressed": false,
                  "inputFormat": "org.apache.hadoop.mapred.TextInputFormat",
                  "location": "hdfs://localhost:9000/user/hive/warehouse/list_bucket_single",
                  "numBuckets": -1,
                  "outputFormat": "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat",
                  "parameters": null,
                  "qualifiedName": "default.list_bucket_single@primary_storage",
                  "serdeInfo": {
                      "jsonClass": "org.apache.atlas.typesystem.json.InstanceSerialization$_Struct",
                      "typeName": "hive_serde",
                      "values": {
                          "name": null,
                          "parameters": {
                              "serialization.format": "1"
                          },
                          "serializationLib": "org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe"
                      }
                  },
                  "sortCols": null,
                  "storedAsSubDirectories": true
              }
          },
          "requestId": "qtp835142742-8676 - 5fb43471-1a33-4d1d-ae29-94aca3d726f5"
      }
      

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              ayubpathan Ayub Pathan
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: