Uploaded image for project: 'Atlas'
  1. Atlas
  2. ATLAS-772

Ordering of columns is not maintained in schema query response, where as hive table entity response maintains the ordering

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Resolved
    • 0.7-incubating
    • None

    Description

      Ordering of columns is not maintained in schema query response, where as hive table entity response maintains the ordering

      Table schema

      0: jdbc:hive2://localhost:10000/default> describe formatted table_pbrscdldkm;
      +-------------------------------+------------------------------------------------------------------------------+--------------------------------+--+
      |           col_name            |                                  data_type                                   |            comment             |
      +-------------------------------+------------------------------------------------------------------------------+--------------------------------+--+
      | # col_name                    | data_type                                                                    | comment                        |
      |                               | NULL                                                                         | NULL                           |
      | viewtime                      | int                                                                          |                                |
      | userid                        | bigint                                                                       |                                |
      | page_url                      | string                                                                       |                                |
      | referrer_url                  | string                                                                       |                                |
      | ip                            | string                                                                       |                                |
      |                               | NULL                                                                         | NULL                           |
      | # Partition Information       | NULL                                                                         | NULL                           |
      | # col_name                    | data_type                                                                    | comment                        |
      |                               | NULL                                                                         | NULL                           |
      | dt                            | string                                                                       |                                |
      | country                       | string                                                                       | partitioned columns comments.  |
      |                               | NULL                                                                         | NULL                           |
      | # Detailed Table Information  | NULL                                                                         | NULL                           |
      | Database:                     | db2pbrscdldkm                                                                | NULL                           |
      | Owner:                        | apathan                                                                      | NULL                           |
      | CreateTime:                   | Tue May 10 16:36:56 IST 2016                                                 | NULL                           |
      | LastAccessTime:               | UNKNOWN                                                                      | NULL                           |
      | Protect Mode:                 | None                                                                         | NULL                           |
      | Retention:                    | 0                                                                            | NULL                           |
      | Location:                     | hdfs://localhost:9000/user/hive/warehouse/db2pbrscdldkm.db/table_pbrscdldkm  | NULL                           |
      | Table Type:                   | MANAGED_TABLE                                                                | NULL                           |
      | Table Parameters:             | NULL                                                                         | NULL                           |
      |                               | last_modified_by                                                             | apathan                        |
      |                               | last_modified_time                                                           | 1462878417                     |
      |                               | transient_lastDdlTime                                                        | 1462878417                     |
      |                               | NULL                                                                         | NULL                           |
      | # Storage Information         | NULL                                                                         | NULL                           |
      | SerDe Library:                | org.apache.hadoop.hive.serde2.avro.AvroSerDe                                 | NULL                           |
      | InputFormat:                  | org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat                   | NULL                           |
      | OutputFormat:                 | org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat                  | NULL                           |
      | Compressed:                   | No                                                                           | NULL                           |
      | Num Buckets:                  | -1                                                                           | NULL                           |
      | Bucket Columns:               | []                                                                           | NULL                           |
      | Sort Columns:                 | []                                                                           | NULL                           |
      | Storage Desc Params:          | NULL                                                                         | NULL                           |
      |                               | serialization.format                                                         | 1                              |
      +-------------------------------+------------------------------------------------------------------------------+--------------------------------+--+
      38 rows selected (0.691 seconds)
      

      Hive table entity query response which shows ordering is maintained as above

      curl http://admin:admin@localhost:21000/api/atlas/entities/2d63c256-aee1-47f6-abdc-9db472764585 | python -m json.tool
        % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                       Dload  Upload   Total   Spent    Left  Speed
      100  6625    0  6625    0     0  65784      0 --:--:-- --:--:-- --:--:-- 66250
      {
          "GUID": "2d63c256-aee1-47f6-abdc-9db472764585",
          "definition": {
              "id": {
                  "id": "2d63c256-aee1-47f6-abdc-9db472764585",
                  "jsonClass": "org.apache.atlas.typesystem.json.InstanceSerialization$_Id",
                  "state": "ACTIVE",
                  "typeName": "hive_table",
                  "version": 0
              },
              "jsonClass": "org.apache.atlas.typesystem.json.InstanceSerialization$_Reference",
              "traitNames": [],
              "traits": {},
              "typeName": "hive_table",
              "values": {
                  "columns": [
                      {
                          "id": {
                              "id": "f0115d35-c768-476b-917c-3a243085d1ff",
                              "jsonClass": "org.apache.atlas.typesystem.json.InstanceSerialization$_Id",
                              "state": "ACTIVE",
                              "typeName": "hive_column",
                              "version": 0
                          },
                          "jsonClass": "org.apache.atlas.typesystem.json.InstanceSerialization$_Reference",
                          "traitNames": [],
                          "traits": {},
                          "typeName": "hive_column",
                          "values": {
                              "comment": null,
                              "name": "viewtime",
                              "qualifiedName": "db2pbrscdldkm.table_pbrscdldkm.viewtime@primary",
                              "table": {
                                  "id": "2d63c256-aee1-47f6-abdc-9db472764585",
                                  "jsonClass": "org.apache.atlas.typesystem.json.InstanceSerialization$_Id",
                                  "state": "ACTIVE",
                                  "typeName": "hive_table",
                                  "version": 0
                              },
                              "type": "int"
                          }
                      },
                      {
                          "id": {
                              "id": "642b6b3a-1e5a-4a06-844e-6fd71ae036b2",
                              "jsonClass": "org.apache.atlas.typesystem.json.InstanceSerialization$_Id",
                              "state": "ACTIVE",
                              "typeName": "hive_column",
                              "version": 0
                          },
                          "jsonClass": "org.apache.atlas.typesystem.json.InstanceSerialization$_Reference",
                          "traitNames": [],
                          "traits": {},
                          "typeName": "hive_column",
                          "values": {
                              "comment": null,
                              "name": "userid",
                              "qualifiedName": "db2pbrscdldkm.table_pbrscdldkm.userid@primary",
                              "table": {
                                  "id": "2d63c256-aee1-47f6-abdc-9db472764585",
                                  "jsonClass": "org.apache.atlas.typesystem.json.InstanceSerialization$_Id",
                                  "state": "ACTIVE",
                                  "typeName": "hive_table",
                                  "version": 0
                              },
                              "type": "bigint"
                          }
                      },
                      {
                          "id": {
                              "id": "9b14560e-6471-4a2e-b495-1f08bfad37d3",
                              "jsonClass": "org.apache.atlas.typesystem.json.InstanceSerialization$_Id",
                              "state": "ACTIVE",
                              "typeName": "hive_column",
                              "version": 0
                          },
                          "jsonClass": "org.apache.atlas.typesystem.json.InstanceSerialization$_Reference",
                          "traitNames": [],
                          "traits": {},
                          "typeName": "hive_column",
                          "values": {
                              "comment": null,
                              "name": "page_url",
                              "qualifiedName": "db2pbrscdldkm.table_pbrscdldkm.page_url@primary",
                              "table": {
                                  "id": "2d63c256-aee1-47f6-abdc-9db472764585",
                                  "jsonClass": "org.apache.atlas.typesystem.json.InstanceSerialization$_Id",
                                  "state": "ACTIVE",
                                  "typeName": "hive_table",
                                  "version": 0
                              },
                              "type": "string"
                          }
                      },
                      {
                          "id": {
                              "id": "8ca2072f-2b98-4b19-9a17-2e3d125ebbd6",
                              "jsonClass": "org.apache.atlas.typesystem.json.InstanceSerialization$_Id",
                              "state": "ACTIVE",
                              "typeName": "hive_column",
                              "version": 0
                          },
                          "jsonClass": "org.apache.atlas.typesystem.json.InstanceSerialization$_Reference",
                          "traitNames": [],
                          "traits": {},
                          "typeName": "hive_column",
                          "values": {
                              "comment": null,
                              "name": "referrer_url",
                              "qualifiedName": "db2pbrscdldkm.table_pbrscdldkm.referrer_url@primary",
                              "table": {
                                  "id": "2d63c256-aee1-47f6-abdc-9db472764585",
                                  "jsonClass": "org.apache.atlas.typesystem.json.InstanceSerialization$_Id",
                                  "state": "ACTIVE",
                                  "typeName": "hive_table",
                                  "version": 0
                              },
                              "type": "string"
                          }
                      },
                      {
                          "id": {
                              "id": "effd4c89-8795-4e54-bd26-f7a182d58c79",
                              "jsonClass": "org.apache.atlas.typesystem.json.InstanceSerialization$_Id",
                              "state": "ACTIVE",
                              "typeName": "hive_column",
                              "version": 0
                          },
                          "jsonClass": "org.apache.atlas.typesystem.json.InstanceSerialization$_Reference",
                          "traitNames": [],
                          "traits": {},
                          "typeName": "hive_column",
                          "values": {
                              "comment": null,
                              "name": "ip",
                              "qualifiedName": "db2pbrscdldkm.table_pbrscdldkm.ip@primary",
                              "table": {
                                  "id": "2d63c256-aee1-47f6-abdc-9db472764585",
                                  "jsonClass": "org.apache.atlas.typesystem.json.InstanceSerialization$_Id",
                                  "state": "ACTIVE",
                                  "typeName": "hive_table",
                                  "version": 0
                              },
                              "type": "string"
                          }
                      }
                  ],
                  "comment": null,
                  "createTime": "2016-05-10T11:06:57.000Z",
                  "db": {
                      "id": "8abe3108-cd0a-42cb-a9f4-54b2256d9ef0",
                      "jsonClass": "org.apache.atlas.typesystem.json.InstanceSerialization$_Id",
                      "state": "ACTIVE",
                      "typeName": "hive_db",
                      "version": 0
                  },
                  "description": null,
                  "lastAccessTime": "2016-05-10T11:06:57.000Z",
                  "name": "db2pbrscdldkm.table_pbrscdldkm@primary",
                  "owner": "apathan",
                  "parameters": {
                      "last_modified_by": "apathan",
                      "last_modified_time": "1462878417",
                      "transient_lastDdlTime": "1462878417"
                  },
                  "partitionKeys": [
                      {
                          "id": {
                              "id": "baa21d89-e899-4f97-8164-7c811cd0b44b",
                              "jsonClass": "org.apache.atlas.typesystem.json.InstanceSerialization$_Id",
                              "state": "ACTIVE",
                              "typeName": "hive_column",
                              "version": 0
                          },
                          "jsonClass": "org.apache.atlas.typesystem.json.InstanceSerialization$_Reference",
                          "traitNames": [],
                          "traits": {},
                          "typeName": "hive_column",
                          "values": {
                              "comment": null,
                              "name": "dt",
                              "qualifiedName": "db2pbrscdldkm.table_pbrscdldkm.dt@primary",
                              "table": {
                                  "id": "2d63c256-aee1-47f6-abdc-9db472764585",
                                  "jsonClass": "org.apache.atlas.typesystem.json.InstanceSerialization$_Id",
                                  "state": "ACTIVE",
                                  "typeName": "hive_table",
                                  "version": 0
                              },
                              "type": "string"
                          }
                      },
                      {
                          "id": {
                              "id": "b5e6ce71-21e4-4814-a5da-82b25a71d27c",
                              "jsonClass": "org.apache.atlas.typesystem.json.InstanceSerialization$_Id",
                              "state": "ACTIVE",
                              "typeName": "hive_column",
                              "version": 0
                          },
                          "jsonClass": "org.apache.atlas.typesystem.json.InstanceSerialization$_Reference",
                          "traitNames": [],
                          "traits": {},
                          "typeName": "hive_column",
                          "values": {
                              "comment": "partitioned columns comments.",
                              "name": "country",
                              "qualifiedName": "db2pbrscdldkm.table_pbrscdldkm.country@primary",
                              "table": {
                                  "id": "2d63c256-aee1-47f6-abdc-9db472764585",
                                  "jsonClass": "org.apache.atlas.typesystem.json.InstanceSerialization$_Id",
                                  "state": "ACTIVE",
                                  "typeName": "hive_table",
                                  "version": 0
                              },
                              "type": "string"
                          }
                      }
                  ],
                  "retention": 0,
                  "sd": {
                      "id": {
                          "id": "6a7ce759-6dfa-4130-bde6-9bdeff64da39",
                          "jsonClass": "org.apache.atlas.typesystem.json.InstanceSerialization$_Id",
                          "state": "ACTIVE",
                          "typeName": "hive_storagedesc",
                          "version": 0
                      },
                      "jsonClass": "org.apache.atlas.typesystem.json.InstanceSerialization$_Reference",
                      "traitNames": [],
                      "traits": {},
                      "typeName": "hive_storagedesc",
                      "values": {
                          "bucketCols": null,
                          "compressed": false,
                          "inputFormat": "org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat",
                          "location": "hdfs://localhost:9000/user/hive/warehouse/db2pbrscdldkm.db/table_pbrscdldkm",
                          "numBuckets": -1,
                          "outputFormat": "org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat",
                          "parameters": null,
                          "qualifiedName": "db2pbrscdldkm.table_pbrscdldkm@primary_storage",
                          "serdeInfo": {
                              "jsonClass": "org.apache.atlas.typesystem.json.InstanceSerialization$_Struct",
                              "typeName": "hive_serde",
                              "values": {
                                  "name": null,
                                  "parameters": {
                                      "serialization.format": "1"
                                  },
                                  "serializationLib": "org.apache.hadoop.hive.serde2.avro.AvroSerDe"
                              }
                          },
                          "sortCols": null,
                          "storedAsSubDirectories": false,
                          "table": {
                              "id": "2d63c256-aee1-47f6-abdc-9db472764585",
                              "jsonClass": "org.apache.atlas.typesystem.json.InstanceSerialization$_Id",
                              "state": "ACTIVE",
                              "typeName": "hive_table",
                              "version": 0
                          }
                      }
                  },
                  "tableName": "table_pbrscdldkm",
                  "tableType": "MANAGED_TABLE",
                  "temporary": false,
                  "viewExpandedText": null,
                  "viewOriginalText": null
              }
          },
          "requestId": "qtp1576861390-13 - 7211cfb0-ad04-48d9-948a-4e8dbab65e17"
      }
      

      Hive schema query

      curl http://admin:admin@localhost:21000/api/atlas/lineage/hive/table/db2pbrscdldkm.table_pbrscdldkm@primary/schema | python -m json.tool
        % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                       Dload  Upload   Total   Spent    Left  Speed
      100  2946    0  2946    0     0   4349      0 --:--:-- --:--:-- --:--:--  4351
      {
          "requestId": "qtp1576861390-14 - b74c229b-02fc-4115-9f3f-187e949f3966",
          "results": {
              "dataType": {
                  "attributeDefinitions": [
                      {
                          "dataTypeName": "string",
                          "isComposite": false,
                          "isIndexable": true,
                          "isUnique": false,
                          "multiplicity": {
                              "isUnique": false,
                              "lower": 1,
                              "upper": 1
                          },
                          "name": "name",
                          "reverseAttributeName": null
                      },
                      {
                          "dataTypeName": "string",
                          "isComposite": false,
                          "isIndexable": true,
                          "isUnique": false,
                          "multiplicity": {
                              "isUnique": false,
                              "lower": 1,
                              "upper": 1
                          },
                          "name": "type",
                          "reverseAttributeName": null
                      },
                      {
                          "dataTypeName": "string",
                          "isComposite": false,
                          "isIndexable": true,
                          "isUnique": false,
                          "multiplicity": {
                              "isUnique": false,
                              "lower": 0,
                              "upper": 1
                          },
                          "name": "comment",
                          "reverseAttributeName": null
                      },
                      {
                          "dataTypeName": "hive_table",
                          "isComposite": false,
                          "isIndexable": true,
                          "isUnique": false,
                          "multiplicity": {
                              "isUnique": false,
                              "lower": 0,
                              "upper": 1
                          },
                          "name": "table",
                          "reverseAttributeName": "columns"
                      }
                  ],
                  "hierarchicalMetaTypeName": "org.apache.atlas.typesystem.types.ClassType",
                  "superTypes": [
                      "Referenceable"
                  ],
                  "typeDescription": null,
                  "typeName": "hive_column"
              },
              "query": "hive_table where (name = \"db2pbrscdldkm.table_pbrscdldkm@primary\") columns",
              "rows": [
                  {
                      "$id$": {
                          "$typeName$": "hive_column",
                          "id": "f0115d35-c768-476b-917c-3a243085d1ff",
                          "state": "ACTIVE",
                          "version": 0
                      },
                      "$typeName$": "hive_column",
                      "comment": null,
                      "name": "viewtime",
                      "qualifiedName": "db2pbrscdldkm.table_pbrscdldkm.viewtime@primary",
                      "table": {
                          "$typeName$": "hive_table",
                          "id": "2d63c256-aee1-47f6-abdc-9db472764585",
                          "state": "ACTIVE",
                          "version": 0
                      },
                      "type": "int"
                  },
                  {
                      "$id$": {
                          "$typeName$": "hive_column",
                          "id": "8ca2072f-2b98-4b19-9a17-2e3d125ebbd6",
                          "state": "ACTIVE",
                          "version": 0
                      },
                      "$typeName$": "hive_column",
                      "comment": null,
                      "name": "referrer_url",
                      "qualifiedName": "db2pbrscdldkm.table_pbrscdldkm.referrer_url@primary",
                      "table": {
                          "$typeName$": "hive_table",
                          "id": "2d63c256-aee1-47f6-abdc-9db472764585",
                          "state": "ACTIVE",
                          "version": 0
                      },
                      "type": "string"
                  },
                  {
                      "$id$": {
                          "$typeName$": "hive_column",
                          "id": "9b14560e-6471-4a2e-b495-1f08bfad37d3",
                          "state": "ACTIVE",
                          "version": 0
                      },
                      "$typeName$": "hive_column",
                      "comment": null,
                      "name": "page_url",
                      "qualifiedName": "db2pbrscdldkm.table_pbrscdldkm.page_url@primary",
                      "table": {
                          "$typeName$": "hive_table",
                          "id": "2d63c256-aee1-47f6-abdc-9db472764585",
                          "state": "ACTIVE",
                          "version": 0
                      },
                      "type": "string"
                  },
                  {
                      "$id$": {
                          "$typeName$": "hive_column",
                          "id": "effd4c89-8795-4e54-bd26-f7a182d58c79",
                          "state": "ACTIVE",
                          "version": 0
                      },
                      "$typeName$": "hive_column",
                      "comment": null,
                      "name": "ip",
                      "qualifiedName": "db2pbrscdldkm.table_pbrscdldkm.ip@primary",
                      "table": {
                          "$typeName$": "hive_table",
                          "id": "2d63c256-aee1-47f6-abdc-9db472764585",
                          "state": "ACTIVE",
                          "version": 0
                      },
                      "type": "string"
                  },
                  {
                      "$id$": {
                          "$typeName$": "hive_column",
                          "id": "642b6b3a-1e5a-4a06-844e-6fd71ae036b2",
                          "state": "ACTIVE",
                          "version": 0
                      },
                      "$typeName$": "hive_column",
                      "comment": null,
                      "name": "userid",
                      "qualifiedName": "db2pbrscdldkm.table_pbrscdldkm.userid@primary",
                      "table": {
                          "$typeName$": "hive_table",
                          "id": "2d63c256-aee1-47f6-abdc-9db472764585",
                          "state": "ACTIVE",
                          "version": 0
                      },
                      "type": "bigint"
                  }
              ]
          },
          "tableName": "db2pbrscdldkm.table_pbrscdldkm@primary"
      }
      

      Attachments

        1. ATLAS-772.patch
          3 kB
          Sarath Subramanian

        Issue Links

          Activity

            People

              sarath Sarath Subramanian
              ayubpathan Ayub Pathan
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: