Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-8210

Support reading/writing tiny RDBMS tables

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • None
    • None
    • None
    • None
    • ghx-label-6

    Description

      It'd be quite helpful if Impala can read/write some tiny RDBMS(MySQL/PostgreSQL/SQLServer) tables. Parallelism or efficiency can be ignored since the target tables are all tiny. Some use cases:

      • Some dimension tables in Hive are snapshots of RDBMS tables. Users want to query the difference between the snapshot in Hive and the latest data in RDBMS.
      • Users want to run queries joining Hive fact tables and the latest data in RDBMS.
      • Users hope their query results can be ingested into MySQL directly

      Implement an "External Data Source" as a generic JDBC wrapper for RDBMS data sources could be a solution. The drawback is that "External Data Source" requires users to create tables in Impala for each RDBMS table they want to access. Users can't list tables (show tables) of a schema(database). 

      There're other solutions that support RDBMS directly. For example https://www.slideshare.net/liuknag/cloudera-impala-postgre-sql-29025605

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              stigahuang Quanlong Huang
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: