Description
Today, a query like this:
INSERT INTO TABLE students VALUES ('fred flintstone', 35, 1.28), ('barney rubble', 32, 2.32);
spins up a TezAM and containers. I believe this is not optimal, even if we already have an tez application running. Not to mention setups where only a hiveserver2 is alive and TezAMs + LLAP executors are spun up on demand, e.g. Cloudera's Data Warehouse, but I'm assuming other companies might do a similar thing in the cloud.
With this optimization a possible risk is to overwhelm Hiveserver2 with such queries, this scenario should be handled with care.
My proposal is to maintain a local tez session pool (default size 0, recommended is 1...4) in hs2, and let's identify "trivial queries" compile-time that currently needs tez application (like the INSERT INTO above).
The first implementation can include only simply INSERT INTO queries, and we can decide the rest later.