Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
-
Spark 1.5 doc/QA sprint
Description
Create a list of functions that is on this page but not in SQL/DataFrame.
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF
Here's the list of missing stuff:
basic
between: added in 1.4 bitwiseAND: added in 1.4 bitwiseOR: added in 1.4 bitwiseXOR: added in 1.4 bitwiseNOT: added in 1.4
math
round(DOUBLE a) round(DOUBLE a, INT d) Returns a rounded to d decimal places. log2 sqrt(string column name) bin hex(long), hex(string), hex(binary) unhex(string) -> binary conv pmod factorial -toDeg -> toDegrees-: added in 1.4 -toRad -> toRadians-: added in 1.4 e() pi() shiftleft(int or long) shiftright(int or long) shiftrightunsigned(int or long)
collection functions
sort_array(array)
size(map, array)
map_values(map<k,v>): array<v>
map_keys(map<k,v>):array<k>
array_contains(array<t>, value): boolean
date functions
from_unixtime(long, string): string unix_timestamp(): long unix_timestamp(date): long year(date): int month(date): int day(date): int dayofmonth(date); int hour(timestamp): int minute(timestamp): int second(timestamp): int weekofyear(date): int date_add(date, int) date_sub(date, int) from_utc_timestamp(timestamp, string timezone): timestamp current_date(): date current_timestamp(): timestamp add_months(string start_date, int num_months): string last_day(string date): string next_day(string start_date, string day_of_week): string trunc(string date[, string format]): string months_between(date1, date2): double date_format(date/timestamp/string ts, string fmt): String
conditional functions
if(boolean testCondition, T valueTrue, T valueFalseOrNull): T nvl(T value, T default_value): T greatest(T v1, T v2, …): T least(T v1, T v2, …): T
string functions
ascii(string str): int base64(binary): string concat(string|binary A, string|binary B…): string | binary concat_ws(string SEP, string A, string B…): string concat_ws(string SEP, array<string>): string decode(binary bin, string charset): string encode(string src, string charset): binary find_in_set(string str, string strList): int format_number(number x, int d): string length(string): int instr(string str, string substr): int locate(string substr, string str[, int pos]): int lower(string), lcase(string) lpad(string str, int len, string pad): string ltrim(string): string parse_url(string urlString, string partToExtract [, string keyToExtract]): string printf(String format, Obj... args): string regexp_extract(string subject, string pattern, int index): string regexp_replace(string INITIAL_STRING, string PATTERN, string REPLACEMENT): string repeat(string str, int n): string reverse(string A): string rpad(string str, int len, string pad): string space(int n): string split(string str, string pat): array str_to_map(text[, delimiter1, delimiter2]): map<string, string> trim(string A): string unbase64(string str): binary upper(string A) ucase(string A): string levenshtein(string A, string B: int soundex(string A): string
Misc
hash(a1[, a2…]): int
text
context_ngrams(array<array<string>>, array<string>, int K, int pf): array<struct<string,double>> ngrams(array<array<string>>, int N, int K, int pf): array<struct<string,double>> sentences(string str, string lang, string locale): array<array<string>>
UDAF
var_samp stddev_pop stddev_samp covar_pop covar_samp corr percentile: array<double> percentile_approx: array<double> histogram_numeric: array<struct {'x','y'}> collect_set <— we have hashset collect_list ntile
Attachments
Issue Links
- blocks
-
SPARK-8159 Improve expression function coverage (Spark 1.5)
- Resolved