hive 1.1.0版内置函数大全

在hive中使用
show functions 查看所有hive支持的函数
describe function xxx 查看具体xxx函数的定义

以下表格是hive1.1.0支持的所有函数及定义,
其实主要用到的函数并不多,后续另外详细列举平时常用的函数例子。

hive functiondescribe
!! a - Logical not
!=a != b - Returns TRUE if a is not equal to b
%a % b - Returns the remainder when dividing a by b
&a & b - Bitwise and
*a * b - Multiplies a by b
+a + b - Returns a+b
-a - b - Returns the difference a-b
/a / b - Divide a by b
<a < b - Returns TRUE if a is less than b
<=a <= b - Returns TRUE if a is not greater than b
<=>a <=> b - Returns same result with EQUAL(=) operator for non-null operands, but returns TRUE if both are NULL, FALSE if one of the them is NULL
<>a <> b - Returns TRUE if a is not equal to b
=a = b - Returns TRUE if a equals b and false otherwise
==a == b - Returns TRUE if a equals b and false otherwise
>a > b - Returns TRUE if a is greater than b
>=a >= b - Returns TRUE if a is not smaller than b
^a ^ b - Bitwise exclusive or
absabs(x) - returns the absolute value of x
acosacos(x) - returns the arc cosine of x if -1<=x<=1 or NULL otherwise
add_monthsadd_months
anda and b - Logical and
arrayarray(n0, n1…) - Creates an array with the given elements
array_containsarray_contains(array, value) - Returns TRUE if the array contains value.
asciiascii(str) - returns the numeric value of the first character of str
asinasin(x) - returns the arc sine of x if -1<=x<=1 or NULL otherwise
assert_trueassert_true(condition) - Throw an exception if ‘condition’ is not true.
atanatan(x) - returns the atan (arctan) of x (x is in radians)
avgavg(x) - Returns the mean of a set of numbers
base64base64(bin) - Convert the argument from binary to a base 64 string
betweenbetween a [NOT] BETWEEN b AND c - evaluate if a is [not] in between b and c
binbin(n) - returns n in binary
caseCASE a WHEN b THEN c [WHEN d THEN e]* [ELSE f] END - When a = b, returns c; when a = d, return e; else return f
cbrtcbrt(double) - Returns the cube root of a double value.
ceilceil(x) - Find the smallest integer not smaller than x
ceilingceiling(x) - Find the smallest integer not smaller than x
coalescecoalesce(a1, a2, …) - Returns the first non-null argument
collect_listcollect_list(x) - Returns a list of objects with duplicates
collect_setcollect_set(x) - Returns a set of objects with duplicate elements eliminated
compute_statscompute_stats(x) - Returns the statistical summary of a set of primitive type values.
concatconcat(str1, str2, … strN) - returns the concatenation of str1, str2, … strN or concat(bin1, bin2, … binN) - returns the concatenation of bytes in binary data bin1, bin2, … binN
concat_wsconcat_ws(separator, [string
context_ngramscontext_ngrams(expr, array<string1, string2, …>, k, pf) estimates the top-k most frequent n-grams that fit into the specified context. The second parameter specifies a string of words that specify the positions of the n-gram elements, with a null value standing in for a ‘blank’ that must be filled by an n-gram element.
convconv(num, from_base, to_base) - convert num from from_base to to_base
corrcorr(x,y) - Returns the Pearson coefficient of correlation between
a set of number pairs
coscos(x) - returns the cosine of x (x is in radians)
countcount(*) - Returns the total number of retrieved rows, including rows containing NULL values.
count(expr) - Returns the number of rows for which the supplied expression is non-NULL.
count(DISTINCT expr[, expr…]) - Returns the number of rows for which the supplied expression(s) are unique and non-NULL.
covar_popcovar_pop(x,y) - Returns the population covariance of a set of number pairs
covar_sampcovar_samp(x,y) - Returns the sample covariance of a set of number pairs
crc32crc32(str or bin) - Computes a cyclic redundancy check value for string or binary argument and returns bigint value.
create_unioncreate_union(tag, obj1, obj2, obj3, …) - Creates a union with the object for given tag
cume_distThere is no documentation for function ‘cume_dist’
current_databasecurrent_database() - returns currently using database name
current_datecurrent_date() - Returns the current date at the start of query evaluation. All calls of current_date within the same query return the same value.
current_timestampcurrent_timestamp() - Returns the current timestamp at the start of query evaluation. All calls of current_timestamp within the same query return the same value.
current_usercurrent_user() - Returns current user name
date_adddate_add(start_date, num_days) - Returns the date that is num_days after start_date.
date_formatdate_format(date/timestamp/string, fmt) - converts a date/timestamp/string to a value of string in the format specified by the date format fmt.
date_subdate_sub(start_date, num_days) - Returns the date that is num_days before start_date.
datediffdatediff(date1, date2) - Returns the number of days between date1 and date2
dayday(date) - Returns the date of the month of date
dayofmonthdayofmonth(date) - Returns the date of the month of date
dayofweekdayofweek(param) - Returns the day of the week of date/timestamp (1 = Sunday, 2 = Monday, …, 7 = Saturday)
decodedecode(bin, str) - Decode the first argument using the second argument character set
degreesdegrees(x) - Converts radians to degrees
dense_rankThere is no documentation for function ‘dense_rank’
diva div b - Divide a by b rounded to the long integer
ee() - returns E
eltelt(n, str1, str2, …) - returns the n-th string
encodeencode(str, str) - Encode the first argument using the second argument character set
ewah_bitmapewah_bitmap(expr) - Returns an EWAH-compressed bitmap representation of a column.
ewah_bitmap_andewah_bitmap_and(b1, b2) - Return an EWAH-compressed bitmap that is the bitwise AND of two bitmaps.
ewah_bitmap_emptyewah_bitmap_empty(bitmap) - Predicate that tests whether an EWAH-compressed bitmap is all zeros
ewah_bitmap_orewah_bitmap_or(b1, b2) - Return an EWAH-compressed bitmap that is the bitwise OR of two bitmaps.
expexp(x) - Returns e to the power of x
explodeexplode(a) - separates the elements of array a into multiple rows, or the elements of a map into multiple rows and columns
fieldfield(str, str1, str2, …) - returns the index of str in the str1,str2,… list or 0 if not found
find_in_setfind_in_set(str,str_array) - Returns the first occurrence of str in str_array where str_array is a comma-delimited string. Returns null if either argument is null. Returns 0 if the first argument has any commas.
first_valueThere is no documentation for function ‘first_value’
floorfloor(x) - Find the largest integer not greater than x
format_numberformat_number(X, D) - Formats the number X to a format like ‘#,###,###.##’, rounded to D decimal places, and returns the result as a string. If D is 0, the result has no decimal point or fractional part. This is supposed to function like MySQL’s FORMAT
from_unixtimefrom_unixtime(unix_time, format) - returns unix_time in the specified format
from_utc_timestampfrom_utc_timestamp(timestamp, string timezone) - Assumes given timestamp is UTC and converts to given timezone (as of Hive 0.8.0)
get_json_objectget_json_object(json_txt, path) - Extract a json object from path
greatestgreatest(v1, v2, …) - Returns the greatest value in a list of values
hashhash(a1, a2, …) - Returns a hash value of the arguments
hexhex(n, bin, or str) - Convert the argument to hexadecimal
histogram_numerichistogram_numeric(expr, nb) - Computes a histogram on numeric ‘expr’ using nb bins.
hourhour(date) - Returns the hour of date
ifIF(expr1,expr2,expr3) - If expr1 is TRUE (expr1 <> 0 and expr1 <> NULL) then IF() returns expr2; otherwise it returns expr3. IF() returns a numeric or string value, depending on the context in which it is used.
intest in(val1, val2…) - returns true if test equals any valN
in_filein_file(str, filename) - Returns true if str appears in the file
indexindex(a, n) - Returns the n-th element of a
initcapinitcap(str) - Returns str, with the first letter of each word in uppercase, all other letters in lowercase. Words are delimited by white space.
inlineinline( ARRAY( STRUCT()[,STRUCT()] - explodes and array and struct into a table
instrinstr(str, substr) - Returns the index of the first occurance of substr in str
isnotnullisnotnull a - Returns true if a is not NULL and false otherwise
isnullisnull a - Returns true if a is NULL and false otherwise
java_methodjava_method(class,method[,arg1[,arg2…]]) calls method with reflection
json_tuplejson_tuple(jsonStr, p1, p2, …, pn) - like get_json_object, but it takes multiple names and return a tuple. All the input parameters and output column types are string.
lagLAG (scalar_expression [,offset] [,default]) OVER ([query_partition_clause] order_by_clause); The LAG function is used to access data from a previous row.
last_daylast_day(date) - Returns the last day of the month which the date belongs to.
last_valueThere is no documentation for function ‘last_value’
lcaselcase(str) - Returns str with all characters changed to lowercase
leadThere is no documentation for function ‘last_value’
leastleast(v1, v2, …) - Returns the least value in a list of values
lengthlength(str
levenshteinlevenshtein(str1, str2) - This function calculates the Levenshtein distance between two strings.
likelike(str, pattern) - Checks if str matches pattern
lnln(x) - Returns the natural logarithm of x
locatelocate(substr, str[, pos]) - Returns the position of the first occurance of substr in str after position pos
loglog([b], x) - Returns the logarithm of x with base b
log10log10(x) - Returns the logarithm of x with base 10
log2log2(x) - Returns the logarithm of x with base 2
logged_in_userlogged_in_user() - Returns logged in user name
lowerlower(str) - Returns str with all characters changed to lowercase
lpadlpad(str, len, pad) - Returns str, left-padded with pad to a length of len
ltrimltrim(str) - Removes the leading space characters from str
mapmap(key0, value0, key1, value1…) - Creates a map with the given key/value pairs
map_keysmap_keys(map) - Returns an unordered array containing the keys of the input map.
map_valuesmap_values(map) - Returns an unordered array containing the values of the input map.
matchpathThere is no documentation for function ‘last_value’
maxmax(expr) - Returns the maximum value of expr
md5md5(str or bin) - Calculates an MD5 128-bit checksum for the string or binary.
minmin(expr) - Returns the minimum value of expr
minuteminute(date) - Returns the minute of date
monthmonth(date) - Returns the month of date
months_betweenmonths_between(date1, date2) - returns number of months between dates date1 and date2
named_structnamed_struct(name1, val1, name2, val2, …) - Creates a struct with the given field names and values
negativenegative a - Returns -a
next_daynext_day(start_date, day_of_week) - Returns the first date which is later than start_date and named as indicated.
ngramsngrams(expr, n, k, pf) - Estimates the top-k n-grams in rows that consist of sequences of strings, represented as arrays of strings, or arrays of arrays of strings. ‘pf’ is an optional precision factor that controls memory usage.
noopThere is no documentation for function ‘row_number’
noopstreamingThere is no documentation for function ‘row_number’
noopwithmapThere is no documentation for function ‘row_number’
noopwithmapstreamingThere is no documentation for function ‘row_number’
notnot a - Logical not
ntileThere is no documentation for function ‘row_number’
nvlnvl(value,default_value) - Returns default value if value is null else returns value
ora or b - Logical or
parse_urlparse_url(url, partToExtract[, key]) - extracts a part from a URL
parse_url_tupleparse_url_tuple(url, partname1, partname2, …, partnameN) - extracts N (N>=1) parts from a URL.
percent_rankThere is no documentation for function ‘percent_rank’
percentilepercentile(expr, pc) - Returns the percentile(s) of expr at pc (range: [0,1]).pc can be a double or double array
percentile_approxpercentile_approx(expr, pc, [nb]) - For very large data, computes an approximate percentile value from a histogram, using the optional argument [nb] as the number of histogram bins to use. A higher value of nb results in a more accurate approximation, at the cost of higher memory usage.
pipi() - returns pi
pmoda pmod b - Compute the positive modulo
posexplodeposexplode(a) - behaves like explode for arrays, but includes the position of items in the original array
positivepositive a - Returns a
powpow(x1, x2) - raise x1 to the power of x2
powerpower(x1, x2) - raise x1 to the power of x2
printfprintf(String format, Obj… args) - function that can format strings according to printf-style format strings
radiansradians(x) - Converts degrees to radians
randrand([seed]) - Returns a pseudorandom number between 0 and 1
rankThere is no documentation for function ‘row_number’
reflectThere is no documentation for function ‘row_number’
reflect2There is no documentation for function ‘row_number’
regexpstr regexp regexp - Returns true if str matches regexp and false otherwise
regexp_extractregexp_extract(str, regexp[, idx]) - extracts a group that matches regexp
regexp_replaceregexp_replace(str, regexp, rep) - replace all substrings of str that match regexp with rep
repeatrepeat(str, n) - repeat str n times
reversereverse(str) - reverse str
rlikestr rlike regexp - Returns true if str matches regexp and false otherwise
roundround(x[, d]) - round x to d decimal places
row_numberThere is no documentation for function ‘row_number’
rpadrpad(str, len, pad) - Returns str, right-padded with pad to a length of len
rtrimrtrim(str) - Removes the trailing space characters from str
secondsecond(date) - Returns the second of date
sentencessentences(str, lang, country) - Splits str into arrays of sentences, where each sentence is an array of words. The ‘lang’ and’country’ arguments are optional, and if omitted, the default locale is used.
sha2sha2(string/binary, len) - Calculates the SHA-2 family of hash functions (SHA-224, SHA-256, SHA-384, and SHA-512).
signsign(x) - returns the sign of x )
sinsin(x) - returns the sine of x (x is in radians)
sizesize(a) - Returns the size of a
sort_arraysort_array(array(obj1, obj2,…)) - Sorts the input array in ascending order according to the natural ordering of the array elements.
soundexsoundex(string) - Returns soundex code of the string.
spacespace(n) - returns n spaces
splitsplit(str, regex) - Splits str around occurances that match regex
sqrtsqrt(x) - returns the square root of x
stackstack(n, cols…) - turns k columns into n rows of size k/n each
stdstd(x) - Returns the standard deviation of a set of numbers
stddevstddev(x) - Returns the standard deviation of a set of numbers
stddev_popstddev_pop(x) - Returns the standard deviation of a set of numbers
stddev_sampstddev_samp(x) - Returns the sample standard deviation of a set of numbers
str_to_mapstr_to_map(text, delimiter1, delimiter2) - Creates a map by parsing text
structstruct(col1, col2, col3, …) - Creates a struct with the given field values
substrsubstr(str, pos[, len]) - returns the substring of str that starts at pos and is of length len orsubstr(bin, pos[, len]) - returns the slice of byte array that starts at pos and is of length len
substringsubstring(str, pos[, len]) - returns the substring of str that starts at pos and is of length len orsubstring(bin, pos[, len]) - returns the slice of byte array that starts at pos and is of length len
sumsum(x) - Returns the sum of a set of numbers
tantan(x) - returns the tangent of x (x is in radians)
to_dateto_date(expr) - Extracts the date part of the date or datetime expression expr
to_unix_timestampto_unix_timestamp(date[, pattern]) - Returns the UNIX timestamp
to_utc_timestampto_utc_timestamp(timestamp, string timezone) - Assumes given timestamp is in given timezone and converts to UTC (as of Hive 0.8.0)
translatetranslate(input, from, to) - translates the input string by replacing the characters present in the from string with the corresponding characters in the to string
trimtrim(str) - Removes the leading and trailing space characters from str
trunctrunc(date, fmt) - Returns returns date with the time portion of the day truncated to the unit specified by the format model fmt. If you omit fmt, then date is truncated to the nearest day. It now only supports ‘MONTH’/‘MON’/‘MM’ and ‘YEAR’/‘YYYY’/‘YY’ as format.
ucaseucase(str) - Returns str with all characters changed to uppercase
unbase64unbase64(str) - Convert the argument from a base 64 string to binary
unhexunhex(str) - Converts hexadecimal argument to binary
unix_timestampunix_timestamp(date[, pattern]) - Converts the time to a number
upperupper(str) - Returns str with all characters changed to uppercase
uuiduuid() - Returns a universally unique identifier (UUID) string.
var_popvar_pop(x) - Returns the variance of a set of numbers
var_sampvar_samp(x) - Returns the sample variance of a set of numbers
variancevariance(x) - Returns the variance of a set of numbers
versionversion() - Returns the Hive build version string - includes base version and revision.
weekofyearweekofyear(date) - Returns the week of the year of the given date. A week is considered to start on a Monday and week 1 is the first week with >3 days.
whenCASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END - When a = true, returns b; when c = true, return d; else return e
windowingtablefunctionThere is no documentation for function ‘windowingtablefunction’
xpathxpath(xml, xpath) - Returns a string array of values within xml nodes that match the xpath expression
xpath_booleanxpath_boolean(xml, xpath) - Evaluates a boolean xpath expression
xpath_doublexpath_double(xml, xpath) - Returns a double value that matches the xpath expression
xpath_floatxpath_float(xml, xpath) - Returns a float value that matches the xpath expression
xpath_intxpath_int(xml, xpath) - Returns an integer value that matches the xpath expression
xpath_longxpath_long(xml, xpath) - Returns a long value that matches the xpath expression
xpath_numberxpath_number(xml, xpath) - Returns a double value that matches the xpath expression
xpath_shortxpath_short(xml, xpath) - Returns a short value that matches the xpath expression
xpath_stringxpath_string(xml, xpath) - Returns the text contents of the first xml node that matches the xpath expression
yearyear(date) - Returns the year of date
|a | b - Bitwise or
~~ n - Bitwise not
  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值