hive.join.cache.size
Default Value: 25000
Added In:
How many rows in the joining tables (except the streaming table) should be cached in memory.
hive.map.aggr
Default Value: true
Added In:
Whether to use map-side aggregation in Hive Group By queries.
mapred.reduce.tasks
Default Value: -1
Added In: 0.1
The default number of reduce tasks per job. Typically set to a prime close to the number of available hosts. Ignored when mapred.job.tracker is “local”. Hadoop set this to 1 by default, whereas hive uses -1 as its default value. By setting this property to -1, Hive will automatically figure out what should be the number of reducers.
hive.exec.reducers.bytes.per.reducer
Default Value: 1000000000
Added In:
Size per reducer. The default is 1G, i.e if the input size is 10G, it will use 10 reducers.
hive.exec.compress.output
Default Value: false
Added In:
This controls whether the final outputs of a query (to a local/hdfs file or a hive table) is compressed. The compression codec and other options are determined from hadoop config variables mapred.output.compress*
hive.exec.compress.intermediate
Default Value: false
Added In:
This controls whether intermediate files produced by hive between multiple map-reduce jobs are compressed. The compression codec and other options are determined from hadoop config variables mapred.output.compress*
hive.exec.parallel
Default Value: false
Added In:
Whether to execute jobs in parallel.