Skip to content

HIVE-29667: [hiveACIDRepl] User is unable to assign yarn queue name t…#6543

Open
Rajesh1255 wants to merge 4 commits into
apache:masterfrom
Rajesh1255:HIVE-29667
Open

HIVE-29667: [hiveACIDRepl] User is unable to assign yarn queue name t…#6543
Rajesh1255 wants to merge 4 commits into
apache:masterfrom
Rajesh1255:HIVE-29667

Conversation

@Rajesh1255

@Rajesh1255 Rajesh1255 commented Jun 15, 2026

Copy link
Copy Markdown

What changes were proposed in this pull request?

When a queue is mentioned for mapreduce task, task should use that queue only (valid queue).

Why are the changes needed?

User mentions a custom scheduler pool for Hive ACID, but distcp was running on default queue only which is not expected behaviour. Fixing this bug here.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Unit Tests

@sonarqubecloud

Copy link
Copy Markdown

@Neer393 Neer393 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 (Non-binding)

@ayushtkn

ayushtkn commented Jul 3, 2026

Copy link
Copy Markdown
Member

not sure if it is correct, there was some changes around this can check here:

* This method ensures if there is an explicit tez.queue.name set, the hadoop shim will submit jobs
* to the same yarn queue. This solves a security issue where e.g settings have the following values:
* tez.queue.name=sample
* hive.server2.tez.queue.access.check=true
* In this case, when a query submits Tez DAGs, the tez client layer checks whether the end user has access to
* the yarn queue 'sample' via YarnQueueHelper, but this is not respected in case of MR jobs that run
* even if the query execution engine is Tez. E.g. an EXPORT TABLE can submit DistCp MR jobs at some stages when
* certain criteria are met. We tend to restrict the setting of mapreduce.job.queuename in order to bypass this
* security flaw, and even the default queue is unexpected if we explicitly set tez.queue.name.
* Under the hood the desired behavior is to have DistCp jobs in the same yarn queue as other parts
* of the query. Most of the time, the user isn't aware that a query involves DistCp jobs, hence isn't aware
* of these details.
*/
protected void ensureMapReduceQueue(Configuration conf) {
String queueName = conf.get(TezConfiguration.TEZ_QUEUE_NAME);
boolean isTez = "tez".equalsIgnoreCase(conf.get("hive.execution.engine"));
boolean shouldMapredJobsFollowTezQueue = conf.getBoolean("hive.mapred.job.follow.tez.queue", false);
LOG.debug("Checking tez.queue.name {}, isTez: {}, shouldMapredJobsFollowTezQueue: {}", queueName, isTez,
shouldMapredJobsFollowTezQueue);
if (isTez && shouldMapredJobsFollowTezQueue && queueName != null && queueName.length() > 0) {
LOG.info("Setting mapreduce.job.queuename (current: '{}') to become tez.queue.name: '{}'",
conf.get(MRJobConfig.QUEUE_NAME), queueName);
conf.set(MRJobConfig.QUEUE_NAME, queueName);
}
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants