Skip to content

Use FileSystem.openFile with FileStatus to reduce NameNode RPCs#6460

Open
dlmarion wants to merge 1 commit into
apache:2.1from
dlmarion:fs-open-use-file-status
Open

Use FileSystem.openFile with FileStatus to reduce NameNode RPCs#6460
dlmarion wants to merge 1 commit into
apache:2.1from
dlmarion:fs-open-use-file-status

Conversation

@dlmarion

@dlmarion dlmarion commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

This commit changes how we open Hadoop files for reading. Instead of calling Filesystem.open this changes the code to use FileSystem.openFile. The openFile method returns a Builder object that has a setter method for a FileStatus object. HDFS-17593 adds logic to the DFSClient to use the located blocks in the FileStatus to reduce NameNode RPCs to get the block locations. This is useful in code where we happen to already have the FileStatus object for the associated file that we want to open.

This commit changes how we open Hadoop files for reading. Instead
of calling Filesystem.open this changes the code to use
FileSystem.openFile. The openFile method returns a Builder object
that has a setter method for a FileStatus object. HDFS-17593 adds
logic to the DFSClient to use the located blocks in the FileStatus
to reduce NameNode RPCs to get the block locations. This is useful
in code where we happen to already have the FileStatus object for
the associated file that we want to open.
@dlmarion dlmarion added this to the 2.1.6 milestone Jul 2, 2026
@dlmarion dlmarion self-assigned this Jul 2, 2026
@dlmarion

dlmarion commented Jul 2, 2026

Copy link
Copy Markdown
Contributor Author

The places in the 2.1 code that can currently take advantage of this are in the bulk import code, FileUtils, and RecoveryLogsIterator.

FYI that HDFS-17593 has been merged into the un-released Hadoop 3.5.1 and 3.6.0 versions. The changes are compatible with the current Hadoop dependency, but not with older versions. We would need to change our minimum compatibility level for Hadoop from 3.0.3 to 3.3.0.

@dlmarion

dlmarion commented Jul 2, 2026

Copy link
Copy Markdown
Contributor Author

This change requires #6461 to pass the build checks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant