Skip to content

feat(storage): Add HDFS backend via opendal services-hdfs-native#2441

Open
jordepic wants to merge 1 commit into
apache:mainfrom
jordepic:main
Open

feat(storage): Add HDFS backend via opendal services-hdfs-native#2441
jordepic wants to merge 1 commit into
apache:mainfrom
jordepic:main

Conversation

@jordepic
Copy link
Copy Markdown

@jordepic jordepic commented May 12, 2026

Which issue does this PR close?

What changes are included in this PR?

Add an opendal-hdfs-native cargo feature plus
OpenDalStorageFactory::Hdfs and OpenDalStorage::Hdfs variants in
iceberg-storage-opendal. The variant uses OpenDAL's
services-hdfs-native, which talks the HDFS RPC protocol directly in
pure Rust (no JNI / libhdfs).

The wire-up in crates/storage/opendal/src/lib.rs and resolving.rs
follows the same factory + variant + scheme-routing pattern used by
every other OpenDAL-backed storage in that crate (s3, gcs, oss,
azdls), to the letter.

Mirroring the Java HadoopFileIO, no iceberg-level HDFS configuration
is exposed: there is no HdfsConfig type and no `hdfs.*` property
constants. HDFS topology (HA name services, namenode RPC addresses)
and Kerberos authentication are entirely delegated to hdfs-native
and its environment. hdfs-native reads core-site.xml / hdfs-site.xml
from HADOOP_CONF_DIR (or HADOOP_HOME/etc/hadoop); Kerberos auth is
delegated to libgssapi_krb5 via the standard KRB5CCNAME /
KRB5_CONFIG env. These deployment requirements are documented in the
opendal backend module (crates/storage/opendal/src/hdfs.rs). The
opendal-hdfs-native feature is added to opendal-all but not to the
default set; it requires libgssapi_krb5 installed at the OS level
for the runtime dlopen (brew install krb5 on macOS,
apt install libgssapi-krb5-2 on Debian).

Path parsing uses url::Url. Paths with an authority
(`hdfs://nameservice1/foo`) build an operator targeted at that name
node; authority-less paths (`hdfs:///foo`) build an operator without
an explicit name node, so hdfs-native picks up `fs.defaultFS` from
the loaded Hadoop config - the same behavior Java HadoopFileIO gets
via Hadoop's Configuration. OpenDalStorage::Hdfs holds a
per-name-node operator cache
(Arc<RwLock<HashMap<Option<String>, Operator>>> keyed by
`Some("hdfs://<authority>")` for authority-bearing paths and `None`
for authority-less paths). The cache uses stdlib primitives
(Arc + RwLock + HashMap) rather than a third-party concurrent map
crate; for the typical workload (one to a few nameservices,
read-heavy after warmup) the RwLock-guarded map is more than
adequate and avoids an additional workspace dependency. OpenDAL's
RetryLayer is applied uniformly to every operator at the call site;
no bespoke retry logic.

The integration-test docker-compose fixture mirrors apache/opendal's
own HDFS fixture (fixtures/hdfs/docker-compose-hdfs-cluster.yml):
the same bde2020 hadoop-namenode and hadoop-datanode images, both
running in host network mode. Host networking is required because
hdfs-native 0.13.5 connects to the DataNode by IP from
DatanodeIdProto.ip_addr; on a docker bridge the DN would register
with an unroutable bridge IP. Host networking works on Linux CI
runners but has known issues on macOS / Windows Docker Desktop, so
the integration tests are marked `#[ignore]` and CI explicitly opts
them in via `cargo nextest --run-ignored=only -E 'test(file_io_hdfs)'`
in the existing Linux-only tests job in .github/workflows/ci.yml.

Are these changes tested?

Yes, we have unit tests to ensure appropriate caching per name service as well as integration tests at parity with other storage backends.

@jordepic
Copy link
Copy Markdown
Author

jordepic commented May 12, 2026

@blackmwk , @kevinjqliu , @CTTY would you mind taking a look at this guy? My org won't really be able to pick up iceberg-rust until we can read tables from hadoop.

Made a few deliberate choices here:

  1. Went as rust-native as possible with test ignoring, unfortunately with HDFS's NN/DN design you need host networking with docker images (or some hacky code) to have the NN return the DN's address. I copied OpenDAL's tests, which I think sets a good precedent, as opposed to trying to build an in-process mini cluster.
  2. Kept the cache for HDFS operators very simple and avoided external dependencies (needed because these maintain RPC connections). I completely removed HDFS configuration because hdfs-native reads the conf files on the host itself. I tried to follow all precedent set by GCS, S3, etc storage implementations.
  3. All the code should be in a passing state by the time you guys take a look, really hoping to streamline the review here and be very intentional about what I added since I know you all have large review backlogs.

Thank you very much!

Add an opendal-hdfs-native cargo feature plus
OpenDalStorageFactory::Hdfs and OpenDalStorage::Hdfs variants in
iceberg-storage-opendal. The variant uses OpenDAL's
services-hdfs-native, which talks the HDFS RPC protocol directly in
pure Rust (no JNI / libhdfs).

The wire-up in crates/storage/opendal/src/lib.rs and resolving.rs
follows the same factory + variant + scheme-routing pattern used by
every other OpenDAL-backed storage in that crate (s3, gcs, oss,
azdls), to the letter.

Mirroring the Java HadoopFileIO, no iceberg-level HDFS configuration
is exposed: there is no HdfsConfig type and no `hdfs.*` property
constants. HDFS topology (HA name services, namenode RPC addresses)
and Kerberos authentication are entirely delegated to hdfs-native
and its environment. hdfs-native reads core-site.xml / hdfs-site.xml
from HADOOP_CONF_DIR (or HADOOP_HOME/etc/hadoop); Kerberos auth is
delegated to libgssapi_krb5 via the standard KRB5CCNAME /
KRB5_CONFIG env. These deployment requirements are documented in the
opendal backend module (crates/storage/opendal/src/hdfs.rs). The
opendal-hdfs-native feature is added to opendal-all but not to the
default set; it requires libgssapi_krb5 installed at the OS level
for the runtime dlopen (brew install krb5 on macOS,
apt install libgssapi-krb5-2 on Debian).

Path parsing uses url::Url. Paths with an authority
(`hdfs://nameservice1/foo`) build an operator targeted at that name
node; authority-less paths (`hdfs:///foo`) build an operator without
an explicit name node, so hdfs-native picks up `fs.defaultFS` from
the loaded Hadoop config - the same behavior Java HadoopFileIO gets
via Hadoop's Configuration. OpenDalStorage::Hdfs holds a
per-name-node operator cache
(Arc<RwLock<HashMap<Option<String>, Operator>>> keyed by
`Some("hdfs://<authority>")` for authority-bearing paths and `None`
for authority-less paths). The cache uses stdlib primitives
(Arc + RwLock + HashMap) rather than a third-party concurrent map
crate; for the typical workload (one to a few nameservices,
read-heavy after warmup) the RwLock-guarded map is more than
adequate and avoids an additional workspace dependency. OpenDAL's
RetryLayer is applied uniformly to every operator at the call site;
no bespoke retry logic.

The integration-test docker-compose fixture mirrors apache/opendal's
own HDFS fixture (fixtures/hdfs/docker-compose-hdfs-cluster.yml):
the same bde2020 hadoop-namenode and hadoop-datanode images, both
running in host network mode. Host networking is required because
hdfs-native 0.13.5 connects to the DataNode by IP from
DatanodeIdProto.ip_addr; on a docker bridge the DN would register
with an unroutable bridge IP. Host networking works on Linux CI
runners but has known issues on macOS / Windows Docker Desktop, so
the integration tests are marked `#[ignore]` and CI explicitly opts
them in via `cargo nextest --run-ignored=only -E 'test(file_io_hdfs)'`
in the existing Linux-only tests job in .github/workflows/ci.yml.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support tables which live on HDFS

1 participant