Skip to content

SOLR-18130: new param "solrConnection" in solrj-streaming, support HTTP#4320

Open
vyatkinv wants to merge 45 commits into
apache:mainfrom
vyatkinv:SOLR-18130-solrj-streaming-solr-cloud-parameter
Open

SOLR-18130: new param "solrConnection" in solrj-streaming, support HTTP#4320
vyatkinv wants to merge 45 commits into
apache:mainfrom
vyatkinv:SOLR-18130-solrj-streaming-solr-cloud-parameter

Conversation

@vyatkinv
Copy link
Copy Markdown
Contributor

@vyatkinv vyatkinv commented Apr 22, 2026

https://issues.apache.org/jira/browse/SOLR-18130

Description

This PR updates the solrj-streaming module by replacing all usages of zkHost with solrCloud to enable support for HTTP-based quorum configurations.

Solution

Parameters, fields and variables in solrj-streaming, named as zkHost renamed to solrCloud
For backward compatibility, specifying zkHost is still supported.
A shared method has been introduced in an abstract class to resolve the effective solrCloud value using a priority-based approach (e.g., explicit parameter → legacy zkHost → default Zookeeper host).

Tests

Introduced method, that randomly provides either an HTTP or Zookeeper record in different scenarios. Additionally, all zkHost parameters have been replaced with connectionString, using getSolrConnection().toString() for substitution.

Checklist

Please review the following and check all that apply:

  • I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
  • I have created a Jira issue and added the issue ID to my pull request title.
  • I have given Solr maintainers access to contribute to my PR branch. (optional but recommended, not available for branches on forks living under an organisation)
  • I have developed this patch against the main branch.
  • I have run ./gradlew check.
  • I have added tests for my changes.
  • I have added documentation for the Reference Guide
  • I have added a changelog entry for my change

@vyatkinv vyatkinv marked this pull request as draft April 22, 2026 11:15
@vyatkinv vyatkinv changed the title [WIP] SOLR-18130: rename parameter "zkHost" to "solrCloud" in "solrj-streaming" WIP: SOLR-18130: rename parameter "zkHost" to "solrCloud" in "solrj-streaming" Apr 22, 2026
@vyatkinv vyatkinv changed the title WIP: SOLR-18130: rename parameter "zkHost" to "solrCloud" in "solrj-streaming" SOLR-18130: rename parameter "zkHost" to "solrCloud" in "solrj-streaming" Apr 22, 2026
@vyatkinv vyatkinv force-pushed the SOLR-18130-solrj-streaming-solr-cloud-parameter branch from 27f4ae5 to c877f2a Compare April 22, 2026 13:36
@epugh
Copy link
Copy Markdown
Contributor

epugh commented Apr 22, 2026

Shouldn't solrCloud be connectionString? I was hoping down the road we could put more into the connection string like username password or other details that are needed....

@vyatkinv
Copy link
Copy Markdown
Contributor Author

vyatkinv commented Apr 23, 2026

Shouldn't solrCloud be connectionString? I was hoping down the road we could put more into the connection string like username password or other details that are needed....

@dsmiley previously suggested naming this parameter solrCloud in the jira ticket comments. However, it looks like there might be some disagreement here, so it would be good to reach a consensus before finalizing the name.

@epugh
Copy link
Copy Markdown
Contributor

epugh commented Apr 23, 2026

Shouldn't solrCloud be connectionString? I was hoping down the road we could put more into the connection string like username password or other details that are needed....

@dsmiley previously suggested naming this parameter solrCloud in the jira ticket comments. However, it looks like there might be some disagreement here, so it would be good to reach a consensus before finalizing the name.

Agreed... I believe the intent is to specify how you connect to your running Solr? In the first PR I believe it was connectionString which makes sense as "hey, I'm going to have to parse this thing, and it tells me how to connect". @dsmiley ?

Copy link
Copy Markdown
Contributor

@dsmiley dsmiley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work here! Clearly this was a bit more involved than a rename ;-)

"connectionString" is somewhat long and it's ambiguous as to what we're even connecting to. But it's not bad. How about "solrCloudString"? If you don't like that, I'll capitulate and be satisfied with "connectionString".

Incorporating auth is out-of-scope but I'll say now that I'm suspicious that it makes sense to embed secret credentials into this. A conversation for another time.

I imagine there should be some ref-guide updates on this in addition to major-changes-in-solr-10.adoc.

I can see the potential of a follow-on issue to update CLIUtils.getZkHost (and thus CLI commands that call it) to instead resolve a SolrCloud connection string that is not necessarily ZK.

Answers to the PR questions:

  1. It's a test issue -- good catch. Glad you made resolving this mandatory. I wouldn't call this "mandatory SolrCloud parameter" since the param can be blank when it's used inside Solr, defaulting to the server's connection.
  2. Normalize; do not round-trip with "zkHost".
  3. IMO SolrParams should be used for "parameters" like this, not Map. Perhaps too much scope creep here but I leave that to you.
  4. Let's not add support for that to ScoreNodes now, albeit a single comment where we initialize it would be nice. Like "for now limited to the current cluster but we could expand support if needed".

I suggest that you change StreamFactory to not mention "ZK" whatsoever, instead using the name we choose (e.g. getDefaultCloudString). For backwards-compatibility, however, the older method names should exist as deprecated.

Comment thread changelog/unreleased/SOLR-18130-solrj-streaming-solr-cloud-parameter.yml Outdated
@epugh
Copy link
Copy Markdown
Contributor

epugh commented Apr 30, 2026

Hi all... So I'm following up after @dsmiley pinged me (thanks!)... What I am excited about this is the opportunity to make it simpler for non Solr experts to access solr. I don't think that you should need to know if it's solrCloud or standalone or zk that you are using, it's just a "connection string". Yes, every connection string has rules of formatting... I would like a name like solrConnection or solrConnString but not something that is as specific as "solrCloudConnection" that ties us to a specific mode of Solr. I know that "Streaming Expressions" is a solr cloud specific feature. However, from a user perspective, they may not know that, or care.

Someday I hpoe to see Streaming Expressions pushed more towards our Data Scientist / ML user base, and they don't care about Cloud versus non Cloud mode, and also don't care about zkHost... So, having a property that speaks to their concern "how do I connect" is perfect. Could we have a property like solr or connection?

@dsmiley
Copy link
Copy Markdown
Contributor

dsmiley commented Apr 30, 2026

+1 to "solrConnection"

Could we have a property like solr or connection

huh; you just agreed to "solrConnection"?

@epugh
Copy link
Copy Markdown
Contributor

epugh commented May 1, 2026

I was just wondering if in the context of a stream expression if saying "connection" was enough. I like "solrConnection" too. We might look at other connections and see what they are named...

@dsmiley
Copy link
Copy Markdown
Contributor

dsmiley commented May 1, 2026

Then lets move forward @vyatkinv with "solrConnection" throughout. If at the last minute Eric gets us to change our minds again, we shouldn't need to subsequently change the vast majority of this PR since "solrConnection" is a perfectly fine parameter & field & local-var name in all places we have such.

@vyatkinv
Copy link
Copy Markdown
Contributor Author

vyatkinv commented May 2, 2026

I’ve addressed most of the review comments. Summary of changes in the latest commits:

  1. Renamed solrCloud parameter to solrConnection.

  2. Switched Streams to use CloudSolrClientConnection instead of a raw string.
    To do this, I broke backward compatibility for constructors that accepted parameters directly. I believe this is acceptable since end users are not expected to instantiate stream implementations directly—they typically pass string expressions into StreamFactory. Most of these constructors are used internally, in tests, or not used at all.
    If needed, I can restore the old constructors as @Deprecated and delegate them to the new ones based on CloudSolrClientConnection.

  3. Fixed minor issues related to formatting and parameter ordering.

  4. Renamed getSolrConnection to buildSolrConnection.

  5. Added a new method to CloudSolrClient.Builder so internal Solr code can use the typed builder, while keeping the string-based API for end users.
    For the same reason, I replaced the string-based getCloudSolrClient with a typed variant.

  6. One concern: currently zkHost can accept HTTP URLs.
    It seems more consistent if:

    • zkHost only allows ZooKeeper connection strings,
    • Solr URLs only allow HTTP(S),
    • and the new solrConnection supports both.
      Should we add validation in StreamTool, the JDBC driver, and buildSolrConnection() to enforce that zkHost is strictly ZooKeeper? And allow both formats only via the new parameter?
  7. Renamed parameter-building method to buildSolrParamsExcept and refactored it to remove the Map<String, String> variant, reusing the unified implementation.

As follow-up work (either separate issue or PR), I suggest:
a) adding support for HTTP quorum in the StreamTool CLI
b) adding support for HTTP quorum in the JDBC driver

PS: Documentation and new tests are not added yet.

Comment thread changelog/unreleased/SOLR-18130-cli-tools-solr-connection-support.yml Outdated
Comment thread changelog/unreleased/SOLR-18130-cli-tools-solr-connection-support.yml Outdated
Comment thread changelog/unreleased/SOLR-18130-join-query-parser-solr-connection-support.yml Outdated
Comment thread changelog/unreleased/SOLR-18130-solr-sql-jdbc-driver-solr-connection-support.yml Outdated
Comment thread solr/core/src/java/org/apache/solr/core/InternalSolrClientCache.java Outdated
Comment thread solr/core/src/test/org/apache/solr/core/InternalSolrClientCacheTest.java Outdated
Comment on lines -107 to -110
// The same ZK Host is used, so the ZK ACLs should still be applied
cache.setDefaultZKHost(zkClient().getZkServerAddress() + "/random/chroot");
cache.getCloudSolrClient(
CloudSolrClient.CloudSolrClientConnection.parse(zkClient().getZkServerAddress()));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a distinction we're loosing but I don't think it matters.
CC @HoustonPutman

Comment thread solr/solrj-streaming/src/java/org/apache/solr/client/solrj/io/sql/DriverImpl.java Outdated
registerV2Api(packageLoader.getPackageAPI().readAPI);
registerV2Api(ZookeeperRead.class);
} else {
solrClientCache = new InternalSolrClientCache(solrClientProvider.getSolrClient());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's have no solrClientCache when no SolrCloud. SCC is linked to solrj-streaming, and it's only used for SolrCloud functions within solrj-streaming.

Copy link
Copy Markdown
Contributor Author

@vyatkinv vyatkinv May 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused by the fact that StreamHandler, SqlHandler, ExportHandler, and others use SolrClientCache regardless of mode. At least they assign it to a field and pass it to the streaming context. It seems that eliminating SolrClientCache in non-cloud mode is not a trivial task.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try testing what happens when these handlers are called in standalone mode without CloudClientCache. If everything works, I think it's safe to remove it.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is actually going to be supported, it'd need a test. If there is no test... don't worry about it. I remember when streaming expressions came about... it kind of lightly requires SolrCloud... not sure anyone put in any real concerted effort to truly not depend on SolrCloud. That said, if we notice easy/simple ways to help it potentially work then that's a good thing.

Copy link
Copy Markdown
Contributor

@epugh epugh May 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that none of those features were designed to work in non cloud mode, the assumption was that they wouldn't. There also was a relunctance to I think to start supporting it in non cloud mode if that isnt' a common use case, as then it increases our testing and maintenance burden. As a community, we don't have a shared understanding of what is in cloud mode versus not, and what our overall goal is. You are seeing this in the code ;-).

@dsmiley
Copy link
Copy Markdown
Contributor

dsmiley commented May 18, 2026

FYI consider #3740 (comment) (or skip for now as you wish)

@epugh
Copy link
Copy Markdown
Contributor

epugh commented May 18, 2026

I didn't notice at first that there are also integration tests for running CLI utilities (*.bats). Do I need to add tests for the --solr-connection parameter there?

Since we are moving to --solr-connection, please do update the bats tests.

Maybe one test just to prove that the bin/solr scripts properly pass through --solr-connection?

@epugh
Copy link
Copy Markdown
Contributor

epugh commented May 18, 2026

Looks like we refer in the error message to solr-connection, so we probably do need to think about it:

image

We could introduce it as --solr-connection and then deprecate --solr-url and then in the future make -s be the shortcut for --solr-connection?

@epugh
Copy link
Copy Markdown
Contributor

epugh commented May 18, 2026

I did some testing on this branch on what commands connect direct to ZK, and the results where what I expected.

docker run --name zookeeper --restart always -d -p 2181:2181 zookeeper 

# fails
bin/solr zk mkroot mychroot

# succeeds
bin/solr zk mkroot --zk-host localhost:2181 mychroot

# succeeds
bin/solr zk ls --zk-host localhost:2181 /

# fails due to unknown option name
bin/solr zk ls --solr-connection localhost:2181 /

# succeeds
bin/solr zk upconfig -z localhost:2181 -n mynewconfig -d ./server/solr/configsets/sample_techproducts_configs

# succeeds
bin/solr zk downconfig -z localhost:2181 -n mynewconfig -d /tmp

# succeeds
bin/solr cluster -z localhost:2181 --property urlScheme --value https 

vyatkinv added 5 commits May 24, 2026 15:27
# Conflicts:
#	solr/core/src/java/org/apache/solr/core/CoreContainer.java
#	solr/solrj-streaming/src/java/org/apache/solr/client/solrj/io/SolrClientCache.java
@vyatkinv
Copy link
Copy Markdown
Contributor Author

I've taken all the comments into account, and this evening I'll try to add all the necessary integration tests and check whether it's safe to remove the solrClientCache initialization for standalone mode.

@vyatkinv
Copy link
Copy Markdown
Contributor Author

vyatkinv commented May 24, 2026

Integration tests via ./gradlew integrationTests does not works after removing metrics-core in SOLR-17855:

2026-05-24 13:06:33.753 INFO  (main) [] o.a.s.s.CoreContainerProvider Using logger factory org.apache.logging.slf4j.Log4jLoggerFactory
2026-05-24 13:06:33.760 INFO  (main) [] o.a.s.s.CoreContainerProvider  ___      _       Welcome to Apache Solr™ version 11.0.0-SNAPSHOT e804e358509b5278ff21252689e3632770a6fb3b [snapshot build, details omitted]
2026-05-24 13:06:33.760 INFO  (main) [] o.a.s.s.CoreContainerProvider / __| ___| |_ _   Starting in cloud mode on port 38229
2026-05-24 13:06:33.760 INFO  (main) [] o.a.s.s.CoreContainerProvider \__ \/ _ \ | '_|  Install dir: /home/vova/solr/solr/packaging/build/solr-11.0.0-SNAPSHOT
2026-05-24 13:06:33.760 INFO  (main) [] o.a.s.s.CoreContainerProvider |___/\___/_|_|    Start time: 2026-05-24T13:06:33.760931721Z
2026-05-24 13:06:33.763 INFO  (main) [] o.a.s.s.CoreContainerProvider Solr started with "-XX:+CrashOnOutOfMemoryError" that will crash on any OutOfMemoryError exception. The cause of the OOME will be logged in the crash file at the following path: /home/vova/solr/solr/packaging/build/test-output/solr-home/logs/jvm_crash_3608.log
[2,450s][warning][jfr,system] Could not initialize JDK events. access denied ("java.lang.RuntimePermission" "accessClassInPackage.jdk.internal.event")
Exception in thread "embeddedZkServer" java.lang.NoClassDefFoundError: com/codahale/metrics/Reservoir
	at org.apache.zookeeper.metrics.impl.DefaultMetricsProvider$DefaultMetricsContext.lambda$getSummary$2(DefaultMetricsProvider.java:151)
	at java.base/java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1708)
	at org.apache.zookeeper.metrics.impl.DefaultMetricsProvider$DefaultMetricsContext.getSummary(DefaultMetricsProvider.java:147)
	at org.apache.zookeeper.server.ServerMetrics.<init>(ServerMetrics.java:81)
	at org.apache.zookeeper.server.ServerMetrics.<clinit>(ServerMetrics.java:46)
	at org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:133)
	at org.apache.solr.cloud.SolrZkServer.lambda$start$0(SolrZkServer.java:160)
	at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.lang.ClassNotFoundException: com.codahale.metrics.Reservoir
	at java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:445)
	at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:593)
	at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:526)
	at org.eclipse.jetty.ee10.webapp.WebAppClassLoader.loadClass(WebAppClassLoader.java:443)
	at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:526)
	... 8 more
	Suppressed: java.lang.ClassNotFoundException: com.codahale.metrics.Reservoir
		at java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:445)
		at org.eclipse.jetty.ee10.webapp.WebAppClassLoader.findClass(WebAppClassLoader.java:563)
		at org.eclipse.jetty.ee10.webapp.WebAppClassLoader.loadClass(WebAppClassLoader.java:467)
		... 9 more

Launching via bin/solr start fails for the same reason

Is there a workaround to run integration tests now?

Copy link
Copy Markdown
Contributor

@dsmiley dsmiley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll do a couple of the suggestions I included in this review, finishing today.

I also want to ensure no stream creates a SolrClientCache() (no args) -- should always rely on StreamContext.

Comment thread solr/core/src/java/org/apache/solr/cloud/InternalSolrClientCache.java Outdated
Comment on lines +116 to +117
streamFactory.withCollectionSolrConnection(defaultCollection, solrConnection);
streamFactory.withDefaultSolrConnection(solrConnection);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this line seems redundant with the previous

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? The first line sets the default collection and connection to it, and the second line sets the default connection for all other collections if solrConnection is not specified in the expression itself.
If we remove any of them, the behavior will change.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be relevant when explicitly enabling external clusters. By default, indeed, one connection is always established.

Comment thread solr/core/src/java/org/apache/solr/search/join/CrossCollectionJoinQuery.java Outdated
streamFactory.withDefaultZkHost(defaultZkhost);
var solrConnection = CloudSolrClient.CloudSolrClientConnection.parse(defaultZkhost);
streamFactory.withCollectionSolrConnection(defaultCollection, solrConnection);
streamFactory.withDefaultSolrConnection(solrConnection);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this line appears redundant with the previous

Comment on lines +93 to +97
if (!solrConnection.isZookeeper()) {
throw new SQLException(
String.format(
Locale.ROOT, "Expected ZooKeeper connection string, but got: '%s'.", schemaName));
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure there's any point in ensuring this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

@epugh
Copy link
Copy Markdown
Contributor

epugh commented May 25, 2026

Launching via bin/solr start fails for the same reason

Is there a workaround to run integration tests now?

This is a great thing to drop a note to dev@solr.apache.org about, and let the person who merged the code know about the issue!

@dsmiley
Copy link
Copy Markdown
Contributor

dsmiley commented May 25, 2026

I already did yesterday: https://issues.apache.org/jira/browse/SOLR-17855
Ignore that in this issue for now.

@vyatkinv
Copy link
Copy Markdown
Contributor Author

Added integration tests for CLI utilities with --solr-connection (locally i revert metrics-core remove). Still need coverage to ensure no connections are made to external clusters. I’ll look into the best approach tomorrow — the closest existing examples seem to be test_auth.bats and test_adminconsole_urls.bats

@vyatkinv
Copy link
Copy Markdown
Contributor Author

@dsmiley
Copy link
Copy Markdown
Contributor

dsmiley commented May 26, 2026

I'll pursue the SolrClientCache security matter in another JIRA issue TBD. Sorry to see this JIRA issue get so much scope creep. We can merge something here but just won't release it until the security matter is handled.

@vyatkinv
Copy link
Copy Markdown
Contributor Author

Good
A little later I implement InternalSolrClientCache#getFrom(CoreContainer cc) and security tests in separate PRs

@dsmiley
Copy link
Copy Markdown
Contributor

dsmiley commented May 26, 2026

InternalSolrClientCache#getFrom(CoreContainer cc) I'll get it; it was my idea anyway.
Can we also just back out ISCC altogether from this PR to de-scope it? The work will not be lost.

I leave it to @epugh to review the CLI aspects of this PR. Ideally that'd be a separate PR; it's a large scope divergence. I don't have time/experience or frankly interest to review the CLI.

@vyatkinv
Copy link
Copy Markdown
Contributor Author

vyatkinv commented May 26, 2026

Can we also just back out ISCC altogether from this PR to de-scope it? The work will not be

I'm rolling back InternalSolrClientCache from this PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants