Skip to content

SOLR-17855: Remove Dropwizard dependency from core#4404

Open
mlbiscoc wants to merge 12 commits into
apache:mainfrom
mlbiscoc:SOLR-17855-dropwizard-removal
Open

SOLR-17855: Remove Dropwizard dependency from core#4404
mlbiscoc wants to merge 12 commits into
apache:mainfrom
mlbiscoc:SOLR-17855-dropwizard-removal

Conversation

@mlbiscoc
Copy link
Copy Markdown
Contributor

@mlbiscoc mlbiscoc commented May 7, 2026

https://issues.apache.org/jira/browse/SOLR-17855

After migration to OTEL, dropwizard metrics are not used anymore for metrics collection but there were still a few references to it in solr-core. This removes those dropwizard instruments so we can get rid of the dependency from core but it is still needed in test-framework where we embed zookeeper.

Copy link
Copy Markdown
Contributor Author

@mlbiscoc mlbiscoc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few things to note:

  • The OVERSEERSTATUS api no longer including rates, percentiles and timing information. Only requests and error info. If we still care about this overseer info, I can properly collect to OTEL and have it exposed via our metrics api and OTLP instead.
  • Dropwizard metrics-core:3.2.5 is still being pulled in places where we embed ZooKeeper which is Solrs testing framework.

Copy link
Copy Markdown
Contributor

@dsmiley dsmiley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that Solr has RTimer, which maybe might be an adequate substitute in some places where you removed stuff completely?

I'm okay with losing some metrics relating to the Overseer. Someone can invest in bringing it back if they so please. I'm more okay with this, I admit, because I want folks to disable the Overseer instead ;-)

It'll be very necessary to add to major-changes-in-solr-10.adoc here on metrics that have disappeared.

scheduledExecutor.shutdown();
} else {
log(meter.getCount() + " docs at " + meter.getMeanRate() + " doc/s");
double rate = count / ((System.nanoTime() - meterStartNanos) / 1e9);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to see something that requires less trust in the time math as a code reviewer... like some JDK API calls to produce a readable equivalent.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we wanted to bring back such timings, I'd rather it be done with annotations similar to tracing @WithSpan. Alas, OTEL has no metric equivalent, which would be for something more light-weight that isn't span-worthy. I think talking to ZK is span-worthy as it's another process/system, albeit it's so fast that someone would typically prefer to disable such. Ideally Spans would have levels like logging but they don't; people have proposed it. I did see this interesting development, which is somewhat related as it allows disabling spans for an entire "scope".

@epugh
Copy link
Copy Markdown
Contributor

epugh commented May 8, 2026

@mlbiscoc I think we may be able to elminate JettySolrRunnerWithMetrics as well... I poked around and it appears that it is no longer used in testing, it exists for two classes, but isn't actively used in the asserts, it just sits there...

I played around a bit and got this patch file:
remove_jettymetricstestframework.patch. This could also be in another PR... Maybe this is a breaking chnage to our test-framework? I think though, it's minimal if it is, and in a perfect world would have been part of our Solr 10 Test Framework changes anyway.

@mlbiscoc
Copy link
Copy Markdown
Contributor Author

@mlbiscoc I think we may be able to elminate JettySolrRunnerWithMetrics as well... I poked around and it appears that it is no longer used in testing, it exists for two classes, but isn't actively used in the asserts, it just sits there...

I played around a bit and got this patch file: remove_jettymetricstestframework.patch. This could also be in another PR... Maybe this is a breaking chnage to our test-framework? I think though, it's minimal if it is, and in a perfect world would have been part of our Solr 10 Test Framework changes anyway.

I took a look and applied it to this PR. I think it's worth adding. The jetty metrics were essentially a no-op in 10 at this point anyways.

@@ -0,0 +1,9 @@
title: Remove Dropwizard Metrics as a direct Solr dependency. The `OVERSEERSTATUS` response no longer includes per-operation timing metrics or queue stats; success/error counts are preserved.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also advise that tracing is a substitute for per-operation metrics.

}

private static void logIndexingRate(long docCount, long startNanos) {
long elapsedMs = TimeUnit.NANOSECONDS.toMillis(System.nanoTime() - startNanos);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as we're logging at docs per SECOND, why convert to milliseconds at an intermediate step? It's okay if we round to a whole number... docs/sec should be rather high anyway. It's also okay if you want to add a fraction.


==== Overseer Status Metrics Removed

The `OVERSEERSTATUS` Collection API response no longer includes per-operation timing metrics but per-operation `requests` and `errors` counts still exist.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mention that OTEL tracing can be used to get similar information indirectly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants