I think I didn't explain myself correctly.
The Just In Time Compiler in a Java Virtual Machine which does a final compilation step at runtime from JVM assembly to native code will make the typical code in algorithms run as fast as in C++ because ultimatelly they both end up as the same assembly. I've actually measure this by the way, though it was years ago.
However things like memory architectures are different between those languages and the one in C++ can easilly be made faster than in Java, though the downside of that memory architecture is nastier bugs.
Further, and this time about general performance, if you're worried about performance above all, then in terms of general performance, C is generally faster than C++: for example virtual functions in C++ objects have calling overheads which function calls when there is no inheritance do not have so unless you don't use inheritance at all, some function calls in C++ will be slower.
As for parallelization, I've actually worked both in massive parallel Java with distributed computing and GPU computing and they're completelly different kinds of parallelization with different capabilities and optimal use cases: good luck making a CUDA application optimized for serving paralled requests from millions of clients sourcing data from multiple sources and good luck making an LLM runtime with distributed computing in Java were parts of the pre-trained neural network are in different machines - GPU computing can't arbitrally decide it needs some data and fetch it whilst the overhead of synching two parts of a Java system running in different machines is way (millions of times?) larger than the overhead of synching read-then-write-access to the same position in a RWStructuredBuffer from 2 different processing units.
Comparing these two kind of parallelism is not an apple and oranges comparison, it'sm more like an apples and horseshoes comparison.