Yeah - if you click on the instances there, it shows one figure for communities, but only lists a small subset of that. lemmy.ml for example says 104, but only lists 14:
there seem to be two separate issues relating to that.
the number at the top includes "all" communities, including those marked as nsfw.
on a quick glance, it seems all the nsfw marked ones are correctly marked as such, in the sense of also being nsfw on lemmy.
there also are a large number of communities missing overall, but at least the number next to the community tab adds up with the number of listed communities when the filter is set to show nsfw communities as well.
there is also either some kind of data corruption going on or there may have been some strange spam communities on lemmy.world in the past, as it shows a bunch of communities with random numbers in the name and display names like oejwfiojwwqpofioqwfiowqiofkwqeifjwefwefoejwfiojwwqpofioqwfiowqiofkwqeifjwefwefoejwfiojwwqpofioqwfiowqiofkwqeifjwefwefoejwfiojwwqpofioqwfiowqiofkwqeifjwefwefoejwfiojwwqpofioqwfiowqiofkwqeifjwefwefoejwfiojwwqpofioqwfiowqiofkwqeifjwefwef which don't currently exist on lemmy.world.
Ah, right - I see now thanks (it didn't occur to me to click 'show nsfw').
As for the 'oejwfiojwwqpofioqwfiowqiofkwqeifjwefwef' - I remember them as being spam. It was maybe a year or so ago now, but a LW user tried creating communities with every name imaginable as way to squat on them. They got to about 2000 before they were stopped, and in retaliation they created about 4000 of the 'wefwef' ones (I guess that the deletion of them from LW didn't make it somewhere, and so something out there still thinks they exist).
from a quick look it doesn't seem like the crawler uses any federation, it seems to just iterate over the community list api for each tracked instance, it probably doesn't have logic to remove entries that no longer exist, considering that they're still in there.