High school student uses AI to reveal 1.5 million previously unknown objects in space.
High school student uses AI to reveal 1.5 million previously unknown objects in space.

Exploring Space with AI

High school student uses AI to reveal 1.5 million previously unknown objects in space.
Exploring Space with AI
Very cool work! I read the abstract of the paper. I don't think it needs to use the "AI" buzzword because his work is already impressive and stands on its own, though, and the work has nothing to do with LLMs.
It uses a neutral net that he designed and trained, so it is AI. The public's view of "AI" seems mostly the generation stuff like chatbots and image gen, but deep learning is perfect for science and medical fields.
Exactly. Artificial intelligence is the parent category.
AI is far more than LLMs. Why does everyone on lemmy think AI is nothing but?!
Because actual presentation of analytical and practical AI here is rare. AI conducting analysis of medical imaging to catch tumors early isn’t something we discuss very much, for example.
What we do get is the marketing hype, AI images, crappy AI search results, ridiculous investment in AI to get rid of human workers, AI’s wasteful power requirements, and everything else under the sun with corporations basically trying to make a buck off AI while screwing over workers. This is the AI we see day to day. Not the ones making interesting and useful scientific discoveries.
The term “artificial intelligence” is supposed to refer to a computer simulating the actions/behavior of a human.
LLMs can mimic human communication and therefore fits the AI definition.
Generative AI for images is a much looser fit but it still fulfills a purpose that was until recently something most or thought only humans could do, so some people think it counts as AI
However some of the earliest AI’s in computer programs were just NPCs in video games, looong before deep learning became a widespread thing.
Enemies in video games (typically referring to the algorithms used for their pathfinding) are AI whether they use neural networks or not.
Deep learning neural networks are predictive mathematic models that can be tuned from data like in linear regression. This, in itself, is not AI.
Transformers are a special structure that can be implemented in a neural network to attenuate certain inputs. (This is how ChatGPT can act like it has object permanence or any sort of memory when it doesn’t) Again, this kind of predictive model is not AI any more than using Simpson’s Rule to calculate a missing coordinate in a dataset would be AI.
Neural networks can be used to mimic human actions, and when they do, that fits the definition. But the techniques and math behind the models is not AI.
The only people who refer to non-AI things as AI are people who don’t know what they’re talking about, or people who are using it as a buzzword for financial gain (in the case of most corporate executives and tech-bros it is both)
I don't know why but reading this is hilarious to me, picturing the high schoolers log into chat gpt and ask it "how many unknown objects are there in space" and presenting the response as their result.
Every day a new Einstein is born, and their life and choices are dictated by the level of wealth and opportunity they are born into.
We would see stories like this every week if wealth and opportunities were equally distributed.
I largely agree, except s/equally/equitably.
I didn't see where the article was about capitalism. Did you comment the right post? It seems off-topic.
This doesn't seem off topic to me. A smart person had access to the tools and support system to enable them to do something incredible, but thousands of people equally capable didn't have the opportunity. Seems pretty easy to follow the logic
You might not be that new Einstein...
I was hoping the article would tell us more about the technique he developed.
The model I implemented can be used for other time domain studies in astronomy, and potentially anything else that comes in a temporal format
All I gathered from it is that it is a time-series model.
I found his paper: https://iopscience.iop.org/article/10.3847/1538-3881/ad7fe6 (no paywall 😃)
From the intro:
VARnet leverages a one-dimensional wavelet decomposition in order to minimize the impact of spurious data on the analysis, and a novel modification to the discrete Fourier transform (DFT) to quickly detect periodicity and extract features of the time series. VARnet integrates these analyses into a type prediction for the source by leveraging machine learning, primarily CNN.
They start with some good old fashioned signal processing, before feeding the result into a neutral net. The NN was trained on synthetic data.
FC = Fully Connected layer, so they're mixing FC with mostly convolutional layers in their NN. I haven't read the whole paper, I'm happy to be corrected.
The model was run (and I think trained?) on very modest hardware:
The computer used for this paper contains an NVIDIA Quadro RTX 6000 with 22 GB of VRAM, 200 GB of RAM, and a 32-core Xeon CPU, courtesy of Caltech.
That's a double VRAM Nvidia RTX 2080 TI + a Skylake Intel CPU, an aging circa-2018 setup. With room for a batch size of 4096, nonetheless! Though they did run into some preprocessing bottleneck in CPU/RAM.
The primary concern is the clustering step. Given the sheer magnitude of data present in the catalog, without question the task will need to be spatially divided in some way, and parallelized over potentially several machines
That's not modest. AI hardware requirements are just crazy.
For an individual yes. But for an institution? No.
I mean, "modest" may be too strong a word, but a 2080 TI-ish workstation is not particularly exorbitant in the research space. Especially considering the insane dataset size (years of noisy, raw space telescope data) they’re processing here.
Also that’s not always true. Some “AI” models, especially oldschool ones, function fine on old CPUs. There are also efforts (like bitnet) to get larger ones fast cheaply.
So a 5090, 5950x3d & 192gb of RAM would run it on "consumer" hardware?
That’s even overkill. A 3090 is pretty standard in the sanely priced ML research space. It’s the same architecture as the A100, so very widely supported.
5090 is actually a mixed bag because it’s too new, and support for it is hit and miss. And also because it’s ridiculously priced for a 32G card.
And most CPUs with tons of RAM are fine, depending on the workload, but the constraint is usually “does my dataset fit in RAM” more than core speed (since just waiting 2X or 4X longer is not that big a deal).
I've managed to run AI on hardware even older than that. The issue is it's just painfully slow. I have no idea if it has any impact on the actual results though. I have a very high spec AI machine on order, so it'll be interesting to run the same tests again and see if they're any better, or if they're simply quicker.
I have no idea if it has any impact on the actual results though.
Is it a PyTorch experiment? Other than maybe different default data types on CPU, the results should be the same.
My mans look like he about to be voted most likely to agent 47 a health insurance ceo
Let them live in fear.
Anything but the fuckin' metric system...
Begging your pardon Sir but it's a bigass sky to search.
Been wanting that gif and been too lazy to record it!
AI accomplishing something useful for once?!
AI has been used for tons of useful stuff for ages, you just never heard about it unless you were in the space until LLMs came around
How many are hallucinations
I havent read the paper and surely he did a great job. Regardless of that, and in principle, anyone can do this in less than hour. The trick is to get an external confirmstion for all the discoveries you've made.
Think of all the astronomers he put out of work. :(
This isn't exactly the type of work tons of astronomers are doing, nor does it cut into their jobs. Astronomers have already been using ML/algorithms/machine vision/similar stuff like this for this kind of work for years.
Besides, whenever a system identifies objects like this, they still need to be confirmed. This kind of thing just means telescope time is more efficient and it leaves more time for the kinds of projects that normally don't get much telescope time.
Also, space is big. 150k possible objects is NOTHING.