@KingRandomGuy

KingRandomGuy@lemmy.world · edit-2 4 days ago

What info have you heard about Fenghua 3? I’d last read that it’s not strictly an AI accelerator but can actually do graphics tasks, which is neat. Would make it more of a competitor to a professional workstation card like an RTX PRO 6000.

I’m most curious about their CUDA compatibility claim. I would expect that to cause a pretty significant performance hit since when writing high-performance CUDA kernels, you generally need to specialize the kernel to the individual GPU (an H100 kernel will look quite different compared to a 4090 kernel, for example). But if in spite of that it can achieve H100 performance, that’d be cool.

KingRandomGuy@lemmy.world · 3 months ago

I’ll need to give this a read, but I’m not super sure what’s novel here. The core idea sounds a lot like GaussianImage (ECCV '24), in which they basically perform 3DGS except with 2D gaussians to fit an image with fewer parameters than implicit neural methods. Thanks for the breakdown!

KingRandomGuy@lemmy.world · 6 months ago

Yeah, you can certainly get it to reproduce some pieces (or fragments) of work exactly but definitely not everything. Even a frontier LLM’s weights are far too small to fully memorize most of their training data.

KingRandomGuy@lemmy.world · 7 months ago

Some apps only require ‘basic’ play integrity verification, but now check to see if they’re installed via the Play Store. They refuse to run if they’re installed via an alternative source.

This has been a problem for GrapheneOS, since some apps filter themselves out of the Play Store search if you don’t pass strong play integrity, despite the fact that they don’t require it. Luckily Graphene now had a bypass for this.

KingRandomGuy@lemmy.world · 7 months ago

Yep, since this is using Gaussian Splatting you’ll need multiple camera views and an initial point cloud. You get both for free from video via COLMAP.

KingRandomGuy@lemmy.world · 8 months ago

The general framework for evolutionary methods/genetic algorithms is indeed old but it’s extremely broad. What matters is how you actually mutate the algorithm being run given feedback. In this case, they’re using the same framework as genetic algorithms (iteratively building up solutions by repeatedly modifying an existing attempt after receiving feedback) but they use an LLM for two things:

Overall better sampling (the LLM has better heuristics for figuring out what to fix compared to handwritten techniques), meaning higher efficiency at finding a working solution.
“Open set” mutations: you don’t need to pre-define what changes can be made to the solution. The LLM can generate arbitrary mutations instead. In particular, AlphaEvolve can modify entire codebases as mutations, whereas prior work only modified single functions.

The “Related Work” (section 5) section of their whitepaper is probably what you’re looking for, see here.

KingRandomGuy@lemmy.world · 9 months ago

I agree that pickle works well for storing arbitrary metadata, but my main gripe is that it isn’t like there’s an exact standard for how the metadata should be formatted. For FITS, for example, there are keywords for metadata such as the row order, CFA matrices, etc. that all FITS processing and displaying programs need to follow to properly read the image. So to make working with multi-spectral data easier, it’d definitely be helpful to have a standard set of keywords and encoding format.

It would be interesting to see if photo editing software will pick up multichannel JPEG. As of right now there are very few sources of multi-spectral imagery for consumers, so I’m not sure what the target use case would be though. The closest thing I can think of is narrowband imaging in astrophotography, but normally you process those in dedicated astronomy software (i.e. Siril, PixInsight), though you can also re-combine different wavelengths in traditional image editors.

I’ll also add that HDF5 and Zarr are good options to store arrays in Python if standardized metadata isn’t a big deal. Both of them have the benefit of user-specified chunk sizes, so they work well for tasks like ML where you may have random accesses.

KingRandomGuy@lemmy.world · 9 months ago

I guess part of the reason is to have a standardized method for multi and hyper spectral images, especially for storing things like metadata. Simply storing a numpy array may not be ideal if you don’t keep metadata on what is being stored and in what order (i.e. axis order, what channel corresponds to each frequency band, etc.). Plus it seems like they extend lossy compression to this modality which could be useful for some circumstances (though for scientific use you’d probably want lossless).

If compression isn’t the concern, certainly other formats could work to store metadata in a standardized way. FITS, the image format used in astronomy, comes to mind.

KingRandomGuy@lemmy.world · 9 months ago

I guess you’d measure whose GenAI models are performing the best on benchmarks (generally currently OpenAI, though top models from China are not crazy far behind), as well as metrics like number of publications at top venues (NeurIPS, ICML, and ICLR for ML, CVPR, ICC and ECCV for vision, etc.).

A lot of great papers come out of Chinese institutions so I’m not sure who would be ahead in that metric either, though.